dshah3 — AI Alignment Forum

I'm curious if there is a need for developing safety evals with tasks that we want an AI to be good at vs. ones that we don't want an AI to be good at. For example, a subset of evals that, if any AI or combination of AIs can solve it, it becomes a matter of national security. My thinking is that this would be capital-intensive and would involve making external tools LLM-interfaceable, especially gov't ones. I'm not even sure what these would look like, but in theory, there are tasks that we don't want a widely-available LLM to ever solve (e.g. bio-agent design eval, persuasion, privacy, etc.)

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments