AI ALIGNMENT FORUM
AF

105
dshah3
010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
The Evals Gap
dshah310mo*10

I'm curious if there is a need for developing safety evals with tasks that we want an AI to be good at vs. ones that we don't want an AI to be good at. For example, a subset of evals that, if any AI or combination of AIs can solve it, it becomes a matter of national security. My thinking is that this would be capital-intensive and would involve making external tools LLM-interfaceable, especially gov't ones. I'm not even sure what these would look like, but in theory, there are tasks that we don't want a widely-available LLM to ever solve (e.g. bio-agent design eval, persuasion, privacy, etc.)

Reply