x

AI ALIGNMENT FORUM

AF

Cameron Raymond — AI Alignment Forum

Cameron Raymond

Cameron Raymond

Message

30

7mo

Cameron Raymond

30

7mo

Predicting LLM Safety Before Release by Simulating Deployment

by Tomek Korbak, Marcus Williams, micahcarroll, Cameron Raymond, and Hannah Sheahan

Paper link Before releasing a new model, labs need to understand not just what it can do, but how it is likely to behave in real-world use, including where it might introduce new risks. This becomes even more important as capabilities increase. As part of our pre-deployment safety review, we...