Predicting LLM Safety Before Release by Simulating Deployment
by Tomek Korbak, Marcus Williams, micahcarroll, Cameron Raymond, and Hannah Sheahan
Paper link Before releasing a new model, labs need to understand not just what it can do, but how it is likely to behave in real-world use, including where it might introduce new risks. This becomes even more important as capabilities increase. As part of our pre-deployment safety review, we...
Jun 1621