x
Auditing language models for hidden objectives — AI Alignment Forum