x
Why Did My Model Do That? Model Incrimination for Diagnosing LLM Misbehavior — AI Alignment Forum