Relaxed adversarial training for inner alignment — AI Alignment Forum