Interpretability Tools Are an Attack Channel — AI Alignment Forum