Symbol/Referent Confusions in Language Model Alignment Experiments — AI Alignment Forum