Measurement tampering detection as a special case of weak-to-strong generalization — AI Alignment Forum