Notes on countermeasures for exploration hacking (aka sandbagging) — AI Alignment Forum