[AN #71]: Avoiding reward tampering through current-RF optimization — AI Alignment Forum