Won't vs. Can't: Sandbagging-like Behavior from Claude Models — AI Alignment Forum