x
Why Do Naive SFT Filters For Safety Properties Fail? — AI Alignment Forum