x

AI ALIGNMENT FORUM

AF

Yushi Yang — AI Alignment Forum

Yushi Yang

Yushi Yang

Message

20

1y

Yushi Yang

20

1y

Unlearning Needs to be More Selective [Progress Report]

by Filip Sondej, Yushi Yang, and Marcel Windys

Summary We’d like to share our ongoing work on improving LLM unlearning. [arXiv] [github] There’s a myriad of approaches for unlearning, so over the past 8 months we conducted hundreds of small-scale experiments, comparing many loss functions, variants of meta-learning, various neuron or weight ablations, representation engineering and many exotic...

Jun 27, 2025•24