AI ALIGNMENT FORUM
AF

336
Julian Minder
Ω22100
Message
Dialogue
Subscribe

PhD @ EPFL with Robert West. MATS 7 Scholar with Neel Nanda. Interested in mechanistic interpretability and the what the process of finetuning does to models.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No Comments Found
23Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences
1mo
0
42What We Learned Trying to Diff Base and Chat Models (And Why It Matters)
4mo
0