AI ALIGNMENT FORUM
AF

Paul Bogdan
Ω55200
Message
Dialogue
Subscribe

I am a MATS 8.0 scholar with Neel Nanda working on mechanistic interpretability and currently (Summer 2025) focusing on interpreting chain-of-thought.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
2Paul Bogdan's Shortform
2d
0
26Statistical suggestions for mech interp research and beyond
1mo
0
29Unfaithful chain-of-thought as nudged reasoning
2mo
0
16Thought Anchors: Which LLM Reasoning Steps Matter?
2mo
1