AI ALIGNMENT FORUM
AF

Uzay Macar
Ω15100
Message
Dialogue
Subscribe

MATS 8.0 scholar with Neel Nanda working on chain-of-thought interpretability. Find more about me here: uzaymacar.com

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
29Unfaithful chain-of-thought as nudged reasoning
2mo
0
16Thought Anchors: Which LLM Reasoning Steps Matter?
2mo
1