This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
205
Rowan Wang
https://rowankwang.com/
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
29
Building and evaluating alignment auditing agents
3mo
0
45
Modifying LLM Beliefs with Synthetic Document Finetuning
6mo
10
48
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
3y
4
25
Gears-Level Mental Models of Transformer Interpretability
4y
1
Comments