This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Monte MacDiarmid
Posts
Sorted by New
36
Paper: Understanding and Controlling a Maze-Solving Policy Network
2mo
0
44
ActAdd: Steering Language Models without Optimization
3mo
2
24
Open problems in activation engineering
4mo
2
110
Steering GPT-2-XL by adding an activation vector
6mo
63
129
Understanding and controlling a maze-solving policy network
9mo
18
Wiki Contributions
Comments