This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Monte MacDiarmid
Posts
Sorted by New
42
ActAdd: Steering Language Models without Optimization
20d
2
24
Open problems in activation engineering
2mo
2
109
Steering GPT-2-XL by adding an activation vector
4mo
63
129
Understanding and controlling a maze-solving policy network
7mo
18
Wiki Contributions
Comments