AI ALIGNMENT FORUM
AF

74
lisathiergart
000
Message
Dialogue
Subscribe

https://admonymous.co/lisath

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
0lisathiergart's Shortform
6mo
0
38Paper: Understanding and Controlling a Maze-Solving Policy Network
2y
0
45ActAdd: Steering Language Models without Optimization
2y
2
24Open problems in activation engineering
2y
2
121Steering GPT-2-XL by adding an activation vector
2y
63
37Maze-solving agents: Add a top-right vector, make the agent go to the top-right
3y
7