AI ALIGNMENT FORUM
AF

magnetoid
000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Refusal in LLMs is mediated by a single direction
magnetoid1y00

transformer_lens doesn't seem to be updated for Llama 3? Was trying to replicate Llama 3 results, would be grateful for any pointers.  Thanks

Reply
No posts to display.