x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Oscar Obeso — AI Alignment Forum
Oscar Balcells Obeso
Math undergrad at ETH Zurich.
More info: oscarbalcells.com
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
77
Refusal in LLMs is mediated by a single direction
2y
44
34
Refusal mechanisms: initial experiments with Llama-2-7b-chat
2y
1
Comments