This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
cloud
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
77
Optimizing The Final Output Can Obfuscate CoT (Research Note)
1mo
3
105
Subliminal Learning: LLMs Transmit Behavioral Traits via Hidden Signals in Data
1mo
0
30
Selective Generalization: Improving Capabilities While Maintaining Alignment
2mo
0
87
Distillation Robustifies Unlearning
3mo
22
26
Selective modularity: a research agenda
5mo
1
14
Is weak-to-strong generalization an alignment technique?
Q
7mo
Q
1
64
Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
9mo
3
Comments