x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Zhijing Jin — AI Alignment Forum
Zhijing Jin
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
1
Investigating Accidental Misalignment: Causal Effects of Fine-Tuning Data on Model Vulnerability
6mo
0
13
Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games
8mo
0
Comments