AI ALIGNMENT FORUM
Petrov Day
AF

341
Brendan Murphy
000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
16Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
8mo
0
8GPT-4o Guardrails Gone: Data Poisoning & Jailbreak-Tuning
11mo
0