AI ALIGNMENT FORUM
AF

1800
green_leaf
000
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
Using GPT-Eliezer against ChatGPT Jailbreaking
green_leaf3y02

(If the point is not to allow the AI to output anything misaligned, being conservative is probably the point, and lowering performance seems to be more than acceptable.)

Reply