This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
AI Boxing (Containment)
•
Applied to
Protecting against sudden capability jumps during training
by
nikola
2d
ago
•
Applied to
Information-Theoretic Boxing of Superintelligences
by
JustinShovelain
3d
ago
•
Applied to
Self-shutdown AI
by
Jan Betley
3mo
ago
•
Applied to
Boxing
by
Raymond Arnold
4mo
ago
•
Applied to
Thoughts on “Process-Based Supervision”
by
Steve Byrnes
5mo
ago
•
Applied to
A way to make solving alignment 10.000 times easier. The shorter case for a massive open source simbox project.
by
AlexFromSafeTransition
5mo
ago
•
Applied to
[FICTION] Unboxing Elysium: An AI'S Escape
by
Super AGI
6mo
ago
•
Applied to
Ideas for studies on AGI risk
by
dr_s
7mo
ago
•
Applied to
ChatGPT getting out of the box
by
qbolec
9mo
ago
•
Applied to
ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so
by
Christopher King
9mo
ago
•
Applied to
Bing finding ways to bypass Microsoft's filters without being asked. Is it reproducible?
by
Christopher King
9mo
ago
•
Applied to
I Am Scared of Posting Negative Takes About Bing's AI
by
Yitzi Litt
10mo
ago
•
Applied to
How it feels to have your mind hacked by an AI
by
blaked
1y
ago
•
Applied to
Oracle AGI - How can it escape, other than security issues? (Steganography?)
by
RationalSieve
1y
ago
•
Applied to
I've updated towards AI boxing being surprisingly easy
by
Noosphere89
1y
ago
•
Applied to
Side-channels: input versus output
by
davidad (David A. Dalrymple)
1y
ago
•
Applied to
Prosaic misalignment from the Solomonoff Predictor
by
Cleo Nardo
1y
ago
•
Applied to
My take on Jacob Cannell’s take on AGI safety
by
Steve Byrnes
1y
ago