You are viewing revision 1.2.0, last edited by Oliver Habryka

AI Boxing is attempts, experiments, or proposals to isolate ("box") an unaligned AI where it can't interact with the world at large and cause harm. Part of AI Alignment.

Challenges are: 1) can you successively prevent it from interacting with the world? And 2) can you prevent it from convincing you to let it out?