AI ALIGNMENT FORUM
AF

Wikitags
You are viewing version 1.1.0 of this page. Click here to view the latest version.

AI Boxing (Containment)

Edited by Ruby, prhodes, TerminalAwareness last updated 12th Sep 2020
You are viewing revision 1.1.0, last edited by Ruby

AI Boxing is attempts, experiments, or proposals to isolate ("box") an unaligned AI where it can't interact with the world at large and cause harm. 

Challenges are: 1) can you successively prevent it from interacting with the world? And 2) can you prevent it from convincing you to let it out?

Subscribe
1
Subscribe
1
Discussion0
Discussion0
Posts tagged AI Boxing (Containment)
18Cryptographic Boxes for Unfriendly AI
paulfchristiano
15y
0
41The case for training frontier AIs on Sumerian-only corpus
Alexandre Variengien, Charbel-Raphaël, Jonathan Claybrough
2y
0
39Thoughts on “Process-Based Supervision”
Steven Byrnes
2y
1
25My take on Jacob Cannell’s take on AGI safety
Steven Byrnes
3y
5
15LOVE in a simbox is all you need
jacob_cannell
3y
0
19Side-channels: input versus output
davidad
3y
12
15[Intro to brain-like-AGI safety] 11. Safety ≠ alignment (but they’re close!)
Steven Byrnes
3y
0
7How Do We Align an AGI Without Getting Socially Engineered? (Hint: Box It)
Peter S. Park, NickyP, Stephen Fowler
3y
1
118Would catching your AIs trying to escape convince AI developers to slow down or undeploy?
Buck
1y
25
45Decision theory does not imply that we get to have nice things
So8res
3y
36
25Results of $1,000 Oracle contest!
Stuart_Armstrong
5y
0
15Contest: $1,000 for good questions to ask to an Oracle AI
Stuart_Armstrong
6y
59
32Counterfactual Oracles = online supervised learning with random selection of training episodes
Wei Dai
6y
20
23Smoke without fire is scary
Adam Jermyn
3y
9
15How to safely use an optimizer
Simon Fischer
1y
0
Load More (15/23)
Add Posts