You are viewing version 1.1.0 of this page. Click here to view the latest version.

AI Boxing (Containment)

Edited by Ruby, prhodes, TerminalAwareness last updated 12th Sep 2020

You are viewing revision 1.1.0, last edited by Ruby

AI Boxing is attempts, experiments, or proposals to isolate ("box") an unaligned AI where it can't interact with the world at large and cause harm.

Challenges are: 1) can you successively prevent it from interacting with the world? And 2) can you prevent it from convincing you to let it out?

Posts tagged AI Boxing (Containment)

18Cryptographic Boxes for Unfriendly AI

paulfchristiano

15y

0

41The case for training frontier AIs on Sumerian-only corpus

Alexandre Variengien, Charbel-Raphaël, Jonathan Claybrough

2y

0

39Thoughts on “Process-Based Supervision”

Steven Byrnes

2y

1

25My take on Jacob Cannell’s take on AGI safety

Steven Byrnes

3y

5

15LOVE in a simbox is all you need

jacob_cannell

3y

0

19Side-channels: input versus output

davidad

3y

12

15[Intro to brain-like-AGI safety] 11. Safety ≠ alignment (but they’re close!)

Steven Byrnes

3y

0

7How Do We Align an AGI Without Getting Socially Engineered? (Hint: Box It)

Peter S. Park, NickyP, Stephen Fowler

3y

1

118Would catching your AIs trying to escape convince AI developers to slow down or undeploy?

Buck

1y

25

45Decision theory does not imply that we get to have nice things

So8res

3y

36

25Results of $1,000 Oracle contest!

Stuart_Armstrong

5y

0

15Contest: $1,000 for good questions to ask to an Oracle AI

Stuart_Armstrong

6y

59

32Counterfactual Oracles = online supervised learning with random selection of training episodes

Wei Dai

6y

20

23Smoke without fire is scary

Adam Jermyn

3y

9

15How to safely use an optimizer

Simon Fischer

1y

0

AI ALIGNMENT FORUM
AF

AI Boxing (Containment)

AI ALIGNMENT FORUM
AF

AI Boxing (Containment)