Oliver Habryka | v1.10.0Sep 12th 2020 | (+5/-4) Improve "see also" formatting | ||
Ruben Bloom | v1.9.0Sep 12th 2020 | (+239/-259) | ||
Ruben Bloom | v1.8.0Sep 12th 2020 | (+2996/-223) | ||
Multicore | v1.7.0Aug 30th 2020 | |||
Multicore | v1.6.0Aug 12th 2020 | |||
Multicore | v1.5.0Aug 12th 2020 | (+629) | ||
Multicore | v1.4.0Aug 12th 2020 | (+225) | ||
Ruben Bloom | v1.3.0May 4th 2020 | (+9/-18) | ||
Oliver Habryka | v1.2.0Apr 23rd 2020 | (+21) | ||
Ruben Bloom | v1.1.0Apr 16th 2020 | (+222/-2984) |
See also: AI, AGI, Oracle AI, Tool AI, Unfriendly AI
One idea for AI boxing is the, AGI, Oracle AI: an, Tool AI that only answers questions and isn't designed to interact with the world in any other way. But even the act of the, Unfriendly AI putting strings of text in front of humans poses some risk.
AI Boxing is attempts, experiments, or proposals to isolate ("box") an unaligneda powerful AI (~AGI) where it can't interact with the world at largelarge, save for limited communication with its human liaison. It is often proposed that so long as the AI is physically isolated and cause harm. restricted, or "boxed", it will be harmless even if it is an unfriendly artificial intelligence (UAI).
See also:also: AI, AGI, Oracle AI, Tool AI, Unfriendly AI
One idea for AI boxing is the Oracle AI: AI: an AI that only answers questions and isn't designed to interact with the world in any other way. But even the act of the AI putting strings of text in front of humans poses some risk.
It is not regarded as likely that an AGI can be boxed in the long term. Since the AGI might be a superintelligence, it could persuade someone (the human liaison, most likely) to free it from its box and thus, human control. Some practical ways of achieving this goal include:
Other, more speculative ways include: threatening to torture millions of conscious copies of you for thousands of years, starting in exactly the same situation as in such a way that it seems overwhelmingly likely that you are a simulation, or it might discover and exploit unknown physics to free itself.
Attempts to box an AGI may add some degree of safety to the development of a friendly artificial intelligence (FAI). A number of strategies for keeping an AGI in its box are discussed in Thinking inside the box and Leakproofing the Singularity. Among them are:
Both Eliezer Yudkowsky and Justin Corwin have ran simulations, pretending to be a superintelligence, and been able to convince a human playing a guard to let them out on many - but not all - occasions. Eliezer's five experiments required the guard to listen for at least two hours with participants who had approached him, while Corwin's 26 experiments had no time limit and subjects he approached.
The text of Eliezer's experiments have not been made public.
List of experiments
The AI Box Experiment is a game meant to explore the possible pitfalls of AI boxing. It is played over text chat, with one human roleplaying as an AI in a box, and another human roleplaying as a gatekeeper with the ability to let the AI out of the box. The AI player wins if they successfully convince the gatekeeper to let them out of the box, and the gatekeeper wins if the AI player has not been freed after a certain period of time. The AI Box Experiment has been played several times, but the text logs are generally not made public. It was first invented by Eliezer Yudkowsky, who won his first two games playing as the AI.
One idea for AI boxing is the Oracle AI: an AI that only answers questions and isn't designed to interact with the world in any other way. But even the act of the AI putting strings of text in front of humans poses some risk.
AI Boxing is attempts, experiments, or proposals to isolate ("box") an unaligned AI where it can't interact with the world at large and cause harm. Part ofSee also: AI Alignment.
AI Boxing is attempts, experiments, or proposals to isolate ("box") an unaligned AI where it can't interact with the world at large and cause harm. Part of AI Alignment.
An AI BoxBoxing is a confined computer system in whichattempts, experiments, or proposals to isolate ("box") an Artificial General Intelligence (AGI) resides, unable tounaligned AI where it can't interact with the external world in any way, save for limited communication with its human liaison. It is often proposed that so long as an AGI is physically isolatedat large and restricted, or "boxed", it will be harmless even if it is an unfriendly artificial intelligence (UAI).cause harm.
It is not regarded as likely that an AGIChallenges are: 1) can be boxed in the long term. Since the AGI might be a superintelligence, it could persuade someone (the human liaison, most likely) to freeyou successively prevent it from its box and thus, human control. Some practical ways of achieving this goal include:
Other, more speculative ways include: threatening to torture millions of conscious copies offrom convincing you for thousands of years, starting in exactly the same situation as in such a way that it seems overwhelmingly likely that you are a simulation, or it might discover and exploit unknown physics to free itself.
Attempts to box an AGI may add some degree of safety to the development of a friendly artificial intelligence (FAI). A number of strategies for keeping an AGI in its box are discussed in Thinking inside the box and Leakproofing the Singularity. Among them are:
Both Eliezer Yudkowsky and Justin Corwin have ran simulations, pretending to be a superintelligence, and been able to convince a human playing a guard to let them out on many - but not all - occasions. Eliezer's five experiments required the guard to listen for at least two hours with participants who had approached him, while Corwin's 26 experiments had no time limit and subjects he approached.it out?
See
alsoalso:AI, AGI, Oracle AI, Tool AI, Unfriendly AI