Artificial Intelligence is the study of creating intelligence in algorithms. On LessWrong, the primary focus of AI discussion is to ensure that as humanity builds increasingly powerful AI systems, the outcome will be good. The central concern is that a powerful enough AI, if not designed and implemented with sufficient understanding, would optimize something unintended by its creators and pose an existential threat to the future of humanity. This is known as the AI alignment problem.

Common terms in this space are superintelligence, AI Alignment, AI Safety, Friendly AI, Transformative AI, human-level-intelligence, AI Governance, and Beneficial AI. This entry and the associated tag roughly encompass all of these topics: anything part of the broad cluster of understanding AI and its future impacts on our civilization deserves this tag.

AI Alignment

There are narrow conceptions of alignment, where you’re trying to get it to do something like cure Alzheimer’s disease without destroying the rest of the world. And there’s much more ambitious notions of alignment, where you’re trying to get it to do the right thing and achieve a happy intergalactic civilization.

But both the narrow and the ambitious alignment have in common that you’re trying to have the AI do that thing rather than making a lot of paperclips.

See also General Intelligence.

Basic Alignment Theory

Coherent Extrapolated Volition
Complexity of Value
Decision Theory
Embedded Agency
Fixed Point Theorems
Goodhart's Law
Inner Alignment
Instrumental Convergence
Logical Induction
Newcomb's Problem
Orthogonality Thesis
Outer Alignment
Paperclip Maximizer
Solomonoff Induction
Utility Functions

Engineering Alignment

AI Boxing (Containment)
Debate (AI safety technique)
Factored Cognition
Humans Consulting HCH
Impact Measures
Inverse Reinforcement Learning
Iterated Amplification
Mild Optimization
Tool AI
Transparency / Interpretability
Value Learning



AI Governance
AI Risk
AI Services (CAIS)
AI Takeoff
AI Timelines


Centre for Human-Compatible AI
Future of Humanity Institute
Machine Intelligence Research Institute


Research Agendas