KatWoods | v2.28.0Apr 5th 2023 | (+13) | ||
plex | v2.27.0Dec 25th 2022 | (+659/-223) | ||
RobertM | v2.26.0Oct 17th 2022 | revert name change | ||
Ruben Bloom | v2.25.0Oct 17th 2022 | (+90/-133) | ||
plex | v2.24.0Oct 4th 2021 | (-14) removed two tags at Ruby's suggestion | ||
plex | v2.23.0Oct 4th 2021 | missed two tag counts from my last edit | ||
plex | v2.22.0Oct 4th 2021 | (+24) using ?showPostCount=true&useTagName=true on the newly added tags so the count shows up | ||
plex | v2.21.0Oct 2nd 2021 | (+365) Adding a lot of tags to match the portal | ||
plex | v2.20.0Mar 11th 2021 | fixing a link | ||
plex | v2.19.0Mar 11th 2021 | (+232/-16) Updated a bunch of tags in the table |
Artificial Intelligence is the study of creating intelligence in algorithms. On LessWrong,AI Alignment is the primary focustask of ensuring [powerful] AI discussion is to ensure that as humanity builds increasingly powerful AI systems, the outcome will be good.system are aligned with human values and interests. The central concern is that a powerful enough AI, if not designed and implemented with sufficient understanding, would optimize something unintended by its creators and pose an existential threat to the future of humanity. This is known as the AI alignment problem.
Basic Alignment Theory
AIXI
Coherent Extrapolated Volition
Complexity of Value
Corrigibility
Deceptive Alignment
Decision Theory
Embedded Agency
Fixed Point Theorems
Goodhart's Law
Goal-Directedness
Gradient Hacking
Infra-Bayesianism
Inner Alignment
Instrumental Convergence
Intelligence Explosion
Logical Induction
Logical Uncertainty
Mesa-Optimization
Multipolar Scenarios
Myopia
Newcomb's Problem
Optimization
Orthogonality Thesis
Outer Alignment
Paperclip Maximizer
Power Seeking (AI)
Recursive Self-Improvement
Simulator Theory
Sharp Left Turn
Solomonoff Induction
Superintelligence
Symbol Grounding
Transformative AI
Treacherous Turn
Utility Functions
Whole Brain Emulation
Engineering Alignment
Agent Foundations
AI-assisted Alignment
AI Boxing (Containment)
Conservatism (AI)
Debate (AI safety technique)
Eliciting Latent Knowledge (ELK)
Factored Cognition
Humans Consulting HCH
Impact Measures
Inverse Reinforcement Learning
Iterated Amplification
Mild Optimization
Oracle AI
Reward Functions
RLHF
Shard Theory
Tool AI
Transparency / Interpretability
Tripwire
Value Learning
Organizations
Full map here
AI Safety Camp
Alignment Research Center
Anthropic
Apart Research
AXRP
CHAI (UC Berkeley)
Conjecture (org)
DeepMind
FHI (Oxford)
Future of Life Institute
MIRI
OpenAI
Ought
SERI MATS
Strategy
AI Alignment Fieldbuilding
AI Governance
AI Persuasion
AI Risk
AI Risk Concrete Stories
AI Safety Public Materials
AI Services (CAIS)
AI Success Models
AI Takeoff
AI Timelines
Computing Overhang
Regulation and AI Risk
Restrain AI Development
Other
AI Alignment Intro Materials
AI Capabilities
AI Questions Open Thread
Compute
DALL-E
GPT
Language Models
Machine Learning
Narrow AI
Neuromorphic AI
Prompt Engineering
Reinforcement Learning
Research Agendas