AI ALIGNMENT FORUM
AF

15
Books of LessWrong

Alignment

Nov 05, 2021 by Raemon

This fifth book is about alignment, the problem of aligning the thoughts and goals of artificial intelligences with those of humans.

This relates to the art of human rationality in two ways.

First, the laws of reasoning and decision-making apply similarly to all agents, whether they are human or artificial. Many of the deepest insights about rationality at LessWrong have come directly from users’ attempts to grapple with the problem of understanding and aligning artificial intelligences.

Second, the design of AI is a topic that a rational agent in the present day will have a natural interest in. This technology is of great leverage at this point in history – that is, the period in which we are actively developing it. In the words of I.J. Good “the first ultraintelligent machine is the last invention that man need ever make”.

For the reader who is not well-versed in the discussion around how to align powerful AI systems, treat the essays in this book as a collection of letters between scientists in a field you are not part of. While there will be terms that are not fully explained, authors will attempt to speak plainly, make honest efforts to convey the structure of their arguments, and to convey the most abstract and philosophical ideas, hand-drawn cartoons will be their mode of communication.

32Arguments about fast takeoff
paulfchristiano
8y
1
18Specification gaming examples in AI
Vika
8y
8
33The Rocket Alignment Problem
Eliezer Yudkowsky
7y
5
45Embedded Agents
abramdemski, Scott Garrabrant
7y
7
42Paul's research agenda FAQ
zhukeepa
7y
34
43Challenges to Christiano’s capability amplification proposal
Eliezer Yudkowsky
7y
2
47Robustness to Scale
Scott Garrabrant
8y
7
54Coherence arguments do not entail goal-directed behavior
Rohin Shah
7y
50