x

AI ALIGNMENT FORUM
AF

False assumptions and leaky abstractions in machine learning and AI safety — AI Alignment Forum

Frontpage

9

False assumptions and leaky abstractions in machine learning and AI safety

by David Scott Krueger (formerly: capybaralet)

28th Jun 2019

1 min read

9

Frontpage

False assumptions and leaky abstractions in machine learning and AI safety

1David Scott Krueger (formerly: capybaralet)

New Comment

1 comment, sorted by

Click to highlight new comments since: Today at 5:38 AM

[-]David Scott Krueger (formerly: capybaralet)6y10

A few more important examples of important leaky abstractions that we might worry about protecting/enforcing:

Casual interventions (as "uncaused causes", ala free will).
Boxes that don't leak information (BoMAI)

Making a more complete list would be a good project

More from David Scott Krueger (formerly: capybaralet)

Curated and popular this week

3

The problems of embedded agency are due to the notion of agency implicit in reinforcement learning being a leaky abstraction.
Machine learning problem statements often makes assumptions that are known to be false, for example, assuming i.i.d. data.
Examining failure modes that result from false assumptions and leaky abstractions is important for safety, (at least) because they create additional possibilities for convergent rationality.
Attempting to enforce the assumptions implicit in machine learning problem statements is another important topic for safety research, since we do not fully understand the failure modes.
In practice, most machine learning research is done in settings where unrealistic assumptions are trivially enforced to a sufficiently high extent that it is reasonable to assume they are not violated (e.g. by the use of a fixed train/valid/test set, generated via pseudo-random uniform sampling from a fixed dataset).
We can (and probably should) do machine learning research that targets failure modes of common assumptions and methods of enforcing assumptions by (instead) creating settings in which these assumptions have the potential to be violated.