AI ALIGNMENT FORUMTags
AF

Gears-Level

EditHistorySubscribe

Help improve this page

EditHistorySubscribe

Help improve this page

Contributors

A gears-level model is 'well-constrained' in the sense that there is a strong connection between each of the things you observe-- it would be hard for you to imagine that one of the variables could be different while all of the others remained the same.

Related Tags: Anticipated Experiences, Double-Crux, Empiricism, Falsifiability, Map and Territory

The term gears-level was first described on LW in the post "Gears in Understanding":...

Posts tagged Gears-Level

3

22Toward a New Technical Explanation of Technical Explanation

6y

2

0

51Attempted Gears Analysis of AGI Intervention Discussion With Eliezer

3y

0

1

53Evolution of Modularity

5y

6

1

39A Case for the Least Forgiving Take On Alignment

1y

18

1

15Abstraction, Evolution and Gears

4y

4

0

72interpreting GPT: the logit lens

4y

14

1

38Current themes in mechanistic interpretability research

Lee Sharkey, Sid Black, Beren Millidge

2y

2

1

34Decision Transformer Interpretability

Joseph Isaac Bloom, Paul Colognese

1y

6

0

11Beware of black boxes in AI alignment research

Vladimir Slepnev

6y

0

1

17Value Formation: An Overarching Model

2y

10