x

AI ALIGNMENT FORUM
AF

A Moderate Update to your Artificial Priors — AI Alignment Forum

A Moderate Update to your Artificial Priors

Jan 03, 2024 by habryka

95ARC's first technical report: Eliciting Latent Knowledge

paulfchristiano, Mark Xu, Ajeya Cotra

4y

72

71Fun with +12 OOMs of Compute

Daniel Kokotajlo

5y

45

120What 2026 looks like

Daniel Kokotajlo

4y

33

87Ngo and Yudkowsky on alignment difficulty

Eliezer Yudkowsky, Richard_Ngo

4y

53

74Another (outer) alignment failure story

paulfchristiano

5y

25

93What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

5y

49

4y

19

54Finite Factored Sets

Scott Garrabrant

5y

70

51Selection Theorems: A Program For Understanding Agents

4y

24

72My research methodology

paulfchristiano

5y

35

61larger language models may disappoint you [or, an eternally unfinished draft]

4y

7

56Comments on Carlsmith's “Is power-seeking AI an existential risk?”

4y

11

64EfficientZero: How It Works

4y

2

32Specializing in Problems We Don't Understand

5y

0