AI ALIGNMENT FORUMTags
AF

SERI MATS

EditHistorySubscribe
Discussion (0)
Help improve this page
EditHistorySubscribe
Discussion (0)
Help improve this page
SERI MATS
Random Tag
Contributors
1Multicore

The Stanford Existential Risks Initiative ML Alignment Theory Scholars program.

https://www.serimats.org/

Posts tagged SERI MATS
Most Relevant
3
32SERI MATS Program - Winter 2022 Cohort
Ryan Kidd, Victor Warlop, Christian Smith
5mo
0
1
147SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow, mwatkins
1mo
14
3
126Understanding and controlling a maze-solving policy network
Alex Turner, peligrietzer, Ulisse Mini, montemac, David Udell
10d
11
2
12Normative vs Descriptive Models of Agency
Matt MacDermott
2mo
2
2
38Soft optimization makes the value target bigger
Jeremy Gillen
3mo
1
2
43Predictions for shard theory mechanistic interpretability results
Alex Turner, Ulisse Mini, peligrietzer
21d
6
2
24Conditioning Generative Models for Alignment
Arun Jose
8mo
8
1
31More findings on Memorization and double descent
Marius Hobbhahn
2mo
2
1
12Race Along Rashomon Ridge
Stephen Fowler, Peter S. Park, MichaelEinhorn
8mo
0
1
16Auditing games for high-level interpretability
Paul Colognese
5mo
0
1
18More findings on maximal data dimension
Marius Hobbhahn
2mo
0
1
9A distillation of Evan Hubinger's training stories (for SERI MATS)
Daphne_W
8mo
1
2
10What sorts of systems can be deceptive?
Andrei Alexandru
5mo
0
1
48Finite Factored Sets in Pictures
Magdalena Wache
3mo
2
1
67Natural Abstractions: Key claims, Theorems, and Critiques
Lawrence Chan, Leon Lang, Erik Jenner
5d
5
Load More (15/61)
Add Posts