AI ALIGNMENT FORUM
AF

428
Wikitags

AXRP

Edited by Multicore, DanielFilan, et al. last updated 30th Dec 2024

AI X-Risk Research Podcast is a podcast hosted by Daniel Filan.

See also: Audio, Interviews

Subscribe
Discussion
1
Subscribe
Discussion
1
Posts tagged AXRP
2
37AXRP Episode 31 - Singular Learning Theory with Daniel Murfet
DanielFilan
2y
0
2
38AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
DanielFilan
2y
6
2
19AXRP Episode 24 - Superalignment with Jan Leike
DanielFilan
2y
3
2
30AXRP Episode 22 - Shard Theory with Quintin Pope
DanielFilan
2y
4
2
25AXRP Episode 19 - Mechanistic Interpretability with Neel Nanda
DanielFilan
3y
0
2
17AXRP Episode 25 - Cooperative AI with Caspar Oesterheld
DanielFilan
2y
0
2
25AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan
1y
0
2
20AXRP Episode 33 - RLHF Problems with Scott Emmons
DanielFilan
1y
0
2
18AXRP Episode 15 - Natural Abstractions with John Wentworth
DanielFilan
3y
0
2
20AXRP Episode 38.2 - Jesse Hoogland on Singular Learning Theory
DanielFilan
1y
0
2
15AXRP Episode 45 - Samuel Albanie on DeepMind’s AGI Safety Approach
DanielFilan
4mo
0
2
17AXRP Episode 41 - Lee Sharkey on Attribution-based Parameter Decomposition
DanielFilan
6mo
0
2
13AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability
DanielFilan
8mo
0
1
15AXRP Episode 14 - Infra-Bayesian Physicalism with Vanessa Kosoy
DanielFilan
4y
10
1
15AXRP Episode 13 - First Principles of AGI Safety with Richard Ngo
DanielFilan
4y
1
Load More (15/50)
Add Posts