This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Eliciting Latent Knowledge (ELK)
•
Applied to
[RFC] Possible ways to expand on "Discovering Latent Knowledge in Language Models Without Supervision".
by
gekaklam
at
10d
•
Applied to
Collin Burns on Alignment Research And Discovering Latent Knowledge Without Supervision
by
Raymond Arnold
at
18d
•
Applied to
[ASoT] Simulators show us behavioural properties by default
by
Arun Jose
at
22d
•
Applied to
Can we efficiently distinguish different mechanisms?
by
Multicore
at
22d
•
Applied to
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
by
Charbel-Raphael Segerie
at
1mo
•
Applied to
My Reservations about Discovering Latent Knowledge (Burns, Ye, et al)
by
Robert_AIZI
at
1mo
•
Applied to
Article Review: Discovering Latent Knowledge (Burns, Ye, et al)
by
Robert_AIZI
at
1mo
•
Applied to
Can we efficiently explain model behaviors?
by
Raymond Arnold
at
2mo
•
Applied to
How is ARC planning to use ELK?
by
Raymond Arnold
at
2mo
•
Applied to
Discovering Latent Knowledge in Language Models Without Supervision
by
Xodarap
at
2mo
•
Applied to
Finding gliders in the game of life
by
Raymond Arnold
at
2mo
•
Applied to
ARC paper: Formalizing the presumption of independence
by
Nicholas Goldowsky-Dill
at
2mo
•
Applied to
Mechanistic anomaly detection and ELK
by
Andrei Alexandru
at
2mo
•
Applied to
The limited upside of interpretability
by
Peter S. Park
at
3mo
•
Applied to
Mesatranslation and Metatranslation
by
Ruben Bloom
at
3mo
•
Applied to
You won’t solve alignment without agent foundations
by
RobertM
at
3mo
•
Applied to
For ELK truth is mostly a distraction
by
Cristian Trout
at
3mo
•
Applied to
Logical Decision Theories: Our final failsafe?
by
Noosphere89
at
3mo
•
Applied to
Where I currently disagree with Ryan Greenblatt’s version of the ELK approach
by
Multicore
at
4mo
•
Applied to
The ELK Framing I’ve Used
by
sudo -i
at
5mo