x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Alignment Jam — AI Alignment Forum
Alignment Jam
Edited by
Esben Kran
last updated
16th May 2023
This lists the posts that have come from the
Alignment Jam hackathons
.
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
Alignment Jam
Most Relevant
0
53
We Found An Neuron in GPT-2
Joseph Miller
,
Clement Neo
3y
0
0
52
Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
StefanHex
,
Marius Hobbhahn
3y
0
1
38
Solving the Mechanistic Interpretability challenges: EIS VII Challenge 2
StefanHex
,
Marius Hobbhahn
3y
0
0
13
How-to Transformer Mechanistic Interpretability—in 50 lines of code or less!
StefanHex
3y
0
0
14
Robustness of Model-Graded Evaluations and Automated Interpretability
Simon Lermen
,
viluon
2y
2
0
7
Finding Deception in Language Models
Esben Kran
,
Archana Vaidheeswaran
1y
0