This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Alignment Jam
•
Applied to
Demonstrate and evaluate risks from AI to society at the AI x Democracy research hackathon
by
Jason Hoelscher-Obermaier
15d
ago
•
Applied to
Towards AI Safety Infrastructure: Talk & Outline
by
Paul Bricman
4mo
ago
•
Applied to
Tips, tricks, lessons and thoughts on hosting hackathons
by
gergogaspar
6mo
ago
•
Applied to
Robustness of Model-Graded Evaluations and Automated Interpretability
by
Esben Kran
10mo
ago
•
Applied to
How-to Transformer Mechanistic Interpretability—in 50 lines of code or less!
by
Esben Kran
11mo
ago
•
Applied to
We Found An Neuron in GPT-2
by
Esben Kran
1y
ago
•
Applied to
Solving the Mechanistic Interpretability challenges: EIS VII Challenge 2
by
Stefan Heimersheim
1y
ago
•
Applied to
Results from the AI testing hackathon
by
Esben Kran
1y
ago
•
Applied to
Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
by
Esben Kran
1y
ago
•
Applied to
Superposition and Dropout
by
Esben Kran
1y
ago
•
Applied to
Identifying semantic neurons, mechanistic circuits & interpretability web apps
by
Esben Kran
1y
ago
•
Applied to
Results from the interpretability hackathon
by
Esben Kran
1y
ago
•
Applied to
Dropout can create a privileged basis in the ReLU output model.
by
Esben Kran
1y
ago
Esben Kran
v1.0.0
May 16th 2023
(+70)
LW
1
This lists the posts that have come from the
Alignment Jam hackathons
.
•
Created by
Esben Kran
at
1y
This lists the posts that have come from the Alignment Jam hackathons.