AI ALIGNMENT FORUM
Wikitags
AF

Subscribe
Discussion0
1

Apart Research

Subscribe
Discussion0
1
Written by Esben Kran, Oliver Habryka, Jason Hoelscher-Obermaier last updated 18th Jul 2024

Apart Research is an AI safety research lab. They host the Apart Sprints, large-scale international events for research experimentation. This tag includes posts written by Apart researchers and content about Apart Research.

Posts tagged Apart Research
53We Found An Neuron in GPT-2
Joseph Miller, Clement Neo
2y
0
1Analysing Adversarial Attacks with Linear Probing
Yoann Poupart, Imene Kerboua, Clement Neo, Jason Hoelscher-Obermaier
1y
0
52Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1
Stefan Heimersheim, Marius Hobbhahn
2y
0
38Solving the Mechanistic Interpretability challenges: EIS VII Challenge 2
Stefan Heimersheim, Marius Hobbhahn
2y
0
14Robustness of Model-Graded Evaluations and Automated Interpretability
Simon Lermen, viluon
2y
2
13How-to Transformer Mechanistic Interpretability—in 50 lines of code or less!
Stefan Heimersheim
2y
0
7Finding Deception in Language Models
Esben Kran, Archana Vaidheeswaran
10mo
0
3Can startups be impactful in AI safety?
Esben Kran, Archana Vaidheeswaran
9mo
0
Add Posts