AI ALIGNMENT FORUM
Mechanistic Interpretability Puzzles
AF

Mechanistic Interpretability Puzzles

Jul 28, 2023 by Neel Nanda
27Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo
Neel Nanda
2y
2
17Mech Interp Puzzle 2: Word2Vec Style Embeddings
Neel Nanda
2y
2