Léo Dana

(Léo Dana) French master student in applied Mathematics (probability & statistic), soon alignment researcher (?)

Posts

Sorted by New

20Introducing EffiSciences’ AI Safety Unit

1y

0

Wiki Contributions

Comments

Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1)

Léo Dana6mo10

Quick question: you say that the MLP 2-6 gradually improve the representation of the sport of the athlete, and that no single MLP do it in one go. Would you consider that the reason would be something like this post describes ? https://www.lesswrong.com/posts/8ms977XZ2uJ4LnwSR/decomposing-independent-generalizations-in-neural-networks

So the MLP 2-6 basically do the same computations, but in a different superposition basis so that after several MLPs, the model is pretty confident about the answer ? Then would you think there is something more to say in the way the "basis are arranged", eg which concept interfere with which (i guess this could help answering questions like "how to change the lookup table name-surname-sport" which we are currently not able to do)

thks

Reply