Possible research directions to improve the mechanistic explanation of neural networks
Why I like the Circuits approach In early 2020 I skimmed through the neural network explainability literature which was already quite massive at that time. I was left quite discouraged about the utility of the explanation techniques I saw. The level of rigor exhibited in the literature was very low...
There's no doubt a world simulator of some sort is probably going to be an important component in any AGI, at the very least for planning - Yan LeCun has talked about this a lot. There's also this work where they show a VAE type thing can be configured to run internal simulations of the environment it was trained on.
In brief, a few issues I see here:
- You haven't actually provided any evidence that GPT does simulation other than "Just saying “this AI is a simulator” naturalizes many of the counterintuitive properties of GPT which don’t usually become apparent to people until they’ve had a lot of hands-on experience with generating text." What
... (read more)