This is a link-post for the residual stream viewer, which can be found here. It's an online tool whose goal is to make it easier to do interpretability research by letting you easily look at directions within the residual stream. It's still in a quite early/unpolished state, so there may...
tl;dr: This post starts with a mystery about positional embeddings in GPT2-small, and from there explains how they relate to previous-token heads, i.e. attention heads whose role is to attend to the previous token. I tried to make the post relatively accessible even if you're not already very familiar with...
TL;DR: There are anomalous tokens for GPT3.5 and GPT4 which are difficult or impossible for the model to repeat; try playing around with SmartyHeaderCode, APolynomial, or davidjl. There are also plenty which can be repeated but are difficult for the model to spell out, like edTextBox or legalArgumentException. A couple...