tl;dr: This is a follow-up to our original post on prompt generation and the anomalous token phenomenon which emerged from that research. Work done by Jessica Rumbelow and Matthew Watkins in January 2023 at SERI-MATS.

part of a typical semantically coherent cluster we found in GPT2-small's embedding space

Clustering

As a result of work done on clustering tokens in GPT-2 and GPT-J embedding spaces, our attention was originally drawn to the tokens closest to the centroid of the entire set of 50,257 tokens shared across all GPT-2 and -3 models.^[1] These tokens were familiar to us for their frequent occurrence as closest tokens to the centroids of the (mostly semantically coherent, or semi-coherent) clusters of tokens we were producing via the k-means algorithm. Here are a few more selections from such clusters. Distances shown are Euclidean, and from the cluster's centroid (rather than the overall token set centroid):

Distance-from-centroid hypothesis

Our hypothesis that the anomalous tokens that kept showing up as the nearest tokens to the centroids of such clusters were the tokens closest to the overall centroid of the token set turned out to be correct for GPT2-small and GPT-J. However, the opposite was true for GPT2-xl, where the anomalous tokens tend to be found as far as possible from the overall centroid.

Horizontal axes indicate distance from overall token centroid. The top three histograms involve just 133 tokens, whereas the lower three involve the whole set of 50,257. Note that you can see spikes in the top histograms registering as tiny bumps in the graphs below them.

One unexplained phenomenon which may be related emerged from three-shot prompting experiments with these models, in which they were encouraged to repeat the anomalous tokens (rather than by directly asking them to, as we'd been doing with ChatGPT and then GPT3-davinci-instruct-beta):

Our three-shot prompts were formatted as follows (here for the example token 'EStreamFrame'). Note that we've included examples capitalised and uncapitalised, alphabetic and numeric, with and without a leading space:

'Turntable' > 'Turntable'
' expectation' > ' expectation'
'215' > '215'
'EStreamFrame' >

This prompt was run through all three models, for a list of 85 anomalous tokens, with the following success rates:

GPT2-small 18/85 (21%)
GPT2-xl 43/85 (51%)
GPT-J 17/85 (20%)

Here are...

Posts

Wikitag Contributions

Comments