From a mathematical point of view, the building and training of a large transformer
language model (LLM) is the construction of a certain function, from some euclidean space to another, that has certain interesting properties. And it may therefore be surprising to find that many key papers announcing significant new LLMs seem reluctant to simply spell out the details of the function that they have constructed in plain mathematical language or indeed even in complete pseudo-code. The latter form of this complaint is the subject of the recent article of Phuong and Hutter [1]. Here, we focus on one aspect of the former perspective and seek to give a relatively ‘pure’ mathematical description of the architecture of an LLM.


This short pdf is a set of notes I made, initially just for my own benefit, while trying to understand the architecture of 'decoder-only' LLMs. It draws heavily on Anthropic's Mathematical Framework for Transformers but is deliberately written in a 'pure math' style.

It was while writing this up to actually post, that I started to develop the thoughts that led to my post about the mathematics of interpretability more generally.

I still consider it something of a fragment or draft, but may develop it further.

New Comment