[ Question ]

What are the most important papers/post/resources to read to understand more of GPT-3?

by Adam Shimi1 min read2nd Aug 20204 comments


GPTMachine LearningAI

I'm way more used to thinking about weird maths or distributed algorithms or abstract philosophical problems than about concrete machine learning architectures. But based on everything I see about GPT-3, it seems a nice idea to learn more about it, even if only for participating in the discussion without spouting non-sense.

So I'm asking for what you think are the must-reads on GPT-3 specifically, and maybe any requirement to understand them.

New Answer
Ask Related Question
New Comment

2 Answers

nostalgebraist's blog is a must-read regarding GPT-x, including GPT-3. Perhaps, start here ("the transformer... 'explained'?"), which helps to contextualize GPT-x within the history of machine learning.

(Though, I should note that nostalgebraist holds a contrarian "bearish" position on GPT-3 in particular; for the "bullish" case instead, read Gwern.)

2Adam Shimi4moThanks for the answer! I knew about the "transformer explained" post, but I was not aware of its author's position on GPT-3.

Here's a list of resources that may be of use to you. The GPT-3 paper isn't too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.

1Adam Shimi4moThanks! I'll try to read that.