[ Question ]

What are the most important papers/post/resources to read to understand more of GPT-3?

by Adam Shimi1 min read2nd Aug 20204 comments

6

GPTMachine LearningAI
Frontpage

I'm way more used to thinking about weird maths or distributed algorithms or abstract philosophical problems than about concrete machine learning architectures. But based on everything I see about GPT-3, it seems a nice idea to learn more about it, even if only for participating in the discussion without spouting non-sense.

So I'm asking for what you think are the must-reads on GPT-3 specifically, and maybe any requirement to understand them.

New Answer
Ask Related Question
New Comment

2 Answers

nostalgebraist's blog is a must-read regarding GPT-x, including GPT-3. Perhaps, start here ("the transformer... 'explained'?"), which helps to contextualize GPT-x within the history of machine learning.

(Though, I should note that nostalgebraist holds a contrarian "bearish" position on GPT-3 in particular; for the "bullish" case instead, read Gwern.)

Here's a list of resources that may be of use to you. The GPT-3 paper isn't too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.