AI ALIGNMENT FORUM
AF

282
youurayy
Ω3010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No posts to display.
What are the most important papers/post/resources to read to understand more of GPT-3?
Answer by Juraj VitkoAug 04, 202030

Here's a list of resources that may be of use to you. The GPT-3 paper isn't too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.

  • https://github.com/jalammar/jalammar.github.io/blob/master/notebooks/nlp/01_Exploring_Word_Embeddings.ipynb
  • http://www.peterbloem.nl/blog/transformers
  • http://jalammar.github.io/illustrated-transformer/
  • https://amaarora.github.io/2020/02/18/annotatedGPT2.html
  • http://jalammar.github.io/illustrated-gpt2/
  • http://jalammar.github.io/how-gpt3-works-visualizations-animations/
  • https://arxiv.org/pdf/1409.0473.pdf Attention (initial)
  • https://arxiv.org/pdf/1706.03762.pdf Attention Is All You Need
  • http://nlp.seas.harvard.edu/2018/04/03/attention.html (annotated)
  • https://www.arxiv-vanity.com/papers/1904.02679/ Visualizing Attention
  • https://stats.stackexchange.com/questions/421935/what-exactly-are-keys-queries-and-values-in-attention-mechanisms
  • https://arxiv.org/pdf/1807.03819.pdf Universal Transformers
  • https://arxiv.org/pdf/2007.14062.pdf Big Bird (see appendices)
  • https://www.reddit.com/r/MachineLearning/comments/hxvts0/d_breaking_the_quadratic_attention_bottleneck_in/
  • https://www.tensorflow.org/tutorials/text/transformer
  • https://www.tensorflow.org/tutorials/text/nmt_with_attention
  • https://cdn.openai.com/blocksparse/blocksparsepaper.pdf
  • https://openai.com/blog/block-sparse-gpu-kernels/
  • https://github.com/pbloem/former/blob/master/former/transformers.py
  • https://github.com/openai/blocksparse/blob/master/examples/transformer/enwik8.py
  • https://github.com/google/trax/blob/master/trax/models/transformer.py
  • https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_gpt2.py
Reply