What's up with the <pad> token (<pad>==<bos>==<eos> in Pythia) in the attention diagram? I assume that doesn't need to be there?
<pad>
<pad>==<bos>==<eos>
What's up with the
<pad>token (<pad>==<bos>==<eos>in Pythia) in the attention diagram? I assume that doesn't need to be there?