What's up with the <pad> token (<pad>==<bos>==<eos> in Pythia) in the attention diagram? I assume that doesn't need to be there?
<pad>
<pad>==<bos>==<eos>
What's up with the
<pad>
token (<pad>==<bos>==<eos>
in Pythia) in the attention diagram? I assume that doesn't need to be there?