nostalgebraist's blog is a must-read regarding GPT-x, including GPT-3. Perhaps, start here ("the transformer... 'explained'?"), which helps to contextualize GPT-x within the history of machine learning.
(Though, I should note that nostalgebraist holds a contrarian "bearish" position on GPT-3 in particular; for the "bullish" case instead, read Gwern.)
Thanks for writing this post. I have a handful of quick questions: (a) What was the reference MIPS (or the corresponding CPU) you used for the c. 2019-2020 data point? (b) What was the constant amount of RAM you used to run Stockfish? (c) Do I correctly understand that the Stockfish-to-MIPS comparison is based on the equation [edit: not sure how to best format this LaTeX...]:
slowed down SF8 running timereference SF8 running time=historical MIPSreference MIPS
So, your post piqued my interest to investigate the Intel 80486 a bit more with the question in mind... (read more)