The point I'm making is that the human example tells us that:
If first we realize that we can't code up our values, therefore alignment is hard. Then, when we realize that mesa-optimisation is a thing. we shouldn't update towards "alignment is even harder". We should update in the opposite direction.
Because the human example tells us that a mesa-optimiser can reliably point to a complex thing even if the optimiser points to only a few crude things.
But I only ever see these three points, human example, inability to code up values, mesa-optimisation to separately argue for "alignment is even harder than previously thought". But taken together that is just not the picture.
Humans don't explicitly pursue inclusive genetic fitness; outer optimization even on a very exact, very simple loss function doesn't produce inner optimization in that direction.
Humans haven't been optimized to pursue inclusive genetic fitness for very long, because humans haven't been around for very long. Instead they inherited the crude heuristics pointing towards inclusive genetic fitness from their cognitively much less sophisticated predecessors. And those still kinda work!
If we are still around in a couple of million years I wouldn't be surpris...
Much better now!
The date published vs date trained was on my mind because of Gopher. It seemed to me very relevant,that Deepmind trained a significantly larger model within basically half a year of the publication of GPT-3.
In addition to google brain also being quite coy about their 100+B model it made me update a lot in the direction of "the big players will replicate any new breakthrough very quickly but not necessarily talk about it."
To be clear, I also think it probably doesn't make sense to include this information in the list, because it is too rarely relevant.
It's worth noting that aside from the ridiculous situation where Googlers aren't allowed to name LaMDA (despite at least 5 published papers so far), Google has been very coy about MUM & Pathways (to the point where I'm still not sure if 'Pathways' is an actual model that exists, or merely an aspirational goal/name of a research programme). You also have the situation where models like LG's new 300b Exaone is described in a research paper which makes no mention of Exaone (the Korean coverage briefly mentions the L-Verse arch, but none of the English cov...
Some ideas for improvements:
The ability to sort by model size etc would be nice. Currently sorting is alphabetical.
Also the rows with long textual information should be more to the right and the more informative/tighter/numerical columns more to the left (like "deep learning" in almost all rows, not very informative). Ideally the most relevant information would be on the initial page without scrolling.
"Date published" and "date trained" can be quite different. Maybe worth including the latter?
This is really cool work! Congratulations!
Besides the LLM related work it also reminds somewhat of dynamic prompting in Stable Diffusion, where part of the prompt is changed after a number of steps to achieve a mixture of promp1 and prompt2.
What's the TL;DR for the Vicuna 13B experiments?