FWIW I have come to similar conclusions along similar lines. I've said that I think human intelligence minus rat intelligence is probably easier to understand and implement than rat intelligence alone. Rat intelligence requires a long list of neural structures fine-tuned by natural selection, over tens of millions of years, to enable the rat to do very specific survival behaviors right out of the womb. How many individual fine-tuned behaviors? Hundreds? Thousands? Hard to say. Human intelligence, by contrast, cannot possibly be this fine tuned, because the same machinery lets us learn and predict almost arbitrarily different* domains.
I also think that recent results in machine learning have essentially proven the conjecture that moar compute regularly and reliably leads to moar performance, all things being equal. The human neocortical algorithm probably wouldn't work very well if it were applied in a brain 100x smaller because, by its very nature, it requires massive amounts of parallel compute to work. In other words, the neocortex needs trillions of synapses to do what it does for much the same reason that GPT-3 can do things that GPT-2 can't. Size matters, at least for this particular class of architectures.
*I think this is actually wrong - I don't think we can learn arbitrarily domains, not even close. Humans are not general. Yann LeCun has repeatedly said this and I'm inclined to trust him. But I think that the human intelligence architecture might be general. It's just that natural selection stopped seeing net fitness advantage at the current brain size.
I would consider it corrigible for the AI to tell Petrov about the problem. Not "I can't answer you" but "the texts I have on hand are inconclusive and unhelpful with respect to helping you solve your problem." This is, itself, informative.
If you're an expert in radar, and I ask you if you think something is a glitch or not, and you say you "can't answer", that doesn't tell me anything. I have no idea why you can't answer. If you tell me "it's inconclusive", that's informative. The information is that you can't really distinguish between a glitch and a real signal in this case. If I'm conservatively minded, then I'll increase my confidence that it's a glitch.
I really appreciated the degree of clarity and the organization of this post.
I wonder how much the slope of L(D) is a consequence of the structure of the dataset, and whether we have much power to meaningfully shift the nature of L(D) for large datasets. A lot of the structure of language is very repetitive, and once it is learned, the model doesn't learn much from seeing more examples of the same sort of thing. But, within the dataset are buried very rare instances of important concept classes. (In other words, the Common Crawl data has a certain perplexity, and that perplexity is a function of both how much of the dataset is easy/broad/repetitive/generic and how much is hard/narrow/unique/specific.) For example: I can't, for the life of me, get GPT-3 to give correct answers on the following type of prompt:
No matter how much priming I give or how I reframe the question, GPT-3 tends to either give a basically random cardinal direction, or just repeat whatever direction I mentioned in the prompt. If you can figure out how to do it, please let me know, but as far as I can tell, GPT-3 really doesn't understand how to do this. I think this is just an example of the sort of thing which simply occurs so infrequently in the dataset that it hasn't learned the abstraction. However, I fully suspect that if there were some corner of the Internet where people wrote a lot about the cardinal directions of things relative to a specified observer, GPT-3 would learn it.
It also seems that one of the important things that humans do but transformers do not, is actively seek out more surprising subdomains of the learning space. The big breakthrough in transformers was attention, but currently the attention is only within-sequence, not across-dataset. What does L(D) look like if the model is empowered to notice, while training, that its loss on sequences involving words like "west" and "cardinal direction" is bad, and then to search for and prioritize other sequences with those tokens, rather than simply churning through the next 1000 examples of sequences from which it has essentially already extracted the maximum amount of information. At a certain point, you don't need to train it on "The man woke up and got out of {bed}", it knew what the last token was going to be long ago.
It would be good to know if I'm completely missing something here.