Presumably you're referring to this graph. The y-axis looks like the kind of score that ranges between 0 and 1, in which case this looks sort-of like a sigmoid to me, which accelerates when it gets closer to ~50% performance (and decelarates when it gets closer to 100% performance).
If so, we might want to ask whether these tasks are chosen ~randomly (among tasks that are indicative of how useful AI is) or if they're selected for difficulty in some way. In particular, assume that most tasks look sort-of like a sigmoid as they're scaled up (accelerating arou...
your view seems to imply that we will move quickly from much worse than humans to much better than humans, but it's likely that we will move slowly through the human range on many tasks
We might be able to falsify that in a few months.
There is a joint Google / OpenAI project called BIG-bench. They've crowdsourced ~200 of highly diverse text tasks (from answering scientific questions to predicting protein interacting sites to measuring self-awareness).
One of the goals of the project is to see how the performance on the tasks is changing with the ...
But after the 10^10 point, something interesting happens: the score starts growing much faster (~N).
And for some tasks, the plot looks like a hockey stick (a sudden change from ~0 to almost-human).
Seems interestingly similar to the grokking phenomenon.
Hot damn, where can I see these preliminary results?
The preliminary results where obtained on a subset of the full benchmark (~90 tasks vs 206 tasks). And there were many changes since then, including scoring changes. Thus, I'm not sure we'll see the same dynamics in the final results. Most likely yes, but maybe not.
I agree that the task selection process could create the dynamics that look like the acceleration. A good point.
As I understand, the organizers have accepted almost all submitted tasks (the main rejection reasons were technical - copyright etc). So, it was mostly self-selection, with the b... (read more)