Feels worth pasting in this other comment of yours from last week, which dovetails well with this:
DL so far has been easy to predict - if you bought into a specific theory of connectionism & scaling espoused by Schmidhuber, Moravec, Sutskever, and a few others, as I point out in https://www.gwern.net/newsletter/2019/13#what-progress & https://www.gwern.net/newsletter/2020/05#gpt-3 . Even the dates are more or less correct! The really surprising thing is that that particular extreme fringe lunatic theory turned out to be correct. So the question is,...
I'm imagining a tiny AI Safety organization, circa 2010, that focused on how to achieve probable alignment for scaled-up versions of that year's state-of-the-art AI designs. It's interesting to ask whether that organization would have achieved more or less than MIRI has, in terms of generalizable work and in terms of field-building.
Certainly it would have resulted in a lot of work that was initially successful but ultimately dead-end. But maybe early concrete results would have attracted more talent/attention/respect/funding, and the org could have thrown ...
a lot of AI safety work increasingly looks like it'd help make a hypothetical kind of AI safe
I think there are many reasons a researcher might still prioritize non-prosaic AI safety work. Off the top of my head:
(a)
Look, we already have superhuman intelligences. We call them corporations and while they put out a lot of good stuff, we're not wild about the effects they have on the world. We tell corporations 'hey do what human shareholders want' and the monkey's paw curls and this is what we get.
Anyway yeah that but a thousand times faster, that's what I'm nervous about.
(b)
Look, we already have superhuman intelligences. We call them governments and while they put out a lot of good stuff, we're not wild about the effects they have on the world. We tell gov... (read more)