Collecting all of the quantitative AI predictions I know of MIRI leadership making on Arbital (let me know if I missed any):
Some caveats:
Inconsistencies are flags 'something is wrong here', but ass numbers are vague and unreliable enough that they're to be expected to some degree. Similarly, ass numbers are often unstable hour-to-hour and day-to-day.
On my model, the point of ass numbers isn't to demand perfection of your gut (e.g., of the sort that would be needed to avoid multiple-stage fallacies when trying to conditionalize a lot), but to:
It may still be a terrible idea to spend too much time generating ass numbers, since "real numbers" are not the native format human brains compute probability with, and spending a lot of time working in a non-native format may skew your reasoning.
(Maybe there's some individual variation here?)
But they're at least a good tool to use sometimes, for the sake of crisper communication, calibration practice (so you can generate non-awful future probabilities when you need to), etc.
In the context of a conversation with Balaji Srinivasan about my AI views snapshot, I asked Nate Soares what sorts of alignment results would impress him, and he said:
example thing that would be relatively impressive to me: specific, comprehensive understanding of models (with the caveat that that knowledge may lend itself more (and sooner) to capabilities before alignment). demonstrated e.g. by the ability to precisely predict the capabilities and quirks of the next generation (before running it)
i'd also still be impressed by simple theories of aimable cognition (i mostly don't expect that sort of thing to have time to play out any more, but if someone was able to come up with one after staring at LLMs for a while, i would at least be impressed)
fwiw i don't myself really know how to answer the question "technical research is more useful than policy research"; like that question sounds to me like it's generated from a place of "enough of either of these will save you" whereas my model is more like "you need both"
tho i'm more like "to get the requisite technical research, aim for uploads" at this juncture
if this was gonna be blasted outwards, i'd maybe also caveat that, while a bunch of this is a type of interpretability work, i also expect a bunch of interpretability work to strike me as fake, shallow, or far short of the bar i consider impressive/hopeful
(which is not itself supposed to be any kind of sideswipe; i applaud interpretability efforts even while thinking it's moving too slowly etc.)
This is a repository for miscellaneous short things I want to post. Other people are welcome to make top-level comments here if they want. (E.g., questions for me you'd rather discuss publicly than via PM; links you think will be interesting to people in this comment section but not to LW as a whole; etc.)