This piece, which predates ChatGPT, is no longer endorsed by its author.
Eliezer's recent discussion on AGI alignment is not optimistic.
I consider the present gameboard to look incredibly grim... We can hope there's a miracle that violates some aspect of my background model, and we can try to prepare for that unknown miracle
For this post, instead of debating Eliezer's model, I want to pretend it's true. Let's imagine we've all seen satisfactory evidence for the following:
I don't think this is an unsolvable problem. In this scenario, there are two ways to avoid catastrophe: massively increase the pace of alignment research, and delay the deployment of AGI.
I wouldn't rely solely on this option. Lots of brilliant and well-funded people are already trying really hard! But I bet we can make up some time here. Let me pull some numbers out of my arse:
Suppose we spent $2B a year. This would let us accomplish in 5 years what would otherwise have taken 22 years.
$2B a year isn't realistic today, but it's realistic in this scenario, where we've seen persuasive evidence Eliezer's model is true. If AI safety is the critical path for humanity's survival, I bet a skilled fundraiser can make it happen
Of course, skillfully administering the funds is its own issue...
The problem, as I understand it:
What can we do about this?
1. Persuade OpenAI
First, let's try the low hanging fruit. OpenAI seems to be full of smart people who want to do the right thing. If Eliezer's position is true, then I bet some high status rationalist-adjacent figures could be persuaded. In turn, I bet these folks could get a fair listen from Sam Altman/Elon Musk/Ilya Sutskever.
Maybe they'll change their mind. Or maybe Eliezer will change his own mind.
2. Persuade US Government to impose stronger Export Controls
Second, US export controls can buy time by slowing down the whole field. They'd also make it harder to share your research, so the leading team accumulates a bigger lead. They're easy to impose: it's a regulatory move, so an act of Congress isn't required. There are already export controls on narrow areas of AI, like automated imagery analysis. We could impose export controls on areas likely to contribute to AGI and encourage other countries to follow suit.
3. Persuade leading researchers not to deploy misaligned AI
Third, if the groups deploying AGI genuinely believed it would destroy the world, they wouldn't deploy it. I bet a lot of them are persuadable in the next 2 to 50 years.
4. Use public opinion to slow down AGI research
Fourth, public opinion is a dangerous instrument. It'd make a lot of folks miserable, to give AGI the same political prominence (and epistemic habits) as climate change research. But I bet it could delay AGI by quite a lot.
5. US commits to using the full range of diplomatic, economic, and military action against those who violate AGI research norms
Fifth, the US has a massive array of policy options for nuclear nonproliferation. These range from sanctions (like the ones crippling Iran's economy) to war. Right now, these aren't an option for AGI, because the foreign policy community doesn't understand the threat of misaligned AGI. If we communicate clearly and in their language, we could help them understand.
I don't know whether the grim model in Eliezer's interview is true or not. I think it's really important to find out.
If it's false (alignment efforts are likely to work), then we need to know that. Crying wolf does a lot of harm, and most of the interventions I can think of are costly and/or destructive.
But if it's true (current alignment efforts are doomed), we need to know that in a legible way. That is, it needs to be as easy as possible for smart people outside the community to verify the reasoning.
*Eliezer says his timeline is "short," but I can't find specific figures. Nate Soares gives a very substantial chance of 2 to 20 years and is 85% confident we'll see AGI by 2070
**Wild guess, loosely based on Price's Law. I think this works as long as we're nowhere close to exhausting the pool of smart/motivated/creative people who can contribute