Today, the AI Extinction Statement was released by the Center for AI Safetya one-sentence statement jointly signed by a historic coalition of AI experts, professors, and tech leaders.

Geoffrey Hinton and Yoshua Bengio have signed, as have the CEOs of the major AGI labs–Sam Altman, Demis Hassabis, and Dario Amodei–as well as executives from Microsoft and Google (but notably not Meta).

The statement reads: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

We hope this statement will bring AI x-risk further into the overton window and open up discussion around AI’s most severe risks. Given the growing number of experts and public figures who take risks from advanced AI seriously, we hope to improve epistemics by encouraging discussion and focusing public and international attention toward this issue.

New Comment
14 comments, sorted by Click to highlight new comments since: Today at 6:30 AM
[-]Wei Dai11mo2031

Is it just me or is it nuts that a statement this obvious could have gone outside the overton window, and is now worth celebrating when it finally (re?)enters?

How is it possible to build a superintelligence at acceptable risk while this kind of thing can happen? What if there are other truths important to safely building a superintelligence, that nobody (or very few) acknowledges because they are outside the overton window?

Now that AI x-risk is finally in the overton window, what's your vote for the most important and obviously true statement that is still outside it (i.e., that almost nobody is willing to say or is interested in saying)? Here are my top candidates:

  1. Dying of old age, as well as physical and mental deterioration from it, are bad and worth substantial coordinated effort to prevent.
  2. It's possible to make serious irreversible mistakes due to having incorrect answers to important philosophical questions. In fact, this is likely, considering how much confusion and disagreement there is on many philosophical questions that seem obviously important.

Why is 1 important? It seems like something we can defer discussion of until after (if ever) alignment is solved, no?

2 is arguably in that category also, though idk.

[-]Wei Dai10mo1213

Why is 1 important? It seems like something we can defer discussion of until after (if ever) alignment is solved, no?

If aging was solved or looked like it will be solved within next few decades, it would make efforts to stop or slow down AI development less problematic, both practically and ethically. I think some AI accelerationists might be motivated directly by the prospect of dying/deterioration from old age, and/or view lack of interest/progress on that front as a sign of human inadequacy/stagnation (contributing to their antipathy towards humans). At the same time, the fact that pausing AI development has a large cost in lives of current people means that you have to have a high p(doom) or credence in utilitarianism/longtermism to support it (and risk committing a kind of moral atrocity if you turn out to be wrong).

2 is arguably in that category also, though idk.

2 is important because as tech/AI capabilities increase, the possibilities to "make serious irreversible mistakes due to having incorrect answers to important philosophical questions" seem to open up exponentially. Some examples:

  • premature value lock-in
  • value drift,
  • handing over too much control/resources to alien/unaligned agents due to negotiation mistakes
  • mistakes related to commitment races
  • the process of creating/aligning AI might be unethical or creates a costly obligation
  • failure to prevent mindcrime inside AIs
  • intentionally doing horrible things at astronomical scale due to having wrong values/philosophies

If your point is that we could delegate solving these problems to aligned AI once we have them, my worry is that AI, including aligned AI, will be much better at creating new philosophical problems (opportunities to make mistakes) than at solving them. The task of reducing this risk (e.g., by solving metaphilosophy or otherwise making sure AIs' philosophical abilities keep up with or outpace their other intellectual abilities) seems super neglected, in part because very few people seem to acknowledge the importance of avoiding errors like the ones listed above.

(BTW I was surprised to see your skepticism about 2, since it feels like I've been talking about it on LW like a broken record, and I don't recall seeing any objections from you before. Would be curious to know if anything I said above is new to you, or you've seen me say similar things before but weren't convinced.)

Something like 2% of people die every year right? So even if we ignore the value of future people and all sorts of other concerns and just focus on whether currently living people get to live or die, it would be worth delaying a year if we could thereby decrease p(doom) by 2 percentage points. My p(doom) is currently 70% so it is very easy to achieve that. Even at 10% p(doom), which I consider to be unreasonably low, it would probably be worth delaying a few years.

Re: 2: Yeah I basically agree. I'm just not as confident as you are I guess. Like, maybe the answers to the problems you describe are fairly objective, fairly easy for smart AIs to see, and so all we need to do is make smart AIs that are honest and then proceed cautiously and ask them the right questions. I'm not confident in this skepticism and could imagine becoming much more convinced simply by thinking or hearing about the topic more.

Even at 10% p(doom), which I consider to be unreasonably low, it would probably be worth delaying a few years.

Someone with with 10% p(doom) may worry that if they got into a coalition with others to delay AI, they can't control the delay precisely, and it could easily become more than a few years. Maybe it would be better not to take that risk, from their perspective.

And lots of people have p(doom)<10%. Scott Aaronson just gave 2% for example, and he's probably taken AI risk more seriously than most (currently working on AI safety at OpenAI), so probably the median p(doom) (or effective p(doom) for people who haven't thought about it explicitly) among the whole population is even lower.

I’m just not as confident as you are I guess. Like, maybe the answers to the problems you describe are fairly objective, fairly easy for smart AIs to see, and so all we need to do is make smart AIs that are honest and then proceed cautiously and ask them the right questions.

I think I've tried to take into account uncertainties like this. It seems that in order for my position (that the topic is important and too neglected) to be wrong, one has to reach high confidence that these kinds of problems will be easy for AIs (or humans or AI-human teams) to solve, and I don't see how that kind of conclusion could be reached today. I do have some specific arguments for why the AIs we'll build may be bad at philosophy, but I think those are not very strong arguments so I'm mostly relying on a prior that says we should be worried about and thinking about this until we see good reasons not to. (It seems hard to have strong arguments either way today, given our current state of knowledge about metaphilosophy and future AIs.)

Another argument for my position is that humans have already created a bunch of opportunities for ourselves to make serious philosophical mistakes, like around nuclear weapons, farmed animals, AI, and we can't solve those problems by just asking smart honest humans the right questions, as there is a lot of disagreement between philosophers on many important questions.

I’m not confident in this skepticism and could imagine becoming much more convinced simply by thinking or hearing about the topic more.

What's stopping you from doing this, if anything? (BTW, beyond the general societal level of neglect, I'm especially puzzled by the lack of interest/engagement on this topic from the many people in EA with formal philosophy backgrounds. If you're already interested in AI and x-risks and philosophy, how is this not an obvious topic to work on or think about?)

I guess I just think it's pretty unreasonable to have p(doom) of 10% or less at this point, if you are familiar with the field, timelines, etc. 

I totally agree the topic is important and neglected. I only said "arguably" deferrable, I have less than 50% credence that it is deferrable. As for why I'm not working on it myself, well, aaaah I'm busy idk what to do aaaaaaah! There's a lot going on that seems important. I think I've gotten wrapped up in more OAI-specific things since coming to OpenAI, and maybe that's bad & I should be stepping back and trying to go where I'm most needed even if that means leaving OpenAI. But yeah. I'm open to being convinced!

I guess part of the problem is that the people who are currently most receptive to my message are already deeply enmeshed in other x-risk work, and I don't know how to reach others for whom the message might be helpful (such as academic philosophers just starting to think about AI?). If on reflection you think it would be worth spending some of your time on this, one particularly useful thing might be to do some sort of outreach/field-building, like writing a post or paper describing the problem, presenting it at conferences, and otherwise attracting more attention to it.

(One worry I have about this is, if someone is just starting to think about AI at this late stage, maybe their thinking process just isn't very good, and I don't want them to be working on this topic! But then again maybe there's a bunch of philosophers who have been worried about AI for a while, but have stayed away due to the overton window thing?)

Somehow there are 4 copies of this post

[+][comment deleted]10mo40
[+][comment deleted]10mo20
[+][comment deleted]10mo20

Some notable/famous signatories that I noted: Geoffrey Hinton, Yoshua Bengio, Demis Hassabis (DeepMind CEO), Sam Altman (OpenAI CEO), Dario Amodei (Anthropic CEO), Stuart Russell, Peter Norvig, Eric Horvitz (Chief Scientific Officer at Microsoft), David Chalmers, Daniel Dennett, Bruce Schneier, Andy Clark (the guy who wrote Surfing Uncertainty), Emad Mostaque (Stability AI CEO), Lex Friedman, Sam Harris.

Edited to add: a more detailed listing from this post:

Signatories include notable philosophers, ethicists, legal scholars, economists, physicists, political scientists, pandemic scientists, nuclear scientists, and climate scientists. [...]

Signatories of the statement include:

  • The authors of the standard textbook on Artificial Intelligence (Stuart Russell and Peter Norvig)
  • Two authors of the standard textbook on Deep Learning (Ian Goodfellow and Yoshua Bengio)
  • An author of the standard textbook on Reinforcement Learning (Andrew Barto)
  • Three Turing Award winners (Geoffrey Hinton, Yoshua Bengio, and Martin Hellman)
  • CEOs of top AI labs: Sam Altman, Demis Hassabis, and Dario Amodei
  • Executives from Microsoft, OpenAI, Google, Google DeepMind, and Anthropic
  • AI professors from Chinese universities
  • The scientists behind famous AI systems such as AlphaGo and every version of GPT (David Silver, Ilya Sutskever)
  • The top two most cited computer scientists (Hinton and Bengio), and the most cited scholar in computer security and privacy (Dawn Song)

I feel somewhat frustrated by execution of this initiative.  As far as I can tell, no new signatures are getting published since at least one day before the public announcement. This means even if I asked someone famous (at least in some subfield or circles) to sign, and the person signed, their name is not on the list, leading to understandable frustration of them.  (I already got a piece of feedback in the direction "the signatories are impressive, but the organization running it seems untrustworthy") 

Also if the statement is intended to serve as a beacon, allowing people who have previously been quiet about AI risk to connect with each other, it's essential for signatures to be published. It's nice that Hinton et al. signed, but for many people in academia it would be actually practically useful to know who from their institution signed - it's unlikely that most people will find collaborators in Hinton, Russell or Hassabis.

I feel even more frustrated because this is second time where similar effort is executed by xrisk community while lacking basic operational competence consisting in the ability to accept and verify signatures. So, I make this humble appeal and offer to the organizers of any future public statements collecting signatures: if you are able to write a good statement and secure the endorsement of some initial high-profile signatories, but lack the ability to accept, verify and publish more than a few hundreds names, please reach out to me - it's not that difficult to find volunteers for this work. 

 

It's a step, likely one that couldn't be skipped. Still just short of actually acknowledging nontrivial probability of AI-caused human extinction, and the distinction between extinction and lesser global risks, availability of second chances at doing better next time. Nuclear war can't cause extinction, so it's not properly alongside AI x-risk. Engineered pandemics might eventually get extinction-worthy, but even that real risk is less urgent.