User Comment Replies — AI Alignment Forum

2Donald Hobson3y

So long as we assume the timescales of intelligence growth are slow compared to destroying the world timescales. If an AI is smart enough to destroy the world in a year (in the hypothetical where it had to stop self improving and do it now). A day of self improvement and they are smart enough to destroy the world in a week. Another day of self improvement and they can destroy the world in an hour. Another possibility is an AI that doesn't choose to destroy the world at the first available moment. Imagine a paperclip maximizer. It thinks it has a 99% chance of destroying the world and turning everything into paperclips. And a 1% chance of getting caught and destroyed. If it waits for another week of self improvement, it can get that chance down to 0.0001%. Suppose the limiting factor was compute budget. Making each AI 1% bigger than before means basically wasting compute running pretty much the same AI again and again. Making each AI about 2x as big as the last is sensible. If each training run costs a fortune, you can't afford to go in tiny steps.

The Dumbest Possible Gets There First

Artaxerxes3y10

I'm fairly agnostic about how dumb we're talking - what kinds of acts or confluence of events are actually likely to be effective complete x-risks, particularly at relatively low levels of intelligence/capability. But that's besides the point in some ways, because whereever someone might place the threshold for x-risk capable AI, as long as you assume that greater intelligence is harder to produce (an assumption that doesn't necessarily hold, as I acknowledged), I think that suggests that we will be killed by something not much higher than that threshold o... (read more)

AI ALIGNMENT FORUM
AF

All of Artaxerxes's Comments + Replies