An open response to Wittkotter and Yampolskiy

Donald Hobson

A response to this paper.

https://asi-safety-lab.com/DL/Kill-Switch-For_ASI_EW_21_12_14.pdf

A substantial fraction of my argument comes down to. It is plausible that a hypothetical hypercompetent civilization could coordinate to build such infrastructure. However we don't live in a hypothetical hypercompetent civilization. We are not remotely close, and any attempt to get us there will be met with almost certain failure. Such a hypothetical civilization would not need the mechanisms presented here, as it wouldn't be stupid enough to create an unaligned AI and let it loose on the internet anyway.

The Current IT ecosystem is unprepared to switch-off globally an Artificial Superintelligence
(ASI), the likely result of an intelligence explosion.

True. You also don't give any indication of how humanity would find out about this (the ASI will only take overt obvious actions once it is convinced that human knowledge of its existence is of little threat to it, ie when it's too late.

Yampolskiy [37], Chalmers [38], and others have suggested confining ASI via AI Boxing, but
Yudkowsky [39], showed that many of these external constraints could be bypassed by an ASI.

They think that humans can't win against ASI on the battlefield of boxing. Boxing is a battlefield heavily skewed towards the defender. Stopping an ASI after it has already spread widely, good luck with that. Complexity is the enemy of security, and the whole world in general can't help be more complex than one carefully built box. (Or even a carelessly built box).

Switching off ASI is a large-scale, war-like action that requires many additional capabilities, in
particular, if we demand that the off-switch happens globally. Throughout this paper, it is assumed
that humankind could have some required capabilities, like a solution to communicate reliably,
and safely using e.g., Key Safes [9], which prevents ASI from stealing encryption keys.

The superintelligence should be at least pretty good at manipulating and tricking humans. It has high bandwith communication channels to a large number of humans. If nothing else, it can trick humans into sending messages. Suppose you have some fancy encryption with perfect verification. A small number of paranoid people use it to communicate with each other. Meanwhile 99.9% of humanity uses other means of communication and has been tricked into believing that ASI is impossible. Now you have to arrange a global effort to eradicate all hard drives that 99.9% of people are opposed to. If the ASI is convincing enough, then it only needs some communication channel to the paranoid. Even a random person on the street pulling out an unsecured smartphone.

We assume that we can send encrypted, authenticatable signals to all devices via the Internet or Radio.

Yeah, good luck with that. Firstly, what do you mean "devices". You seem to intend to eradicate all means of information storage. Any device like a disconnected hard drive can't be remotely erased, because erasing hard drives takes energy. Data can be printed on paper. Or photographed off a screen with a chemical camera. Or embedded in the internal microstructure of a 3d print. Or memorized in a human brain. (Ie if the ASI comes up with plausible but wrong AI safety research, posts it online, and hopes someone reads it, and remembers the key points to use in future AI work) Or encoded in the patterns of wheelmarks and dust removal left by a roomba. Or sent to an already existing mars rover. Somewhere out there is a random Arduino + solar panel a farmer used to count sheep. The device is in the middle of a random field and totally unsecure. The farmer has since died, and no one knows that the device is there. The key insights to make an ASI of similar design might fit on a postcard. The golfed code might be only a page. The moment you make any attempt to censor data with the legal apparatus of the state, people will share it just to be perverse. This doesn't look like anything that has any chance of working, and that is before you include the superintelligence on the other side.

Remember, you are assuming a superintelligence that can freely communicate with a large number of people. Including people with electronics knowledge and tools that could be persuaded to bypass any security measures.

Furthermore, it is assumed that humans can detect early warning signs and that the infrastructure and the national security apparatus could be sufficiently hardened. Finally, we assume that humankind has agreed-upon secure and reliable consultation- and decision-making tools among its
representatives for agreeing and sending out a signal that can end the existence of ASI if required.

Ok. wishlist of things you are never going to get. I suspect "infrastructure" covers everything from combine harvesters to GPS satellites, to car engine computers to train points systems to factory robotic arms to .... And all the infrastructure making future infrastructure. So chip fabs. The computers that operating systems are compiled on, circuit board printers. In other words its a one size fits all bucket for all the computers that can't easily be turned off without massive damage, ie everything but social media.

However, there might be circumstances in which our deterrence would rely solely on drones and robots. Helping and guiding these drones directly within the local environment with tactical information (i.e., without the server-sided ASI Hunting Service) and without human assistance could accelerate the process of search and destroy significantly

Drones and robots capable enough to do the complex real world task of chasing down and destroying computers. Yet the ASI can't possibly use these drones itself (or other drones)

Key-Safe (KS) is an indispensable tool in ASI Safety and for the distributing of the Kill-ASI signal in particular. Without having this technology, it is doubtful that ASI Safety could be provided in any meaningful way. ASI could use Reverse Code Engineering to steal any key via modifying encryption/decryption software, all keys and every key-using algorithm need to be hidden and protected from the main CPU. In Trustworthy Encryption [9], all keys, including public keys, are stored in Key Safes allowing to exchange keys only with other key safes. Under no circumstances keys are shown in cleartext. The En- and Decryption happens in separate hardware-based Encryption and Decryption Units.
Without Key Safe, ASI could establish man-in-the-middle attacks and modify not just the software updating and hashcode validation process but also create a deadman “all-ok” signal pre-radioing the signal without creating the suspicion that this is happening, while the decision-making infrastructure has already been deactivated by ASI.

Any security mechanism can only work from where the message is encrypted to where it is decrypted. I mean, sure the ASI can probably reverse engineer your encryption scheme anyway, especially if it can persuade someone to open up the chip and scan it with a scanning tunneling microscope. But at the very least, humans can be tricked and hacked. The hardware encryption unit can be used to decrypt whatever, say by changing the keyboard mapping settings between each keystroke, so that "what the user typed, almost direct from the keyboard" is a string of the ASI's choice. Or maybe the whole device is stolen. The board is cut open, and the key safe sits there decrypting whatever the ASI feeds it.

Any decision making infrastructure can be fooled by the ASI. Any human can be tricked. Any passwords can be picked up on a security camera when a human types them in. (and then a robot is used to type the same password and keep the all-ok signal going.)

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

5

An open response to Wittkotter and Yampolskiy

5