User Comment Replies — AI Alignment Forum

I believe this has been proposed before (I'm not sure what the first time was).

The main obstacles is that this still doesn't solve impact regularization, and a more generalized type of shutdownability then you presented.

'define a system that will let you press its off-switch without it trying to make you press the off-switch' presents no challenge at all to them...
...building a Thing all of whose designs and strategies will also contain an off-switch, such that you can abort them individually and collectively and then get low impact beyond that point.

... (read more)

Shutdown-Seeking AI

Christopher King2y32

gwern2y51

This has been proposed before (as their citations indicate), and this particular proposal does not seem to introduce any particularly novel (or good) solutions.

I think the problems with myopic agents (of which this is but a special case) are made clearer by looking at current LLMs like the hobbyist AutoGPT. Most discussions of myopic agents seem to have in mind a simplistic scenario of a single persistent agent located in a single PC running only 1 computation at a time with no self-modification or ML-related programming or change of any parameters; and t... (read more)

AI ALIGNMENT FORUM
AF

All of Christopher King's Comments + Replies