'define a system that will let you press its off-switch without it trying to make you press the off-switch' presents no challenge at all to them...
...building a Thing all of whose designs and strategies will also contain an off-switch, such that you can abort them individually and collectively and then get low impact beyond that point. This is conceptually a part meant to prevent an animated broom with a naive 'off-switch' that turns off just that broom, from animating other brooms that don't have off-switches in them, or building some other automatic cauldron-filling process.

Reply

Modal Fixpoint Cooperation without Löb's Theorem

Christopher King2y*00

Wouldn't this also let you prove "not E"? 🤔 I think this system might be inconsistent.

EDIT: nvm, I guess it's assumed that the agents are some kind of FairBot (https://www.lesswrong.com/posts/iQWk5jYeDg5ACCmpx/robust-cooperation-in-the-prisoner-s-dilemma#Previously_known__CliqueBot_and_FairBot), which introduces an asymmetry between cooperate and defect.

Reply