Very interesting, I like the long list of examples as it helped me get my head around it more.
So, I've been thinking a bit about similar topics, but in relation to a long reflection on value lock-in.
My basic thesis was that the concept of reversibility should be what we optimise for in general for humanity, as we want to be able to reach as large a part of the "moral searchspace" as possible.
The concept of corrigibility you seem to be pointing towards here seems very related to notions of reversibility. You don't want to take actions that cannot later be r...
I love your stuff and I'm very excited to see where you go next.
I would be very curious to hear what you have to say about more multi-polar threat scenarios and extending theories of agency into the collective intelligence frame.
What are your takes on Michael Levin's work on agency and "morphologenesis" in relation to your neuroscience ideas? What do you think about claims of hierarchical extension of these models? How does this affect multipolar threat models? What are the fundamental processes that we should care about? When should we expand these concepts cognitively, when should we constrain them?