Xita - AI Alignment Forum

I want to start my reply by saying I am dubious of the best future for humanity being one in which a super-intelligence we build ends up giving all control and decision making to humans. However, the tone of the post feels somewhat too anti-human (that a future where humans have greater agency is necessarily "bad", not just sub-optimal) and narrow in its interpretation for me to move on without comment. There is a lot to be learned from considering the necessary conflict between human and FAI agency. Yes, conflict.

The first point I don't fully agree with is the lack of capacity humans have to change, or grow, even as adults. You cite the lack of growth of wisdom in the human population even when people have been calling for it for millennia. There are many possible reasons for this besides humans being incapable of growth. For one, human psychology hasn't been significantly understood for more than a century, let alone studied in detail with the instruments we have in modern times. One of the greatest difficulties of passing down wisdom is the act of teaching. Effective teaching usually has to be personal, tailored to an individual's current state of mind, background knowledge, and skills. Even so, twin studies have found even such a rough measure as IQ to be 70% genetic and a whopping 30% environmental--but that is within an environment massively worse than perfect at teaching.

Further, practices found within, say, Buddhism show potential at increasing one's capacity to act on abstract empathy. Jainism, as a religion, seems to have teachings and practices strong enough to make its group of followers one of the most peaceful groups on the planet, without material conditions being a necessary cause. These are within even current humans' capacities to achieve. I will also point out the potential psychedelic substances have to allow adults who are set in their ways to break out, though the research is still relatively new and unexplored. I absolutely agree that human competence will never be able to compete with powerful AI competence, but that's not really the point. Human values (probably) allow for non-Homo Sapiens to evolve out of--or be created by--"regular" old humans.

This is a good point to move on to institutions, and why human-led institutions tend to be so stifling. There are a few points to consider here, and I'd like to start with centralization versus decentralization. In the case of centralization, yes, we do have a situation where people are essentially being absorbed into a whole which acts much like a separate entity in its own right. However, with greater decentralization, where higher organizations are composed of collections of smaller organizations, which are composed of yet smaller orgs, and so on down to the individual, generally speaking individuals have greater agency. Of course, they must still participate in a game with other agents--but then we get into discussion of whether an agent has more or less agency in the wild with no assistance or working with other agents towards collective, shared values. I can't even try to answer this is in a reply (or in a book), but the point is this isn't cut and dry.

We also should consider the fact that current institutions are set up competitively, not only with competition between institutions but also with their individuals competing with each other. This of course leads to multipolar traps abound, with serious implications for the capacity for those institutions to be maximally effective. I think the question of whether institutions must necessarily act like this is an open one--and a problem we didn't even know how to speak about until recently. I am reminded of Daniel Schachtenberger and "Game B".

Finally, to the point about conflict. I'd like to consider instances where it makes sense for a Friendly AI to want to increase a human's agency. To be clear, there is a vital difference between "increase" a quantity and "maximize" a quantity. A sufficiently Friendly AI should be able to recognize easily that human values need to be balanced against one another, and no one will preclude all others. With that said, agency is valuable to humans at least in the cases of:

i) Humans intrinsically valuing autonomy. Whether you believe we do, if it is true, a Friendly agent would wish to increase this to a sensible point.

ii) Human values may be computationally intractable, especially when discussing multiple humans. There is (as far was we know) a finite amount of accessible information in the universe. No matter how smart an AI gets, there are probably going to be limits on its ability to compute the best possible world. A human seems to be, in many senses, "closer" to their own values than an outsider looking in is. It is not necessarily true that a human's actions and desires are perfectly simulatable--we are made of quantum objects, after all, and even with a theory of everything, that doesn't mean literally anything is possible. It may be that it is more efficient to teach humans how to achieve their own happiness than for the AI to do everything for us, at least in cases of, say, fetching a banana from the table.

iii) An AI may wish to give its (potential) computational resources to humans, in order to increase their capacity to fulfill their values. This is really more of an edge case, but... there will surely be a point where something like "devoting x planet to y computronium versus having z human societies/persons" will need to be decided on. That is to say, at some point an agent will have to decide how much resources it is going to spend predicting human values and the path towards them versus having humans actually instantiate the appreciation of those values. If we're talking about maximizing human values... would it not make sense for as much of the universe as possible be devoted to such things? Consider the paper clip maker which, at the end of its journey of making the universe worthless, finally turns itself into paper clips just to put a cherry on top and leave a universe unable to even appreciate its paper-clip-ness. Similarly, the FAI may want to back off at the extremes to increase the capacity for humans to enjoy their lives.

In terms of formalizing stuff of this nature, I am aware of at least one attempt in information theory with the concept of "empowerment", which is a measure of the maximum possible mutual information between an agent's potential future actions and its potential future state. It may be something to look into, though I don't think its perfect.

Sorry for the length, but I hope it was at least thought-provoking.

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments