This is a super interesting and important problem, IMO. I believe it already has significant real world practical consequences, e.g. powerful people find it difficult to avoid being surrounded by sychophants: even if they really don't want to be, that's just an extra constraint for the sychophants to satisfy ("don't come across as sychophantic")! I am inclined to agree that avoiding power differentials is the only way to really avoid these perverse outcomes in practice, and I think this is a good argument in favor of doing so.
--------------------------------------
This is also quite related to an (old, unpublished) work I did with Jonathan Binas on "bounded empowerment". I've invited you to the Overleaf (it needs to clean-up, but I've also asked Jonathan about putting it on arXiv).
To summarize: Let's consider this in the case of a superhuman AI, R, and a human H. The basic idea of that work is that R should try and "empower" H, and that (unlike in previous works on empowerment), there are two ways of doing this:
1) change the state of the world (as in previous works)
2) inform H so they know how to make use of the options available to them to achieve various ends (novel!)
If R has a perfect model of H and the world, then you can just compute how to effectively do these things (it's wildly intractable, ofc). I think this would still often look "patronizing" in practice, and/or maybe just lead to totally wild behaviors (hard to predict this sort of stuff...), but it might be a useful conceptual "lead".
Random thought OTMH: Something which might make it less "patronizing" is if H were to have well-defined "meta-preferences" about how such interactions should work that R could aim to respect.
I'm surprised by the list of forms of power by what it leaves out.
A stereotypical example of power differences is bosses having relationships with their employees.
The boss has power over a different domain of the life of the employee than the domain of the relationship.
It's the problem of corruption where power from one domain leaks into a different domain where it doesn't belong.
If there's an option to advance one's career by sleeping with one's boss, that makes it issues of consent more tricky. Career incentives might pressure a person in the relationship even if they wouldn't want to be in it otherwise.
Suppose that the more powerful being is aligned to the less powerful: that is to say that (as should be the case in the babysitting example you give) the more powerful being's fundamental motive is the well-being of the less powerful being.. Assume also that a lot of the asymmetry is of intellectual capacity: the more powerful being is also a great deal smarter. I think the likely and correct outcome is that there isn't always consent, the less powerful being is frequently being manipulated into actions and reactions that they haven't actually consented to, and might not even be capable of realizing why they should consent to — but ones that, if they were as intellectually capable as the more powerful being, they would in fact consent to.
I also think that,. for situations where the less powerful being is able to understand the alternatives and make an rational and informed decision, and wants to, the more powerful should give them the option and let them do so.. That's the polite, respectful way to do things But often that isn't going to be practical, or desirable. and the baby sitter should just distract the baby before they get into the dangerous situation.
Consent is a concept that fundamentally assumes that I am the best person available to make decisions about my own well-being. Outside parental situations, for interactions between evolved intelligence like humans, that's almost invariably true. But if I had a superintelligence aligned to me, then yes, I would want it to keep me away from dangers so complex that I'm not capable of making an informed decision about them.
Relevant post by Richard Ngo: "Moral Strategies at different capability levels". Crucial excerpt:
Let’s consider three ways you can be altruistic towards another agent:
- You care about their welfare: some metric of how good their life is (as defined by you). I’ll call this care-morality - it endorses things like promoting their happiness, reducing their suffering, and hedonic utilitarian behavior (if you care about many agents).
- You care about their agency: their ability to achieve their goals (as defined by them). I’ll call this cooperation-morality - it endorses things like honesty, fairness, deontological behavior towards others, and some virtues (like honor).
- You care about obedience to them. I’ll call this deference-morality - it endorses things like loyalty, humility, and respect for authority.
[...]
- Care-morality mainly makes sense as an attitude towards agents who are much less capable than you, and/or can't make decisions for themselves - for example animals, future people, and infants.
[...]
- Cooperation-morality mainly makes sense as an attitude towards agents whose capabilities are comparable to yours - for example others around us who are trying to influence the world.
[...]
- Deference-morality mainly makes sense as an attitude towards trustworthy agents who are much more capable than you - for example effective leaders, organizations, communities, and sometimes society as a whole.
Thanks for this! I think the categories of morality is a useful framework. I am very wary of the judgement that care-morality is appropriate for less capable subjects - basically because of paternalism.
I'd like to put forward another description of a basic issue that's been around for a while. I don't know if there's been significant progress on a solution, and would be happy to pointed to any such progress. I've opted to go for a relatively rough and quick post that doesn't dive too hard into the details, to avoid losing the thought at all. I may be up for exploring details further in comments or follow-ups.
The Question: How do you respect the wishes (or preferences) of a subject over whom you have a lot of control?
The core problem: any indicator/requirement/metric about respecting their wishes is one you can manipulate (even inadvertently).
For example, think about trying to respect the preferences of the child you're babysitting when you simply know from experience what they will notice, how they will feel, what they will say they want, and what they will do, when you put them in one environment versus another (where the environment could be as small as what you present to them in your behaviour). Is there any way to provide them a way to meaningfully choose what happens?
We could think about this in a one-shot case where there's a round of information gathering and coming to agreement on terms, and then an action is taken. But I think this is a simplification too far, since a lot of what goes into respecting the subject/beneficiary is giving them space for recourse, space to change their mind, space to realise things that were not apparent with the resources for anticipation they had available during the first phase.
So let's focus more on the case where there's an ongoing situation where one entity has a lot of power over another but nevertheless wants to secure their consent for whatever actually happens, in a meaningful sense.
Lots of cases where this happens in real life, mostly where the powerful entity has a lot of their own agenda and doesn't care a huge amount about the subject (they may care a lot, but maybe not as much as they do about their other goals):
Our intuitions may be mostly shaped by that kind of situation, where there's a strong need to defend against self-interest, corruption, or intention to gain and abuse power.
But I think there's a hard core of a problem left even if we remove the malicious or somewhat ill-intentioned features from the powerful entity. So let's focus: what does it mean to fully commit to respecting someone's autonomy, as a matter of genuine love or a strong sense of morality or something along those lines, even when you have a huge amount of power over them.
What forms power can take:
Examples where this shows up in real life already (and where people seem to mostly suck at it, maybe due to not even trying, but there are some attempts to take it seriously: see work by Donaldson and Kymlicka):
It may be that the only true solution here is a full commitment to egalitarianism that seeks to remove the power differentials in the first place (to the extent possible: I don't believe it's completely possible), and (somehow) to do structured decision making that is truly joint or communal.
What form does such decision-making need to take? (Hard mode: how could we come to figure out what form it should take together from our current unequal starting point?)
It could also be the case that preferences or wishes are simply not enough of a real thing to be a target of our respect. But then what? What matters? My best guess involves ongoing dialogue and inclusive and accessible community, but I don't have a complete answer. (And it's hard to do this of course while daring to care about relatively powerless subjects exposes one to a great deal of criticism if not ridicule/dismissal - possibly arising from defensiveness about the possibility of having caused harm and possibly continuing to do so.)