This is presenting some old work on combining different possible utility functions, that is worth revealing to the world.
I've written before about the problem of reaching an agreementbetween agents with different utility functions. The problem re-appears if you yourself are uncertain between two different moral theories.
For example, suppose you gave 99% credence to average utilitarianism and 1% credence to total utilitarianism. In an otherwise empty universe, you can create one person with 2 utility, or a thousand with 1 utility.
If we naively computed the expected utility of both actions, we would get 0.99(2)+0.01(2)=2 for the first choice, and 0.99(1)+0.01(1000)=10.99 for the second. It therefore seems that total utilitarianism wins by default, even though it is very unlikely (for you).
But the situation can be worse. Suppose that there is a third option, which created ten thousand people with each 0.000001 utility. And you have 99% credence on average utilitarianism, (1−10−100)% credence on total utilitarianism, and (10−100)% credence on exponential utilitarianism, where the average utility is multiplied by two to the power of the population. In this case the third option - and the incredibly unlikely exponential utilitarianism - win out massively.
Normalising utilities
To prevent the large-population-loving utilities from winning out by default, it's clear we need to normalise the utilities in some way before adding them together, similarly to how you normalise the utilities of opposing agents.
I'll distinguish two methods here: individual normalisations, and collective normalisations. For individual normalisations, if you have credences of pi for utilities ui∈U, then ui is normalised into ^ui using some procedure that is independent of pi, pj, and uj for j≠i. Then the normalised utilities are added to give your total utility function of:
u=∑ipi^ui.
In collective normalisations, the normalisation of ui into ^ui is allowed to depend upon the other utilities and the credences. All Pareto outcomes for the utilities are equivalent (modulo resolving ties) with maximising such a u.
Here I'll present the work that I did with Owen Cotton-Barratt, Toby Ord, and Will MacAskill, in order to try and come up with a principled way of doing individual normalisations. In a certain sense, this work failed: we didn't find any normalisations that were clearly superior in every way to others. But we did find a lot about the properties of the different normalisations; one interesting thing is that the dumbest normalsation - the zero-one, or min-max - has surprisingly good properties.
Let O be the option set for the agent: the choices that it can make (in our full treatment, we considered a larger set O⊃O, the normalisation set, but this won't be needed here).
For the purpose of this post, O will be equal to Π={πj}, the set of deterministic policies the agent can follow; this feels like a natural choice, as it's what the agent really has control over.
For any ui∈U and πj∈Π, there is the expected utility of ui conditional on the agent following policy πj; this will be designated by ui(πj).
We may have a probability distribution q over O=Π (maybe defined by the complexity of the policy?). If we don't have such a normalisation, and the set of deterministic policies is finite, then we can set q to be the uniform distribution.
Then, given q, each ui becomes a real-valued random variable, taking value ui(πj) with probability q(πj). We'll normalise these ui by normalising the properties of this random variable.
First of all, let's exclude any ui that are constant on all of Π; these utilities cannot be changed, in expectation, by the agent's policies, so should make no difference. Then each ui, seen as a random variable, has the following properties:
Maximum: maxui=maxπjui(πj).
Minimum: minui=minπjui(πj).
Mean: μi=∑jq(πj)ui(πj).
Variance: σ2i=∑jq(πj)(ui(πj))2−μ2i.
Mean difference: δi=∑j,kq(πj)q(πk)|ui(πj)−ui(πk)|.
There are five natural normalisation methods that emerge from these properties. The first and most trivial is the min-max or zero-one normalisation: scale and translate ui so that minui takes the value 0 and maxui takes the value 1 (note that the translation doesn't change the desired policy when summing utilities, so what is actually required is to scale ui so that (maxui)−(minui)=1).
The second nomalisation, the mean-max, involves setting (maxui)−μi=1; by symmetry, the min-mean normalisation involves setting μi−(minui)=1.
Finally, the last two normalisations involve setting either the variance, or the mean difference, to 1.
Meaning of the normalisations
What do these normalisations mean? Well, min-max is a normalisation that cares about the difference between perfect utopia and perfect dystopia: between the best possible and the worst possible expected outcome. Conceptually, this seems problematic - it's not clear why the dystopia matters, with seems like something that opens the utility up to extortion - but, as we'll see, the min-max normalisation has the best formal properties.
The mean-max is the normalisation that most appeals to me; the mean is the expected value of random policy, while the max is the expected outcome of the best policy. In a sense, that's the job of an agent with a single utility function: to move the outcome from random to best. Thus the max has a meaning that the min, for instance, lacks.
For this reason, I don't see the min-mean normalisation as being anything meaningful; it's the difference between complete disaster and a random policy.
I don't fully grasp the meaning of the variance normalisation; Owen Cotton-Barratt did the most work on it, and showed that, in a certain sense, it was resistant to lying/strategic distortion in certain circumstances, if a given utility didn't 'know' what the other utilities would be. But I didn't fully understand this point. So bear in mind that this normalisation has positive properties that aren't made clear in this post.
Finally, the mean difference normalisation controls the spread between the utilities of the different policies, in a linear way that may seem to be more natural than the variance.
Properties of the normalisation
So, which normalisation is best? Here's were we look at the properties of the normalisations (they will be summarised in a table at the end). As we've seen, independence of irrelevant alternatives always fails, and there can always be an incentive for a utility to "lie" (as in, there are U, ui∈U, p, Π, and q, such that ui would have a higher expected utility under the final u if it was replaced with u′i≠ui).
What other properties do all the normalisations share? Well, since they normalise independently, u is continuous in p. And because the minimum, maximum, variance, etc... are continuous in q and in ui(πj), then u is also continuous in that information.
In contrast, the best policy argmaxπju(πj) of u is not typically continuous in the data. Imagine that there are two utilities and two policies: u0(π0)=u1(π1)=1 and u0(π1)=u1(π0)=0. Then for p0<1/2, π1 is the optimal policy (for all the above normalisations for uniform q), while for p0>1/2, π0 is optimal.
Ok, that's enough of properties that all methods share; what about ones they don't?
First of all, we can look at the negation symmetry between ui and −ui. Min-max, variance, and mean difference all have the same normalisation for ui and −ui; mean-max and min-mean do not, since the mean can be closer to the min than that max (or vice versa).
Then we can consider what happens when some policies πj and πk are clones of each other: imagine that for all ui∈U, ui(πj)=ui(πk). Then what happens if we remove the redundant πk and normalise on U−{πk}? Well, it's clear that the maximum or minimum value of ui cannot change (since if πk was a maximum/minimum, then so is πj, which remains), so the min-max normalisation is unaffected.
All the other normalisations change, though. This can be seen in the example U={u0,u1}, Π={π0,π1,π2,π3}, with u0(π0)=u1(π0)=−1, u1(π0)=1, u1(π1)=0, u1(π2)=u1(π3)=1, and u0(π2)=u1(π3)=0; in terms of sets of expected utilities in terms of policies, π0 has (1,0,0,−1) while π1 has (1,1,0,−1). Then for uniform q, all other normalisation methods change if we remove π3 which is identical to π2 for both utilities.
Thus all the other normalisation change when we add (or remove) clones of existing policies.
Finally, we can consider what happens if we are in one of several worlds, and the policies/utilities are the identical in some of these worlds. This should be treated the same as if those identical worlds were all one.
So, imagine that we are in one of three worlds: W0, W1, and W2, with probabilities ρ0, ρ1, and ρ2, respectively. Before taking any actions, the agent will discover which world it is in. Thus, if Πi is the set of policies in Wi, the complete set of policies is Π0×Π1×Π2.
The worlds W1 and W2 are, however, indistinguishable for all utilities in U. Thus we can identify f(Π1)≅Π2, with ui(πj)=ui(f(πj)) for all ui∈U. Then a normalisation method combines indistinguishable choices property if the normalisation is the same in world ρ0W0+ρ1W1+ρ2W2 and ρ0W0+(ρ1+ρ2)W1. Then:
Min-max, mean-max, and min-mean combine indistinguishable choices. Variance and mean difference normalisations do not.
Proof (sketch): Let uji=ui|Wj be the random variable that is ui on Wj under the assumption that Wj is the true underlying world. Then on ρ0W0+ρ1W1+ρ2W2, ui behaves like the random variable ρ0u0i+ρ1u1i+ρ2u2i. (this means that ui has probability ρj of being uji, not that it adds random variables together). Mean, max, and min all have the property that f(aX+bY)=af(X)+bf(Y); variance and mean difference, on the other hand, do not.
Summary of properties
In my view, it is a big worry that the variance and mean difference normalisations fail to combine indistinguishable choices. World W1 and W2 could be strictly identical, except for some irrelevant information that all utility functions agree is irrelevant. We have to worry about whether the light from a distant star is slightly redder or slightly bluer than expected; what colour ink was used in a proposal; the height of the next animal we see, and so on.
This means that we cannot divide the universe into relevant and irrelevant variables, and focus solely on the first.
In table form, the various properties are:
As can be seen, the min-max method, simplistic though it is, has all the possible nice properties.
This is presenting some old work on combining different possible utility functions, that is worth revealing to the world.
I've written before about the problem of reaching an agreement between agents with different utility functions. The problem re-appears if you yourself are uncertain between two different moral theories.
For example, suppose you gave 99% credence to average utilitarianism and 1% credence to total utilitarianism. In an otherwise empty universe, you can create one person with 2 utility, or a thousand with 1 utility.
If we naively computed the expected utility of both actions, we would get 0.99(2)+0.01(2)=2 for the first choice, and 0.99(1)+0.01(1000)=10.99 for the second. It therefore seems that total utilitarianism wins by default, even though it is very unlikely (for you).
But the situation can be worse. Suppose that there is a third option, which created ten thousand people with each 0.000001 utility. And you have 99% credence on average utilitarianism, (1−10−100)% credence on total utilitarianism, and (10−100)% credence on exponential utilitarianism, where the average utility is multiplied by two to the power of the population. In this case the third option - and the incredibly unlikely exponential utilitarianism - win out massively.
Normalising utilities
To prevent the large-population-loving utilities from winning out by default, it's clear we need to normalise the utilities in some way before adding them together, similarly to how you normalise the utilities of opposing agents.
I'll distinguish two methods here: individual normalisations, and collective normalisations. For individual normalisations, if you have credences of pi for utilities ui∈U, then ui is normalised into ^ui using some procedure that is independent of pi, pj, and uj for j≠i. Then the normalised utilities are added to give your total utility function of:
In collective normalisations, the normalisation of ui into ^ui is allowed to depend upon the other utilities and the credences. All Pareto outcomes for the utilities are equivalent (modulo resolving ties) with maximising such a u.
The Nash Bargaining Equilibrium and the Kalai-Smorodinsky Bargaining Solution are both collective normalisations; the Mutual Worth Bargaining Solution is an individual normalisation iff the choice of the default point is individual (but doing that violates the spirit of what that method is supposed to achieve).
Note that there are no non-dictatorial Pareto normalisations, whether individual or collective, that are independent of irrelevant alternatives, or that are immune to lying.
Individual normalisations
Here I'll present the work that I did with Owen Cotton-Barratt, Toby Ord, and Will MacAskill, in order to try and come up with a principled way of doing individual normalisations. In a certain sense, this work failed: we didn't find any normalisations that were clearly superior in every way to others. But we did find a lot about the properties of the different normalisations; one interesting thing is that the dumbest normalsation - the zero-one, or min-max - has surprisingly good properties.
Let O be the option set for the agent: the choices that it can make (in our full treatment, we considered a larger set O⊃O, the normalisation set, but this won't be needed here).
For the purpose of this post, O will be equal to Π={πj}, the set of deterministic policies the agent can follow; this feels like a natural choice, as it's what the agent really has control over.
For any ui∈U and πj∈Π, there is the expected utility of ui conditional on the agent following policy πj; this will be designated by ui(πj).
We may have a probability distribution q over O=Π (maybe defined by the complexity of the policy?). If we don't have such a normalisation, and the set of deterministic policies is finite, then we can set q to be the uniform distribution.
Then, given q, each ui becomes a real-valued random variable, taking value ui(πj) with probability q(πj). We'll normalise these ui by normalising the properties of this random variable.
First of all, let's exclude any ui that are constant on all of Π; these utilities cannot be changed, in expectation, by the agent's policies, so should make no difference. Then each ui, seen as a random variable, has the following properties:
There are five natural normalisation methods that emerge from these properties. The first and most trivial is the min-max or zero-one normalisation: scale and translate ui so that minui takes the value 0 and maxui takes the value 1 (note that the translation doesn't change the desired policy when summing utilities, so what is actually required is to scale ui so that (maxui)−(minui)=1).
The second nomalisation, the mean-max, involves setting (maxui)−μi=1; by symmetry, the min-mean normalisation involves setting μi−(minui)=1.
Finally, the last two normalisations involve setting either the variance, or the mean difference, to 1.
Meaning of the normalisations
What do these normalisations mean? Well, min-max is a normalisation that cares about the difference between perfect utopia and perfect dystopia: between the best possible and the worst possible expected outcome. Conceptually, this seems problematic - it's not clear why the dystopia matters, with seems like something that opens the utility up to extortion - but, as we'll see, the min-max normalisation has the best formal properties.
The mean-max is the normalisation that most appeals to me; the mean is the expected value of random policy, while the max is the expected outcome of the best policy. In a sense, that's the job of an agent with a single utility function: to move the outcome from random to best. Thus the max has a meaning that the min, for instance, lacks.
For this reason, I don't see the min-mean normalisation as being anything meaningful; it's the difference between complete disaster and a random policy.
I don't fully grasp the meaning of the variance normalisation; Owen Cotton-Barratt did the most work on it, and showed that, in a certain sense, it was resistant to lying/strategic distortion in certain circumstances, if a given utility didn't 'know' what the other utilities would be. But I didn't fully understand this point. So bear in mind that this normalisation has positive properties that aren't made clear in this post.
Finally, the mean difference normalisation controls the spread between the utilities of the different policies, in a linear way that may seem to be more natural than the variance.
Properties of the normalisation
So, which normalisation is best? Here's were we look at the properties of the normalisations (they will be summarised in a table at the end). As we've seen, independence of irrelevant alternatives always fails, and there can always be an incentive for a utility to "lie" (as in, there are U, ui∈U, p, Π, and q, such that ui would have a higher expected utility under the final u if it was replaced with u′i≠ui).
What other properties do all the normalisations share? Well, since they normalise independently, u is continuous in p. And because the minimum, maximum, variance, etc... are continuous in q and in ui(πj), then u is also continuous in that information.
In contrast, the best policy argmaxπju(πj) of u is not typically continuous in the data. Imagine that there are two utilities and two policies: u0(π0)=u1(π1)=1 and u0(π1)=u1(π0)=0. Then for p0<1/2, π1 is the optimal policy (for all the above normalisations for uniform q), while for p0>1/2, π0 is optimal.
Ok, that's enough of properties that all methods share; what about ones they don't?
First of all, we can look at the negation symmetry between ui and −ui. Min-max, variance, and mean difference all have the same normalisation for ui and −ui; mean-max and min-mean do not, since the mean can be closer to the min than that max (or vice versa).
Then we can consider what happens when some policies πj and πk are clones of each other: imagine that for all ui∈U, ui(πj)=ui(πk). Then what happens if we remove the redundant πk and normalise on U−{πk}? Well, it's clear that the maximum or minimum value of ui cannot change (since if πk was a maximum/minimum, then so is πj, which remains), so the min-max normalisation is unaffected.
All the other normalisations change, though. This can be seen in the example U={u0,u1}, Π={π0,π1,π2,π3}, with u0(π0)=u1(π0)=−1, u1(π0)=1, u1(π1)=0, u1(π2)=u1(π3)=1, and u0(π2)=u1(π3)=0; in terms of sets of expected utilities in terms of policies, π0 has (1,0,0,−1) while π1 has (1,1,0,−1). Then for uniform q, all other normalisation methods change if we remove π3 which is identical to π2 for both utilities.
Thus all the other normalisation change when we add (or remove) clones of existing policies.
Finally, we can consider what happens if we are in one of several worlds, and the policies/utilities are the identical in some of these worlds. This should be treated the same as if those identical worlds were all one.
So, imagine that we are in one of three worlds: W0, W1, and W2, with probabilities ρ0, ρ1, and ρ2, respectively. Before taking any actions, the agent will discover which world it is in. Thus, if Πi is the set of policies in Wi, the complete set of policies is Π0×Π1×Π2.
The worlds W1 and W2 are, however, indistinguishable for all utilities in U. Thus we can identify f(Π1)≅Π2, with ui(πj)=ui(f(πj)) for all ui∈U. Then a normalisation method combines indistinguishable choices property if the normalisation is the same in world ρ0W0+ρ1W1+ρ2W2 and ρ0W0+(ρ1+ρ2)W1. Then:
Proof (sketch): Let uji=ui|Wj be the random variable that is ui on Wj under the assumption that Wj is the true underlying world. Then on ρ0W0+ρ1W1+ρ2W2, ui behaves like the random variable ρ0u0i+ρ1u1i+ρ2u2i. (this means that ui has probability ρj of being uji, not that it adds random variables together). Mean, max, and min all have the property that f(aX+bY)=af(X)+bf(Y); variance and mean difference, on the other hand, do not.
Summary of properties
In my view, it is a big worry that the variance and mean difference normalisations fail to combine indistinguishable choices. World W1 and W2 could be strictly identical, except for some irrelevant information that all utility functions agree is irrelevant. We have to worry about whether the light from a distant star is slightly redder or slightly bluer than expected; what colour ink was used in a proposal; the height of the next animal we see, and so on.
This means that we cannot divide the universe into relevant and irrelevant variables, and focus solely on the first.
In table form, the various properties are:
As can be seen, the min-max method, simplistic though it is, has all the possible nice properties.