Generalizing Foundations of Decision Theory

[-]Scott Garrabrant9y40

I am not optimistic about this project. My primary reason is that decision theory has two parts. First, there is the part that is related to this post, which I'll call "Expected Utility Theory." Then, there is the much harder part, which I'll call "Naturalized Decision Theory."

I think expected utility theory is pretty well understood, and this post plays around with details of a well understood theory, while naturalized decision theory is not well understood at all.

I think we agree that the work in this post is not directly related to naturalized decision theory, but you think it is going to help anyway.

My understanding of your argument (correct me if I am wrong) is that probability theory is to logical uncertainty as expected utility theory is to naturalized decision theory, and dutch books lead to LU progress, so VNMish should lead to NDT progress.

I challenge this in two ways.

First, Logical Inductors look like dutch books, but this might be because things related to probability theory can be talked about with dutch books. I don't think that thinking about Dutch books lead to the invention of Logical Inductors (Although maybe they would have if I followed the right path), and I don't think that the post hoc connection provides much evidence that thinking about dutch books is useful. Perhaps whenever you have a theory, you can do this formal justification stuff, but formal justification does not create theories.

I realize that I actually do not stand behind this first challenge very much, but I still want to put it out there as a possibility.

Second, I think that in a way Logical Uncertainty is about resource bounded Probability theory, and this is why a weakening of dutch books helped. On the other hand, Naturalized Decision Theory is not about resource bounded Expected Utility Theory. We made a type of resource bounded Probability theory, and magically got some naturalistic reasoning out of it. I expect that we cannot do the same thing for decision theory, because the relationship is more complicated.

Expected Utility Theory is about your preferences over various worlds. If you follow the analogy with LI strongly, if you succeed, we will be able to extend it to having preferences over various worlds which contain yourself. This seems very far from a solution to naturalized decision theory. In fact, it does not feel that far from what we might be able to easily do with existing Expected Utility Theory plus logical inductors.

Perhaps I am attacking a straw man, and you mean “do the same thing we did with logical induction” less literally than I am interpreting it, but in this case there is way more special sauce in the part about what you do to generalize expected utility theory, so I expect it to be much harder than the Logical Induction case.

[-]jessicata9y30

On the other hand, Naturalized Decision Theory is not about resource bounded Expected Utility Theory.

I think there's a sense in which I buy this but it might be worth explaining more.

My current suspicion is that "agents that have utility functions over the outcome of the physics they are embedded in" is not the right concept for understanding naturalized agency (in particular, the "motive forces" of the things that emerge from processes like abiogenesis/evolution/culture/AI research). This concept is often argued for using dutch-book arguments (e.g. VNM). I think these arguments are probably invalid when applied to naturalized agents (if taken literally they assume something like a "view from nowhere" and unbounded computation, etc). As such, re-examining what arguments can be made about coherent naturalized agency while avoiding inscription errors* seems like a good path towards recovering the correct concepts for thinking about naturalized agency.

*I'm getting the term "inscription error" from Brian Cantwell Smith (On the Origin of Objects, p. 50):

It is a phenomenon that I will in general call an inscription error: a tendency for a theorist or observer, first, to write or project or impose or inscribe a set of ontological assumptions onto a computational system (onto the system itself, onto the task domain, onto the relation between the two, and so forth), and then, second, to read those assumptions or their consequences back off the system, as if that constituted an independent empirical discovery or theoretical result.

[-]abramdemski9y20

I think expected utility theory is pretty well understood, and this post plays around with details of a well understood theory, while naturalized decision theory is not well understood at all.

I think most of our disagreement actually hinges on this part. My feeling is that I, at least, don't understand EU well enough; when I look at the foundations which are supposed to argue decisively in its favor, they're not quite as solid as I'd like.

If I was happy with the VNM assumption of probability theory (which I feel is circular, since Dutch Book assumes EU), I think my position would be similar to this (linked by Alex), which strongly agrees with all of the axioms but continuity, and takes continuity as provisionally reasonable. Continuity would be something to maybe dig deeper into at some point, but not so likely to bear fruit that I'd want to investigate right away.

However, what's really interesting is justification of EU and probability theory in one stroke. The justification of the whole thing from only money-pump/dutch-book style arguments seems close enough to be tantalizing, while also having enough hard-to-justify parts to make it a real possibility that such a justification would be of an importantly generalized DT.

First, [...] I don’t think that thinking about Dutch books lead to the invention of Logical Inductors (Although maybe they would have if I followed the right path), and I don’t think that the post hoc connection provides much evidence that thinking about dutch books is useful.

All I have to say here is that I find it somewhat plausible outside-view; an insight from a result need not be an original generator of the result. I think max-margin classifiers in machine learning are like this; the learning theory which came from explaining why they work was then fruitful in producing other algorithms. (I could be wrong here.)

Second, I think that in a way Logical Uncertainty is about resource bounded Probability theory, and this is why a weakening of dutch books helped. On the other hand, Naturalized Decision Theory is not about resource bounded Expected Utility Theory.

I don't think naturalized DT is exactly what I'm hoping to get. My highest hope that I have any concrete reason to expect is a logically-uncertain DT which is temporally consistent (without a parameter for how long to run the LI).

[-]AlexMennen9y30

I take the axiom of independence to be tier two: an intuitively strong rationality principle, but not one that’s enforced by nasty things that happen if we violate it. It surprises me that I’ve only seen this kind of justification for one of the four VNM axioms. Actually, I suspect that independence could be justified in a tier-one way; it’s just that I haven’t seen it.

Suppose A < B but pA+(1-p)C > pB + (1-p)C. A genie offers you a choice between pA+(1-p)C and pB + (1-p)C, but charges you a penny for the former. Then if A is supposed to happen rather than C, the genie offers to make B happen instead, but will charge you another penny for it. If you pay two pennies, you're doing something wrong. (Of course, these money-pumping arguments rely on the possibility of making arbitrarily small side payments.)

I think many people would put continuity at tier two, a strong intuitive principle. I don’t see why, personally. For me, it seems like an assumption which only makes sense if we already have the intuition that expected utility is going to be the right way of doing things. This puts it in tier 3 for me; another structural axiom.

Sure, it is structural, but your description of structural axioms made it sound like something it would be better if you didn't have to accept, in case they end up not being true, which would be very inconvenient for the theorem. But if the continuity axiom is not an accurate description of your preferences, pretending it is changes almost nothing, so accepting the continuity axiom anyway seems well-justified from a pragmatic point of view. See this and this (section "Doing without Continuity") for explanations.

Savage chooses not to define probabilities on a sigma-algebra. I haven’t seen any decision-theorist who prefers to use sigma-algebras yet. Similarly, he only derives finite additivity, not countable additivity; this also seems common among decision theorists.

This is annoying. Does anyone here know why they do this? My guess is that it's because their nice theorems about the finite case don't have straightforward generalizations that refer to sigma-algebras (I'm guessing this mainly because it appears to be the case for the VNM theorem, which only works if lotteries can only assign positive probability to finitely many outcomes).

[-]Vanessa Kosoy9y00

Savage chooses not to define probabilities on a sigma-algebra. I haven’t seen any decision-theorist who prefers to use sigma-algebras yet. Similarly, he only derives finite additivity, not countable additivity; this also seems common among decision theorists.

This is annoying. Does anyone here know why they do this? My guess is that it’s because their nice theorems about the finite case don’t have straightforward generalizations that refer to sigma-algebras (I’m guessing this mainly because it appears to be the case for the VNM theorem, which only works if lotteries can only assign positive probability to finitely many outcomes).

Is it indeed the case that the VNM theorem cannot be generalized to the measure-theoretic setting?

Hypothesis: Consider $X$ a compact Polish space. Let $R \subseteq P (X) \times P (X)$ be closed in the weak topology and satisfy the VNM axioms (in the sense that $μ \leq ν$ iff $(μ, ν) \in R$ ). Then, there exists $u : X \to R$ continuous s.t. $(μ, ν) \in R$ iff $E_{μ} [u] \leq E_{ν} [u]$ .

Counterexamples?

One is also tempted to conjecture a version of the above where $X$ is just a measurable space, $R$ is closed in the strong convergence topology and $u$ is just measurable. However, there's the issue that if $u$ is not bounded from either direction, there will be $μ$ s.t. $E_{μ} [u]$ is undefined. Does it mean $u$ automatically comes out bounded from one direction? Or that we need to add an additional axiom, e.g. that there exists $μ$ which is a global minimum (or maximum) in the preference ordering?

[-]AlexMennen9y00

Both of your conjectures are correct. In the measurable / strong topology case, $u$ will necessarily be bounded (from both directions), though it does not follow that the bounds are achievable by any probability distribution.

I described the VNM theorem as failing on sigma-algebras because the preference relation being closed (in the weak or strong topologies) is an additional assumption, which seems much more poorly motivated than the VNM axioms (in Abram's terminology, the assumption is purely structural).

[-]Vanessa Kosoy9y00

I think that one can argue that a computationally bounded agent cannot reason about probabilities with infinite precision, and that therefore preferences have to depend on probabilities in a way which is in some sense sufficiently regular, which can justify the topological condition. It would be nice to make this idea precise. Btw, it seems that the topological condition implies the continuity axiom.

[-]Vanessa Kosoy9y10

Completeness is less clear...

Actually, the way you formulated it, completeness seems quite clear. If completeness is violated then there are $A$ and $B$ s.t. $A < B$ and $B < A$ which is an obvious money-pump. It is transitivity that is suspect: in order to make the Dutch book argument, you need to assume the agent would agree to switch between $A$ and $B$ s.t. neither $A < B$ nor $B < A$ . On the other hand, we could have used $\leq$ as the basic relation and defined $A < B$ as " $A \leq B$ and not $B \leq A$ ." In this version, transitivity is "clear" (assuming appropriate semantics) but completeness (i.e. the claim that for any $A$ and $B$ , either $A \leq B$ or $B \leq A$ ) isn't.

Btw, what would be an example of a relation that satisfies the other axioms but isn't coherently extensible?

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

11

Generalizing Foundations of Decision Theory

11

The Project

Longer History

Justifying Probability Theory

Justifying Decision Theory

Conditional Probability as Primitive

OK, So What?