My current take is that we don't have good formalisms for consequentialist goal-directed systems that are weaker than expected utility maximization, and therefore don't really know how to reason about them. I think this is main cause of overemphasis on EUM.
For example, completeness as stated in the VNM assumptions is actually a really strong property. Aumann wrote a paper on removing completeness, but the utility function is no longer unique.
Epistemic Status
Unsure[1], partially noticing my own confusion. Hoping Cunningham's Law can help resolve it.
Related Answer
Confusions About Arguments From Expected Utility Maximisation
Some MIRI people (e.g. Rob Bensinger) still highlight EU maximisers as the paradigm case for existentially dangerous AI systems. I'm confused by this for a few reasons:
I don't expect the systems that matter (in the par human or strongly superhuman regime) to be expected utility maximisers. I think arguments for AI x-risk that rest on expected utility maximisers are mostly disconnected from reality. I suspect that discussing the perils of expected utility maximisation in particular — as opposed to e.g. dangers from powerful (consequentialist?) optimisation processes — is somewhere between being a distraction and being actively harmful[3].
I do not think expected utility maximisation is the limit of what generally capable optimisers look like[4].
Arguments for Expected Utility Maximisation Are Unnecessary
I don't think the case for existential risks from AI safety rest on expected utility maximisation. I kind of stopped alieving expected utility maximisers a while back (only recently have I synthesised explicit beliefs that reject it), but I still plan on working on AI existential safety, because I don't see the core threat as resulting from expected utility maximisation.
The reasons I consider AI an existential threat mostly rely on:
I do not actually expect extinction near term, but it's not the only "existential catastrophe":
I optimised for writing this quickly. So my language may be stronger/more confident that I actually feel. I may not have spent as much time accurately communicating my uncertainty as may have been warranted.
Correct me if I'm mistaken, but I'm under the impression that RL is the main training paradigm we have that selects for agents.
I don't necessarily expect that our most capable systems would be trained via reinforcement learning, but I think our most agentic systems would be.
There may be significant opportunity cost via diverting attention from other more plausible pathways to doom.
In general, I think exposing people to bad arguments for a position is a poor persuasive strategy as people who dismiss said bad arguments may (rationally) update downwards on the credibility of the position.
I don't necessarily think agents are that limit either. But as "Why Subagents?" shows, expected utility maximisers aren't the limit of idealised agency.