We can assign meanings to statements like “my sensor sees red” by picking out subsets of experiences, just as before.
How do you assign meaning to statements like "my sensor will see red"? (In the OP you mention "my sensors will see the heads side of the coin" but I'm not sure what your proposed semantics of such statements are in general.)
Also, here's an old puzzle of mine that I wonder if your line of thinking can help with: At time 1 you will be copied and the original will be shown "O" and the copy will be shown "C", then at time 2 the copy will be copied again, and the three of you will be shown "OO" (original), "CO" (original of copy), "CC" (copy of copy) respectively. At time 0, what are your probabilities for "I will see X" for each of the five possible values of X?
That's a very good question! It's definitely more complicated once you start including other observers (including future selves), and I don't feel that I understand this as well.
But I think it works like this: other reasoners are modeled (0P) as using this same framework. The 0P model can then make predictions about the 1P judgements of these other reasoners. For something like anticipation, I think it will have to use memories of experiences (which are also experiences) and identify observers for which this memory corresponds to the current experience. Understanding this better would require being more precise about the interplay between 0P and 1P, I think.
(I'll examine your puzzle when I have some time to think about it properly)
Defining the semantics and probabilities of anticipation seems to be a hard problem. You can see some past discussions of the difficulties at The Anthropic Trilemma and its back-references (posts that link to it). (I didn't link to this earlier in case you already found a fresh approach that solved the problem. You may also want to consider not reading the previous discussions to avoid possibly falling into the same ruts.)
[Without having looked at the link in your response to my other comment, and I also stopped reading cubefox's comment once it seemed that it was going in a similar direction. ETA: I realized after posting that I have seen that article before, but not recently.]
I'll assume that the robot has a special "memory" sensor which stores the exact experience at the time of the previous tick. It will recognize future versions of itself by looking for agents in its (timeless) 0P model which has a memory of its current experience.
For p("I will see O"), the robot will look in its 0P model for observers which have the t=0 experience in their immediate memory, and selecting from those, how many have judged "I see O" as Here. There will be two such robots, the original and the copy at time 1, and only one of those sees O. So using a uniform prior (not forced by this framework), it would give a 0P probability of 1/2. Similarly for p("I will see C").
Then it would repeat the same process for t=1 and the copy. Conditioned on "I will see C" at t=1, it will conclude "I will see CO" with probability 1/2 by the same reasoning as above. So overall, it will assign: p("I will see OO") = 1/2, p("I will see CO") = 1/4, p("I will see CC") = 1/4
The semantics for these kinds of things is a bit confusing. I think that it starts from an experience (the experience at t=0) which I'll call E. Then REALIZATION(E) casts E into a 0P sentence which gets taken as an axiom in the robot's 0P theory.
A different robot could carry out the same reasoning, and reach the same conclusion since this is happening on the 0P side. But the semantics are not quite the same, since the REALIZATION(E) axiom is arbitrary to a different robot, and thus the reasoning doesn't mean "I will see X" but instead means something more like "They will see X". This suggests that there's a more complex semantics that allows worlds and experiences to be combined - I need to think more about this to be sure what's going on. Thus far, I still feel confident that the 0P/1P distinction is more fundamental than whatever the more complex semantics is.
(I call the 0P -> 1P conversion SENSATIONS, and the 1P -> 0P conversion REALIZATION, and think of them as being adjoints though I haven't formalized this part well enough to feel confident that this is a good way to describe it: there's a toy example here if you are interested in seeing how this might work.)
Then it would repeat the same process for t=1 and the copy. Conditioned on “I will see C” at t=1, it will conclude “I will see CO” with probability 1⁄2 by the same reasoning as above. So overall, it will assign:p(“I will see OO”) = 1⁄2,p(“I will see CO”) = 1⁄4,p(“I will see CC”) = 1⁄4
This all make me think there's something wrong with the 1/2,1/4,1/4 answer and with the way you define probabilities of future experiences. More specifically, suppose OO wasn't just two letters but an unpleasant experience, and CO and CC are both pleasant experiences, so you prefer "I will experience CO/CC" to "I will experience OO". Then at time 0 you would be willing to pay to switch from the original setup to (2) or (3), and pay even more to switch to (4). But that seems pretty counterintuitive, i.e., why are you paying to avoid making observations in (3), or paying to make and delete copies of yourself in (4). Both of these seem at best pointless in 0P.
But every other approach I've seen or thought of also has problems, so maybe we shouldn't dismiss this one too easily based on these issues. I would be interested to see you work out everything more formally and address the above objections (to the extent possible).
I'm confused about why 1P-logic is needed. It seems to me like you could just have a variable X which tracks "which agent am I" and then you can express things like sensor_observes(X, red)
or is_located_at(X, northwest)
. Here and Absent are merely a special case of True and False when the statement depends on X
.
Because you don't necessarily know which agent you are. If you could always point to yourself in the world uniquely, then sure, you wouldn't need 1P-Logic. But in real life, all the information you learn about the world comes through your sensors. This is inherently ambiguous, since there's no law that guarantees your sensor values are unique.
If you use X as a placeholder, the statement sensor_observes(X, red)
can't be judged as True or False unless you bind X to a quantifier. And this could not mean the thing you want it to mean (all robots would agree on the judgement, thus rendering it useless for distinguishing itself amongst them).
It almost works though, you just have to interpret "True" and "False" a bit differently!
Truth values in classical logic have more than one interpretation.
In 0th Person Logic, the truth values are interpreted as True and False.
In 1st Person Logic, the truth values are interpreted as Here and Absent relative to the current reasoner.
Importantly, these are both useful modes of reasoning that can coexist in a logical embedded agent.
This idea is so simple, and has brought me so much clarity that I cannot see how an adequate formal theory of anthropics could avoid it!
Crash Course in Semantics
First, let's make sure we understand how to connect logic with meaning. Consider classical propositional logic. We set this up formally by defining terms, connectives, and rules for manipulation. Let's consider one of these terms: A. What does this mean? Well, its meaning is not specified yet!
So how do we make it mean something? Of course, we could just say something like "Arepresents the statement that 'a ball is red'". But that's a little unsatisfying, isn't it? We're just passing all the hard work of meaning to English.
So let's imagine that we have to convey the meaning of A without using words. We might draw pictures in which a ball is red, and pictures in which there is not a red ball, and say that only the former are A. To be completely unambiguous, we would need to consider all the possible pictures, and point out which subset of them are A. For formalization purposes, we will say that this set is the meaning of A.
There's much more that can be said about semantics (see, for example, the Highly Advanced Epistemology 101 for Beginners sequence), but this will suffice as a starting point for us.
0th Person Logic
Normally, we think of the meaning of A as independent of any observers. Sure, we're the ones defining and using it, but it's something everyone can agree on once the meaning has been established. Due to this independence from observers, I've termed this way of doing things 0th Person Logic (or 0P-logic).
The elements of a meaning set I'll call worlds in this case, since each element represents a particular specification of everything in the model. For example, say that we're only considering states of tiles on a 2x2 grid. Then we could represent each world simply by taking a snapshot of the grid.
From logic, we also have two judgments. A is judged True for a world iff that world is in the meaning of A. And False if not. This judgement does not depend on who is observing it; all logical reasoners in the same world will agree.
1st Person Logic
Now let's consider an observer using logical reasoning. For metaphysical clarity, let's have it be a simple, hand-coded robot. Fix a set of possible worlds, assign meanings to various symbols, and give it the ability to make, manipulate, and judge propositions built from these.
Let's give our robot a sensor, one that detects red light. At first glance, this seems completely unproblematic within the framework of 0P-logic.
But consider a world in which there are three robots with red light sensors. How do we give A the intuitive meaning of "my sensor sees red"? The obvious thing to try is to look at all the possible worlds, and pick out the ones where the robot's sensor detects red light. There are three different ways to do this, one for each instance of the robot.
That's not a problem if our robot knows which robot it is. But without sensory information, the robot doesn't have any way to know which one it is! There may be both robots which see a red signal, and robots which do not—and nothing in 0P-Logic can resolve this ambiguity for the robot, because this is still the case even if the robot has pinpointed the exact world it's in!
So statements like "my sensor sees red" aren't actually picking out subsets of worlds like 0P-statements are. Instead, they're picking out a different type of thing, which I'll term an experience.[1] Each specific combination of possible sensor values constitutes a possible experience.
For the most part, experiences work in exactly the same way as worlds. We can assign meanings to statements like "my sensor sees red" by picking out subsets of experiences, just as before. It's still appropriate to reason about these using logic. Semantically, we're still just doing basic set operations—but now on sets of experiences instead of sets of worlds.
The crucial difference comes from how we interpret the "truth" values. A is judged Here for an experience iff that experience is in the meaning of A. And Absent if not. This judgment only applies to the robot currently doing the reasoning—even the same robot in the future may come to different judgments about whether A is Here. Therefore, I've termed this 1st Person Logic (or 1P-logic).
We Can Use Both
In order to reason effectively about its own sensor signals, the robot needs 1P-logic.
In order to communicate effectively about the world with other agents, it needs 0P-logic, since 0P-statements are precisely the ones which are independent of the observer. This includes communicating with itself in the future, i.e. keeping track of external state.
Both modes of reasoning are useful and valid, and I think it's clear that there's no fundamental difficulty in building a robot that uses both 0P and 1P reasoning—we can just program it to have and use two logic systems like this. It's hard to see how we could build an effective embedded agent that gets by without using them in some form.
While 0P-statements and 1P-statements have different types, that doesn't mean they are separate magisteria or anything like that. From an experience, we learn something about the objective world. From a model of the world, we infer what sort of experiences are possible within it.[2]
As an example of the interplay between the 0P and 1P perspectives, consider adding a blue light sensor to our robot. The robot has a completely novel experience when it first gets activated! If its world model doesn't account for that already, it will have to extend it somehow. As it explores the world, it will learn associations with this new sense, such as it being commonly present in the sky. And as it studies light further, it may realize there is an entire spectrum, and be able to design a new light sensor that detects green light. It will then anticipate another completely novel experience once it has attached the green sensor to itself and it has been activated.
This interplay allows for a richer sense of meaning than either perspective alone; blue is not just the output of an arbitrary new sensor, it is associated with particular things already present in the robot's ontology.
Further Exploration
I hope this has persuaded you that the 0P and 1P distinction is a core concept in anthropics, one that will provide much clarity in future discussions and will hopefully lead to a full formalization of anthropics. I'll finish by sketching some interesting directions it can be taken.
One important consequence is that it justifies having two separate kinds of Bayesian probabilities: 0P-probabilities over worlds, and 1P-probabilities over experiences. Since probability can be seen as an extension of propositional logic, it's unavoidable to get both kinds if we accept these two kinds of logic. Additionally, we can see that our robot is capable of having both, with both 0P-probabilities and 1P-probabilities being subjective in the sense that they depend on the robot's own best models and evidence.
From this, we get a nice potential explanation to the Sleeping Beauty paradox: 1/2 is the 0P-probability, and 1/3 is the 1P-probability (of slightly different statements: "the coin in fact landed heads", "my sensors will see the heads side of the coin"). This could also explain why both intuitions are so strong.
It's worth noting that no reference to preferences has yet been made. That's interesting because it suggests that there are both 0P-preferences and 1P-preferences. That intuitively makes sense, since I do care about both the actual state of the world, and what kind of experiences I'm having.
Additionally, this gives a simple resolution to Mary's Room. Mary has studied the qualia of 'red' all her life (gaining 0P-knowledge), but has never left her grayscale room. When she leaves it and sees red for the very first time, she does not gain any 0P-knowledge, but she does gain 1P-knowledge. Notice that there is no need to invoke anything epiphenomenal or otherwise non-material to explain this, as we do not need such things in order to construct a robot capable of reasoning with both 0P and 1P logic.[3]
Finally, this distinction may help clarify some confusing aspects of quantum mechanics (which was the original inspiration, actually). Born probabilities are 1P-probabilities, while the rest of quantum mechanics is a 0P-theory.
Special thanks to Alex Dewey, Nisan Stiennon and Claude 3 for their feedback on this essay, to Alexander Gietelink Oldenziel and Nisan for many insightful discussions while I was developing these ideas, and to Alex Zhu for his encouragement. In this age of AI text, I'll also clarify that everything here was written by myself.
The idea of using these two different interpretations of logic together like this is original to me, as far as I am aware (Claude 3 said it thought so too FWIW). However, there have been similar ideas, for example Scott Garrabrant's post about logical and indexical uncertainty, or Kaplan's theory of indexicals.
I'm calling these experiences because that is a word that mostly conveys the right intuition, but these are much more general than human Experiences and apply equally well to a simple (non-AI) robot's sensor values.
More specifically, I expect there to be an adjoint functor pair of some sort between them (under the intuition that an adjoint functor pair gives you the "best" way to cast between different types).
I'm not claiming that this explains qualia or even what they are, just that whatever they are, they are something on the 1P side of things.