«Boundaries», Part 3a: Defining boundaries as directed Markov blankets

Overall, this is my favorite thing I have read on lesswrong in the last year.

Agreements:

I agree very strongly with most of this post, both in the way you are thinking about boundaries, and in the scale and scope of applications of boundaries to important problems.

In particular on the applications, I think that boundaries as you are defining them are crucial to developing decision theory and bargaining theory (and indeed are already helpful for thinking about bargaining and fairness in real life), but I also agree with your other potential applications.

I particular on the theory, I agree that the boundary of an agent (or agent-like-thing) should be thought of as that which screens off the viscera of the agent from its environment. I agree that the agent should think of decisions as intervening on its boundary. I agree that the boundary (as the agent sees it) only partially does the screening-off thing. I agree that the agent should in part be focused on actively maintaining its boundary, as this is crucial to its integrity as an agent.

I believe the above mostly independently of this post, but the place where I think this post is doing better than my default way of thinking is in the directionality of the arrows. I have been thinking about this in a pretty symmetric way: the notion of B screening off V from E is symmetric in swapping V and E. I was aware this was a mistake (because logical mutual information is not symmetric), but this post makes it clear to me how important that mistake was. Thanks!

Disagreements:

Philosophical nitpick: I think the boundary should be thought of as part of the agent/organism and simultaneously as part of the environment. Indeed, the screening off property can be thought of as an informational (as opposed to physical) way of saying that the boundary is the intersection of the agent and environment.

I think the boundary factorization into active and passive is wrong. I am not sure what is right here. My default proposal is to think of the active as the minimal part that contains all information flow from the viscera, and the perceptive as the minimal part that contains all information flow from the environment. By definition, these cover the boundary, but they might intersect. (An alternative proposal is to define the active as the part that the agent thinks of its interventions as living, and the perceptive as where the agent thinks of its perceptions as living, and now they don't cover the boundary)

In both of the above, I am pushing for the claim that we are not yet in the part of the theory where we need to break agent-environment symmetry in the theory. (Although we do need to track the directions of information flow separately!)

I think that thinking of there as being physical nodes is wrong. Unfortunately Finite Factored Sets is not yet able to handle directionality of information flow, so I see how it is the only way you can express an important part of the model. We need to fix that, so we can think of viscera, environments, boundaries, etc. as features of the world rather than sets of nodes.

I also think that the time-embedded picture is wrong. I often complain about models that have a thing persisting across linear time like this, but I think it is especially important here. As far as I can tell, time is mostly about screening-off, and boundaries are also mostly about screening-off, so I think that this is a domain in which it is especially important to get time right.

[-]Andrew_Critch3y40

Thanks, Scott!

I think the boundary factorization into active and passive is wrong.

Are you sure? The informal description I gave for A and P allow for the active boundary to be a bit passive and the passive boundary to be a bit active. From the post:

the active boundary, A — the features or parts of the boundary primarily controlled by the viscera, interpretable as "actions" of the system— and the passive boundary, P — the features or parts of the boundary primarily controlled by the environment, interpretable as "perceptions" of the system.

There's a question of how to factor B into a zillion fine-grained features in the first place, but given such a factorization, I think we can define A and P fairly straightforwardly using Shapley value to decide how much V versus E is controlling each feature, and then A and P won't overlap and will cover everything.

[-]Scott Garrabrant3y41

Oh yeah, oops, that is what it says. Wasn’t careful, and was responding to reading an old draft. I agree that the post is already saying roughly what I want there. Instead, I should have said that the B=AxP bijection is especially unrealistic. Sorry.

[-]Andrew_Critch3y10

Why is it unrealistic? Do you actually mean it's unrealistic that the set I've defined as "A" will be interpretable at "actions" in the usual coarse-grained sense? If so I think that's a topic for another post when I get into talking about the coarsened variables ...

[-]Scott Garrabrant3y20

I mean, the definition is a little vague. If your meaning is something like "It goes in A if it is more accurately described as controlled by the viscera, and it goes in P if it is more accurately described as controlled by the environment," then I guess you can get a bijection by definition, but it is not obvious these are natural categories. I think there will be parts of the boundary that feel like they are controlled by both or neither, depending on how strictly you mean "controlled by."

[-]Scott Garrabrant3y21

Forcing the AxP bijection is an interesting idea, but it feels a little too approximate to my taste.

[-]Scott Garrabrant3y30

To be clear, everywhere I say “is wrong,” I mean I wish the model is slightly different, not that anything is actually is mistaken. In most cases, I don’t really have much of an idea how to actually implement my recommendation.

[-]Scott Garrabrant3y20

More of my thoughts here.

[-]Alex Flint3y60

I have the sense that boundaries are so effective as a coordination mechanism that we have come to believe that they are an end in themselves. To me it seems that the over-use of boundaries leads to loneliness that eventually obviates all the goodness of the successful coordination. It's as if we discovered that cars were a great way to get from place to place, but then we got so used to driving in cars that we just never got out of them, and so kind of lost all the value of being able to get from place to place. It was because the cars were in fact so effective as transportation devices that started to emphasize them so heavily in our lives.

You say "real-world living systems sometimes do funky things like opening up their boundaries" but that's like saying "real-world humans sometimes do funky things like getting out of their cars" -- we shouldn't begin with the view that boundaries are the default thing and then consider some "extreme cases" where people open up their boundaries.

Some specific cases to consider for a theory of boundaries-as-arising-from-cordination:

A baby grows inside a mother and is born, gradually establishing boundaries. You might say the baby has zero boundaries just prior to conception and full boundaries at age 10? age 15? age 20? How do you make appropriate sense of the coming into existence of boundaries over time?
A human dies, gradually losing agency over years. What is the appropriate way to view the attenuation-to-zero of this person's boundaries?
During an adult human life, a person finds themselves in situations where it is extremely difficult, for practical reasons, to establish certain boundaries. For example, two people locked in a tiny closet together are unable to establish, perhaps, any boundary around personal space. Perhaps it was a mistake to get locked in there in the first place, but now that they are in there, they need a way to coordinate without being able to establish certain boundaries.

Overall, I would ask "what is an effective set of boundaries given our situation and our goal?" rather than "how can we coordinate on our goals given our situation and our apriori fixed boundaries?"