This is a longer, more mathematically dense introduction to "Finite Factored Sets." For a shorter introduction, see my Topos talk/transcript.
Abstract: We propose a new approach to temporal inference, inspired by the Pearlian causal inference paradigm—though quite different from Pearl's approach formally. Rather than using directed acyclic graphs, we make use of factored sets, which are sets expressed as Cartesian products. We show that finite factored sets are powerful tools for inferring temporal relations. We introduce an analog of d-separation for factored sets, conditional orthogonality, and we demonstrate that this notion is equivalent to conditional independence in all probability distributions on a finite factored set.
1. Introduction
1.1. Pearlian Causal Inference
Judea Pearl's theory of inferred causation (e.g., as presented in chapter 2 of Causality: Models, Reasoning, and Inference) was a deep advance in our understanding of the nature of time. The Pearlian paradigm allows us to infer causal relationships between variables using statistical data, and thereby infer temporal sequence—in defiance of the old adage that correlation does not imply causation.
In particular, given a collection of variables and a joint probability distribution over those variables, the Pearlian paradigm can often infer temporal relationships between the variables.
The joint probability distribution is usually what gets emphasized in discussions of Pearl's approach. Quite a bit of work is being done, however, by the assumption that we are handed "a collection of variables" to reason about. The Pearlian paradigm is not inferring temporal relationships from purely statistical data, but rather inferring temporal relationships from statistical data together with data about how to factorize the world into variables.[1]
A doctor who misdiagnoses their patient or misidentifies a symptom may base their subsequent reasoning on a wrong factorization of the situation into causally relevant variables. We would ideally like to build fewer assumptions like this into our model of inference, and instead allow the reasoner to figure such facts out, consider the merits of different factorizations into variables, etc.
Instead of beginning with a collection of variables and a joint probability distribution over those variables, one could imagine starting with just a finite sample space and a probability distribution on that sample space. In this way, we might hope to do temporal inference purely using statistical data, without relying on a priori knowledge of a canonical way of factoring the situation into variables.
How might one do temporal inference without an existing factorization? One way might be to just consider all possible variables that can be defined on the sample space. This gives us one variable for each partition of the set.
However, when one tries to apply Pearl's methods to this collection of variables, one quickly runs into a problem: many of the variables definable on a fixed set are deterministic functions of each other. The Pearlian paradigm, as presented in the early chapters of Causality, lacks tools for performing temporal inference on variables that are highly deterministically related.[2]
We will introduce a new approach to temporal inference instead—one which is heavily inspired by the Pearlian paradigm, but approaches the problem with a very different formal apparatus, and does not make use of graphical models.
1.2. Overview
We'll begin by introducing the concept of a finite factored set, in Section 2. This will be our analogue of the directed acyclic graphs in Pearl's framework.
In Section 3, we will introduce the concepts of time and orthogonality, which can be read off of a finite factored set. In Pearl's framework, "time" corresponds to directed paths between nodes, and "orthogonality" corresponds to nodes that have no common ancestor.
In Section 4, we will introduce conditional orthogonality, which is our analogue of d-separation. We show that conditional orthogonality satisfies (a modified version of) the compositional semigraphoid axioms. We then (in Section 5) prove the fundamental theorem of finite factored sets, which states that conditional orthogonality is equivalent to conditional independence in all probability distributions on the finite factored set.
In Section 6, we discuss how to do temporal inference using finite factored sets, and give two examples. Finally, in Section 7 we discuss applications and future work, with an emphasis on temporal and conceptual inference, generalizing finite factored sets to the infinite case, and applications to embedded agency.
And here, we take our leave of Pearl. We've highlighted this approach's relationship to the Pearlian paradigm in order to motivate finite factored sets and explain how we'll be using them in this sequence. Formally, however, our approach is quite unlike Pearl's, and the rest of the sequence will stand alone.
2. Factorizations
Before giving a definition of finite factored sets, we will recall the definition of a partition, and give some basic notation related to partitions.
We do this for two reasons. First, we will use partitions in the definition of a factored set; and second, we want to draw attention to a duality between the notion of a partition, and the notion of a factorization.
2.1. Partitions
We begin with a definition of disjoint union.
Definition 1 (disjoint union). Given a set S of sets, let ⊔(S) denote the set of all ordered pairs (T,t), where T∈S and t∈T.[3]
Definition 2 (partition). A partition of a set S is a set X⊆P(S) of nonempty subsets of S such that the function ι:⊔(X)→S given by ι(x,s)=s is a bijection.[4]
Let Part(S) denote the set of all partitions of S. The elements of a partition are called parts.
An equivalent definition of partition is often given: a partition is a set A of nonempty subsets of S that are pairwise disjoint and union to S. We choose the above definition because it will make the symmetry between partitions and factorizations more obvious.
Definition 3 (trivial partition). A partition X of a set S is called trivial if |X|=1.
Definition 4. Given a partition X of a set S, and an element s∈S, let [s]X denote the unique x∈X such that s∈x.
Definition 5. Given a partition X of a set S, and elements s0,s1∈S, we say s0∼Xs1 if [s0]X=[s1]X.
Proposition 1. Given a partition X of a set S, ∼X is an equivalence relation on S.
Proof. Trivial. □
Definition 6 (finer and coarser). We say that a partition X of S is finer than another partition Y of S, if for all s0,s1∈S, if s0∼Xs1, then s0∼Ys1.
If X is finer than Y, we also say Y is coarser than X, and we write X≥SY and Y≤SX.
Definition 7 (discrete and indiscrete partitions). Given a set S, let DisS={{s}∣s∈S}.
If S is empty, let IndS={}, and if S is nonempty, let IndS={S}.
DisS is called the discrete partition, and IndS is called the indiscrete partition.
Proposition 2. For any set S, ≥S is a partial order on Part(S). Further, for all X∈Part(S), DisS≥SX and X≥SIndS.
Proof. Trivial. □
While both notations are sometimes used, it is more standard to draw the symbol in the opposite direction and have X≤Y when X is finer than Y. We choose to go against that standard because we want to think of partitions in part as the ability to distinguish between elements, and finer partitions correspond to greater ability to distinguish.[5]
Definition 8 (common refinement). Given a set C of partitions of a fixed set S, let ⋁S(C) denote the partition X∈Part(S) satisfying s0∼Xs1 if and only if s0∼cs1 for all c∈C. Given X,Y∈Part(S), we let X∨SY=⋁S({X,Y}).
2.2. Factorizations
We start with a definition of Cartesian product.
Definition 9 (Cartesian product). Given a set S of sets, let ⊓(S) denote the set of all functions f:S→⊔(S) such that for all T∈S, f(T) is of the form (T,t), for some t∈T.
We can now give the definition of a factorization of a set.
Definition 10 (factorization). A factorization of a set S is a set B⊆Part(S) of nontrivial partitions of S such that the function π:S→⊓(B), given by π(s)=(b↦(b,[s]b)), is a bijection.
Let Fact(S) denote the set of all factorizations of S. The elements of a factorization are called factors.
In other words, a set of nontrivial partitions is a factorization of S if for each way of choosing one part from each factor, there exists a unique element of S in the intersection of those parts.
Notice the duality between the definitions of partition and factorization. We replace subsets with partitions, nonempty with nontrivial, and disjoint union with Cartesian product, and we reverse the direction of the function. We can think of a factorization of S as a way to view S as a product, in the same way that a partition was a way to view S as a disjoint union.
A factored set is just a set together with a factorization of that set.
Definition 11 (factored set). A factored set F is an ordered pair (S,B), such that B is a factorization of S.
If F=(S,B) is a factored set, we let set(F)=S, and let basis(F)=B.
An important fact about factored sets is that the factors are enough to distinguish distinct elements.
Proposition 3. Given a factored set F=(S,B), and elements s0,s1∈S, if s0∼bs1 for all b∈B, then s0=s1.
Proof. Let F=(S,B) be a finite factored set, and let s0,s1∈S satisfy s0∼bs1 for all b∈B.
Let π:S→⊓(B) be given by π(s)=(b↦(b,[s]b)), as in the definition of factorization. Then π(s0)=(b↦(b,[s0]b))=(b↦(b,[s1]b))=π(s1). Since π is bijective, this means s0=s1. □
2.3. Chimera Functions
The following theorem can be viewed as an alternate characterization of factorization. We will use this alternate characterization to define chimera functions, which will be useful tools for manipulating elements of factored sets.
Theorem 1. Given a set S, a set B of nontrivial partitions of S is a factorization of S if and only if for every function g:B→S, there exists a unique s∈S such that for all b∈B, s∼bg(b).
Proof. First, we let B be a factorization of S, and let g:B→S be any function. We want to show that there exists a unique s∈S such that for all b∈B, s∼bg(b). Let π:S→⊓(B) be given by π(s)=(b↦(b,[s]b)), as in the definition of factorization. Note that π is bijective, and thus has an inverse.
Let s=π−1(b↦(b,[g(b)]b)). Observe that this is well-defined, because (b↦(b,[g(b)]b)) is in fact in ⊓(B). We will show that s∼bg(b) for all b∈B, and the uniqueness of this s will then follow directly from Proposition 3.
We have π(s)=(b↦[s]b) by the definition of π. However, we also have π(s)=(b↦[g(b)]b) by the definition of s. Thus, b↦[s]b and b↦[g(b)]b are the same function, so [s]b=[g(b)]b for all b∈B, so s∼bg(b) for all b∈B.
Conversely, let S be any set, and let B be any set of nontrivial partitions of S. Assume that for all g:B→S, there exists a unique s∈S satisfying s∼bg(b) for b∈B. Again, let π:S→⊓(B) be given by π(s)=(b↦(b,[s]b)), as in the definition of factorization. We want to show that π is invertible.
First, we show that π is injective. Take an arbitrary s0∈S, and let g:B→S be the constant function satisfying g(b)=s0 for all b∈B. Given another s1∈S, if π(s0)=π(s1), then (b↦[s0]b)=(b↦[s1]b), so [s1]b=[s0]b=[g(b)]b for all b∈B, so s0∼bs1∼bg(b) for all b∈B. Since there is a unique s∈S satisfying s∼bg(b) for all b∈B, this means s0=s1. Thus π is injective.
To see that π is surjective, consider some arbitrary h∈⊓(B). We want to show that there exists an s∈S with h=π(s).
For all b∈B, let Hb∈b be given by h(b)=(b,Hb), which is well-defined since h∈⊓(B). Note that Hb is a nonempty subset of S, so there exists a function g:B→S with g(b)∈Hb for all b∈B. Fix any such g, and let s satisfy s∼bg(b) for all b∈B.
We thus have that for all b∈B, h(b)=(b,Hb)=(b,[g(b)]b)=(b,[s]b)=π(s)(b), so h=π(s). Thus π is surjective.
Since π is bijective, we have that B is a factorization of S. □
This also gives us that factors are disjoint from each other.
Corollary 1. Given a factored set F=(S,B) and distinct factors b0,b1∈B, b0∩b1={}.
Proof. Assume by way of contradiction that T∈b0∩b1. Since b0 is nontrivial, there must be some other T′∈b0 with T∩T′={}. Let g:B→S be any function such that g(b0)∈T′ and g(b1)∈T. Then there can be no s such that s∼b0g(b0) and s∼b1g(b1), since then s would be in both T and T′. This contradicts Theorem 1. □
We are now ready to define the chimera function of a factored set.
Definition 12 (chimera function). Given a factored set F=(S,B), the chimera function (of F) is the function χF:(B→S)→S defined by χF(g)∼bg(b) for all g:B→S and b∈B.
The name "chimera function" comes from the fact that χF can be viewed as building an element of S by fusing together the properties of various different elements. Since we will often apply the chimera function to functions g that only take on two values, we will give notation for this special case.
Definition 13. Given a factored set F=(S,B), and a subset C⊆B, let χFC:S×S→S be given by χFC(s,t)=χF(g), where g:B→S is given by g(b)=s if b∈C, and g(b)=t otherwise.
For T,R⊆S, we will write χFC(T,R) for {χFC(t,r)∣t∈T,r∈R}.
The following is a list of properties of χFC, which will be useful in later proofs. All of these properties follow directly from the definition of χFC.
Proposition 4. Fix F=(S,B), a factored set, C,D⊆B, and s,t,r∈S.
χFC(s,t)∼cs for all c∈C.
χFC(s,t)∼bt for all b∈B∖C.
χFC(s,s)=s.
χFB∖C(s,t)=χFC(t,s).
χFC∪D(s,t)=χFC(s,χFD(s,t)).
χFC∩D(s,t)=χFC(χFD(s,t),t).
χFC(χFC(s,t),r)=χFC(s,χFC(t,r))=χFC(s,r).
χFC(s,χFD(t,r))=χFD(χFC(s,t),χFC(s,r)).
χFC(χFD(s,t),r)=χFD(χFC(s,r),χFC(t,r)).
χFB(s,t)=s.
χF{}(s,t)=t.
Proof. Trivial. □
2.4. Trivial Factorizations
We now define a notion of a trivial factorization of a set, and show that every set has a unique trivial factorization.
Definition 14 (trivial factorization). A factorization B of a set S is called trivial if |B|≤1. A factored set (S,B) is called trivial if B is trivial.
Proposition 5. For every set S, there exists a unique trivial factorization B of S. If |S|≠1, this trivial factorization is given by B={DisS}, and if |S|=1, it is given by B={}.
Proof. We start with the case where |S|=0. The only partition of S is {}, so we only need to consider the sets of partitions {{}} and {} as potential factorizations. {{}} is vacuously a factorization of S by Theorem 1, since there are no functions from {{}} to S. {} is not a factorization by Theorem 1, since there is a function from {} to S, but there is no element of S. Thus, when |S|=0, {{}}={DisS} is the unique trivial factorization of S.
Next, consider the case where |S|=1. First, observe that the unique s∈S vacuously satisfies s∼bg(b) for all g:{}→S and b∈{}, since there is no b∈{}. Thus, by Theorem 1, {} is a factorization of S. Further, {} is the only factorization of S, since there are no nontrivial partitions of S. Thus, when |S|=1, {} is the unique trivial factorization of S.
Next, we consider the case where |S|≥2. Observe that DisS is a nontrivial partition of S. Let B={DisS}. We want to show that B is a factorization of S. By Theorem 1, it suffices to show that for all g:B→S, there exists a unique s∈S with s∼DisSg(DisS). We can take s=g(DisS), which clearly satisfies s∼DisSg(DisS). This s is unique, since if s′∼DisSg(DisS), then s′∈[g(DisS)]DisS=[s]b={s}, so s′=s. Thus B is a factorization of S.
On the other hand, if |S|≥2, {} is not a factorization of S, since if it were, Proposition 3 would imply that all elements of S are equal. Further, for any partition b of S, with b≠{DisS}, there must exist s0,s1∈S, with s0∼bs1, but s0≠s1. Thus {b} cannot be a factorization of S by Proposition 3. Thus when |S|≥2, DisS is the unique trivial factorization of S. □
2.5. Finite Factored Sets
This sequence will primarily be about finite factored sets.
Definition 15. If F=(S,B) is a factored set, the size of F, written size(F), is the cardinality of S. The dimension of F, written dim(F), is the cardinality of B. F is called finite if its size is finite, and finite-dimensional if its dimension is finite.
We suspect that the theory of infinite factored sets is both interesting and important. However, it is outside of the scope of this sequence, which will require finiteness for many of its key results.
Some of the definitions and results in this sequence will be given for finite factored sets, in spite of the fact that they could easily be extended to finite-dimensional or arbitrary factored sets. This is because they can often be extended in more than one way, and determining which extension is most natural requires further developing the theory of arbitrary factored sets.
Proposition 6. Every finite factored set is also finite-dimensional.
Proof. If F=(S,B) is a factored set, B is a set of sets of subsets of S. Thus, |B|≤22|S|. □
This bound is horrible and will be improved in Proposition 9. First, however, we will take a look at the number of factorizations of a fixed finite set.
Proposition 7. Let F=(S,B) be a finite factored set. Then |S|=∏b∈B|b|.
Proof. Trivial. □
Proposition 8. If |S| is equal to 0, 1, or a prime, the trivial factorization of S is the only factorization of S.
Proof. If |S|=0 or |S|=1, then |Part(S)|=1, so B⊆Part(S) can have cardinality at most 1.
If |S|=p, a prime, then by Proposition 7, |b| must divide p for all b∈B. Since factorizations cannot contain trivial partitions, this means |b|=p for all b∈B. However, {{s}∣s∈S} is the only element of Part(S) of cardinality p, so |B|≤1. □
On the other hand, in the case where |S| is finite and composite, the number of factorizations of S grows very quickly, as seen in Table 1. Table 1 shows the number of factorizations of a set S with cardinality up to 25:
|S|
|Fact(S)|
|S|
|Fact(S)|
0
1
13
1
1
1
14
8648641
2
1
15
1816214401
3
1
16
181880899201
4
4
17
1
5
1
18
45951781075201
6
61
19
1
7
1
20
3379365788198401
8
1681
21
1689515283456001
9
5041
22
14079294028801
10
15121
23
1
11
1
24
4454857103544668620801
12
13638241
25
538583682060103680001
Given the naturalness of the notion of factorization, we were surprised to discover that this sequence did not exist on the On-Line Encyclopedia of Integer Sequences (OEIS). We added the sequence, A338681, on April 30, 2021.
To give one concrete example, the four factorizations of the set {0,1,2,3} are:
{{{0},{1},{2},{3}}},
{{{0,1},{2,3}},{{0,2},{1,3}}},
{{{0,1},{2,3}},{{0,3},{1,2}}}, and
{{{0,2},{1,3}},{{0,3},{1,2}}}.
Proposition 9. Let F be a finite factored set.
If size(F)=0, then dim(F)=1.
If size(F)=1, then dim(F)=0.
If size(F)=p is prime, then dim(F)=1.
If size(F)=p0…pk−1 is a product of k≥2 primes, then 1≤dim(F)≤k.
Proof. The first three parts follow directly from Proposition 5 and Proposition 8. For the fourth part, let F=(S,B), and let |S|=p0…pk−1 be a product of k≥2 primes.
By Proposition 7, |S|=∏b∈B|b|. Consider an arbitrary b∈B. Since b is a nontrivial partition of a finite set S, |b| is finite and |b|≠1. If |b| were 0, then |S| would be 0. Thus |b| is a natural number greater than or equal to 2. B cannot be empty, since |S|≠1. If |B| were greater than k, then we would be able to express |S| as a product of more than k natural numbers greater than or equal to 2, which is clearly not possible since |S| is a product of k primes. Thus 1≤dim(F)≤k. □
In the next post, we will introduce the notions of the history of a partition, orthogonality between partitions, and time.
Acknowledgments: My thanks to Alex Appel, Ramana Kumar, Xiaoyu He, Tsvi Benson-Tilsen, Andrew Critch, Sam Eisenstat, Rob Bensinger, and Claire Wang for discussion and feedback on this sequence.
Footnotes
[1] Although I say "factorize" here, note that this will not be the kind of factorization that shows up in finite factored sets, because (as we will see) disjoint factors must be independent in a finite factored set. I appeal to the same concept in both contexts because factorization is just a very general and useful concept, rather than to indicate a direct connection.
[2] At least, it lacks such causal inference tools unless we assume access to interventional data.
[3] Note that this definition and Definition 9 could have been made more general by taking S to be a multiset.
[4] P(S) denotes the power set of S.
[5] In our view, "Y≥X" is also a more natural way to visually represent a mapping between a three-part partition Y that is finer than a two-part partition X.
This is a longer, more mathematically dense introduction to "Finite Factored Sets." For a shorter introduction, see my Topos talk/transcript.
Abstract: We propose a new approach to temporal inference, inspired by the Pearlian causal inference paradigm—though quite different from Pearl's approach formally. Rather than using directed acyclic graphs, we make use of factored sets, which are sets expressed as Cartesian products. We show that finite factored sets are powerful tools for inferring temporal relations. We introduce an analog of d-separation for factored sets, conditional orthogonality, and we demonstrate that this notion is equivalent to conditional independence in all probability distributions on a finite factored set.
1. Introduction
1.1. Pearlian Causal Inference
Judea Pearl's theory of inferred causation (e.g., as presented in chapter 2 of Causality: Models, Reasoning, and Inference) was a deep advance in our understanding of the nature of time. The Pearlian paradigm allows us to infer causal relationships between variables using statistical data, and thereby infer temporal sequence—in defiance of the old adage that correlation does not imply causation.
In particular, given a collection of variables and a joint probability distribution over those variables, the Pearlian paradigm can often infer temporal relationships between the variables.
The joint probability distribution is usually what gets emphasized in discussions of Pearl's approach. Quite a bit of work is being done, however, by the assumption that we are handed "a collection of variables" to reason about. The Pearlian paradigm is not inferring temporal relationships from purely statistical data, but rather inferring temporal relationships from statistical data together with data about how to factorize the world into variables.[1]
A doctor who misdiagnoses their patient or misidentifies a symptom may base their subsequent reasoning on a wrong factorization of the situation into causally relevant variables. We would ideally like to build fewer assumptions like this into our model of inference, and instead allow the reasoner to figure such facts out, consider the merits of different factorizations into variables, etc.
Instead of beginning with a collection of variables and a joint probability distribution over those variables, one could imagine starting with just a finite sample space and a probability distribution on that sample space. In this way, we might hope to do temporal inference purely using statistical data, without relying on a priori knowledge of a canonical way of factoring the situation into variables.
How might one do temporal inference without an existing factorization? One way might be to just consider all possible variables that can be defined on the sample space. This gives us one variable for each partition of the set.
However, when one tries to apply Pearl's methods to this collection of variables, one quickly runs into a problem: many of the variables definable on a fixed set are deterministic functions of each other. The Pearlian paradigm, as presented in the early chapters of Causality, lacks tools for performing temporal inference on variables that are highly deterministically related.[2]
We will introduce a new approach to temporal inference instead—one which is heavily inspired by the Pearlian paradigm, but approaches the problem with a very different formal apparatus, and does not make use of graphical models.
1.2. Overview
We'll begin by introducing the concept of a finite factored set, in Section 2. This will be our analogue of the directed acyclic graphs in Pearl's framework.
In Section 3, we will introduce the concepts of time and orthogonality, which can be read off of a finite factored set. In Pearl's framework, "time" corresponds to directed paths between nodes, and "orthogonality" corresponds to nodes that have no common ancestor.
In Section 4, we will introduce conditional orthogonality, which is our analogue of d-separation. We show that conditional orthogonality satisfies (a modified version of) the compositional semigraphoid axioms. We then (in Section 5) prove the fundamental theorem of finite factored sets, which states that conditional orthogonality is equivalent to conditional independence in all probability distributions on the finite factored set.
In Section 6, we discuss how to do temporal inference using finite factored sets, and give two examples. Finally, in Section 7 we discuss applications and future work, with an emphasis on temporal and conceptual inference, generalizing finite factored sets to the infinite case, and applications to embedded agency.
And here, we take our leave of Pearl. We've highlighted this approach's relationship to the Pearlian paradigm in order to motivate finite factored sets and explain how we'll be using them in this sequence. Formally, however, our approach is quite unlike Pearl's, and the rest of the sequence will stand alone.
2. Factorizations
Before giving a definition of finite factored sets, we will recall the definition of a partition, and give some basic notation related to partitions.
We do this for two reasons. First, we will use partitions in the definition of a factored set; and second, we want to draw attention to a duality between the notion of a partition, and the notion of a factorization.
2.1. Partitions
We begin with a definition of disjoint union.
Definition 1 (disjoint union). Given a set S of sets, let ⊔(S) denote the set of all ordered pairs (T,t), where T∈S and t∈T.[3]
Definition 2 (partition). A partition of a set S is a set X⊆P(S) of nonempty subsets of S such that the function ι:⊔(X)→S given by ι(x,s)=s is a bijection.[4]
Let Part(S) denote the set of all partitions of S. The elements of a partition are called parts.
An equivalent definition of partition is often given: a partition is a set A of nonempty subsets of S that are pairwise disjoint and union to S. We choose the above definition because it will make the symmetry between partitions and factorizations more obvious.
Definition 3 (trivial partition). A partition X of a set S is called trivial if |X|=1.
Definition 4. Given a partition X of a set S, and an element s∈S, let [s]X denote the unique x∈X such that s∈x.
Definition 5. Given a partition X of a set S, and elements s0,s1∈S, we say s0∼Xs1 if [s0]X=[s1]X.
Proposition 1. Given a partition X of a set S, ∼X is an equivalence relation on S.
Proof. Trivial. □
Definition 6 (finer and coarser). We say that a partition X of S is finer than another partition Y of S, if for all s0,s1∈S, if s0∼Xs1, then s0∼Ys1.
If X is finer than Y, we also say Y is coarser than X, and we write X≥SY and Y≤SX.
Definition 7 (discrete and indiscrete partitions). Given a set S, let DisS={{s}∣s∈S}.
If S is empty, let IndS={}, and if S is nonempty, let IndS={S}.
DisS is called the discrete partition, and IndS is called the indiscrete partition.
Proposition 2. For any set S, ≥S is a partial order on Part(S). Further, for all X∈Part(S), DisS≥SX and X≥SIndS.
Proof. Trivial. □
While both notations are sometimes used, it is more standard to draw the symbol in the opposite direction and have X≤Y when X is finer than Y. We choose to go against that standard because we want to think of partitions in part as the ability to distinguish between elements, and finer partitions correspond to greater ability to distinguish.[5]
Definition 8 (common refinement). Given a set C of partitions of a fixed set S, let ⋁S(C) denote the partition X∈Part(S) satisfying s0∼Xs1 if and only if s0∼cs1 for all c∈C. Given X,Y∈Part(S), we let X∨SY=⋁S({X,Y}).
2.2. Factorizations
We start with a definition of Cartesian product.
Definition 9 (Cartesian product). Given a set S of sets, let ⊓(S) denote the set of all functions f:S→⊔(S) such that for all T∈S, f(T) is of the form (T,t), for some t∈T.
We can now give the definition of a factorization of a set.
Definition 10 (factorization). A factorization of a set S is a set B⊆Part(S) of nontrivial partitions of S such that the function π:S→⊓(B), given by π(s)=(b↦(b,[s]b)), is a bijection.
Let Fact(S) denote the set of all factorizations of S. The elements of a factorization are called factors.
In other words, a set of nontrivial partitions is a factorization of S if for each way of choosing one part from each factor, there exists a unique element of S in the intersection of those parts.
Notice the duality between the definitions of partition and factorization. We replace subsets with partitions, nonempty with nontrivial, and disjoint union with Cartesian product, and we reverse the direction of the function. We can think of a factorization of S as a way to view S as a product, in the same way that a partition was a way to view S as a disjoint union.
A factored set is just a set together with a factorization of that set.
Definition 11 (factored set). A factored set F is an ordered pair (S,B), such that B is a factorization of S.
If F=(S,B) is a factored set, we let set(F)=S, and let basis(F)=B.
An important fact about factored sets is that the factors are enough to distinguish distinct elements.
Proposition 3. Given a factored set F=(S,B), and elements s0,s1∈S, if s0∼bs1 for all b∈B, then s0=s1.
Proof. Let F=(S,B) be a finite factored set, and let s0,s1∈S satisfy s0∼bs1 for all b∈B.
Let π:S→⊓(B) be given by π(s)=(b↦(b,[s]b)), as in the definition of factorization. Then π(s0)=(b↦(b,[s0]b))=(b↦(b,[s1]b))=π(s1). Since π is bijective, this means s0=s1. □
2.3. Chimera Functions
The following theorem can be viewed as an alternate characterization of factorization. We will use this alternate characterization to define chimera functions, which will be useful tools for manipulating elements of factored sets.
Theorem 1. Given a set S, a set B of nontrivial partitions of S is a factorization of S if and only if for every function g:B→S, there exists a unique s∈S such that for all b∈B, s∼bg(b).
Proof. First, we let B be a factorization of S, and let g:B→S be any function. We want to show that there exists a unique s∈S such that for all b∈B, s∼bg(b). Let π:S→⊓(B) be given by π(s)=(b↦(b,[s]b)), as in the definition of factorization. Note that π is bijective, and thus has an inverse.
Let s=π−1(b↦(b,[g(b)]b)). Observe that this is well-defined, because (b↦(b,[g(b)]b)) is in fact in ⊓(B). We will show that s∼bg(b) for all b∈B, and the uniqueness of this s will then follow directly from Proposition 3.
We have π(s)=(b↦[s]b) by the definition of π. However, we also have π(s)=(b↦[g(b)]b) by the definition of s. Thus, b↦[s]b and b↦[g(b)]b are the same function, so [s]b=[g(b)]b for all b∈B, so s∼bg(b) for all b∈B.
Conversely, let S be any set, and let B be any set of nontrivial partitions of S. Assume that for all g:B→S, there exists a unique s∈S satisfying s∼bg(b) for b∈B. Again, let π:S→⊓(B) be given by π(s)=(b↦(b,[s]b)), as in the definition of factorization. We want to show that π is invertible.
First, we show that π is injective. Take an arbitrary s0∈S, and let g:B→S be the constant function satisfying g(b)=s0 for all b∈B. Given another s1∈S, if π(s0)=π(s1), then (b↦[s0]b)=(b↦[s1]b), so [s1]b=[s0]b=[g(b)]b for all b∈B, so s0∼bs1∼bg(b) for all b∈B. Since there is a unique s∈S satisfying s∼bg(b) for all b∈B, this means s0=s1. Thus π is injective.
To see that π is surjective, consider some arbitrary h∈⊓(B). We want to show that there exists an s∈S with h=π(s).
For all b∈B, let Hb∈b be given by h(b)=(b,Hb), which is well-defined since h∈⊓(B). Note that Hb is a nonempty subset of S, so there exists a function g:B→S with g(b)∈Hb for all b∈B. Fix any such g, and let s satisfy s∼bg(b) for all b∈B.
We thus have that for all b∈B, h(b)=(b,Hb)=(b,[g(b)]b)=(b,[s]b)=π(s)(b), so h=π(s). Thus π is surjective.
Since π is bijective, we have that B is a factorization of S. □
This also gives us that factors are disjoint from each other.
Corollary 1. Given a factored set F=(S,B) and distinct factors b0,b1∈B, b0∩b1={}.
Proof. Assume by way of contradiction that T∈b0∩b1. Since b0 is nontrivial, there must be some other T′∈b0 with T∩T′={}. Let g:B→S be any function such that g(b0)∈T′ and g(b1)∈T. Then there can be no s such that s∼b0g(b0) and s∼b1g(b1), since then s would be in both T and T′. This contradicts Theorem 1. □
We are now ready to define the chimera function of a factored set.
Definition 12 (chimera function). Given a factored set F=(S,B), the chimera function (of F) is the function χF:(B→S)→S defined by χF(g)∼bg(b) for all g:B→S and b∈B.
The name "chimera function" comes from the fact that χF can be viewed as building an element of S by fusing together the properties of various different elements. Since we will often apply the chimera function to functions g that only take on two values, we will give notation for this special case.
Definition 13. Given a factored set F=(S,B), and a subset C⊆B, let χFC:S×S→S be given by χFC(s,t)=χF(g), where g:B→S is given by g(b)=s if b∈C, and g(b)=t otherwise.
For T,R⊆S, we will write χFC(T,R) for {χFC(t,r)∣t∈T, r∈R}.
The following is a list of properties of χFC, which will be useful in later proofs. All of these properties follow directly from the definition of χFC.
Proposition 4. Fix F=(S,B), a factored set, C,D⊆B, and s,t,r∈S.
Proof. Trivial. □
2.4. Trivial Factorizations
We now define a notion of a trivial factorization of a set, and show that every set has a unique trivial factorization.
Definition 14 (trivial factorization). A factorization B of a set S is called trivial if |B|≤1. A factored set (S,B) is called trivial if B is trivial.
Proposition 5. For every set S, there exists a unique trivial factorization B of S. If |S|≠1, this trivial factorization is given by B={DisS}, and if |S|=1, it is given by B={}.
Proof. We start with the case where |S|=0. The only partition of S is {}, so we only need to consider the sets of partitions {{}} and {} as potential factorizations. {{}} is vacuously a factorization of S by Theorem 1, since there are no functions from {{}} to S. {} is not a factorization by Theorem 1, since there is a function from {} to S, but there is no element of S. Thus, when |S|=0, {{}}={DisS} is the unique trivial factorization of S.
Next, consider the case where |S|=1. First, observe that the unique s∈S vacuously satisfies s∼bg(b) for all g:{}→S and b∈{}, since there is no b∈{}. Thus, by Theorem 1, {} is a factorization of S. Further, {} is the only factorization of S, since there are no nontrivial partitions of S. Thus, when |S|=1, {} is the unique trivial factorization of S.
Next, we consider the case where |S|≥2. Observe that DisS is a nontrivial partition of S. Let B={DisS}. We want to show that B is a factorization of S. By Theorem 1, it suffices to show that for all g:B→S, there exists a unique s∈S with s∼DisSg(DisS). We can take s=g(DisS), which clearly satisfies s∼DisSg(DisS). This s is unique, since if s′∼DisSg(DisS), then s′∈[g(DisS)]DisS=[s]b={s}, so s′=s. Thus B is a factorization of S.
On the other hand, if |S|≥2, {} is not a factorization of S, since if it were, Proposition 3 would imply that all elements of S are equal. Further, for any partition b of S, with b≠{DisS}, there must exist s0,s1∈S, with s0∼bs1, but s0≠s1. Thus {b} cannot be a factorization of S by Proposition 3. Thus when |S|≥2, DisS is the unique trivial factorization of S. □
2.5. Finite Factored Sets
This sequence will primarily be about finite factored sets.
Definition 15. If F=(S,B) is a factored set, the size of F, written size(F), is the cardinality of S. The dimension of F, written dim(F), is the cardinality of B. F is called finite if its size is finite, and finite-dimensional if its dimension is finite.
We suspect that the theory of infinite factored sets is both interesting and important. However, it is outside of the scope of this sequence, which will require finiteness for many of its key results.
Some of the definitions and results in this sequence will be given for finite factored sets, in spite of the fact that they could easily be extended to finite-dimensional or arbitrary factored sets. This is because they can often be extended in more than one way, and determining which extension is most natural requires further developing the theory of arbitrary factored sets.
Proposition 6. Every finite factored set is also finite-dimensional.
Proof. If F=(S,B) is a factored set, B is a set of sets of subsets of S. Thus, |B|≤22|S|. □
This bound is horrible and will be improved in Proposition 9. First, however, we will take a look at the number of factorizations of a fixed finite set.
Proposition 7. Let F=(S,B) be a finite factored set. Then |S|=∏b∈B|b|.
Proof. Trivial. □
Proposition 8. If |S| is equal to 0, 1, or a prime, the trivial factorization of S is the only factorization of S.
Proof. If |S|=0 or |S|=1, then |Part(S)|=1, so B⊆Part(S) can have cardinality at most 1.
If |S|=p, a prime, then by Proposition 7, |b| must divide p for all b∈B. Since factorizations cannot contain trivial partitions, this means |b|=p for all b∈B. However, {{s}∣s∈S} is the only element of Part(S) of cardinality p, so |B|≤1. □
On the other hand, in the case where |S| is finite and composite, the number of factorizations of S grows very quickly, as seen in Table 1. Table 1 shows the number of factorizations of a set S with cardinality up to 25:
Given the naturalness of the notion of factorization, we were surprised to discover that this sequence did not exist on the On-Line Encyclopedia of Integer Sequences (OEIS). We added the sequence, A338681, on April 30, 2021.
To give one concrete example, the four factorizations of the set {0,1,2,3} are:
Proposition 9. Let F be a finite factored set.
Proof. The first three parts follow directly from Proposition 5 and Proposition 8. For the fourth part, let F=(S,B), and let |S|=p0…pk−1 be a product of k≥2 primes.
By Proposition 7, |S|=∏b∈B|b|. Consider an arbitrary b∈B. Since b is a nontrivial partition of a finite set S, |b| is finite and |b|≠1. If |b| were 0, then |S| would be 0. Thus |b| is a natural number greater than or equal to 2. B cannot be empty, since |S|≠1. If |B| were greater than k, then we would be able to express |S| as a product of more than k natural numbers greater than or equal to 2, which is clearly not possible since |S| is a product of k primes. Thus 1≤dim(F)≤k. □
In the next post, we will introduce the notions of the history of a partition, orthogonality between partitions, and time.
Acknowledgments: My thanks to Alex Appel, Ramana Kumar, Xiaoyu He, Tsvi Benson-Tilsen, Andrew Critch, Sam Eisenstat, Rob Bensinger, and Claire Wang for discussion and feedback on this sequence.
Footnotes
[1] Although I say "factorize" here, note that this will not be the kind of factorization that shows up in finite factored sets, because (as we will see) disjoint factors must be independent in a finite factored set. I appeal to the same concept in both contexts because factorization is just a very general and useful concept, rather than to indicate a direct connection.
[2] At least, it lacks such causal inference tools unless we assume access to interventional data.
[3] Note that this definition and Definition 9 could have been made more general by taking S to be a multiset.
[4] P(S) denotes the power set of S.
[5] In our view, "Y≥X" is also a more natural way to visually represent a mapping between a three-part partition Y that is finer than a two-part partition X.