The disjoint union of two sets is just the union, but with the additional information that the two sets also don't have any elements in common. That is, we can use the phrase "disjoint union" to indicate that we've taken the union of two sets which have empty intersection. The phrase can also be used to indicate a very slightly different operation: "do something to the elements of each set to make sure they don't overlap, and then take the union".
"Disjoint union" can mean one of two things:
(Mathematicians usually let context decide which of these meanings is intended.)
The disjoint union has the symbol : so the disjoint union of sets and is .
Let's look at and . These two sets don't overlap: no element of is in , and no element of is in . So we can announce that the union of and (that is, the set ) is in fact a disjoint union.
In this instance, writing is just giving the reader an extra little hint that and are disjoint; I could just have written , and the formal meaning would be the same. For the purposes of the first definition, think of as but with a footnote reading "And, moreover, the union is disjoint".
As a non-example, we could not legitimately write , even though ; this is because is in both of the sets we are unioning.
This is the more interesting definition, and it requires some fleshing out.
Let's think about and (so the two sets overlap). We want to massage these two sets so that they become disjoint, but are somehow "still recognisably and ".
There's a clever little trick we can do. We tag every member of with a little note saying "I'm in ", and every member of with a note saying "I'm in ". To turn this into something that fits into set theory, we tag an element of by putting it in an ordered pair with the number : is " with its tag". Then our massaged version of is the set consisting of all the elements of , but where we tag them first:
Now, to tag the elements of in the same way, we should avoid using the tag because that means "I'm in "; so we will use the number instead. Our massaged version of is the set consisting of all the elements of , but we tag them first as well:
Notice that bijects with [1], and bijects with , so we've got two sets which are "recognisably and ".
But magically and are disjoint, because everything in is a tuple with second element equal to , while everything in is a tuple with second element equal to .
We define the disjoint union of and to be (where now means the first definition: the ordinary union but where we have the extra information that the two sets are disjoint). That is, "make the sets and disjoint, and then take their union".
Take a specific example where and . In this case, it only makes sense to use in the second sense, because and overlap (they both contain the element ).
Then and , and the disjoint union is
Notice that has only three elements, because is in both and and that information has been lost on taking the union. On the other hand, the disjoint union has the required four elements because we've retained the information that the two 's are "different": they appear as and respectively.
In this example, the notation is slightly ambiguous, since and are disjoint already. Depending on context, it could either mean , or it could mean (where and ). It will usually be clear which of the two senses is meant; the former is more common in everyday maths, while the latter is usually intended in set theory.
What happens if ?
Only the second definition makes sense.
Then and , so which has four elements.
Let be the set of natural numbers including , and let be the set containing two natural numbers and one symbol which is not a natural number.
Then only makes sense under the second definition; it is the union of and , or
In this case, again the notation is ambiguous; it could mean , or it could mean .
We can generalise the disjoint union so that we can write instead of just .
To use the first definition, the generalisation is easy to formulate: it's just , but with the extra information that , and are pairwise disjoint (so there is nothing in any of their intersections: and are disjoint, and are disjoint, and and are disjoin).
To use the second definition, we just tag each set again: let , , and . Then is defined to be .
In fact, both definitions generalise even further, to unions over arbitrary sets. Indeed, in the first sense we can define together with the information that no pair of intersect.
In the second sense, we can define where .
For example,
The first definition is basically just a notational convenience: it saves a few words when saying "… and moreover the sets are pairwise disjoint".
The real meat of the idea is the second definition, which provides a way of forcing the sets to be disjoint. It's not necessarily the only way we could coherently define a disjoint union (since there's more than one way we could have tagged the sets; if nothing else, could be defined the other way round, as where and , swapping the tags). But it's the one we use by convention. Usually when we're using the second definition we don't much care exactly how we force the sets to be disjoint; we only care that there is such a way. (For comparison, there is more than one way to define the ordered pair in the ZF set theory, but we almost never care really which exact definition we use; only that there is a definition that has the properties we want from it.)
Indeed, a bijection from to is the map .