Recall the formal statement of Rice's theorem:
We will use the notation for the th Turing machine under some fixed numbering system. Each such machine induces a function, which we will also write as where this is unambiguous due to context; then it makes sense to write for the value that machine outputs when it is run on input .
Let be a non-empty, proper [1] subset of , where is the graph of the partial function computed by , the th Turing machine. Then there is no Turing machine such that:
- is if
- is if .
We give a proof that is (very nearly) constructive: one which (if we could be bothered to work it all through) gives us an explicit example [2] of a Turing machine whose "am-I-in-" nature cannot be determined by a Turing machine. [3]
We will present an intermediate lemma which does all the heavy lifting; this makes the actual reasoning rather unclear but very succinct, so we will also include an extensive worked example of what this lemma does for us.
The intermediate lemma is a certain fixed-point theorem.
Let be total computable: that is, it halts on every input. Then there is such that . [4]
That is, the "underlying function" of - the partial function computed by - has the same output, at every point, as the function computed by . If we view as a way of manipulating a program (as specified by its description_number), then this fixed-point theorem states that we can find a program whose underlying function is not changed at all by .
The proof of this lemma is quite simple once the magic steps have been discovered, but it is devilishly difficult to intuit, because it involves two rather strange and confusing recursions and some self-reference.
Recall the theorem, which states that there is a total computable function of two variables such that for every , we have : that is, there is a total computable way of currying computable functions. (Strictly speaking, our Turing machines only take one argument. Therefore we should use a computable pairing scheme such as Cantor's pairing function, so that actually should be interpreted as .)
Then the function which takes the pair and outputs the value of is computable, so it has a description number , say. [5]
Now we claim that is the we seek. [6] Indeed, for any , by definition of ; this is by the theorem; this is by definition of ; and that is by definition of .
Therefore , so we have found our fixed point.
Suppose our description numbering scheme is just "expand as a number in base , and interpret the result as an ASCII [7] string; then interpret that string as Python code".
Then our function , whatever it may be, can be viewed just as transforming Python code.
Suppose does nothing more than insert the following line of code as the second line of its input:
x = 0
So, for instance, it takes the string
x = 1
print(x)
and returns
x = 1
x = 0
print(x)
thereby changing the function computed from "return the constant " to "return the constant ", in this case. Note that many other functions will not change at all: for example, those which don't contain a variable in the first place will be unchanged, because all the modification does is add in an initialisation of a variable which will never subsequently be used.
The fixed-point theorem guarantees that there is indeed a Python program which will not change at all under this modification (though in this case it's very obvious). In fact the theorem constructs such a program; can we work out what it is?
First of all, can be implemented as follows.
We will take our Python code to be written so that their input is given in the variable r1
, so is simply the Python code represented by but where the code-variable r1
is initialised to first; that is, it can be found by prepending the line r1 = 5
to the code represented by .
Then we will assume that Python comes with a function eval
(corresponding to ) which takes as its input a string [8] and another argument with which eval
initialises the variable x
before running the string as a Python program in a separate instance of Python:
eval("print(r1)", 5) # does nothing other than print the number 5
eval("print(y)", 5) # throws an error because `y` is not defined when it comes to printing it
eval("print(6)", 5) # prints 6, ignoring the fact that the variable `r1` is equal to `5` in the sub-instance
Remember, our proof of the fixed point theorem says that the program we want has code , where takes a pair as input, and outputs . What is specifically here? Well, on the one hand we're viewing it as a string of code (because it comes as the first argument to ), and on the other we're viewing it as an integer (because it also comes as the second argument to ).
As code, a
is the following string, where h
is to be replaced by whatever we've already decided is:
eval("r1 = e; h(eval(r1, str_as_int(r1)))", x)
We are assuming the existence of a function str_as_int
which takes an ASCII string and returns the integer whose places in base 128 are the ASCII for each character of the string in turn.
For example, we have inserting the line x = 0
as the second line, so our a
is:
eval("r1 = e; x = 0; eval(r1, str_as_int(r1))", x)
As a number, a
is just the ASCII for this, interpreted in base 128 (i.e. a certain number which in this case happens to have 106 digits, which is why we don't give it here).
The claim of the fixed-point theorem, then, is that the following program is unchanged by :
eval("eval(\"r1 = e; x = 0; eval(r1, str_as_int(r1))\", x)", str_to_int("eval(\"r1 = e; x = 0; eval(r1, str_as_int(r1))\", x)"))
You may recognise this as a quining construction.
Finally, Rice's theorem follows quickly: suppose we could decide in general whether or not, and label by the computable function which decides this (that is, whose value is if , and otherwise).
Since is nonempty and proper, there are natural numbers and such that but . Define the computable function which takes and outputs if , and otherwise. (That is, it flips its input: if its input had the property of , the function outputs whose graph is not in , and vice versa. Informally, it is the program-transformer that reads in a program, determines whether the program computes a function in or not, and transforms the program into a specific canonical example of something which has the opposite -ness status.)
By the fixed-point theorem, we can find such that .
But now we can ask whether is in (and therefore whether is in ).
We have obtained contradictions in both cases (namely that is both in and not in ), so it must be the case that does not exist after all.
That is, it is not the entire set.
Well, very nearly; see the next note.
It's only "very nearly" constructive. It would be actually constructive if we knew in advance a specific example of a program whose function is in , and a program whose function is in . The proof here assumes the existence of a program of each type, but ironically the theorem itself guarantees that there is no fully-general way to find such programs.
And, moreover, we can actually find such an .
This is the first strange part: we are treating both as a description number, and as an input to \(E\), when we consider .
This is the second strange part, for the same reason as was the first; but this one is even worse, because the definition of already involves a weird recursion and we've just added another one on top.
This is a standard, agreed-upon method of turning a number between and into a character.
The string is standing in place of , but we have just skipped the intermediate step of "unpack the integer into a string" and gone straight to assuming it is a string.