2. PR Indices and Diagonalization

Kevin T. Kelly
Department of Philosophy
Carnegie Mellon University

Our general strategy for indexing functions (the input-output behaviors of programs) will be to effectively decode numbers into unique programs and say that a function has a given index if decoding the index yields a program of the function. This will mean that, typically, each function has many indices, since many different programs can compute it. In primitive recursion this is readily seen to be the case, since any number of projections can be composed on the outside of a sensible program and the result is the same.

C(p_1,1, f) = C(p_1,1, C(p_1,1, f)) = C(p_1,1, C(p_1,1, C(p_1,1, f))) = ...

For this reason, we don't have to be too careful about each program having a unique index corresponding to it: functions will end up with infinitely many redundant numbers anyway! The important properties are:

There is an effective procedure to decode numbers into unique programs.
Every PR program has a number.
Every number decodes into some program.

We have to index functions of every arity. There are two possible ways to proceed:

One big enumeration of functions of all arities: We could also number all the functions of all arities at once, so that the index determines the arity. But then we have to go back and construct sub-enumerations of the unary functions, binary functions, etc. in order to set up diagonal arguments. Also, the inductive clauses in the definition become inelegant because, for example, you can't apply primitive recursion to functions of the wrong arities.
Separate enumerations for different arities: We can have the same index pick out different functions for different specified arieties. This approach is natural when we get to Turing machines, since what a Turing machine does depends on how many arguments it is given.

We will follow the second strategy.

f[x, k](y₁, ..., y_k) = the value of the xth k-ary program on inputs (y₁, ..., y_k).

Then the unary PR functions will be enumerated as:

f[0, 1], f[1, 1], ....

Here's one way to do it. Recall that d(n, i) = (n)_iis the PR decoding function for the PR finite sequence encoding <x>.

An indexing of PR functions of given arities:

The idea of our indexing is to use the length of the sequence encoded by the index to determine the last primitive recursive operator applied to generate the function and to use the components of the coded sequence to determine which functions the operation is applied to.

In reading the following definition, don't forget that positions in coded sequences are counted starting from 0 and that the length of a code number is number of items occurring in the sequence (including the one indexed as 0); e.g.

d(<i, j, k>, 2) = k;
lh(<i, j, k>) = 3.

Now define the function denoted by f[x, k] by cases as follows:

k = 0 Þ f[x, k] = x;
k > 0 Þ
lh(x) = 0 Þ f[x, k] = C(o, p_1,k);
lh(x) = 1 Þ f[x, k] = C(s, p_1,k);
lh(x) = 2 Þ f[x, k] = p_{min(d(x, 0),
k), k};
lh(x) = 3 Þ f[x, k] = C(f[d(x, 0), lh(d(x,1))], f[d(d(x, 1), 0), k], ..., f[d(d(x, 1), lh(d(x, 1)) -.1), k]);
lh(x) = 4 Þ f[x, k] = R(f[d(x, 0), k -. 1], f[d(x, 1), k + 1]);
lh(x) > 4 Þ f[x, k] = C(o, p_1,k).

Observations:

The most complicated clause is the composition case (4). Given index x, we decode x into (i, j, k) and treat i = d(x, 0) as the index of the outer function and j = d(x, 1) as an index of a list of numbers d(d(x, 1), 0), ..., d(d(x, 2), lh(d(x, 1)) - 1), each of which is interpreted as the index of a function embedded in the composition. The arity of the outer function is lh(d(x, 1)) and the arity of the inner functions is taken to be k, so that the resulting composition has the required arity.
Each PR function of a given arity has a number (by induction on the depth of operator applications in PR definitions).
Each number codes some function (by induction on N).
Each map h_k such that h_k(x, y) = f[x, k](y) is intuitively effective.
Component functions always have lower indices than the functions they are components of. Thus, induction on N corresponds to induction on function embedding depth.
Notice how nicely having the arity given as an input together with our convention that constants are 0-ary functions allows us to steer past the restrictions on the domains of C and R without including them as special conditions. For example, the composition case works by treating the second item occurring in the sequence encoded by x as the code number of the sequence of indices of embedded functions. This nails down the arity of the outer function. The arities of the embedded functions are recovered from the given k.

Exercise 1: Find the indices of a few PR functions. You can present them in terms of the encoding.

Exercise 2: Prove: all the functions occurring in a derivation tree for f[x, k] have indices strictly less than x.

Diagonalization (Rogers pp. 10-12).

Programming (= defining) can only show that a function is PR. Failure to find a program is like failure to find a logical proof. It leaves the following choice: either there is no proof or we aren't smart enough to find it (David Hume: which would be the greater miracle?). To prove that there is no program at all, we need to have a mathematical specification of the possible programs and of the functions the programs define and then we have to show that a given function is different from each of those. Typically this is done by making our function differ from each given function in one place, depending on its position in an enumeration. If we think of each functions values being listed as rows in an infinite table, then one technique is to show that the given function differs from a given row on the diagonal. So negative arguments are often called diagonal arguments even if the place where a given function differs isn't exactly on the diagonal.

Here's an easy "logician's" example of an intuitively effective, total function that is not PR:

diag(x) = f[x, 1](x)+1.

Intuitively, decode x into a unary program, simulate the program on x itself and apply successor. But evidently diag(x) differs from the xth unary PR function at position x, so diag is not PR.

This is literally a diagonal argument, since if we Think of a table T whose entry

T[x, y] = f[x, 1](y) + 1,

then

diag(x) = T[x, x] + 1,

so diag modifies each cell of the table along the diagonal.

Thesis: intuitive effectiveness outruns the PR functions.

Note that this conclusion is not (yet) a theorem because "effective" is (still) not mathematically definite. Could some kind of ultra-fancy kind of effective recursion exhaust all of the intuitively effective functions?

No. For the logic of the diagonal argument will still work. Just add a clause for the new, fancy kind of recursion into our enumeration

g[x, k](y).

Now define

diag = g[x, 1](x) + 1.

This will still be an intuitively effective total function that transcends the newly defined collection! So our thesis is even stronger:

Strengthened thesis: intuitive effectiveness outruns any kind of effective primitive recursive operation that produces only total functions.

The trouble is that recursion operators are guaranteed to produce total functions. Total functions leave "targets" everywhere along the diagonal for the diagonal argument to "hit". Defining total functions out of partial functions allows for the possibility that some cells along the diagonal are "missing". Then diag will also be undefined there (since it adds 1 to an undefined value) and may be identical to the function characterizing that row in the table. This is where the theory of computability is headed.

But first, we are going to look at a hierarchy based on primitive recursion.

Next lecture.