2. PR Indices and Diagonalization
Kevin T. Kelly
Department of Philosophy
Carnegie Mellon University
Our general strategy for indexing functions (the
input-output behaviors of programs) will be to effectively decode numbers
into unique programs and say that a function has a given index if decoding
the index yields a program of the function. This will mean that,
typically, each function has many indices, since many different programs
can compute it. In primitive recursion this is readily seen to be
the case, since any number of projections can be composed on the outside
of a sensible program and the result is the same.
C(p1,1,
f)
= C(p1,1, C(p1,1, f)) = C(p1,1,
C(p1,1, C(p1,1, f))) = ...
For this reason, we don't have to be too careful
about each program having a unique index corresponding to it: functions
will end up with infinitely many redundant numbers anyway!
The important properties are:
-
There is an effective procedure to decode numbers into unique programs.
-
Every PR program has a number.
-
Every number decodes into some program.
We have to index functions of every arity.
There are two possible ways to proceed:
-
One big enumeration of functions of all arities: We could
also number all the functions of all arities at once, so that the index
determines the arity. But then we have to go back and construct sub-enumerations
of the unary functions, binary functions, etc. in order to set up diagonal
arguments. Also, the inductive clauses in the definition become inelegant
because, for example, you can't apply primitive recursion to functions
of the wrong arities.
-
Separate enumerations for different arities: We can have the
same index pick out different functions for different specified arieties.
This approach is natural when we get to Turing machines, since what a Turing
machine does depends on how many arguments it is given.
We will follow the second strategy.
f[x,
k](y1,
..., yk) = the value of the xth k-ary program
on inputs (y1, ..., yk).
Then the unary PR functions will be enumerated as:
f[0,
1], f[1, 1], ....
Here's one way to do it. Recall that d(n,
i)
= (n)i
is the PR decoding function for the PR
finite sequence encoding <x>.
An indexing of PR functions of given arities:
The idea of our indexing is to use the length of the sequence encoded by
the index to determine the last primitive recursive operator applied to
generate the function and to use the components of the coded sequence to
determine which functions the operation is applied to.
In reading the following definition, don't forget that positions in
coded sequences are counted starting from 0 and that the length of a code
number is number of items occurring in the sequence (including the one
indexed as 0); e.g.
d(<i, j, k>, 2) = k;
lh(<i, j, k>) = 3.
Now define the function denoted by f[x, k] by cases
as follows:
k = 0 Þ
f[x, k] = x;
k > 0 Þ
lh(x) = 0 Þ
f[x,
k]
= C(o, p1,k);
lh(x) = 1 Þ
f[x,
k]
= C(s, p1,k);
lh(x) = 2 Þ
f[x,
k]
= pmin(d(x, 0),
k), k;
lh(x) = 3 Þ
f[x,
k]
= C(f[d(x, 0), lh(d(x,1))],
f[d(d(x,
1), 0), k], ..., f[d(d(x, 1), lh(d(x,
1)) -.1), k]);
lh(x) = 4 Þ
f[x,
k]
= R(f[d(x, 0), k -.
1], f[d(x, 1), k + 1]);
lh(x) > 4 Þ
f[x,
k]
= C(o, p1,k).
Observations:
-
The most complicated clause is the composition case (4). Given index
x,
we decode x into (i, j, k) and treat i
= d(x, 0) as the index of the outer function and j
= d(x, 1) as an index of a list of numbers d(d(x,
1), 0), ..., d(d(x, 2), lh(d(x, 1))
- 1), each of which is interpreted as the index of a function embedded
in the composition. The arity of the outer function is lh(d(x,
1)) and the arity of the inner functions is taken to be k, so that
the resulting composition has the required arity.
-
Each PR function of a given arity has a number (by induction on the
depth of operator applications in PR definitions).
-
Each number codes some function (by induction on N).
-
Each map hk such that hk(x,
y)
= f[x, k](y) is intuitively effective.
-
Component functions always have lower indices than the functions they are
components of. Thus, induction on N corresponds to induction on function
embedding depth.
-
Notice how nicely having the arity given as an input together with our
convention that constants are 0-ary functions allows us to steer past the
restrictions on the domains of C and R without including them as special
conditions. For example, the composition case works by treating the
second item occurring in the sequence encoded by x as the code number
of the sequence of indices of embedded functions. This nails down
the arity of the outer function. The arities of the embedded functions
are recovered from the given k.
Exercise 1: Find the indices
of a few PR functions. You can present them in terms of the encoding.
Exercise 2: Prove: all the functions occurring
in a derivation tree for f[x, k] have indices strictly
less than x.
Diagonalization (Rogers pp. 10-12).
Programming (= defining) can only show that a function
is PR. Failure to find a program is like failure to find a logical
proof. It leaves the following choice: either there is no proof or
we aren't smart enough to find it (David Hume: which would be the
greater miracle?). To prove that there is no program at all, we need
to have a mathematical specification of the possible programs and of the
functions the programs define and then we have to show that a given function
is different from each of those. Typically this is done by making
our function differ from each given function in one place, depending on
its position in an enumeration. If we think of each functions values
being listed as rows in an infinite table, then one technique is to show
that the given function differs from a given row on the diagonal.
So negative arguments are often called diagonal arguments even if the place
where a given function differs isn't exactly on the diagonal.
Here's an easy "logician's" example of an intuitively effective, total
function that is not PR:
diag(x)
= f[x, 1](x)+1.
Intuitively, decode x into a unary program,
simulate the program on x itself and apply successor. But
evidently diag(x) differs from the xth unary PR function
at position x, so diag is not PR.
This is literally a diagonal argument, since if we Think of a table T
whose entry
T[x, y] = f[x, 1](y)
+ 1,
then
diag(x) = T[x, x] + 1,
so diag modifies each cell of the table along the diagonal.
Thesis:
intuitive effectiveness outruns the PR functions.
Note that this conclusion is not (yet) a theorem because "effective" is
(still) not mathematically definite. Could some kind of ultra-fancy
kind of effective recursion exhaust all of the intuitively effective functions?
No. For the logic of the diagonal argument will still work.
Just add a clause for the new, fancy kind of recursion into our enumeration
g[x, k](y).
Now define
diag = g[x, 1](x) + 1.
This will still be an intuitively effective total function that transcends
the newly defined collection! So our thesis is even stronger:
Strengthened thesis: intuitive effectiveness outruns
any kind of effective primitive recursive operation that produces only
total functions.
The trouble is that recursion operators are guaranteed to produce total
functions. Total functions leave "targets" everywhere along the diagonal
for the diagonal argument to "hit". Defining total functions out
of partial functions allows for the possibility that some cells along the
diagonal are "missing". Then diag will also be undefined there (since
it adds 1 to an undefined value) and may be identical to the function characterizing
that row in the table. This is where the theory of computability
is headed.
But first, we are going to look at a hierarchy based on primitive recursion.
Next
lecture.