Home Help About Profile Search Home
Guest: login

Educational Technology Leadership*05-06 > Hot Topic!!! Turing Machines

Rotisserie Question

elindblom

Below are three articles on the Turing Machine. Once read, an understanding shall develop. Please feel free to make comments on the material!!!

Eric J. Lindblom, Ph.D.

http://plato.stanford.edu/entries/turing-machine/
Stanford University

"Turing Machines
Turing machines, first described by Alan Turing in (Turing 1937), are simple abstract computational devices intended to help investigate the extent and limitations of what can be computed.

Turing, writing before the invention of the modern digital computer, was interested in the question of what it means to be computable. Intuitively a task is computable if one can specify a sequence of instructions which when followed will result in the completion of the task. Such a set of instructions is called an effective procedure, or algorithm, for the task. This intuition must be made precise by defining the capabilities of the device that is to carry out the instructions. Devices with different capabilities may be able to complete different instruction sets, and therefore may result in different classes of computable tasks (see the entry on computability and complexity).

Turing proposed a class of devices that came to be known as Turing machines. These devices lead to a formal notion of computation that we will call Turing-computability.

The proposition that Turing's notion captures exactly the intuitive idea of effective procedure is called the Church-Turing thesis. This proposition, being a claim about the relationship between a formal concept and intuition, is not provable, though it would be refuted by an intuitively acceptable algorithm for a task that is not Turing-computable. That no such counterexample has been found, together with the fact that Turing-computability is equivalent to independently defined notions of computability based on alternative foundations, such as recursive functions and abacus machines, indicates that there is at least something natural about this notion of computability.

Turing machines are not physical objects but mathematical ones. We require neither soldering irons nor silicon chips to build one. The architecture is simply described, and the actions that may be carried out by the machine are simple and unambiguously specified. Turing recognized that it is not necessary to talk about how the machine carries out its actions, but merely to take as given the twin ideas that the machine can carry out the specified actions, and that those actions may be uniquely described.

1. A Definition of Turing Machine
1.1 The Definition Formalized
2. Describing Turing Machines
2.1 Examples
2.2 Instantaneous Descriptions of a Computation
3. Varieties of Turing Machines
4. What Can Be Computed
5. What Cannot Be Computed
5.1 The Busy Beaver
5.2 The Halting Problem
6. Alternative Formulations of Computability
6.1 Recursive Functions
6.2 Abacus Machines
7. Restricted Turing Machines
Bibliography
Other Internet Resources
Related Entries

--------------------------------------------------------------------------------

1. A Definition of Turing Machines
A Turing machine is a kind of state machine. At any time the machine is in any one of a finite number of states. Instructions for a Turing machine consist in specified conditions under which the machine will transition between one state and another.

A Turing machine has an infinite one-dimensional tape divided into cells. Traditionally we think of the tape as being horizontal with the cells arranged in a left-right orientation. The tape has one end, at the left say, and stretches infinitely far to the right. Each cell is able to contain one symbol, either ‘0’ or ‘1’.

The machine has a read-write head, which at any time scanning a single cell on the tape. This read-write head can move left and right along the tape to scan successive cells.

The action of a Turing machine is determined completely by (1) the current state of the machine (2) the symbol in the cell currently being scanned by the head and (3) a table of transition rules, which serve as the “program” for the machine.

Each transition rule is a 4-tuple:

< State0, Symbol, Statenext, Action >
which can be read as saying “if the machine is in state State0 and the current cell contains Symbol then move into state Statenext taking Action”. The actions available to a Turing machine are either to write a symbol on the tape in the current cell (which we will denote with the symbol in question), or to move the head one cell to the left or right, which we will denote by the symbols « and » respectively.

If the machine reaches a situation in which there is not exactly one transition rule specified, i.e., none or more than one, then the machine halts.

In modern terms, the tape serves as the memory of the machine, while the read-write head is the memory bus through which data is accessed (and updated) by the machine. There are two important things to notice about the definition. The first is that the machine's tape is infinite in length, corresponding to an assumption that the memory of the machine is infinite. The second is similar in nature, but not explicit in the definition of the machine, namely that a function will be Turing-computable if there exists a set of instructions that will result in the machine computing the function regardless of the amount of time it takes. One can think of this as assuming the availability of infinite time to complete the computation.

These two assumptions are intended to ensure that the definition of computation that results is not too narrow. This is, it ensures that no computable function will fail to be Turing-computable solely because there is insufficient time or memory to complete the computation. If a function is not Turing-computable it is because Turing machines lack the computational machinery to carry it out, not because of a lack of spatio-temporal resources.

1.1. The Definition Formalized
Talk of "tape" and a "read-write head" is intended to aid the intution (and reveals something of the time in which Turing was writing) but plays no important role in the definition of Turing machines. In situations where a formal analysis of Turing machines is required, it is appropriate to spell out the definition of the machinery and program in more mathematical terms. Purely formally a machine might be defined to consist of:
A finite set of states Q with a distinguished start state,
A finite set of symbols S
A computation state of the machine can be described
Qstate a member of Q.
A function sstate from the natural numbers into S.
A natural number hstate.
The transition function for the machine d is a function from computation states to computation states, such that if d(S) = T
sT agrees with sS everywhere except on hS (and perhaps there too.)
If sS(hS) &#8800; sT(hS) then hT = hS otherwise, |hT - hS| = 1
A formal definition such as this facilitates the analysis of the concept of Turing machine, at the cost of obscuring the underlying intutions about such machines. To see how the formal definition relates to the original one, think of the function s as representing the tape by assigning an index to each cell of the tape, and then encoding the symbol in that cell as the value of the function s given the index of the cell.

The read-write head is represented by the number h which is the index of the cell scanned by the head.

The transition function determines the new content of the tape by returning a new function s but this new function is constrained to differ from the original in at most the cell being scanned. If that cell is modified, then the head must not be moved in the transition, and if it is not modified, then the head is constrained to move at most one cell in either direction.

This definition is very similar to that given in the entry on computability and complexity, with the significant difference that the transition function may write a new symbol as well as move during any transition. This change does not alter the set of Turing-computable functions, and simplifies the formal definition by removing the second condition on the transition function in our definition. Both formal definitions permit the alphabet of symbols on the tape to be any finite set, while the original definition insisted on S={0,1} this change also does not impact the definition of the set of Turing-computable functions.

2. Describing Turing Machines
Every Turing machine has the same machinery. What makes one Turing machine perform one task and another a different task is the table of transition rules that make up the machine's program, and a specified initial state for the machine. We will assume throughout that a machine starts in the lowest numbered of its states.
We can describe a Turing machine, therefore, by specifying only the 4-tuples that make up its program. Here are the tuples describing a simple machine.

<s0, 1, s0, » >
< s0, 0, s1, 1 >
< s1, 1, s1, « >
< s1, 0, s2, » >

When we are interested in examining the behaviour of a Turing machine, it is common and perhaps more perspicuous to represent the machine using a state diagram. Here is the machine represented in this format.


Figure 1: A State Diagram
In this figure, states are represented by the circles, with the unique double circle being the initial state. A transition is represented as an arrow originating from one circle and landing at another (possibly the same) circle. The arrows are labeled by a pair consisting first of the symbol that must be being scanned for the arrow to be followed, and second the action that is to be taken as the transition is made. The action will either be the symbol to be written, or « or » indicating a move to the left or right.

In what follows we will describe Turing machines in the state machine format.

2.1 Examples
In order to speak about a Turing machine that does something useful, we will have to provide an interpretation of the symbols recorded on the tape. For example, if we want to design a machine which will perform some mathematical function, addition say, then we will need to describe how to interpret the ones and zeros appearing on the tape as numbers.

In the examples that follow we will represent the number n as a block of n+1 copies of the symbol ‘1’ on the tape. Thus we will represent the number 0 as a single ‘1’ and the number 3 as a block of four ‘1’s.

We will also have to make some assumptions about the configuration of the tape when the machine is started, and when it finishes, in order to interpret the computation. We will assume that if the function to be computed requires n arguments, then the Turing machine will start with its head scanning the leftmost ‘1’ of a sequence of n blocks of ‘1’s. the blocks of ‘1's representing the arguments must be separated by a single occurrence of the symbol ‘0’. For example, to compute the sum 3+4, a Turing machine will start in the following configuration, where the ellipses indicate that the tape has only zeros on the cells that we can't see, and the upward arrow indicates the cell that is currently scanned.


A machine must finish in standard configuration too. There must be a single block of ‘1’s on the tape, and the machine must be scanning the leftmost such ‘1’. If the machine correctly computes the function then this block must represent the correct answer. So an addition machine started in the configuration above must finish on a tape that looks like this:


Adopting this convention for the terminating configuration of a Turing machine means that we can compose machines by identifying the final state of one machine with the initial state of the next.

Under these conventions, the state diagram in Figure 1 describes a machine which computes the successor (add-one) function. That is when started in standard configuration on a tape representing the number n it will halt in standard configuration representing the number n+1. It does this by using state s0 to scan to the first ‘0’ to the right of the (single) block of ‘1’s. It then replaces that ‘0’ by a ‘1’, and scans left in state s1 until a ‘0’ is found (this is the first zero to the left of the block of ‘1’s.) It then moves back to scan the first ‘1’ and halts in state s2.

You can see a movie of the execution of this machine here.

For another example, consider the machine in Figure 2 which computes the addition function. That is, when started on a standard tape representing the numbers n and m, the machine halts on a tape representing n+m.


Figure 2: A Machine for Computing n+m
Notice that this machine is like the add one machine in that states s0 through s 2 cause the machine to write a ‘1’ to the right of the first block of ‘1’s, and returns the head to the leftmost ‘1’. In standard configuration for addition, this joins the two blocks of ‘1’s into a single block, containing (n+1)+1+(m+1) copies of the symbol ‘1’, so that on entering state s2 the tape represents the number n+m+2. In order to correct this, we need to remove two copies of the symbol ‘1’, which is achieved by states s 2 and s3, each of which replaces a ‘1’ by ‘0’ and then moves to the right.
Since the tape initially contains at least two ‘1’s and we write one more, the deletion of two ‘1’s will leave at least one on the tape at the end of the computation, and we will be scanning the leftmost of them.

2.2 Instantaneous Descriptions of a Computation
The complete state of a Turing machine at any point during a computation may be described by the name of the state that the machine is in, the symbols on the tape, and the cell that is currently being scanned. A description of these three data is called an instantaneous description of the computation. We will represent such a description using a figure like that in Figure 3, in which the arrow represents the currently scanned cell and the name of the current state is written below the arrow.


Figure 3: The Instantaneous Description of a Turing Machine Computation
3. Varieties of Turing Machines
We have presented here one of the most common formulations of Turing's basic idea. There are a number of variations to the formulation that turn out to be equivalent to this one, and different authors present Turing machines using any of these. Since they are all provably equivalent to one another we can consider any of the formulations as being the definition of Turing machine as we find convenient.
Formulation F1 and formulation F2 are equivalent if for every machine described in formulation F1 there is machine a described in F2 which has the same input-output behavior, and vice versa, i.e. when started on the same tape at the same cell, will terminate with the same tape on the same cell.

Two-way infinite tapes

In our original formulation we specified that the tape had an end, at the left say, and stretched infinitely far to the right. Relaxing this stipulation to allow the tape to stretch infinitely far to right and left results in a new formulation of Turing machines equivalent to the original. That is for any Turing machine using a two way tape there is a Turing machine with a one-way infinite tape with the same input-output behavior, and vice versa.

Two-dimensional tapes

Instead of a one-dimensional infinite tape, we could consider a two-dimensional “tape”, which stretches infinitely far up and down as well as left and right. We would add to the formulation that a machine transition can cause the read-write head to move up or down one cell in addition to being able to move left and right. Again this formulation is equivalent to the original.

Arbitrary movement of the head

Modifying the definition of a Turing machine so that the read-write head may move an arbitrary number of cells at any given transition does not alter the notion of Turing-computability.

Arbitrary numbers of read-write heads

Modifying the definition of a Turing machine so that the machine has several read-write heads does not alter the notion of Turing-computability.

Arbitrary finite alphabet

In our original formulation we allowed the use of only two symbols on the tape. In fact we do not increase the power of Turing machines by allowing the use of any finite alphabet of symbols.

5-tuple formulation

A common way to describe Turing machines is to allow the machine to both write and move its head in the same transition. This formulation requires the 4-tuples of the original formulation to be replaced by 5-tuples

<State0, Symbol, Statenew, Symbolnew, Move >
where Symbolnew is the symbol written, and Move is one of « and ».

Again, this additional freedom does not result in a new definition of Turing-computable. For every one of the new machines there is one of the old machines with the same properties.

Non-deterministic Turing machines

An apparently more radical reformulation of the notion of Turing machine allows the machine to explore alternatives computations in parallel. In the original formulation we said that if the machine specified multiple transitions for a given state/symbol pair, and the machine was in such a state then it would halt. In this reformulation, all transitions are taken, and all the resulting computations are continued in parallel. One way to visualize this is that the machine spawns an exact copy of itself and the tape for each alternative available transition, and each machine continues the computation. If any of the machines terminates successfully, then the entire computation terminates and inherits that machine's resulting tape. Notice the word successfully in the preceding sentence. In this formulation, some states are designated as accepting states and when the machine terminates in one of these states, then the computation is successful, otherwise the computation is unsuccessful and any other machines continue in their search for a successful outcome.

The addition of non-determinism to Turing machines does not alter the definition of Turing-computable.

Turing's original formulation of Turing Machines used the 5-tuple representation of machines. Post introduced the 4-tuple representation, and the use of a two-way infinite tape.

A more complex machine

In addition to performing numerical functions using unary representation for numbers, we can perform tasks such as copying blocks of symbols, erasing blocks of symbols and so on. Here is an example of a Turing machine which when started in standard configuration on a tape containing a single block of ‘1’s, halts on a tape containing two copies of that block of ‘1’s, with the blocks separated by a single ‘0’. It uses an alphabet consisting of the symbols ‘0’, ‘1’ and ‘A’.


Figure 4: A Machine for Copying a Block of 1s
The action of this machine is to repeatedly change one of the original ones into an A, and then write a new ‘1’ to the right of all remaining ‘1’ on the tape, after leaving a zero between the original block and the copy. When we run out of the original ‘1’s, we turn the As back into ‘1’s.
The initial state, s0, is used to change a ‘1’ into an ‘A’, and move to the right and into state s1. In state s1 we skip the remainder of the block of ‘1’s until we find a ‘0’ (the block separator) and in s2 we skip any ‘1’s to the right of that ‘0’ (this is the copy of the block of ‘1’s that we are making.). When we reach the end of that block, we find a ‘0’, which we turn into a ‘1’ and head back to the left, and into state s3. States s3 and s4 skip leftward over the ‘1’s and separating ‘0’ on the tape until an ‘A’ is found. When this occurs, we go back into state s0, and move rightward.

At this point, we are either scanning the next ‘1’ of the original block, or the original block has all been turned into ‘A's, and we are scanning the separator ‘0’. In the former case, we make another trip through states s1–s4, but in the latter, we move into state s5, moving leftward. In this state we will repeatedly find ‘A's, which we replace with ‘1’s, and move to the left. If we find a ‘0’, then all of the ‘A's have been turned back into ‘1’s. We will be scanning the ‘0’ to the left of the original cell, and so we move right, and into the final state s6.

This copying machine could be used in conjunction with the addition machine of Figure 2 to construct a doubling machine, i.e. a machine which, when started on a tape representing the number n halts on a tape representing 2n. We could do this by first using the copying machine to produce a tape with two copies of n on the tape, and then using the addition machine to compute n+n (=2n). We would do this by identifying the copying machine's halt state (s6) with the adding machine's initial state (s0).

The construction just suggested relies on the fact that the copying machine terminates in standard position, which is required for the adding machine to correctly compute its result. By designing Turing machines which start and end in standard configuration, we can ensure that they may be composed in this manner. In the example, the copying machine has a unique terminating state, but this is not necessary. We might build a Turing machine which indicates the result of its computation by terminating on one of many states, and we can the combine that machine with more that one machine, with the identity of the machine which follow dependent on the switching machine. This would enable us to create a machine which adds one to the input if that input is even, and doubles it if odd, for example (should we want to for some reason).

4. What Can Be Computed
Turing machines are very powerful. For a very large number of computational problems, it is possible to build a Turing machine that will be able to perform that computation. We have seen that it is possible to design Turing machines for arithmetic on the natural numbers, for example.
Computable Numbers

Turing's original paper concerned computable numbers. A number is Turing-computable if there exists a Turing machine which starting from a blank tape computes an arbitrarily precise approximation to that number. All of the algebraic numbers (roots of polynomials with algebraic coefficients) and many transcendental mathematical constants, such as e and p are Turing-computable.

Computable Functions

As we have seen, Turing machines can do more than write down numbers. Among other things they can compute numeric functions, such as the machine for addition (presented in Figure 2) multiplication, proper subtraction, exponentiation, factorial and so on.

By adopting a convention for representing TRUE and FALSE, perhaps that TRUE is represented as a sequence of two ‘1’s and FALSE as one ‘1’, we can compute the characteristic functions of computable predicates, and we may combine the results of such functions using the boolean functions: AND, NOT, OR, IF-THEN-ELSE.

In fact the Turing-computable functions are just the recursive functions, described below.

Universal Turing Machines

The most striking positive result concerning the capabilities of Turing machines is the existence of Universal Turing Machines (UTM). When started on a tape containing the encoding of another Turing machine, call it T, followed by the input to T, a UTM produces the same result as T would when started on that input. Essentially a UTM can simulate the behavior of any Turing machine (including itself.)

One way to think of a UTM is as a programmable computer. When a UTM is given a program (a description of another machine), it makes itself behave as if it were that machine while processing the input.

Note again, our identification of input-output equivalence with “behaving identically”. A machine T working on input t is likely to execute far fewer transitions that a UTM simulating T working on t, but for our purposes this fact is irrelevant.

In order to design such a machine, it is first necessary to define a way of representing a Turing machine on the tape for the UTM to process. To do this we will recall that Turing machines are formally represented as a collection of 4-tuples. We will first design an encoding for individual tuples, and then for sequences of tuples.

Encoding Turing Machines

Each 4-tuple in the machine specification will be encoded as a sequence of four blocks of ‘1’s, separated by a single ‘0’

The first block of ones will encode the current state number, using the unary number convention above (n+1 ones represents the number n).
The second block of ones will encode the current symbol, using one ‘1’ to represent the symbol zero, and two to represent the symbol ‘1’ (again because we can't use zero ones to represent ‘0’.)
The third element of the tuple will represent the new state number in unary number notation.
The fourth element represents the action, and there are four possibilities: symbols will be encoded as above, with a block of three ‘1’s representing a move to the left («) and a block of four ‘1’s representing a move to the right (»).
Using this convention the tuple <0, ‘1’, 0, »> would be represented as in Figure 4.

Figure 4: The Encoding of the Tuple <0, ‘1’, 0, »>

To encode a complete machine, we need to simply write down the tuples on the tape, in any order, but separated from one another by two blank cells. The add-one machine of Figure 1, would be represented by the somewhat intimidating string shown in Figure 5.

… 010110101111 00 101011011 00 110110110111 00 11010111011110 …

Figure 5: The Encoding of the Machine in Figure 1
If we interpret encodings such as that in Figure 5 as a binary number (ignoring trailing zeros), we see that each Turing machine has a serial number.

5. What Cannot Be Computed
A crucial observation about Turing machine is that there are only countably many machines (a set is countable if it is finite, or may be placed in one-to-one correspondence with the integers.) It follows immediately that there are uncountably many real numbers that are not computable, since the reals are not countable. There are simply not enough Turing machines to compute all of those numbers.

Like the real numbers, the number of functions on the natural numbers is not countable. It follows therefore that there are uncountably many such functions that are not computable by any Turing machine.

Here we give two examples of uncomputable functions.

5.1 The Busy Beaver
Define the productivity of a Turing machine as the number of ones on the tape when it halts in standard configuration, if it does, and 0 if it does not. The machine is assumed to start in its lowest number state and with an initially blank tape. Now define p(n) to be the productivity of the most productive n-state machine. There is no Turing machine which will compute the function p, i.e., which when started in standard configuration on a tape with n ‘1’s will halt in standard configuration on a tape with p(n) ‘1’s. This example is due to Tibor Radó (Radó 1962).

We can prove the uncomputability of the busy beaver function by deriving a contradiction from the assumption that such a machine exists. The crucial step in the proof, is to note that if the Busy Beaver machine were to exist, the productivity of the most productive n+2k state machine is at least p(p(n)), where k is the number of states of the (alleged) Busy Beaver machine. This is demonstrated by building a machine which initially writes n ‘1’s on the tape (which can be done in n states — exercise for the reader), and then connecting the halt state of that machine to the start state of the Busy Beaver machine (resulting in a computation of p(n)), and then connecting the halt state of that machine with another copy of the Busy Beaver machine, resulting a computation of p(p(n)). Thus (if the Busy Beaver machine exists)

p(n+2k) = p(p(n)) for any n.
It is easy to show that the productivity of Turing machines increases as states are added, i.e.
if i < j, then p(i) < p(j)
(another exercise). Consequently (if the Busy Beaver machine exists)
n+2k = p(n) for any n.
Since this is true for any n, it is true for n+11, yielding:
n+11+2k = p(n+11) for any n.
But it is easy to show that p(n+11) = 2n (another exercise, but show that there is an eleven state machine for doubling the number of ‘1’ on the tape, and compose such a machine with the n-state machine for writing n ‘1’s). Combining this fact with the previous inequality we have:
n+11+2k = p(n+11) = 2n for any n.
from which by subtracting n from both sides we have 11+2k = n for any n if the Busy Beaver exists, which is a contradiction.
Even though the productivity function is uncomputable, there is considerable interest in the search for Busy Beaver Turing machines (most productive machines with a given number of states). Some candidates can be found by following links in the Other Internet resources section of this article.

5.2 The Halting Problem
It would be very useful to be able to examine the description of a Turing machine and determine whether it halts on a given input. this problem is called the Halting problem and is, regrettably, uncomputable. That is, no Turing machine exists which computes the function h(t,n) which is defined to be TRUE if machine t halts on input n and FALSE otherwise.

To see the uncomputability of the halting function, imagine that such a machine H exists, and consider a new machine built by composing the copying machine of Figure 4 with H by joining the halt state of the copier to the start state of H. Such a machine, when started on a tape with n ‘1’s determines whether the machine whose code is n halts when given input n, i.e. it computes M(n) = h(n,n).

Now lets add another little machine to the halt state of H. This machine goes into an infinite sequence of transitions if the tape contains TRUE when it starts, and halts if the tape contains FALSE (its an exercise for the reader to construct this machine, assume that TRUE is represented by ‘11’, and FALSE by ‘1’.)

This composed machine, call it M, halts if the machine with the input code n does not halt on an initial tape containing n (because if machine n does not halt on n, the halting machine will leave TRUE on the tape, and M will then go into its infinite sequence.) and vice versa.

To see that this is impossible, consider the code for M itself. What happens when M is started on a tape containing Ms code? Assume that M halts on M, then by the definition of the machine M it does not halt. But equally, if it does not halt on M the definition of M says that it should halt.

This is a contradiction, and the Halting machine cannot exist. The fact that the halting problem is not Turing-computable was first proved by Turing in (Turing 1937).

6. Alternative Formulations of Computability
6.1 Recursive Functions
Recursive function theory is the study of the functions that can be defined using recursive techniques (see the entry on recursive functions). Briefly, the primitive recursive functions are those that can be formed from the basic functions:

the zero function: z(x) = 0, for all x
the successor function: s(x) = x+1, for all x
the ith projection over j arguments: pi,j(x0,…xj) = xi, for all xi,i,j

by using the operations of composition and primitive recursion.

Composition: f(x1,…,xn) = g(h1(x1,…,xn),…, hm(x1,…,xn)) for all g,h1,…,hm
Primitive Recursion: f(x,0) = g(x) for any g
f(x,s(y)) = h(x,y,f(x,y)) for any h

The recursive functions are formed by the addition of the minimization operator, which takes a function f and returns h defined as follows:
Minimization: h(x1, …,xn) = y, if f(x1, …,xn,y)=0 and &#8704; t < y. f(x1, …,xn,t) is defined and positive.  
= undefined otherwise.

It is known that the Turing computable functions are exactly the recursive functions.

6.2 Abacus Machines
Abacus machines abstract the more familiar architecture of the modern digital computer (the von Neumann architecture.) In its simplest form a computer with such an architecture has a number of addressable registers each of which can hold a single datum, and a processor which can read and write to these registers.

The machine can perform two basic operations, namely: add one to the content of a named register (which we will symbolize as n+, where n is the name of the register) and (attempt to) subtract one from a named register, with two possible outcomes: a success branch if the register was initially non-zero, and a failure branch if the register was initially zero (we will symbolize the operation as n-).

These are called abacus computers by Lambek (Lambek 1961), and are known to be equivalent to Turing machines.

The modern digital computer is subject to finiteness constraints that we have abstracted away in the definition of abacus machines, just as we did in the case of Turing machines. Physical computers are limited in the number of memory locations that they have, and in the storage capacity of each of those locations, while abacus machines are not subject to those constraints. Thus some abacus-computable functions will not be computable by any physical machine. (We won't consider whether Turing machines and modern digital computers remain equivalent when both are given external inputs, since that would require us to change the definition of a Turing machine.)

7. Restricted Turing Machines
One way to modify the definition of Turing machines is by removing their ability to write to the tape. The resulting machines are called finite state machines. They are provably less powerful than Turing machines, since they cannot use the tape to remember the state of the computation. For example, finite state machines cannot determine whether an input string consists of some As followed by the same number of Bs. The reason is that the machine cannot remember how many As it has seen so far, except by being in a state that represents this fact, and determining whether the number of As and Bs match in all cases would require the machine to have infinitely many states (one to remember that it has seen one A, one to remember that it has seen 2, and so on.)
Bibliography
Barwise, J. and Etchemendy, J., 1993, Turing's World, Stanford: CSLI Publications.
Boolos, G.S. and Jeffrey, R.C., 1974, Computability and Logic, Cambridge: Cambridge University Press.
Davis, M., 1958, Computability and Unsolvability, McGraw-Hill; reprinted Dover 1982.
Herken, R., (ed.), 1988, The Universal Turing Machine: A Half-Century Survey, New York: Oxford University Press.
Hodges, A., 1983, Alan Turing, The Enigma, New York: Simon and Schuster.
Kleene, S.C., 1936, "General recursive Functions of Natural Numbers". Mathematische Annalen, 112.
Lambek, J., 1961, "How to program an infinite abacus", Canadian Mathematical Bulletin, 4: 279–293.
Lewis, H.R. and Papadimitriou, C.H., 1981, Elements of the Theory of Computation, Prentice-Hall.
Lin, S. and Radó, T., 1965, "Computer Studies of Turing Machine Problems," Journal of the Association for Computing Machinery, 12, 196–212.
Post, E., 1947, "Recursive Unsolvability of a Problem of Thue," The Journal of Symbolic Logic, 12.
Radó, T., 1962, "On Non-computable functions", Bell System Technical Journal, May.
Turing, A.M., 1936-7, "On Computable Numbers, With an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, (2) 42, pp 230-265; correction ibid. 43, pp 544-546 (1937). [Available Online].
Turing, A.M., 1937, "Computability and &#955;-Definability", The Journal of Symbolic Logic, 2.
Other Internet Resources
Turing Machines, the original version of the present entry, written by the Editors at the Stanford Encyclopedia of Philosophy in September 1995 and last modified in the Summer 2003 Edition.
The Alan Turing Home Page
The Turing's World Homepage
Busy Beaver
Turing's World Busy Beaver page
Michael Somos' page of Busy Beaver references.
The Halting Problem
Halting problem solvable (funny)
Online Turing Machine Simulators
There are many Turing machine simulators available. Here are three that use different technologies to implement simulators using your browser.
Andrew Hodges' Turing Machine Simulator (for limited number of machines)
Suzanne Britton's Turing Machine Simulator (A Java Applet)
CGI Based Turing Machine Simulator
Here are some applications that you can run on the desktop (no endorsement of these programs is implied).
Turing's World: a courseware package available from CSLI Publications (Macintosh only)
Visual Turing: freeware simulator for Widnows 95/98/NT/2000
During the second world war, Alan Turing was involved in code breaking activites at Station X in Bletchley Park in the UK.
Related Entries
Church, Alonzo | Church-Turing Thesis | computability and complexity | function: recursive | Turing, Alan

Copyright © 2004
David Barker-Plummer
dbp@csli.stanford.edu

--------------------------------------------------------------------------------

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Stanford Encyclopedia of Philosophy


"Turing machine
From Wikipedia, the free encyclopedia.
The Turing machine is an abstract machine introduced in 1936 by Alan Turing to give a mathematically precise definition of algorithm or 'mechanical procedure'. As such it is still widely used in theoretical computer science, especially in complexity theory and the theory of computation. The thesis that states that Turing machines indeed capture the informal notion of effective or mechanical method in logic and mathematics is known as the Church-Turing thesis.

The concept of the Turing machine is based on the idea of a person executing a well-defined procedure by changing the contents of an infinite amount of ordered paper sheets that can contain one of a finite set of symbols. The person needs to remember one of a finite set of states and the procedure is formulated in very basic steps in the form of "If your state is 42 and the symbol you see is a '0' then replace this with a '1', remember the state 17, and go to the following sheet."

Turing machines shouldn't be confused with the Turing test, Turing's attempt to capture the notion of artificial intelligence.

A Turing machine that is able to simulate any other Turing machine is called a universal Turing machine or simply a universal machine as Turing described it in 1947:

It can be shown that a single special machine of that type can be made to do the work of all. It could in fact be made to work as a model of any other machine. The special machine may be called the universal machine.
Contents [showhide]  
1 Definition

1.1 Informal description
1.2 Formal definition


1.2.1 One-tape Turing machine
1.2.2 k-tape Turing machine


2 Example

3 Deterministic and non-deterministic Turing machines

4 Universal Turing machines

5 Comparison with real machines

6 A physical Turing machine

7 See also

8 References

9 External links

10 Simulators

[edit]
Definition
[edit]
Informal description
A Turing machine is a pushdown automaton made more powerful by relaxing the last-in-first-out requirement of its stack. (Interestingly, a seemingly minor relaxation enables the Turing machine to perform such a wide variety of computations that it can serve as a model for the computational capabilities of all modern computer software.)

More precisely, a Turing machine consists of:

A tape which is divided into cells, one next to the other. Each cell contains a symbol from some finite alphabet. The alphabet contains a special blank symbol (here written as '0') and one or more other symbols. The tape is assumed to be arbitrarily extendible to the left and to the right, i.e., the Turing machine is always supplied with as much tape as it needs for its computation. Cells that have not been written to before are assumed to be filled with the blank symbol.
A head that can read and write symbols on the tape and move left and right.
A state register that stores the state of the Turing machine. The number of different states is always finite and there is one special start state with which the state register is initialized.
An action table (or transition function) that tells the machine what symbol to write, how to move the head ('L' for one step left, and 'R' for one step right) and what its new state will be, given the symbol it has just read on the tape and the state it is currently in. If there is no entry in the table for the current combination of symbol and state then the machine will halt.
Note that every part of the machine is finite, but it is the potentially unlimited amount of tape that gives it an unbounded amount of storage space.

[edit]
Formal definition
[edit]
One-tape Turing machine
More formally, a (one-tape) Turing machine is usually defined as a 6-tuple M = (Q,G,s,b,F,d), where

Q is a finite set of states
G is a finite set of the tape alphabet
is the initial state
is the blank symbol (the only symbol allowed to occur on the tape infinitely often at any step during the computation)
is the set of final or accepting states
is a partial function called the transition function, where L is left shift, R is right shift.
Definitions in the literature sometimes differ slightly, to make arguments or proofs easier or clearer, but this is always done in such a way that the resulting machine has the same computational power. For example, changing the set {L,R} to {L,R,S}, where S would allow the machine to stay on the same tape cell instead of moving left or right, does not increase the machine's computational power.

[edit]
k-tape Turing machine
A k-tape Turing machine can similarly be described as a 6-tuple M = (Q,G,s,b,F,d), where

Q is a finite set of states
G is a finite set of the tape alphabet
is the initial state
is the blank symbol
is the set of final or accepting states
is a partial function called the transition function, where L is left shift, R is right shift, S is no shift.
[edit]
Example
The following Turing machine has an alphabet {'0', '1'}, with 0 being the blank symbol. It expects a series of 1's on the tape, with the head initially on the leftmost 1, and doubles the 1's with a 0 in between, i.e., "111" becomes "1110111". The set of states is {s1, s2, s3, s4, s5} and the start state is s1. The action table is as follows.

Old Read   Wr.      New     Old Read   Wr.      New
St. Sym.   Sym. Mv. St.     St. Sym.   Sym. Mv. St.
- - - - - - - - - - - -     - - - - - - - - - - - -  
s1  1  ->  0    R   s2      s4  1  ->  1    L   s4  
s2  1  ->  1    R   s2      s4  0  ->  0    L   s5  
s2  0  ->  0    R   s3      s5  1  ->  1    L   s5  
s3  0  ->  1    L   s4      s5  0  ->  1    R   s1  
s3  1  ->  1    R   s3

A computation of this Turing machine might for example be: (the position of the head is indicated by displaying the cell in bold face)

Step State Tape   Step State Tape
- - - - - - - -   - - - - - - - - -    
1  s1   11        9  s2   1001    
2  s2   01       10  s3   1001    
3  s2   010      11  s3   10010    
4  s3   0100     12  s4   10011    
5  s4   0101     13  s4   10011    
6  s5   0101     14  s5   10011    
7  s5   0101     15  s1   11011    
8  s1   1101       -- halt --

The behavior of this machine can be described as a loop: it starts out in s1, replaces the first 1 with a 0, then uses s2 to move to the right, skipping over 1's and the first 0 encountered. S3 then skips over the next sequence of 1's (initially there are none) and replaces the first 0 it finds with a 1. S4 moves back to the left, skipping over 1's until it finds a 0 and switches to s5. s5 then moves to the left, skipping over 1's until it finds the 0 that was originally written by s1. It replaces that 0 with a 1, moves one position to the right and enters s1 again for another round of the loop. This continues until s1 finds a 0 (this is the 0 right in the middle between the two strings of 1's) at which time the machine halts.

[edit]
Deterministic and non-deterministic Turing machines
If the action table has at most one entry for each combination of symbol and state then the machine is a deterministic Turing machine (DTM). If the action table contains multiple entries for a combination of symbol and state then the machine is a non-deterministic Turing machine (NDTM or NTM).

[edit]
Universal Turing machines
Every Turing machine computes a certain fixed partial computable function from the input strings over its alphabet. In that sense it behaves like a computer with a fixed program. However, as Alan Turing already described, we can encode the action table of any Turing machine in a string. Thus we might try to construct a Turing machine that expects on its tape a string describing an action table followed by a string describing the input tape, and then computes the tape that the encoded Turing machine would have computed. As Turing showed, such a Turing machine is indeed possible and since it is able to simulate any other Turing machine it is called a universal Turing machine.

With this encoding of action tables as strings, it becomes possible in principle for Turing machines to answer questions about the behavior of other Turing machines. Most of these questions, however, are undecidable, meaning that the function in question cannot be calculated by any Turing machine. For instance, the problem of determining whether a particular Turing machine will halt on a particular input, or on all inputs, known as the Halting problem, was shown to be undecidable in Turing's original paper. Rice's theorem shows that any nontrivial question about the behavior or output of a Turing machine is undecidable.

If we broaden the definition to include any Turing machine that simulates some Turing-complete computational model, not just Turing machines that directly simulate other Turing machines, a universal Turing machine can be fairly simple, using just a few states and a few symbols. For example, only 2 states are needed, since a 2×18 (meaning 2 states, 18 symbols) universal Turing machine is known. A complete list of the smallest known universal Turing machines is: 2×18, 3×10, 4×6, 5×5, 7×4, 10×3, 22×2. These simulate a computational model called tag systems.

A universal Turing machine is Turing-complete. It can calculate any recursive function, decide any recursive language, and accept any recursively enumerable language. According to the Church-Turing thesis, the problems solvable by a universal Turing machine are exactly those problems solvable by an algorithm or an effective method of computation, for any reasonable definition of those terms.

[edit]
Comparison with real machines
It's often said that Turing machines, unlike simpler automata, are as powerful as real machines, and are able to execute any operation that a real program can. What is missed in this statement is that almost any particular program running on a particular machine and given a finite amount of input is in fact nothing but a deterministic finite automaton, since the machine it runs on can only be in finitely many states. Turing machines would actually only be equivalent to a real machine that is magically given an infinite amount of storage space. We might ask, then, why Turing machines are useful models of real computers. There are a number of ways to answer this:

Even the less powerful models of real machines are much more complex than a Turing machine. For example, a Turing machine describing an algorithm may have a few hundred states, while the equivalent DFA on a given real machine has quadrillions.
Turing machines can describe algorithms over all machines at once, regardless of how much memory they have; there is a maximum to the amount of memory any machine has now, but this limit can rise arbitrarily in time. Statements about algorithms should be timeless.
Turing machines can describe algorithms without reference to any sort of machine-specific characteristics, such as the memory limit.
Turing machines simplify the statement of algorithms. Algorithms running on Turing-equivalent abstract machines are usually more general than their counterparts running on real machines, because they have arbitrary-precision datatypes available and never have to deal with unexpected conditions such as running out of memory.
On the other hand, there's one way in which Turing machines are often considered a poor model for programs: real programs, such as operating systems and word processors, often receive an unbounded amount of input over time, and never "finish" their task. Turing machines do not model such ongoing computation well, but can still model portions of it, such as individual procedures.

[edit]
A physical Turing machine
It is not difficult to simulate a Turing machine on a modern computer (except for the limited memory of actually existing computers).

It is also possible to build a Turing machine on a purely mechanical basis. The mathematician Karl Scherer has indeed built such a machine in 1986 using metal and plastic construction sets, and some wood. The 1.5 meter high machine uses the pulling of strings to read, move and write the data (which is represented using ball bearings).

The machine is now exhibited in the entrance of the Department of Computer Science of the University of Heidelberg, Germany.

Also, using a few hundred mirrors, one can build an optical universal Turing machine in one's backyard, using the Horseshoe map. This is based on a work by Stephen Smale.

The concept of a Turing machine was used as an educational tool in the science fiction novel, The Diamond Age (1995), by Neal Stephenson. The main character, Nell, possesses an interactive book which teaches her creative thinking and logic by presenting puzzles in a story as Turing machines which become more and more complex as the story progresses. They begin as a simple chain-fed clockwork device and progresses to abstract economic processes, trades, and finally the interaction of entire fictional kingdoms.

[edit]
See also
Langton's ant, a simple two-dimensional Turing machine.
Paterson's worms, a family of two-dimensional Turing machines
Church-Turing thesis, effective Turing machine can perform any computation in any language.
Busy Beaver
Computability logic
Turing tarpit, a term describing a minimalistic turing-complete programming language
[edit]
References
Rolf Herken: The Universal Turing Machine - A Half-Century Survey, Springer Verlag, ISBN 3-211-82637-8
Paul Strathern: Turing and the Computer - The big idea, Anchor Books/Doubleday, ISBN 0-385-49243-X
Turing, A., On Computable Numbers, With an Application to the Entscheidungsproblem (http://www.abelard.org/turpap2/tp2-ie.asp), Proceedings of the London Mathematical Society, Series 2, Volume 42, 1936; reprinted in M. David (ed.), The Undecidable, Hewlett, NY: Raven Press, 1965;
Boolos, G. and Jeffrey, R., Computability and Logic, 2nd ed., Cambridge: Cambridge University Press, 1980.
Rogozhin, Yurii, "A Universal Turing Machine with 22 States and 2 Symbols" (http://www.imt.ro/Romjist/Volum1/Vol1_3/turing.htm), Romanian Journal Of Information Science and Technology, 1(3), 259-265, 1998. (surveys known results about small universal Turing machines)
[edit]
External links
Turing Machine on Stanford Encyclopedia of Philosophy (http://plato.stanford.edu/entries/turing-machine/)
Detailed info on the Church-Turing Hypothesis (http://plato.stanford.edu/entries/church-turing/) (Stanford Encyclopedia of Philosophy)
Computing machinery and intelligence - A.M. Turing (http://www.abelard.org/turpap/turpap.htm) Original 1950 article by Alan Turing on machine intelligence, where he introduces the famous Turing test
The Turing test and intelligencedocument (http://www.abelard.org/turing/tur-hi.htm) clarifing the meaning of the Turing test and suggests that meeting the Turing test is already in the process of being achieved
Citations from CiteSeer (http://citeseer.org/cs?q=Turing+machine)
[edit]
Simulators
Visual Turing, a Turing machine interactive simulator/IDE (http://www.cheransoft.com/vturing/) (free software for Windows).
Suzanne Brittons Turing Machine Simulator (http://ironphoenix.org/tril/tm/) (java applet).
C++ Simulator of a Nondeterministic and Deterministic Multitape Turing Machine (http://semillon.wpi.edu/~aofa/AofA/msg00020.html) (free software).
C++ Simulator of a Universal Turing Machine (which accepts Multitape Turing Machine) (http://semillon.wpi.edu/~aofa/AofA/msg00024.html) (free software).
Turing Machine as an educational freeware board game for PCs (http://www.zillions-of-games.com/cgi-bin/zilligames/submissions.cgi/71230?do=show;id=9)
Turing Machine II - an educational freeware board game for PCs (http://www.zillions-of-games.com/cgi-bin/zilligames/submissions.cgi/71230?do=show;id=10)
Turing Train Teminal (http://www.monochrom.at/turingtrainterminal/) - A working Turing machine built out of scale trains.

Retrieved from "http://en.wikipedia.org/wiki/Turing_machine"
Categories: Alan Turing | Computational models

ViewsArticleDiscussionEdit this pageHistory Personal toolsCreate account / log in NavigationMain Page
Community portal
Current events
Recent changes
Random page
Help
Donations
Search   ToolboxWhat links here
Related changes
Special pages
In other languages&#1041;&#1098;&#1083;&#1075;&#1072;&#1088;&#1089;&#1082;&#1080;
Cesky
Deutsch
Espańol
Français
&#54620;&#44397;&#50612;
Italiano
&#1506;&#1489;&#1512;&#1497;&#1514;
Magyar
Nederlands
&#26085;&#26412;&#35486;
Lietuviu
Polski
&#1056;&#1091;&#1089;&#1089;&#1082;&#1080;&#1081;
Slovenšcina
Suomi
Svenska
Türkçe
&#20013;&#25991;

This page was last modified 23:46, 12 Feb 2005. All text is available under the terms of the GNU Free Documentation License (see Copyrights for details).
About Wikipedia Disclaimers"

http://en.wikipedia.org/wiki/Turing_machine

"Historical Introduction --- A Century of Controversy Over the Foundations of Mathematics
G.J. Chaitin's 2 March 2000 Carnegie Mellon University School of Computer Science Distinguished Lecture. The speaker was introduced by Manuel Blum. The lecture was videotaped; this is an edited transcript which appeared on pp. 12-21 of a special issue of Complexity magazine on ``Limits in Mathematics and Physics'' (Vol. 5, No. 5, May/June 2000).
Thanks very much Manuel! It's a great pleasure to be here!

We're in a state of euphoria now in the computer business because things are going so well: the web, e-commerce. It's all paying for our salaries, and it's a nice moment to be around, when things are going so well. But I'd like to make the outrageous claim, that has a little bit of truth, that actually all of this that's happening now with the computer taking over the world, the digitalization of our society, of information in human society, you could say in a way is the result of a philosophical question that was raised by David Hilbert at the beginning of the century.

It's not a complete lie to say that Turing invented the computer in order to shed light on a philosophical question about the foundations of mathematics that was asked by Hilbert. And in a funny way that led to the creation of the computer business.

It's not completely true, but there is some truth in it. You know, most historical statements are a lie, so this one isn't that much worse than most others!

So I'd like to explain the philosophical history of the computer. In a way what happened, and I'll tell you more, is that Hilbert said we should formalize all of mathematics, mathematical reasoning. And this failed: it took Gödel and Turing to show that it couldn't be done. It failed in that precise technical sense. But in fact it succeeded magnificently, not formalization of reasoning, but formalization of algorithms has been the great technological success of our time --- computer programming languages!

So if you look back at the history of the beginning of this century you'll see papers by logicians studying the foundations of mathematics in which they had programming languages. Now you look back and you say this is clearly a programming language! If you look at Turing's paper of course there's a machine language. If you look at papers by Alonzo Church you see the lambda calculus, which is a functional programming language. If you look at Gödel's original paper you see what to me looks like LISP, it's very close to LISP, the paper begs to be rewritten in LISP!

So I'd like to give you this hidden philosophical history of computer technology which is how philosophically minded mathematicians set out to solve once and for all the foundational problems of mathematics and did not succeed but helped to create computer technology as a by product. This was the failure of this project! We're all benefiting from the glorious failure of this project!

However this project has not died completely. --- I'm going to start more systematically from the beginning; but I'm trying to give an introduction. --- It's popular to think, well Gödel did this wonderful thing in 1931 and Turing added a lot of profound stuff in 1936, but the world has moved on from that point. And what I'd like to do is to tell you that in fact I've done some more work in this area.

You may think it's misguided! Most of the world has shrugged and gone on. We had this disappointment. What Gödel and Turing showed is that axiomatic formal reasoning has certain limitations. You can't formalize it all. And at first people were tremendously shocked and then they shrugged and said, so what? Mathematicians went on, ignoring this. And my misfortune or fortune was that I didn't want to shrug. I said, I want to understand this better. And I'm going to tell you the story of my attempt to understand Gödel incompleteness. --- It's a psychological problem that a good psychiatrist could have cured me of, and then I wouldn't have done any of this work!

So let me start at the beginning and tell you this story of a hundred years of intense worry, crisis, self-doubt, self-examination and angst about the philosophy of mathematics.

There've been lots of crises in the history of mathematics. Mathematics is not placid, static and eternal.

One of the first crises was the Pythagorean result that the square root of two is irrational. And the fact that this was a crisis survives in the word ``irrational''. Remember the Greeks thought that rationality was the supreme goal --- Plato! Reason! If a number is called irrational that means that this was the Gödel incompleteness theorem of ancient Greece. So there was a crisis there.

Another crisis was caused by the calculus. A lot of people said this is nonsense, we're talking about infinitesimals, what is this? Bishop Berkeley was a theologian and he said, pure mathematicians make as little sense as theologians, you can't reject us saying we're unreasonable. The way you deal with evanescent quantities in the calculus --- this was before the calculus had a rigorous foundation --- is as bad as our theological discussions! So at that time it was pretty bad!

Then there was a crisis about the parallel postulate, about non-Euclidean geometries.

So mathematics is not static and eternal!

But the particular crisis that I want to tell you about goes back a little more than a hundred years to work of Cantor on set theory.


Cantor: Theory of Infinite Sets
So my talk is very impractical. We all know that you can have a start-up and in one year make a million dollars if you're lucky with the web. So this is about how not to make any money with the web. This is about how to ruin your career by thinking about philosophy instead.

So Cantor was obsessed with the notion of infinite, and it's not mentioned that he was obsessed with infinite because he was interested in theology and God, which is edited out from the accounts now, but that was the original idea.

And Cantor had the idea that if you have 1, 2, 3, ... why stop there?


1, 2, 3, ... &#969;
--- I'm giving you a cartoon version of Cantor's theory of infinite sets. --- You put an omega, &#969;, this is a Greek letter, the lower case of the last letter in the Greek alphabet, that's the reason to pick it. So you just say, I'm going to put another number here instead of stopping with 1, 2, 3, ... This is going to be the first number after all the finite numbers. This is the first transfinite number.

You can keep going for a while.


1, 2, 3, ... &#969;, &#969; + 1, &#969; + 2, ...
And then you have another thing like a copy of 1, 2, 3, ... : &#969; + 1, &#969; + 2, &#969; + 3, ... These are names. And then you say, why stop here? I'm going to put something after all this, so 2&#969;, 2&#969; + 1, + 2, + 3, then later 3&#969;, 4&#969; ... Well, what comes after all of those? Why stop there? So, &#969; squared, obviously.


1, 2, 3, ... &#969;, &#969; + 1, &#969; + 2, ... 2&#969;   3&#969;   4&#969;   &#969;2
Then you keep going. 5&#969;2 + 8&#969; + 96! And then much later you get to &#969; cubed! And then eventually &#969; to the fourth. You keep going and why stop there? This sequence goes on forever, but let's put something after all of those. So what would that be? That would be obviously &#969; to the &#969;. This is starting to get interesting! Then you keep going and you have &#969; to the &#969; to the &#969;. This is a pretty far-out number already!


1, 2, 3, ... &#969;, &#969; + 1, &#969; + 2, ... 2&#969;   3&#969;   4&#969;   &#969;2   &#969;3   &#969;4   &#969;&#969;   &#969;&#969;&#969;
You can see why this is becoming theological. This is the mathematical equivalent of drug addiction. Instead of getting high on alcohol or grass you get high on ideas like this. After a while you don't know where you're standing or what's going on!

Then the next number is &#969; to the &#969; to the &#969; forever.


&#969; &#969; &#969; &#969; &#969; &#969; ...
This number is the smallest solution of the equation


x = &#969;x
And it's called e0, epsilon nought, I don't know why. Because you start having problems with how to name things, because up to here I was using normal algebraic notation just throwing in &#969;.

So anyway you can see this is fantastic stuff! I don't know whether it's mathematics, but it's very imaginative, it's very pretty, and actually there was a lot of practical spin-off for pure mathematicians from what Cantor was doing.

Some people regarded set theory as a disease. Poincaré, the great French mathematician, said set theory is a disease, he said, from which I hope future generations will recover. But other people redid all of mathematics using the set-theoretic approach. So modern topology and a lot of abstract mathematics of the twentieth century is a result of this more abstract set-theoretic approach, which generalized questions. The mathematics of the nineteenth century was at a lower level in some ways, more involved with special cases and formulas. The mathematics of the twentieth century --- it's hard to write a history of mathematics from the year ten-thousand looking back because we're right here --- but the mathematics of the twentieth century you could almost say is set-theoretical, ``structural'' would be a way to describe it. The mathematics of the nineteenth century was concerned with formulas, infinite Taylor series perhaps. But the mathematics of the twentieth century went on to a set-theoretic level of abstraction.

And in part that's due to Cantor, and some people hate it saying that Cantor wrecked and ruined mathematics by taking it from being concrete and making it wishy-washy, for example, from hard analysis to abstract analysis. Other people loved this. It was very controversial.

It was very controversial, and what didn't help is in fact that there were some contradictions. It became more than just a matter of opinion. There were some cases in which you got into really bad trouble, you got obvious nonsense out. And the place you get obvious nonsense out in fact is a theorem of Cantor's that says that for any infinite set there's a larger infinite set which is the set of all its subsets, which sounds pretty reasonable. This is Cantor's diagonal argument --- I don't have time to give you the details.

So then the problem is that if you believe that for any infinite set there's a set that's even larger, what happens if you apply this to the universal set, the set of everything? The problem is that by definition the set of everything has everything, and this method supposedly would give you a larger set, which is the set of all subsets of everything. So there's got to be a problem, and the problem was noticed by Bertrand Russell.


Bertrand Russell

Cantor I think may have noticed it, but Bertrand Russell went around telling everyone about it, giving the bad news to everyone! --- At least Gödel attributes to Russell the recognition that there was a serious crisis.
The disaster that Russell noticed in this proof of Cantor's was the set of all sets that are not members of themselves, that turns out to be the key step in the proof. And the set of all sets that aren't members of themselves sounds like a reasonable way to define a set, but if you ask if it's inside itself or not, whatever you assume you get the opposite, it's a contradiction, it's like saying this statement is false. The set of all sets that are not members of themselves is contained in itself if and only if it's not contained in itself.

So does this mean that some ways of defining sets are bad, or that the universal set gets you into trouble? What's wrong with the set of everything? So there was a problem with set theory --- that became increasingly clear. I think Russell helped to make it be recognized by everybody that we had a serious crisis and that methods of reasoning that seemed at first sight perfectly legitimate in some cases led to obvious disaster, to contradictions. There were a whole bunch of paradoxes that Russell advertised: the Berry paradox, the one I just mentioned is called the Russell paradox, and there's another paradox, the Burali-Forti paradox.

A lot of these paradoxes in fact were really brought to the attention of the world by Russell. Russell would typically have a footnote saying this paradox occurred to me while I was reading a paper by Burali-Forti, so everyone calls it the Burali-Forti paradox. Burali-Forti I think spent his whole life trying to live down this attribution because he didn't believe that mathematics was in trouble!

Okay so there was a crisis, and I think Russell was one of the key figures in this. At this point David Hilbert comes to the rescue.


David Hilbert

David Hilbert was a very important mathematician around the turn of the century. Unlike Poincaré, a very important French mathematician --- Hilbert was a very important German mathematician --- Hilbert liked set theory. He liked this abstract Cantorian approach. And Hilbert had the idea of solving once and for all these problems. How was he going to do it?
The way Hilbert was going to do it is using the axiomatic method, which of course goes back to Euclid --- Hilbert didn't invent this. But he went one significant step further.


Hilbert: Formal Axiomatic Method

Hilbert said let's use all the technology from symbolic logic, which a lot of people were involved in inventing, and let's go to some final extreme. Because one of the reasons you got into trouble and got contradictions in mathematics with set theory is because words are very vague. What we want to do to get rid of all these problems in mathematics and in reasoning is get rid of pronouns for example, you don't know what pronouns refer to. And there are all kinds of things that are vague in normal language.
Hilbert said that the way to get rid of all these problems is to come up with a finite set of axioms and an artificial language for doing mathematics --- this is the idea of formalism taken to the limit.


Formalism


Take formalism to the absolute limit and invent a completely artificial language with completely precise rules of the game --- artificial grammar and everything --- and eliminate all these problems, like the problems that Russell had. This was an ambitious program to once and for all put mathematics on a firm footing.
And one thing that Hilbert emphasized, which was as far as I know a key contribution that he himself made, was that he wanted the rules of the game for this formal axiomatic system for all of mathematics to be so precise that you have a mechanical proof checker. So it's completely certain and objective and mechanical whether a proof obeys the rules or not. There should be no human element, there should be no subjective element, there should be no question of interpretation. If somebody claims they have a proof, it should be absolutely clear, mechanical, to check it and see, does it obey the rules and you proved a theorem or does it have a mistake, does it fail.

So this is the idea that mathematics should be absolutely black or white, precise, absolute truth. This is the traditional notion of mathematics.


Black or White

The real world we know is an absolute mess --- right? --- everything's complicated and messy. But the one place where things should be absolutely clear, black or white, is in pure mathematics.
So this is sort of what Hilbert is saying, and he proposed this as a goal, to have this formalization of all of mathematics and eliminate all the problems. Now this was a program, this was not supposed to be something you did over a weekend. Hilbert proposed this as a goal for putting mathematics on a very firm foundation. And he and a group of very bright collaborators, including John von Neumann, set to work on this, and for a while, for thirty years, it looked sort of encouraging. And then --- this is a quick summary of a century of work --- then as I'm sure all of you know there were a few little problems!

The problems are 1931, Kurt Gödel, and 1936, Alan Turing.


1931 Gödel
1936 Turing

They showed that it could not be done, that there were fundamental obstacles to formalizing all of mathematics and making mathematics absolutely black and white and absolutely crystal clear. Remember what Hilbert is proposing is that we should formalize all of mathematics so that everyone on planet earth can agree that a proof is either correct or incorrect. The rules of the game should be absolutely explicit, it should be an artificial language and then mathematics will give you absolute truth. ``Absolute truth'' should be underlined in a very beautiful font and you should hear the angels singing when you say these words! This was the thought that we mathematicians have absolute truth. It's ours --- no one else has it, only us! That was the idea.
So it turns out this doesn't quite work. Why doesn't it work?

Gödel shocked people quite a bit by showing that it couldn't work. It was very, very surprising when Gödel did this in 1931. And Turing went I think more deeply into it. So let me give you a cartoon five minute summary, my take on what they did.

Gödel starts with ``this statement is false'', what I'm now saying is a lie, I'm lying. If I'm lying, and it's a lie that I'm lying, then I'm telling the truth! So ``this statement is false'' is false if and only if it's true, so there's a problem. Gödel considered instead ``this statement is unprovable''.


``This stmt is unprovable!''

Here unprovable means unprovable from the axioms of Hilbert's formal axiomatic system, unprovable within the system that Hilbert was trying to create.
Now think about a statement that says that it's unprovable. There are two possibilities: it's provable or it's unprovable. This is assuming you can make a statement say it's unprovable, that there's some way to say this within Hilbert's system. That required enormous cleverness: Gödel numbering, trickery for a statement to refer to itself indirectly, because pronouns that say ``this'' or ``I'' aren't usually found in mathematical formulas. So this required a lot of cleverness on Gödel's part. But the basic idea is ``this statement is unprovable''.

So there are two possibilities. Either it's provable or it's unprovable. And this means provable or unprovable from the system that Hilbert had proposed, the final goal of formalizing all of mathematics.

Well, if it's provable, and it says it's unprovable, we're proving something that's false. So that's not very nice. And if it's unprovable and it says it's unprovable, well then, what it states is true, it's unprovable, and we have a hole. Instead of proving something false we have incompleteness, we have a true statement that our formalization has not succeeded in capturing.

So the idea is that either we're proving false statements, which is terrifying, or we get something which is not as bad, but is still awful, which is that our formal axiomatic system is incomplete --- there's something that's true but we can't prove it within our system. And therefore the goal of formalizing once and for all all of mathematics ends up on the floor!

Now I don't think that Hilbert really wanted us to formalize all of mathematics. He didn't say that we should all work in an artificial language and have formal proofs. Formal proofs tend to be very long and inhuman and hard to read. I think Hilbert's goal was philosophical. If you believe that mathematics gives absolute truth, then it seems to me that Hilbert has got to be right, that there ought to have been a way to formalize once and for all all of mathematics. That's sort of what mathematical logic was trying to do, that's sort of what the axiomatic method was trying to do, the idea of breaking proofs into smaller and smaller steps. And Leibniz thought about this, and Boole thought about this, and Frege and Peano and Russell and Whitehead thought about this. It's the idea of making very clear how mathematics operates step by step. So that doesn't sound bad. Unfortunately it crashes at this point!

So everyone is in a terrible state of shock at this point. You read essays by Hermann Weyl or John von Neumann saying things like this: I became a mathematician because this was my religion, I believed in absolute truth, here was beauty, the real world was awful, but I took refuge in number theory. And all of a sudden Gödel comes and ruins everything, and I want to kill myself!

So this was pretty awful. However, this


``This stmt is unprovable!''

is a very strange looking statement. And there are ways of rationalizing, human beings are good at that, you don't want to face unpleasant reality. And this unpleasant reality is very easy to shrug off: you just say, well, who cares! The statements I work with normally in mathematics, they're not statements of this kind. This is nonsense! If you do this kind of stupidity, obviously you're going to get into trouble.
But that's rationalizing too far. Because in fact Gödel made this


``This stmt is unprovable!''

into a statement in elementary number theory. In its original form, sure, it's nonsense, who ever heard of a statement in mathematics that says it's unprovable? But in fact Gödel made this into a numerical statement in elementary number theory, in arithmetic. It was a large statement, but in some clever way, involving Gödel numbering of all arithmetic statements using prime numbers, he was writing it so that it looked like a statement in real mathematics. But it really indirectly was referring to itself and saying that it's unprovable.
So that's why there's a problem. But people didn't really know what to make of this. So I would put ``surprising'' here, surprising, a terrible shock!


1931 Gödel ``This stmt is unprovable!'' Surprising

Now my reaction as a child reading this proof is that I follow it step by step, but I don't like it. It doesn't appeal to me! Which is good, because if I had said I like it, it's wonderful, finished, I go ahead and become a molecular biologist and start up a biotech company, and now I'd be rich, but I wouldn't have done any work in this area!
Then comes Turing.


1936 Turing

Now I prefer Turing's approach. Turing goes more deeply into this. Turing starts talking about computers. This is the point where it happens!

1936 Turing Computer

Turing has to invent the computer, because Hilbert says that there should be a mechanical procedure to decide if a proof is correct or not. Turing says what Hilbert really means is that there should be a computer program for checking proofs. But first Turing has to say what a computer is, it's a Turing machine, and all of this is in a paper of Turing's in 1936, when there were no computers, so it's a fantastic piece of work. And I would like to claim that this is the invention of the computer. These were general-purpose computers, that was the idea, on paper.
What Turing shows is in fact that there is a relatively concrete statement that escapes the power of mathematics. We now think of computers as physical devices, so they're almost like something in physics. It's a machine working away, it's an idealization of that, you have this machine working, and Turing discovers the halting problem.


1936 Turing Computer Halting problem

The halting problem says there's no way to decide if a computer program will eventually halt.
Now obviously to decide if a computer program halts is the easiest thing in the world. You run it and when you run out of patience, that's it, it doesn't halt as far as you're concerned. Who cares, you can't wait any longer! But what Turing showed is that there's a problem if you put no time limit. This is very abstract mathematics --- in the real world there's always a time limit! You can't run a program a million years, a billion years, 101010 years! If you put a time limit, the halting problem is very easy to decide, in principle: you just run the program that long and you see, does it halt by that point or not.

But what Turing showed is that if you put no time limit, then there is no solution. There's no way to decide in advance whether a computer program will halt or not. If it halts you can eventually discover that by running it. The problem is to realize that you've got to give up. So there's no mechanical procedure that will decide in advance if a computer program will halt or not, and therefore, it turns out, there is no set of mathematical axioms in Hilbert's sense that can enable you to prove whether a program will halt or not.

Because if you could always prove whether a program will halt or not, you could run through all possible proofs in size order and check whether they're correct, and eventually either find a proof that the program's going to halt or find a proof that it's not going to halt. And this would give you a way to decide in advance whether a program's going to halt.

Now in practice running through all possible proofs requires an astronomical amount of time. Imagine how many proofs are there that are one page long! You'd never get through them! But in principle you can run through all possible proofs in size order and check whether they obey the rules, if it's a Hilbert formal axiomatic system. So if you had a formal axiomatization of mathematics that enabled you to always prove whether a program halts or not, that would give you a mechanical procedure, by running through all possible proofs in size order, to decide whether a program will halt or not. And Turing showed that you can't do it. His proof, by the way, involves Cantor's diagonal argument --- all these ideas are connected, but there's no time to go into that.

So I think that Turing's work makes the limits of mathematics seem much more natural, because we're talking about a question about a physical device, it's a computer.


1936 Turing Computer Halting problem Natural

You fantasize a little bit, you make it a theoretical computer, a computer that can go on forever, that never breaks down, that has as much storage as it wants, so that if numbers get too big it can keep going anyway. But that's not too much of a fantasy; we have devices like that everywhere, right? So it sounds much more concrete. The limits of mathematics discovered by Turing sound more serious, more dangerous than the ones that Gödel found.
And this is the invention of the computer, for this crazy kind of theoretical argument! You don't see billions and billions of dollars of technology in this 1936 paper, but it was all there in embryonic form, as von Neumann kept emphasizing: the universal Turing machine is really the notion of a general-purpose programmable computer. You had machines that did calculations before, but they did special-purpose calculations, they were adding machines, mechanical calculating machines, and I used them when I was a kid. But the notion of a computer is Turing's notion of a machine that can do what any calculating machine can do, and that's the idea of software: it's a very general-purpose machine, it's a flexible machine. So it's really there, von Neumann kept saying, very clearly in Turing's paper. So you have this whole technology there!

And in fact Gödel's paper as I said uses LISP, there's a programming language hidden in it, and in Turing's paper there's a programming language, given explicitly, Turing machines, and it's a machine language. It's actually a very bad machine language, it's a machine that no person in their sane mind would want to program. But Turing wanted to keep it as simple as possible. Obviously, if his paper had included a manual for the machine language of a real machine, it would have been hopeless, no one would have understood it.

Okay, now what happens with all of this? What happens with all of this is that Hilbert dies, World War II comes, and when I'm a child in the 1950's I could still read essays by John von Neumann talking about all of this, but the world was clearly going in a less philosophical direction. Things were going downhill rapidly until we're all billionaires with our web start-ups! People were less concerned about philosophy, and computers were becoming a technology, and Turing was very involved in that, and so was von Neumann.

But stupidly I wanted to understand what was going on in the foundations of mathematics, so in a way I'm stuck in the 1930's, I never got past that stage. What happened? What happened with me is that I couldn't accept the fact that everybody said, who cares! Now it's true that there are a lot of things in life besides the foundations of mathematics and epistemology! There're things like having a family, earning a living, wars, politics, lots of stuff out there, obviously! But what I couldn't accept was that even in the world of pure mathematics, mathematicians were saying, so what, in practice we should do mathematics exactly the same as we've always done it, this does not apply to the problems I care about! That was basically the reaction to Gödel's and Turing's work on incompleteness.

At first there was terrible shock, then it went from one extreme to another. Who cares, people would say, it's obvious, or it's irrelevant! This has no impact in practice on how we should do mathematics. I was very unhappy with that. I was obsessed by incompleteness, and I had an idea.

When I was a kid I really wanted to be a physicist, and a lot of mathematicians say I never made it into mathematics really --- I never succeeded, I'm still stuck! I wanted to be a physicist, and I got corrupted by a lot of ideas from physics. While all of this crisis was going on in mathematics, there was a parallel crisis going on in physics, which actually started in the 1920's: that's quantum mechanics, and the key date is 1924.


1924 Quantum Mechanics

And that's the whole question of uncertainty and randomness in fundamental physics. So when I was a kid, besides reading essays talking about Gödel's incompleteness theorem saying ``Oh, my God'', there were also essays asking what happened to determinism in physics, what happened to predictability, can there be randomness, does God play dice? Einstein said no, God doesn't play dice. He hated quantum mechanics. And everybody else said yes, God plays dice.

God plays dice!

Quantum mechanics is the most successful physical theory ever. We get transistors and computers from it. But even though Einstein helped to contribute to the creation of quantum mechanics he hated it. So it looks like Einstein was wrong. God does play dice!
So I had a crazy idea. I thought that maybe the problem is larger and Gödel and Turing were just the tip of the iceberg. Maybe things are much worse and what we really have here in pure mathematics is randomness. In other words, maybe sometimes the reason you can't prove something is not because you're stupid or you haven't worked on it long enough, the reason you can't prove something is because there's nothing there! Sometimes the reason you can't solve a mathematical problem isn't because you're not smart enough, or you're not determined enough, it's because there is no solution because maybe the mathematical question has no structure, maybe the answer has no pattern, maybe there is no order or structure that you can try to understand in the world of pure mathematics. Maybe sometimes the reason that you don't see a pattern or structure is because there is no pattern or structure!

And one of my motivations was the prime numbers. There's some work on the prime numbers that says that in some ways the prime numbers can be looked at statistically. There seems to be a certain amount of randomness in the distribution of the primes. That's one of the ways that people try to think about the prime numbers. And this even happens in number theory, which is the queen of pure mathematics!

So on the one hand I heard this talk about probabilistic ways of thinking about the primes --- this was heuristic --- and this stuff about God plays dice in fundamental physics --- what goes on in the atom is random --- and I begin to think, well, maybe that's what's going on in the foundations of mathematics.

This is what I set out to do, and this project took a long time. One of the first steps is clarifying what do you mean by randomness. What do you mean by lack of structure, lack of order, lack of pattern?


Randomness: lack of structure

So this is a kind of a logical notion of randomness rather than a statistical notion of randomness. It's not like in physics where you say a physical process is random like coin tossing. I don't care where something comes from. I just look at something and say does it have structure or pattern or not. So this is logical or structural randomness as opposed to physical unpredictability and randomness. It's different --- it's very closely related, but they're different.
And the idea that I came up with --- and Kolmogorov came up with at the same time independently --- is the idea that something is random if it can't be compressed into a shorter description, if essentially you just have to write it out as it is. In other words, there's no concise theory that produces it. For example, a set of physical data would be random if the only way to publish it is as is in a table, but if there's a theory you're compressing a lot of observations into a small number of physical principles or laws. And the more the compression, the better the theory: in accord with Occam's razor, the best theory is the simplest theory. I would say that a theory is a program --- also Ray Solomonoff did some thinking along these lines for doing induction --- he didn't go on to define randomness, but he should have! If you think of a theory as a program that calculates the observations, the smaller the program is relative to the output, which is the observations, the better the theory is.

By the way, this is also what axioms do. I would say that axioms are the same idea. You have a lot of theorems or mathematical truth and you're compressing them into a set of axioms. Now why is this good? Because then there's less risk. Because the axioms are hypotheses that you have to make and every time you make a hypothesis you have to take it on faith and there's risk --- you're not proving it from anything, you're taking it as a given, and the less you assume, the safer it is. So the fewer axioms you have, the better off you are. So the more compression of a lot of theorems, of a body of theory, into a small set of axioms, the better off you are, I would say, in mathematics as well as physics.

Okay, so this is this notion of lack of structure or randomness. You have to define it first! If I'm going to find randomness or lack of structure, lack of pattern, in pure mathematics, first I've got to say what do I mean by that. And I like to call this subject algorithmic information theory. It deals with this algorithmic information. Or you can call it complexity if you like, program-size complexity.


Algorithmic Information

The basic concept is to look at the size of the most concise program, the smallest program --- I don't care about running time --- it's the most concise program that calculates something. That's the number of bits I have to give a computer in order to get it to produce this object. That's my most concise algorithmic description of something, and that's how I measure its complexity, its algorithmic information content or its program-size complexity.
This is like recursive function theory: I don't care about run time --- so this is very impractical! So in that sense also what I'm doing is 1930's stuff, with this one extra idea thrown in of program size, of looking at the size of programs.

So what happens when you start looking at the size of programs? --- and then something is random if the smallest program that calculates it is the same size as it is, and there's no compression. So the whole idea is, look at the size of computer programs, don't care about run time --- if it takes a billion, billion years I don't care! Information is the only thing I'm thinking about, bits of information, size of computer programs. Okay?

So what happens when you start playing with this idea? What happens is, everywhere you turn, you get incompleteness and undecidability, and you get it in the worst possible way. For example this happens with the first thing you want to do: you can never decide that an individual string of digits satisfies this definition of randomness. Impossible! You can never calculate the program-size complexity of anything. You can never determine what the size of the smallest program is.

If you have a program that calculates something, that gives you an upper bound, its size is an upper bound on the program-size complexity of what it calculates. But you can never prove any lower bounds. And that's my first incompleteness result in this area and I think Jack Schwartz got very excited about it.

In normal, practical, useful complexity theory where you talk about time rather than bits of information, lower bounds are much harder than upper bounds. To get lower bounds on complexity is much harder than getting upper bounds on complexity. Because if you find a clever algorithm you get an upper bound on the time it takes to calculate something; if you find a way to do it that's fast you've shown that it can be done that fast. The problem is to show that you've gotten the fastest possible algorithm, that's much harder, right? But it can be done in some cases, within a class of possible algorithms. Well, in algorithmic information theory you can't prove any lower bounds! And I had an article about this in 1975 in Scientific American.

The basic idea is that you can't prove any lower bounds on the program-size complexity of individual objects. So in particular even though most strings of digits satisfy this definition of randomness, they're incompressible in this sense, they're random in this sense of lack of structure --- it turns out you can show easily that most objects satisfy this definition, they have no structure --- if you look at all hundred digit numbers, almost all of them have no structure according to this definition, but you can never be sure in individual cases, you can never prove it in individual cases.

More precisely, there may be finitely many exceptions. With N bits of axioms you can determine all the objects of program-size complexity up to N. But that's as far as you can go.

And my worst incompleteness result, my very worst incompleteness result, where you have complete lack of structure in pure mathematics, has to do with a number I defined called the halting probability.

O = halting probability

How is this number defined? It's very simple. Turing said you can't decide whether a program halts, there's no mechanical procedure for doing that. And I say, let's consider a real number O which is the probability that a program generated by tossing a coin halts. So I'm averaging over Turing's halting problem, saying if I generate a program by coin tossing, what is the probability that it halts, with no time limit? So this will give me a real number that's determined if you tell me --- there's a subscript --- what's the programming language.

Ocomputer = halting probability of computer

Once you decide, then O is a well-defined real number. Mathematically it's not a very sophisticated thing! Compared to large cardinals, sophisticated mathematics, this is a fairly low-brow object.
However it turns out this object O is maximally unknowable!


O is maximally unknowable

What is it that's maximally unknowable? Well, it's the digits or bits of this number. Once I fix the computer programming language this halting probability is a specific real number, that depends on the choice of computer, or the programming language in which I generate a program by coin tossing. So this becomes a specific real number, and let's say I write it out in binary, so I get a sequence of 0's and 1's, it's a very simple-minded definition. Well, it turns out these 0's and 1's have no mathematical structure. They cannot be compressed. To calculate the first N bits of this number in binary requires an N-bit program. To be able to prove what the first N bits of this number are requires N bits of axioms. This is irreducible mathematical information, that's the key idea.

O is irreducible information

This should be a shocking idea, irreducible mathematical information, because the whole normal idea of mathematics, the Hilbertian idea, the classical idea of mathematics, is that all of mathematical truth can be reduced to a small set of axioms that we can all agree on, that are ``self-evident'' hopefully. But if you want to determine what the bits of the halting probability O are, this is something that cannot be reduced to anything simpler than it is.

O has a mathematical definition with a rather simple structure once I specify the computer, or the programming language, I've even written out a program in LISP that calculates this number in a weak sense. You can't calculate this number. If you could calculate it, then it wouldn't be unknowable! You can get it in the limit from below, but it converges very, very slowly --- you can never know how close you are --- there is no computable regulator of convergence, there is no way to decide how far out to go to get the first N bits of O right. To get O in the limit from below, you just look at more and more programs, for more and more time, and every time you see that a K-bit program halts, that contributes 1/2K to the halting probability.


O = &#8721; p halts 2 -|p|

So the time you need to get the first N bits of O right grows like the longest possible finite run-time of an N-bit program, which is a version of the Busy-Beaver function.
So what's the precise definition of O? Generate a program by tossing a coin for each bit, that's independent tosses of a fair coin. The key point is that the program has to be ``self-delimiting''. The computer has got to ask for each bit one by one. Every time the computer says I want another bit of the program, you flip the coin. And the computer has to decide by itself that it has enough bits, that it has the whole program. The program has to be self-delimiting to define this probability measure correctly. So there's no blank to indicate where a program ends: a program has to indicate within itself how long it is with some trick, some coding trick. That's the technical issue to get this probability to be well-defined. That's the one technical point in my theory.

So this number O is a real number between 0 and 1. It's the probability that a program each of whose bits is generated by an independent toss of a fair coin eventually halts. And I'm fixing the programming language, I pick the universal Turing machine, there's a subscript, it's OUTM, it's the halting probability of a particular universal Turing machine. And I actually pick a particular UTM that I programmed in LISP, just to fix the ideas. But you could do it with essentially any universal Turing machine with self-delimiting programs; it would work.

So O is maximally unknowable. This is a case where mathematical truth has no structure or pattern and it's something we're never going to know! So let me tell you what I've got here. What I've got here is maximum randomness --- like independent tosses of a fair coin --- in pure mathematics. In fact, I can even do it in elementary number theory, like Gödel did. I can make determining bits of O into an assertion about a diophantine equation.

The point is, here you've got a simple mathematical question --- which is what is each bit of O: is the first bit 0 or 1, is the second bit 0 or 1, is the third bit 0 or 1 --- but the answers have no structure, they look like independent tosses of a fair coin, even though each answer is well-defined mathematically, because it's a specific bit of a specific real number and it has to be a 0 or a 1. In fact, we're never going to know: this is my version of independent tosses of a fair coin in pure mathematics. Even if you knew all the even bits of O it wouldn't help you to get any of the odd bits. Even if you knew the first million bits, it wouldn't help you to get the next one. It really looks like independent tosses of a fair coin, it's maximally random, it has maximum entropy.

Physicists feel comfortable with randomness, but this is the black or white world of pure mathematics --- how is this possible, how can it be? Each of these bits is well-defined, it's a specific 0 or a 1, because O is a specific real number once I fix the universal Turing machine or the programming language that I'm dealing with. But it turns out that the right way to think about each bit is that it's not black or white, it's not that it's a 0 or a 1, it's so well balanced, it's so delicately balanced, that it's grey!

Here's another way to put it. Let's go back to Leibniz. What's the idea of mathematics? The normal idea is that if something is true, it's true for a reason --- Leibniz! --- if something is true it's true for a reason. Now in pure math, the reason that something is true is called a proof, and the job of the mathematician is to find proofs, to find the reason something is true. But the bits of this number O, whether they're 0 or 1, are mathematical truths that are true for no reason, they're true by accident! And that's why we will never know what these bits are.

In other words, it's not just that Hilbert was a little bit wrong. It's not just that the normal notion of pure mathematics is a little bit wrong, that there are a few small holes, that there are a few degenerate cases like ``This statement is unprovable''. It's not that way! It's much, much worse than that! There are extreme cases where mathematical truth has no structure at all, where it's maximally unknowable, where it's completely accidental, where you have mathematical truths that are like coin tosses, they're true by accident, they're true for no reason. That's why you can never prove whether individual bits of O are 0 or are 1, because there is no reason that individual bits are 0 or 1! That's why you can't find a proof. In other words, it's so delicately balanced whether each bit is 0 or 1 that we're never going to know.

So it turned out that not only Hilbert was wrong, as Gödel and Turing showed... I want to summarize all of this. With Gödel it looks surprising that you have incompleteness, that no finite set of axioms can contain all of mathematical truth. With Turing incompleteness seems much more natural. But with my approach, when you look at program size, I would say that it looks inevitable. Wherever you turn, you smash up against a stone wall and incompleteness hits you in the face!


Program-size complexity   &   O   &   irreducible information
&#8594;   make incompleteness seem inevitable

So this is what I've been working on. Now what is the reaction of the world to this work?! Well, I think it's fair to say that the only people who like what I'm doing are physicists! This is not surprising, because the idea came in a way from physics. I have a foreign idea called randomness that I'm bringing into logic, and logicians feel very uncomfortable with it. You know, the notion of program size, program-size complexity is like the idea of entropy in thermodynamics. So it turns out that physicists find this nice because they view it as ideas from their field invading logic. But logicians don't like this very much.

I think there may be political reasons, but I think there are also legitimate conceptual reasons, because these are ideas that are so foreign, the idea of randomness or of things that are true by accident is so foreign to a mathematician or a logician, that it's a nightmare! This is their worst nightmare come true! I think they would prefer not to think about it.

On the other hand, physicists think this is delightful! Because they remember well the crisis that they went through in the 1920's about randomness at the foundations of physics, and they say, it's not just us, we're not the only people who have randomness, pure math has it too, they're not any better than we are!

I'll give an example of the attitude of physicists to my theory. It just so happens that this week I found it by chance. There's an English magazine New Scientist that comes out every week; it's like an English version of Scientific American, except that it's a little livelier, it's a little more fun, and it comes out every week. And the current issue --- the one that appeared February 26th, the next issue hasn't come out yet --- of New Scientist has on its cover an article called ``Random Reality''. And if you open the issue and look at this article, it turns out to be an article about the work of two physicists, very speculative work. They're trying to get space and time, three or four dimensional spacetime, our world, to emerge from a random substratum underneath.

Go look at it if you like. There's a link on my web site to this article, ``Random Reality''. Or get the New Scientist.

The reason that I mention this article is that these physicists say that their work was inspired by Gödel's and my work on the limits of logic; they're trying to absorb this stuff. They say that physicists were interested in Gödel's result, but they couldn't relate to it, it's not in terms that make sense to a physicist. But my work, they say, that makes sense to a physicist! It's not surprising: I got the idea by reading physics. So it makes sense to them because it's an idea that came from their field and is coming back to their field.

Actually, they don't use my definitions or my theorems at all, because I was asked to referee their paper, and I had to say that it really has nothing to do with me. My stuff is mentioned in the introduction because it helped to stimulate their work, but actually their work is in physics and has nothing to do with my area, which is algorithmic information theory.

But I think this is an interesting example of the fact that crazy ideas sometimes have unexpected consequences! As I said, formal systems did not succeed for reasoning, but they succeeded wonderfully for computation. So Hilbert is the most incredible success in the world, but as technology, not as epistemology.

And unexpectedly there are physicists who are interested in my notion of program-size complexity; they view it as another take on thermodynamical entropy. There's some work by real physicists on Maxwell's demon using my ideas; I mention this for those of you who have some physics background.

But I must say that philosophers have not picked up the ball. I think logicians hate my work, they detest it! And I'm like pornography, I'm sort of an unmentionable subject in the world of logic, because my results are so disgusting!

So this is my story! To end, let me quote from a posthumous collection of essays by Isaiah Berlin, The Power of Ideas, that was just published: ``Over a hundred years ago, the German poet Heine warned the French not to underestimate the power of ideas: philosophical concepts nurtured in the stillness of a professor's study could destroy a civilization.'' So beware of ideas, I think it's really true.

Hilbert's idea of going to the limit, of complete formalization, which was for epistemological reasons, this was a philosophical controversy about the foundations of mathematics --- are there foundations? And in a way this project failed, as I've explained, because of the work of Gödel and Turing. But here we are with these complete formalizations which are computer programming languages, they're everywhere! They pay my salary, they probably pay your salary... well, this is the School of Computer Science, it pays for all of this, right? Here we are!

So it worked! In another sense, it worked tremendously.

So I like to apologize in an aggressive way about my field. I like to say that my field has no applications, that the most interesting thing about the field of program-size complexity is that it has no applications, that it proves that it cannot be applied! Because you can't calculate the size of the smallest program. But that's what's fascinating about it, because it reveals limits to what we can know. That's why program-size complexity has epistemological significance.

More seriously, I think the moral of the story is that deep ideas don't have a spin-off in dollars right away, but sometimes they have vastly unexpected consequences. I never expected to see two physicists refer to my stuff the way they did in ``Random Reality''. So who knows!

It's true that the computer pays for our salaries but I think it's also true that there are a lot of fascinating impractical ideas out there, and sometimes when an idea is so beautiful --- I've been having wonderful conversations with people here --- Peter Lee told me over lunch, this idea is so beautiful, it's got to be right! Those are the ideas to watch out for! Those are the dangerous ones, the ones that can transform our society. This little idea of a web, for example, of linking stuff into a web! Or the idea of having completely artificial languages, because then it becomes mechanical to see what they mean... Very dangerous ideas! Thanks very much!

Bibliography
G.J. Chaitin, Algorithmic Information Theory, Cambridge University Press, 1987.
G.J. Chaitin, Information, Randomness & Incompleteness, World Scientific, 1987.
G.J. Chaitin, Information, Randomness & Incompleteness, 2nd Ed., World Scientific, 1990.
G.J. Chaitin, Information-Theoretic Incompleteness, World Scientific, 1992.
G.J. Chaitin, The Limits of Mathematics, Springer-Verlag, 1998.
G.J. Chaitin, The Unknowable, Springer-Verlag, 1999.
I've recently finished programming my entire theory in LISP. This will eventually be my book Algorithmic Information Theory in LISP, in preparation. [This book!]


--------------------------------------------------------------------------------


Notes

A transcript of another version of ``A century of controversy over the foundations of mathematics'' was published in C. Calude and G. Paun, Finite versus Infinite, Springer-Verlag London, 2000, pp. 75-100.

For a newer version of the physics described in the New Scientist ``Random reality'' article, see http://arXiv.org/abs/gr-qc/0009023, ``Process physics: modelling reality as self-organising information,'' by R. Cahill, C. Klinger and K. Kitto. The original reference is M. Chown, ``Random reality,'' New Scientist, Vol. 165, No. 2227 (26 February 2000), pp. 24-28.

For a reaction to my work from mathematicians, see J. Casti and W. DePauli, Gödel, Perseus Publishing, 2000. For a reaction from a physicist, see D. Ruelle, Chance and Chaos, Princeton University Press, 1991. For a reaction from philosophers, see L. Brisson and F. Walter Meyerstein, Puissance et limites de la raison, Les Belles Lettres, 1995; Inventing the Universe, SUNY Press, 1995; Inventer L'Univers, Les Belles Lettres, 1991.

Even though Gödel and Turing exploded Hilbert's dream of a formal axiomatic system for all of mathematics, for several decades Jacob T. (``Jack'') Schwartz (Courant Institute, NYU) has been working on precisely this! But what Schwartz is trying to achieve is not a formal axiomatic system that captures all of mathematical truth, which is impossible, but just one that can get to a significant theorem of real mathematics, say, the Cauchy integral theorem in the theory of functions of a complex variable. The goal is formal proofs that are comprehensible to humans and that simultaneously can be verified by a mechanical proof checker. In order to achieve this, Schwartz uses as his formal system a rich axiomatization of set theory within first order logic. And he and collaborators have discovered decision procedures for several significant sublanguages of set theory, which the proposed formal proof checker would incorporate. As an application for such a system, Schwartz envisions a proof checker that examines programs written in a set-theoretic programming language interspersed with assertions, and that attempts to verify that all the subsequent assertions follow from the initial assertion. Schwartz has convinced me that probably no fundamental obstacles stand in the way of achieving these goals, and that what is required is basically good engineering and a great deal of hard work. Schwartz has programmed substantial fragments of this system in his powerful set-theoretic programming language SETL, which employs standard mathematical notation together with finite sets and mappings. And he has also worked through many of the necessary proof schemas. I believe that his intuition that general-purpose first order logic is too weak, and that a rich axiomatization of set theory is a much better basis for actually developing a formalization of real, working mathematics, is fundamentally sound."

http://www.umcs.maine.edu/~chaitin/cmu.html

Status ?

Done. Started on
2/14/05 6:30 PM
Schedule
Start: 2/14/05 6:30 PM
Round 1 Due: 2/15/05 6:30 PM
Round 2 Due: 2/16/05 6:30 PM
Final Due: 2/17/05 6:30 PM