22C:096
Computation, Information, and Description

Department of Computer Science
The University of Iowa

Lecture Notes

Last modified: 10 June 1997

Three simple computing systems

Although we are interested in why computing systems are what they are, I think that we will understand the why better if we already know some of the what. So, we will look at 3 particular computing systems just to observe how they work, and then understand why they are computing systems later. We are most familiar with computing systems that are implemented electronically, and intended to perform impressive (perhaps even useful) computations as efficiently as possible. But, the computing systems that we study in this course are designed to illuminate the essential nature of computation. They are much simpler than the popular commercial systems, but they can accomplish the same computations.

The Post-Turing Machine: modelling the physical behavior of human computers

Historical Note

We noticed before that ``computer'' was a job title, before it referred to an electronic device. Alan Turing was involved in the creation of some of the first electronic computers. But, when he set out to define the basic nature of computation, he was particularly concerned to capture computation by humans. When he defined the Post-Turing Machine, he called it a machine to emphasize the fact that it could be automated. But, he designed the structure of the Post-Turing Machine to mimic the behavior of a human computer. The Turing Machine only mimics the deliberate and conscious compuational behavior of a human. Many cognitive psychologists, including Jerry Fodor, whose work we will look at later, believe that much of human thinking should be understood as a type of computation. If so, then the results of thinking can be achieved by a Post-Turing Machine. But, the Post-Turing Machine is not designed to mimic the structure of human thinking. In fact, Turing was able to capture the structure of deliberate and conscious human compuation precisely because deep thinking is not really required for such behavior. The structure of thought as computation is still a mystery.

Emil Post discovered the idea of the Post-Turing Machine independently of Turing, at about the same time. Post was a teacher in grade school in the USA (high school in New York I think: I'll have to verify this). He suffered from some nervous/emotional disorder. And, mathematicians in the USA at the time tended to ignore work on logic and compuation. So, Post's work was not given the attention that it deserved. I like Post's version of the machine slightly better than Turing's for teaching purposes. Post certainly understood a lot of important issues way ahead of his time. Even accounting for the unfortunate neglect of Post's work, Turing was even more advanced. Having given credit to Post, I will revert to convention, and refer to the ``Turing Machine,'' even when I use Post's version of it.

The Structure of the Machine

When we say ``the Turing Machine,'' we really mean ``the concept of machine proposed by Turing,'' or ``the class of machines called `Turing Machines.' '' That is, Turing and Post invented a theoretical type of machine, and there are an infinite number of different machines of that type. Why did they do this, instead of inventing a single machine, capable of running an infinite number of different programs? Probably because they wanted to make it clear that they were allowing for the variations in capability between different human computers, and different alphabets in which they might write down their computations. In a future lecture, we will see that there is very little difference between choosing a computer from a large class of computers, vs. programming a single computer, in terms of the computations that can be accomplished.

To design or choose a single Turing Machine from the infinite class of Turing Machines, the first thing to specify is the alphabet in which computations may be written. This alphabet may contain any finite number of symbols, and we assume that those symbols can be distinguished reliably from one another. A Turing machine has an unlimited one-dimensional scratch pad, called a ``tape,'' on which to write symbols involved in its comutation. The tape is unlimited in both directions (although we see later that this doesn't matter much). It is divided into squares, each one of which can hold a single symbol from the alphabet. During a computation, the machine considers the symbol on a single square of the tape (we say that it ``scans'' that square), it may leave that symbol unchanged, or replace it with any other symbol, and it may stay at the same square, or move one square to the left, or one square to the right.

The behavior of a Turing Machine is controlled by its program, which is the last item required to design/choose a specific Turing Machine. It doesn't matter much precisely how this program is written. Let's think of it as a sequence of labelled instructions, each in one of the following forms:

Move left
Move right
Print c
If c then go to l
Halt

In these forms, c stands for any character in the alphabet, and l stands for any label on an instruction. The intent of the first two instructions, and the last, should be obvious. ``Print c'' means print the character c on whatever tape square the machine is scanning, eliminating whatever character was there before. ``If c then go to l'' means check the square that you are scanning, and if the character c is there, continue by executing the instruction with the label l. If any character other than c is there, continue by executing the next instruction in sequence.

There are lots of little details that I've left out. The last instruction in a program should be ``Halt''. Each label should be unique, and the label in each ``If'' command should occur as the label of some instruction in the program. After executing any instruction other than ``If ...'' or ``Halt'' the machine goes on to the next instruction in the sequence. Oh yes, the machine starts with the first instruction in the sequence. These details are very important, but they are so conventional by now that we don't need to worry about them much. Also, it makes no important difference whether we say that Turing Machines are only those machines with programs satisfying the rules about labels and ``Halt''s, or whether we allow Turing Machines with bad programs, and assume that their computation crashes.

In summary, a Turing Machine is specified by

a finite alphabet of distinguishable symbols, and
a program consisting of a linear sequence of labelled instructions of the forms shown above.

A Turing Machine computes by executing the instructions in its program on a data structure called a ``tape'', which is essentially an infinite one-dimensional array of symbols from its alphabet. Since every Turing Machine has the same sort of tape, the tape is not part of the specification of the particular machine. The crucial thing that I've left out is the initial contents of the tape, and the location of the first square to be scanned, when the machine begins to execute its program.

Out of all the symbols in the alphabet, one is chosen to be the ``blank.'' Essentially everybody who writes examples of Turing Machine computations denotes the blank by white space, but all that matters is that blank is distinguishable from the other characters. We have an emotional tendency to consider white space in a special way, but in principle it is just another possible content for a region on the tape. If you think of the tape as real live paper, notice that one could in principle acquire the paper preprinted with some chosen pencil mark, instead of white. Whew. Before starting a Turing Machine computation, we write some finite sequence of nonblank characters on contiguous squares of the tape, leave the infinitely many other squares blank, and start the machine scanning the leftmost nonblank square. Most people who write about Turing Machines include the choice of blank character as part of the specification of the machine. This doesn't matter much.

It is important to have an even more careful and systematic definition of Turing Machines than I've given here. But, we don't need it in the lecture notes. I will give you a copy of Turing's and Post's original papers defining the Machines. And, I will lend you a copy of Turing's World by Barwise and Etchemendy, which includes a book and Macintosh software presenting a precise view of Turing Machines. You need to go through these precise descriptions and understand how they make sense as minor variations on what I've described above.

How is a Turing Machine Like a Human?

Turing, Post, and Barwise/Etchemendy all discuss at some length the sense in which Turing Machines capture the computational behavior of a person who is deliberately and consciously computing. Please read this material very carefully, since it is crucial to our understanding of the conceptual impact of Turing Machines. Here are some key points to look for. Turing et. al. cover these points more carefully than I do: my intent is to attract your attention to them.

The fact that real people don't always divide their work paper explicitly into squares is not important. As long as there is some limit to our visual precision, we can conceive of the paper as divided into pixels, each of which is either black or white. Or, if we insist on gray scale or color, we can only distinguish some finite collection of possible pixel contents. In principle, a person who claims to be computing with undisciplined scribblings can be described as computing at the pixel level.
The use of two-dimensional paper instead of a one-dimensional ``tape'' is not important, as long as we can demonstrate that anything accomplished with two-dimensional paper may also be accomplished with a tape. This is not obvious, but it is also not very hard, as theoretical mathematical proofs go. I'll sketch the structure of the proof for you, but the important thing for this course is to develop confidence that it is true. Notice that the pixelization idea in item 1 really requires the 2-1 reduction of dimension. Think carefully about the possibility that there is circular reasoning going on here. There isn't, but we should worry about the possibility.
The precise form of the instructions can be varied, as long as we have the basic capabilities to look at a symbol and, depending on what we see, to write a new symbol and move left or right.

Using these points, and other considerations, Turing et al. argue in detail that any computation that can be done deliberately and consciously by a human can also be done by an appropriately defined Turing Machine.

Subtleties Regarding Memory and Program

At first glance, it appears that the only memory in a Turing Machine is the tape. But, there is an implicit memory involved in keeping track of the instruction to execute in the program. In modern electronic computers, this memory is an explicit register, called the ``program counter,'' which is built essentially the same as other memory elements. In Turing Machines, it is treated as something different. But, since a Turing Machine has one fixed, finite program, the amount of memory in this implicit program counter is also fixed and finite. So, the only unlimited memory is the tape. Notice that not all of the tape memory is stored in the form of the symbols on the tape. The location of the scan, often called the ``read/write head,'' is another implicit memory resource, and this one has no fixed bound.

The form of Turing Machine with an explicit program is more like Post's version than Turing's. The operation of Post's version is easier to understand, but some of the detail of the connection to the human mathematician is clearer in Turing's version. Turing took the set of locations in the program (that is, the set of values that the program counter could take), and called them ``states.'' Instead of writing down the program as a list of instructions, Turing used a table that associates each possible combination of a state and a tape symbol with an action. The action includes an optional rewrite of the symbol, and an optional move of the head. So, each of Turing's actions might correspond to two instruction executions in my version above. It's not hard to see that this doesn't affect the basic capabilities of the machine.

In the Post version of the Turing Machine, with an explicit program, it is easy to get the (mis)impression that the program must be written down, and the human computer must be alternately reading the program and executing one of its instructions on the tape. In this case, the human computer needs to keep track of the program counter with some sort of mark on the paper, or perhaps a moveable token such as a pebble. One might object that this limits the possibilities for the human computer, who might prefer to use her own brain in some well defined way. To deal with this potential objection, Turing associated the states of the machine, not with points in a written program, but with ``states of mind'' of the human computer. This view still allows the human computer to work from written instructions (which might be in the form of a Post program, or Turing's table of operations, or some other form containing the same essential information). But, it also allows the human computer to memorize the program, or any other relevant information, as long as there is a fixed finite limit on the information kept in the human computer's mind.

At first, we might worry that Turing's presentation limits the human computer even more severely than Post's. Can we be sure that real live humans have only a fixed finite set of possible states of mind, each one reliably distinguishable from the others? Probably not, but Turing's analysis doesn't depend on such an assumption. Turing does not assume that the human mind is finite, he only claims that in carrying out a precisely defined computation, a human computer may only use a finite amount of mental capacity in a relevant way. A crucial quality of computation is that, given a careful description of the rules, any person who cares to can check the correctness of computation. There may be correctable errors in any particular person's computational work, but given sufficient care and attention, there is no room for serious debate about what is and isn't a computation according to the rules. Turing claims, and it is hard to avoid agreeing, that if a human computer uses an infinite amount of mental capacity, then there is no absolutely reliable, objective, and effective way to describe the rules to another person. The description must be finite, and to be reliable it must be essentially computational itself, .... There might be a small loophole here, but nobody has driven any impressive argument through it. In order to describe the use of unlimited mental capacity to perform computation, one must give a way of computing the result of the mental work, which means that all but a finite amount of the work must be carried out externally on paper for all to see, unless one already had a way of applying infinite mental capacity with computational reliability to the description of the computational use of infinite mental capacity. I don't think that we can quite prove that computation can always be done with only a finite use of the mind, but at least we've made it look very difficult if not impossible to prove the reverse. And, in some sense a computation isn't fully reliable, and therefore is not a computation, if we can't prove that it's a computation (prove in some intuitive sense which hasn't been made precise, not in the mathematical sense). Whew! I think I've said too much already. To appreciate this line of thought, you really have to stuff it into your mind and think it over for yourself.

I see one way to misunderstand the previous paragraph, and I don't see how to rewrite it, so I'll try to kill the misunderstanding a posteriori. Turing's argument does not require us to believe that computation cannot be carried out using infinite mental capacity (for example, to memorize the contents of the tape, and avoid actually writing it down). All that it requires is that, in order to be computation, all but a finite amount of the mental capacity must be replacable by external resources, such as the contents of the tape written on paper, in principle visible to any observer who cares to make the effort to verify things. Since we cannot observe one another's states of mind, we almost surely cannot verify one another's purely mental work reliably and objectively, without redoing that work in an external form. And, if we could observe one another's states of mind and verify their operations reliably and objectively, then almost surely we could also represent those mental operations in some sort of external medium equivalent (in its basic power as memory) to a Turing Machine tape.

Enough, if not more than.

Counter, or Bucket, Machines

I'm getting behind on typing the notes, so this part will be sketchy.

Turing Machines are designed to clarify the mechanizability of computation by humans. Counter/Bucket Machines illustrate how primitive the individual operations in computation can be.

A Counter Machine has some finite number of numerical registers, called ``counters'' because the only arithmetic operations are increment, decrement, and test for 0. A counter could be implemented as a bucket holding pebbles. Increment means place a pebble in a bucket, decrement means remove a pebble from a bucket, and test for 0 means look in a bucket to see whether it is empty. It's interesting to see how powerful a Counter Machine can be, even with a small number of counters. We did this in some detail in class, for now I'll just summarize the results.

One Counter

A Counter Machine with only one counter can't do very much. In addition to the primitive operations A := A+1, A := A-1, and A=0?, we can program:

A := A+k for a constant k
A := A-k
A := k
A := A mod k
A>k?
A=k?
A<k?

That's essentially all. It is possible to characterize precisely the functions computable in one counter, based on the operations above.

Two Counters

With more than one counter, life gets a bit more complicated. It's easier to describe the possibilities if we consider that the machine state, or program counter, can implement a fixed number of registers with a fixed bound on the value in each register. Many two-counter operations are destructive: that is, to calculate a certain value in one counter, we must destroy the value in another counter. In principle, there are a number of results that can end up in the wiped out counter, but I will always describe the value as 0. Other values can be thought of as stored in the state registers, and then loaded back into the counters on demand. So, we get the following defined operations:

B := B+kA; A := 0
B := B-kA; A := 0
B := B+A/k; A := 0 (A/k is rounded down)
B := B-A/k; A := 0
A := s where s is a state variable
s := k
s := A mod k

The last three, strictly speaking, are one-counter operations, but they become important when we have wiped out a counter value in a two-counter operation. These are all of the obvious two-counter operations. Are there more?

Three Counters

B := B+kA; C := C+mA; A := 0
B := B+kA; C := C+A/m; A := 0
B := B+A/k; C := C+mA; A := 0
B := B+A/k; C := C+A/m; A := 0
C := Bk^(mA); A := 0; B := 0
C := Bk^(A/m); A := 0; B := 0
C := C+ log_k B; A := 0; B := 0

With three counters we can scale the value in one counter A up or down by a constant, and simultaneously add/subtract it into two other counters (the first 4 operations above). In doing so, we destroy the value in A. Probably the most important case of this is copying: B := A; C := A; A := 0.

We can also exponentiate and take logarithms (the last 3 operations above), but at the cost of destroying the value that we are exponentiating or loging, and another value. Specifically, we can assign C := k^A, but to do so we must stream an initial value 1 between C and B, multiplying by k each time, and decrementing A to keep count.

These are all of the obvious possibilities with three counters.

Four Counters

It's weird that we can exponentiate with three counters, but the obvious way to multiply requires four counters.

D := BC; A,B,C := 0

Roughly, we copy the value from B into A, while adding it into D. We decrement C each time we add B into D.

Once we have multiplication, we can do anything that we want. The proof is not obvious, but we'll take it on faith. Four counters are enough to compute all of the computable functions, in particular, to simulate a Turing Machine.

Two Counters, Revisited

Consider the two-counter machine again. Let the value in A be in the form (2^w)(3^x)(5^y)(7^z). Because prime factorization is unique, the single number in A represents the four values w, x, y, z. A := 2A yields (2^(w+1))(3^x)(5^y)(7^z), so multiplying A by 2 is the same as incrementing w. Similarly, A := A/5 yields (2^w)(3^x)(5^(y-1))(7^z), so dividing A by 5 is the same as decrementing y (assuming that A was divisible by 5 in the first place). Testing w, x, y z for equality to 0 is the same as testing A for divisibility by 2, 3, 5, 7. Using these ideas, we can simulate a computation using as many counters as we like with only two counters.

Three Counters, Revisited

Let f be any computable function. Given an input value n in one of three counters, we may compute 2^n in another counter, and perform a two-counter simulation of four counters, or as many as we please, to compute 2^(f(n)). Then, we may take log_2 of the final result, and leave f(n) in whichever counter we select for output. So, three counters are enough to compute all computable functions. Think about the peculiarly indirect way that we would do multiplication with three counters.

Two Counters, Rerevisited: I/O subtleties

Back again to two counters, with any old computable function f. Given 2^n in one of the counters, we may simulate a multicounter program for f, leaving 2^(f(n)) as a final counter value. In particular, if f(n)=2^n, we may start with 2^n in a counter, and finish with 2^(2^n). But, there appears to be no way to start with n, and produce 2^n. I think that someone has proved this, but I have never found it in writing.

Are two counters enough for computing? That depends on precisely what we mean. If we are willing to exponentiate as part of the input encoding, and take a logarithm as part of the output encoding, then two counters are enough. If we insist on the more natural enoding of input n as the value n in a counter, then they are not enough. Bizarre. In any case, it is clear that two-counter computation has all of the complexity of full-powered computation, which will affect us if we try to analyze a two-counter machine.

A Restricted Three-Counter Machine

The simplest counter machine that can do all the computable functions with the natural form of I/O appears to be a three-counter machine, but one of the counters is used only for I/O. That is, the special I/O counter contains the input initially. It is decremented while exponentiating into one of the two working registers. Then, the main computation is carried out in the two working registers. Finally, the I/O register is incremented to contain the logarithm of the result in the working registers.

Combinator Calculus

The final computing system that we study is the Combinator Calculus. There is a brief definition and discussion of the calculus associated with the description of a logo for the Chicago Journal of Theoretical Computer Science. You will also learn about the Combinator Calulus using my SKIlift software, described in a posted message.

Notice that the Combinator Calculus manipulates binary trees, rather than strings. In one sense, the rules are quite primitive, but from another point of view, the S rule involves copying an arbitrarily large subtree. There is also a shared version of the Combinator Calculus, implemented on a heap (the data structure used by Lisp), in which the S rule sets two pointers to the same location, rather than copying.

The main point of studying the Combinator Calculus is to illuminate the structural properties of computing systems. Some of these structural properties are very obvious in the Combinators. Because all computing systems simulate one another, the obvious structural properties of the Combinator Calculus translate to more subtle structural properties of other systems.

Maintained by Michael J. O'Donnell, email:

odonnell@cs.uiowa.edu

22C:096 Computation, Information, and Description