Halting problem

From Free net encyclopedia

In computability theory the halting problem is a decision problem which can be informally stated as follows:

Given a description of a program and its initial input, determine whether the program, when executed on this input, ever halts (completes). The alternative is that it runs forever without halting.

Alan Turing proved in 1936 that a general algorithm to solve the halting problem for all possible inputs cannot exist. We say that the halting problem is undecidable over Turing machines. ([1] with respect to attribution of "halting problem" to Turing.)

Contents

Formal statement

One possible way of formally stating the halting problem is as follows:

Given a Gödel numbering <math>\varphi</math> of the computable functions,

with <math>\langle i, x \rangle</math> the Cantor pairing function,

the set <math>K_{\varphi}^{0} := \{ \langle i, x \rangle | \varphi_i(x) \ \mathrm{exists} \}</math> is called the halting set.

The problem of deciding whether the halting set is recursive or not is called the halting problem. As the set is recursively enumerable the halting problem is not solvable by a computable function.

Alternative equivalent formulations, for instance explicitly using Turing machines, are possible.

Importance and consequences

The historical importance of the halting problem lies in the fact that it was one of the first problems to be proved undecidable. (Turing's proof went to press in May 1936, whereas Church's proof of the undecidability of a problem in the lambda calculus had already been published in April 1936.) Subsequently, many other such problems have been described; the typical method of proving a problem to be undecidable is with the technique of reduction. To do this, the computer scientist shows that if a solution to the new problem was found, it could be used to decide an undecidable problem (by transforming instances of the undecidable problem into instances of the new problem). Since we already know that no method can decide the old problem, no method can decide the new problem either.

One such consequence of the halting problem's undecidability is that there cannot be a general algorithm that decides whether a given statement about natural numbers is true or not. The reason for this is that the proposition that states that a certain algorithm will halt given a certain input can be converted into an equivalent statement about natural numbers. If we had an algorithm that could solve any statement about natural numbers, it could certainly solve this one; but that would determine whether the original program halts, which is impossible, since the halting problem is undecidable.

Yet another, quite amazing, consequence of the undecidability of the halting problem is Rice's theorem which states that the truth of any non-trivial statement about the function that is defined by an algorithm is undecidable. So, for example, the decision problem "will this algorithm halt for the input 0" is already undecidable. Note that this theorem holds for the function defined by the algorithm and not the algorithm itself. It is, for example, quite possible to decide if an algorithm will halt within 100 steps, but this is not a statement about the function that is defined by the algorithm.

Gregory Chaitin has given an undecidable problem in algorithmic information theory which does not depend on the halting problem. Chaitin also gave the intriguing definition of the halting probability which represents the probability that a randomly produced program halts.

While Turing's proof shows that there can be no general method or algorithm to determine whether algorithms halt, individual instances of that problem may very well be susceptible to attack. Given a specific algorithm, one can often show that it must halt for any input, and in fact computer scientists often do just that as part of a correctness proof. But every such proof requires new arguments: there is no mechanical, general way to determine whether algorithms on a Turing machine halt. However, there are some heuristics that can be used in an automated fashion to attempt to construct a proof, which succeed frequently on typical programs. This field of research is known as automated termination analysis.

There is another caveat. The undecidability of the halting problem relies on the fact that algorithms are assumed to have potentially infinite storage: at any one time they can only store finitely many things, but they can always store more and they never run out of memory. However, computers that actually exist are not equivalent to a Turing machine but instead to a linear bounded automaton, as their memory and external storage of a machine is limited. In this case, the halting problem for programs running on that machine can be solved with a very simple general algorithm (albeit one that is so inefficient that it could never be useful in practice). It involves running the program and trying to find a cycle over the states of the machine's memory.

Turing's introduction of the machine model that has become known as the Turing machine, introduced in the paper, has proved a convenient model for much theoretical computer science since.

Sketch of proof

The proof proceeds by reductio ad absurdum. Start by choosing a programming language, a scheme that associates every program with at least one string description. Now suppose that someone claims to have found an algorithm halt(p, i) that returns true if p describes a program that halts when given as input the string i, and returns false otherwise. Construct another program trouble that uses halt as a subroutine:

 function trouble(string s)
     if halt(s, s) == false
         return true
     else
         loop forever

This program takes a string s as its argument and runs the algorithm halt, giving it s both as the description of the program to check and as the initial data to feed to that program. If halt returns false, then trouble returns true, otherwise trouble goes into an infinite loop. Since all programs can be represented by strings, there is a string t that represents the program trouble. Does trouble(t) halt?

Consider both cases:

  1. If trouble(t) halts, it must be because halt(t, t) returned false, but that would mean that trouble(t) should not have halted.
  2. If trouble(t) runs forever, it is either because halt itself runs forever, or because it returned true. This would mean either that halt does not work for every valid input, or that trouble(t) should have halted.

Either case concludes that halt did not give a correct answer, contrary to the original claim. Since the same reasoning applies to any program that someone might offer as a solution to the halting problem, there can be no solution.

This classic proof is typically referred to as the diagonalization proof, so called because if one imagines a grid containing all the values of halt(p, i), with every possible p value given its own row, and every possible i value given its own column, then the values of halt(s, s) are arranged along the main diagonal of this grid. The proof can be framed in the form of the question: what row of the grid corresponds to the string t? The answer is that the trouble function is devised such that halt(t, i) differs from every row in the grid in at least one position: namely, the main diagonal, where t=i. This contradicts the requirement that the grid contains a row for every possible p value, and therefore constitutes a proof by contradiction that the halting problem is undecidable.

Common pitfalls

Many students, upon analyzing the above proof, ask whether there might be an algorithm that can return a third option for some programs, such as "undecidable" or "would lead to a contradiction." This reflects a misunderstanding of decidability. It is easy to construct one algorithm that always answers "halts" and another that always answers "doesn't halt." For any specific program and input, one of these two algorithms answers correctly, even though nobody may know which one. The difficulty of the halting problem lies not in particular programs, but in the requirement that a solution must work for all programs.

It is worth noting that the halting problem is decidable for deterministic machines with finite memory. A machine with finite memory has a finite number of states, and thus any deterministic program on it must eventually either halt or repeat a previous state. Repetition of a previous state indicates a loop, so a program that repeats a previous state is thus known to not halt.

Formalization of the halting problem

In his original proof Turing formalized the concept of algorithm by introducing Turing machines. However, the result is in no way specific to them; it applies equally to any other model of computation that is equivalent in its computational power to Turing machines, such as Markov algorithms, Lambda calculus, Post systems or register machines.

What is important is that the formalization allows a straightforward mapping of algorithms to some data type that the algorithm can operate upon. For example, if the formalism lets algorithms define functions over strings (such as Turing machines) then there should be a mapping of these algorithms to strings, and if the formalism lets algorithms define functions over natural numbers (such as recursive functions) then there should be a mapping of algorithms to natural numbers. The mapping to strings is usually the most straightforward, but strings over an alphabet with n characters can also be mapped to numbers by interpreting them as numbers in an n-ary numeral system.

Relationship with Gödel's incompleteness theorem

The concepts raised by Gödel's incompleteness theorems are very similar to those raised by the halting problem, and the proofs are quite similar. In fact, a weaker form of the First Incompleteness Theorem is an easy consequence of the undecidability of the halting problem. This weaker form differs from the standard statement of the incompleteness theorem by asserting that a complete, consistent and sound axiomatization of all statements about natural numbers is unachievable. The "sound" part is the weakening: it means that we require the axiomatic system in question to prove only true statements about natural numbers (it's very important to observe that the statement of the standard form of Gödel's First Incompleteness Theorem is completely unconcerned with the question of truth, but only concerns the issue of whether it can be proven).

The weaker form of the theorem can be proved from the undecidability of the halting problem as follows. Assume that we have a consistent and complete axiomatization of all true first-order logic statements about natural numbers. Then we can build an algorithm that enumerates all these statements. This means that there is an algorithm N(n) that, given a natural number n, computes a true first-order logic statement about natural numbers such that, for all the true statements, there is at least one n such that N(n) yields that statement. Now suppose we want to decide if the algorithm with representation a halts on input i. We know that this statement can be expressed with a first-order logic statement, say H(a, i). Since the axiomatization is complete it follows that either there is an n such that N(n) = H(a, i) or there is an n' such that N(n') = ¬ H(a, i). So if we iterate over all n until we either find H(a, i) or its negation, we will always halt. This means that this gives us an algorithm to decide the halting problem. Since we know that there cannot be such an algorithm, it follows that the assumption that there is a consistent and complete axiomatization of all true first-order logic statements about natural numbers must be false.

Can humans solve the halting problem?

It might seem like humans could solve the halting problem. After all, a programmer can often look at a program and tell whether it will halt. It is useful to understand why this cannot be true. For simplicity, we will consider the halting problem for programs with no input, which is also undecidable.

To "solve" the halting problem means to be able to look at any program and tell whether it halts. It is not enough to be able to look at some programs and decide. Humans may also not be able to solve the halting problem, due to the sheer size of the input (a program with millions of lines of code). Even for short programs, it isn't clear that humans can always tell whether they halt. For example, we might ask if this pseudocode function, which corresponds to a particular Turing machine, ever halts:

 function searchForOddPerfectNumber()
     var int n:=1     // arbitrary-precision integer
     loop {
         var int sumOfFactors := 0
         for factor from 1 to n-1
             if factor is a factor of n
                 sumOfFactors := sumOfFactors + factor
         if sumOfFactors = n then
             exit loop
         n := n + 2
     }
     return

This program searches until it finds an odd perfect number, then halts. It halts if and only if such a number exists, which is a major open question in mathematics. So, after centuries of work, mathematicians have yet to discover whether a simple, ten-line program halts. This makes it difficult to see how humans could solve the halting problem.

More generally, it's usually easy to see how to write a simple brute-force search program that looks for counterexamples to any particular conjecture in number theory; if the program finds a counterexample, it stops and prints out the counterexample, and otherwise it keeps searching forever. For example, consider the famous (and still unsolved) twin prime conjecture. This asks whether there are arbitrarily large prime numbers p and q with p+2 = q. Now consider the following program, which accepts an input N:

 function findTwinPrimeAbove(int N)
     int p := N
     loop
         if p is prime and p + 2 is prime
             return
         else
             p := p + 1

This program searches for twin primes p and p+2 both at least as large as N. If there are arbitrarily large twin primes, it will halt for all possible inputs. But if there is a pair of twin primes P and P+2 larger than all other twin primes, then the program will never halt if it is given an input N larger than P. Thus if we could answer the question of whether this program halts on all inputs, we would have the long-sought answer to the twin prime conjecture. It's similarly straightforward to write programs which halt depending on the truth or falsehood for many other conjectures of number theory.

Because of this, one might say that the halting theorem itself is unsurprising. If there were a mechanical way to decide whether arbitrary programs would halt, then many apparently difficult mathematical problems would succumb to it. A counterargument to this, however, is that even if the halting problem were decidable over Turing machines, as it is over physical computers and other LBAs, it might still be infeasible in practice because it takes too much time or memory to execute. For example, there are some very large upper bounds on numbers with certain properties in number theory, but it's not feasible to check all values below this bound in a naïve way with a computer — they can't even hold some of these numbers in memory.

Recognizing partial solutions

There are many programs that either return a correct answer to the halting problem or do not return an answer at all. If it were possible to decide whether a program gives only correct answers, one might hope to collect a large number of such programs and run them in parallel, in the hope of being able to determine whether many programs halt. Unfortunately, recognizing such partial halting solvers (PHS) is just as hard as the halting problem itself.

Suppose someone claims that program PHSR is a partial halting solver recognizer. Construct a program H:

input a program P
X := "input Q. if Q = P output "halts" else loop forever"
run PHSR with X as input

If PHSR recognizes the constructed program X as a partial halting solver, that means that P, the only input for which X produces a result, halts. If PHSR fails to recognize X, then it must be because P does not halt. Therefore H can decide whether an arbitrary program P halts; it solves the halting problem. Since this is impossible, the program PHSR could not have been a partial halting solver recognizer as claimed. Therefore no program can be a partial halting solver recognizer.

Another example, HT, of a Turing machine which gives correct answers only for some instances of the halting problem can be described by the requirements that, if HT is started scanning a field which carries the first of a finite string of a consecutive "1"s, followed by one field with symbol "0" (i. e. a blank field), and followed in turn by a finite string of i consecutive "1"s, on an otherwise blank tape, then

  • HT halts for any such starting state, i. e. for any input of finite positive integers a and i;
  • HT halts on a completely blank tape if and only if the Turing machine represented by a does not halt when given the starting state and input represented by i; and
  • HT halts on a nonblank tape, scanning an appropriate field (which however does not necessarily carry the symbol "1") if and only if the Turing machine represented by a does halt when given the starting state and input represented by i. In this case, the final state in which HT halted (contents of the tape, and field being scanned) shall be equal to some particular intermediate state which the Turing machine represented by a attains when given the starting state and input represented by i; or, if all those intermediate states (including the starting state represented by i) leave the tape blank, then the final state in which HT halted shall be scanning a "1" on an otherwise blank tape.

While its existence has not been refuted (essentially: because there's no Turing machine which would halt only if started on a blank tape), such a Turing machine HT would solve the halting problem only partially either (because it doesn't necessarily scan the symbol "1" in the final state, if the Turing machine represented by a does halt when given the starting state and input represented by i, as explicit statements of the halting problem for Turing machines may require).

History of the halting problem

In the following: U refers to the source "Undecidable"

1900 -- Hilbert poses his "23 questions" cf Hilbert problems at the Second International Congress of Mathematicians in Paris, "Of these, the second was that of proving the consistency of the 'Peano axioms' on which, as he had shown, the rigour of mathematics depended" (Hodges p.83, commentary in U p. 108; also Penrose p. 34; also his address The Future of Mathematics reprinted in Reid p. 74ff and his famous pronouncement: "This conviction of the solvability of every mathematical problem is a powerful incentive to the worker. We hear within us the perpetual call: There is the problem. Seek its solution. You can find it by pure reason, for in mathematics, there is no ignorabimus"(ibid p. 81)

1928 -- Hilbert recasts his 'Second Problem' [verification required! cf Penrose p.34 states this is a recast of his 10th problem but Reid does not agree] at the Bologna International Congress (cf Reid pp.188-189). "Hilbert now added to the problem of consistency another problem, that of the completeness of the formal system" (p. 189 Reid). Hodges claims he posed three questions: i.e. #1: Was mathematics complete? #2: Was mathematics consistent? #3: Was mathematics decidable? (Hodges p. 91). The third question is known as the Entscheidungsproblem (Decision Problem) (Hodges p.91, Penrose p.34)

1930 -- Hilbert retires, delivers his "Farewell to Teaching" (Reid p. 190) and reaffirms his "Positivist belief" (Hodges p. 92) that "...there is no such thing as an unsolvable problem." (Hilbert quoted in Hodges p.92). "...he denied again, at the end of his career, the "foolish ignorabimus" of du Bois-Reymond and his followers. At almost the same time [still needs verification] Gödel announces his proof as an answer to the first two of Hilbert's 1928 questions [cf Reid p. 198]. Gödel's paper is received on 17 November (U p.5). "At first he [Hilbert] was only angry and frustrated, but then he began to try to deal constructively with the problem... Gödel himself felt -- and expressed the thought in his paper -- that his work did not contradict Hilbert's formalistic point of view" (Reid p. 199).

1931 -- The paper of Kurt Gödel appears: "On Formally Undecidable Propositions of Principia Mathematica and Related Systems I", (reprinted in U p. 5 ff)

19 April 1935 -- Paper of Alonzo Church "An Unsolvable Problem of Elementary Number Theory" is presented to the American Mathematical Society, (reprinted in U p. 89ff). Church identifies effective calculability with "the notion of recursive function of positive integers" (U p. 100). Such a function will have an algorithm, and "...the fact that the algorithm has terminated [italics added] becomes effectively known and the value of F(n) is effectively calculable" (ibid).

1936 -- Alonzo Church publishes the first proof that the Entscheidungsproblem is unsolvable [A Note on the Entsheidungsproblem, reprinted in U p.110].

7 October 1936 -- Paper of Emil Post is received by Church’s (Hodges p. 125) Journal of Symbolic Logic. His paper appeared as "Finite Combinatory Processes. Formulation I",(reprinted in U p. 298ff). Post's brief paper introduces the word "terminate”. Church had to certify that Post was unaware of Turing's work and vice versa (cf commentary in U p. 288, also Hodges p. 125). See Footnote|Post.

January 1937 -- Turing's On Computable Numbers With an Application to the Entscheidungsproblem is published (reprinted in U, p. 115). With three theorems he answers the “decision problem”: "I shall show that there is no general method which tells whether a given formula U is provable in K [Principia Mathematica], or what comes to the same, whether the system consisting of K with -U adjoined as an extra axiom is consistent" (p. 145, ibid). See Footnote|Turing.

1939 -- J.B. Rosser observes the essential equivalence of "effective method" defined by Gödel, Church, and Turing (Rosser in U p. 273, "Informal Exposition of Proofs of Gödel's Theorem and Church's Theorem"].

1943 -- In his 1943 paper Stephen Kleene discusses "algorithmic theories" ("Recursive Predicates and Quantifiers", reprinted in U pp. 255ff). He states that "In setting up a complete algorithmic theory, what we do is describe a procedure ... which procedure necessarily terminates and in such manner that from the outcome we can read a definite answer, "Yes" or "No," to the question, "Is the predicate value true?"

Footnotes

Footnote|Davis: Turing did not use the word "halting" or "termination". Turing's biographer Hodges does not have the word "halting" or words "halting problem" in his index. The earliest known use of the words "halting problem" is in a proof by Davis (p. 70-71, Davis 1958). He uses Gödelization to prove the theorem:

"Theorem 2.2 There exists a Turing machine whose halting problem is recursively unsolvable.
"A related problem is the printing problem for a simple Turing machine Z with respect to a symbol Si" (p. 70).

Davis then goes on to prove his Theorem 2.3 that "...the printing problem for Z with respect to Sk is recursively unsolvable" (p. 71). This proof uses a form similar to the antinomies that appear in Minsky, Beltrami and this page, above. Davis adds no attribution for these proofs, so we can infer they are original with him.

"Halting Problem" does not appear in either of Alonzo Church's texts dated 1944 and 1956, nor in E.F. Moores' A Simplified Universal Turing Machine, Proc. ACM, Sept 1952, 1953. Moore's paper references "mimeographed notes" of a lecture by Davis at the University of Illinois in 1951, so this source would need to be investigated. Hao Wang's A Variant to Turing's Theory of Computing Machines, Journal of the ACM 4(1):63-92 January 1957 does mention "halt" as an instruction (p. 65), but not the "halting problem." Wang in turn references Post (ibid); see Footnote|Post below. By 1965 the "halting problem" has appeared in Fisher, On formalisms for Turing Machines, Journal of the ACM 12,4 (Oct 1965), Anderaa & Fisher, The Solvability of the Halting Problem for two state Post Machines, Journal of the ACM 14(4):677-682 (Oct 1967), and in Minsky's text (1967).

Footnote|Post: In his paper Post describes a "formulation" (i.e. process, not a machine) consisting of "a worker" who follows a "set of instructions" (instructions that are, as it turned out, virtually identical to those of Turing's machines). But Post adds another instruction "(C) Stop". Thus "...This process will terminate when and only when it comes to the direction of type (C)." He called such a process "type 1 ... if the process it determines terminates for each specific problem." He went on to remove the "Stop" instruction when evaluating "symbolic logics"; in this case "a deterministic process will be set up which is unending" [his italics] Post did not address directly the "Entscheidungsproblem" in his "formulation"; see Post-Turing Machine for more.

References

  • Alan Turing, On computable numbers, with an application to the Entscheidungsproblem, Proceedings of the London Mathematical Society, Series 2, 42 (1936), pp 230-265. online version This is the epochal paper where Turing defines Turing machines, formulates the halting problem, and shows that it (as well as the Entscheidungsproblem) is unsolvable.
  • Template:Cite book
  • Wiki:HaltingProblem
  • Martin Davis, The Undecidable, Basic Papers on Undecidable Propositions, Unsolvable Problems And Computable Functions, Raven Press, New York, 1965. Turing's paper is #3 in this volume. Papers include those by Godel, Church, Rosser, Kleene, and Post.
  • Martin Davis, Computability and Unsolvability, McGraw-Hill, New York, 1958.
  • Alfred North Whitehead and Bertrand Russel, Principia Mathematica to *56, Cambridge at the University Press, 1962. Re: the problem of paradoxes, the authors discuss the problem of a set not be an object in any of its "determining functions", in particular "Introduction, Chap. 1 p. 24 "...difficulties which arise in formal logic", and Chap. 2.I. "The Vicious-Circle Principle" p.37ff, and Chap. 2.VIII. "The Contradictions" p. 60ff.
  • Martin Davis, "What is a computation", in Mathematics Today, Lynn Arthur Steen, Vintage Books (Random House), 1980. A wonderful little paper, perhaps the best ever written about Turing Machines for the non-specialist. Davis reduces the Turing Machine to a far-simpler model based on Post's model of a computation. Discusses Chaitin proof. Includes little biographies of Emil Post, Julia Robinson.
  • Marvin Minsky, Computation, Finite and Infinite Machines, Prentice-Hall, Inc., N.J., 1967. See chapter 8, Section 8.2 "The Unsolvability of the Halting Problem." Excellent, i.e. readable, sometimes fun. A classic.
  • Roger Penrose, The Emperor's New Mind: Concerning computers, Minds and the Laws of Physics, Oxford University Press, Oxford England, 1990 (with corrections). Cf: Chapter 2, "Algorithms and Turing Machines". An overly-complicated presentation (see Davis's paper for a better model), but a thorough presentation of Turing machines and the halting problem, and Church's Lambda Calculus.
  • John Hopcroft and Jeffrey Ullman, Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading Mass, 1979. See Chapter 7 "Turing Machines." A book centered around the machine-interpretation of "languages", NP-Completeness, etc.
  • Andrew Hodges, Alan Turing: The Engima, Simon and Schuster, New York. Cf Chapter "The Spirit of Truth" for a history leading to, and a discussion of, his proof. A wonderful biography.
  • Constance Reid, Hilbert, Copernicus: Springer-Verlag, New York, 1996 (first published 1970). Fascinating history of German mathematics and physics from 1880's through 1930's. Hundreds of names familiar to mathematicians, physicists and engineers appear in its pages. Perhaps marred by no overt references and few footnotes: Reid states her sources were numerous interviews with those who personally knew Hilbert, and Hilbert's letters and papers.
  • Edward Beltrami, What is Random? Chance and order in mathematics and life, Copernicus: Springer-Verlag, New York, 1999. Nice, gentle read for the mathematically-inclined non-specialist, puts tougher stuff at the end. Has a Turing-machine model in it. Discusses the Chaitin contributions.
  • Ernest Nagel and James R. Newman, Godel’s Proof, New York University Press, 1958. Wonderful writing about a very difficult subject. For the mathematically-inclined non-specialist. Discusses Gentzen's proof on pages 96-97 and footnotes. Appendices discuss the Peano Axioms briefly, gently introduce readers to formal logic.
  • Taylor Booth, Sequential Machines and Automata Theory, Wiley, New York, 1967. Cf Chapter 9, Turing Machines. Difficult book, meant for electrical engineers and technical specialists. Discusses recursion, partial-recursion with reference to Turing Machines, halting problem. Has a Turing Machine model in it. References at end of Chapter 9 catch most of the older books (i.e. 1952 until 1967 including authors Martin Davis, F. C. Hennie, H. Hermes, S. C. Kleene, M. Minsky, T. Rado) and various technical papers. See note under Busy-Beaver Programs.
  • Busy Beaver Programs are described in Scientific American, August 1984, also March 1985 p. 23. A reference in Booth attributes them to Rado, T.(1962), On non-computable functions, Bell Systems Tech. J. 41. Booth also defines Rado's Busy Beaver Problem in problems 3, 4, 5, 6 of Chapter 9, p. 396.
  • David Bolter, Turing’s Man: Western Culture in the Computer Age, The University of North Carolina Press, Chapel Hill, 1984. For the general reader. May be dated. Has yet another (very simple) Turing Machine model in it.cs:Problém zastavení

de:Halteproblem es:Problema de la parada fa:مسأله‌ى توقف fr:Problème de l'arrêt ko:정지 문제 he:בעיית העצירה nl:Beslissingsprobleem ja:チューリングマシンの停止問題 pl:Problem stopu ru:Проблема зависания fi:Pysähtymisongelma th:ปัญหาการยุติการทำงาน