# The Dershowitz/Falkovich proof of the Extended Church-Turing Thesis

In a previous post, I considered a proof of the Church-Turing Thesis that Dershowitz and Gurevich published in the Bulletin of Symbolic Logic in 2008.  It is safe to say that the proof is controversial — not because it is technically inaccurate, but because it relies on an axiomatization of computation that excludes randomness, parallelism and quantum computing.  (For more discussion of that paper, please see the previous blog entry, or the question about disproving the Church-Turing Thesis on CSTheory.)  Nevertheless, one of the authors, Dershowitz, and his PhD student Falkovich, have just published a proof of an even more controversial statement, the Extended Church-Turing Thesis.  It is this new proof which I will now discuss.

The Extended Church-Turing Thesis is the statement that every physically realizable computational device can be simulated efficiently by Turing machines.  (By contrast, the “mere” Church-Turing Thesis says that Turing machines can simulate any computational device, but perhaps at the cost of an exponential blowup in time and/or space.)  Unlike the Church-Turing Thesis, which computer science convential wisdom appears to think is true, many CS people believe the Extended Church-Turing Thesis to be false, because quantum computers are expected to be strictly stronger than classical computers.  (The most famous example of this difference in strength is a quantum algorithm that can factor large integers in polynomial time, due to Peter Shor.  Most CS people believe that classical computers cannot factor integers in polynomial time, and many cryptography programs depend on this infeasibility.)  Dershowitz and Falkovich sidestep this issue by attempting to axiomatize only “classical” algorithms — no quantum, no randomization, no massive parallelism.  They define an algorithm’s implementation, a program, an efficient program, and then prove that any efficient program (in their general axiomatic sense) can be simulated in linear time by a Random Access Machine (RAM).  Since a RAM can be simulated efficiently by a Turing machine, the Extended Church-Turing Thesis follows as a consequence.

A definition of classical algorithms

My last post on this general topic was informal and philosophical, so I will treat the Dershowitz/Falkovitch paper more technically.  To start off, the authors use a formal definition of “classical algorithm” from a paper by Yuri Gurevich that appeared in the very first issue of the ACM Transactions on Computational Logic.

Definition [Gurevich]:

A classical algorithm is a (deterministic) state-transition system, satisfying the following three postulates:

1. It is comprised of a set $S$ of states, a subset $S_0 \subseteq S$ of initial states, and a partial transition function $\tau: S \rightarrow S$ from states to states.  States for which there is no transition are terminal.
2. All states in $S$ are (first-order) structures over the same finite vocabulary $F$, and $X$ and $\tau(X)$ share the same domain for every $X \in S$.  The set of states, initial states, and terminal states are each closed under isomorphism.  Moreover, transitions respect isomorphisms.  Specifically, if $X$ and $Y$ are isomorphic, then either both are terminal or else $\tau(X)$ and $\tau(Y)$ are also isomorphic by the same isomorphism.
3. There exists a fixed finite set $T$ of critical terms over $F$ that fully determines the behavior of the algorithm.  Viewing any state $X$ over $F$ with domain $D$ as a set of location-value pairs $f(a_1,\ldots,a_n) \rightarrow a_0$, where $f \in F$ and $a_0,a_1,\ldots,a_n \in D$, this means that whenever states $X$ and $Y$ agree on $T$, in the sense that for every critical term $t \in T$, the value of $t$ is equal in both $X$ and $Y$, then either both $X$ and $Y$ are both terminal states, or the state changes are the same: $\tau(X) \setminus X = \tau(Y) \setminus Y$.

The above definition of “algorithm” can be thought of as capturing a piece of pseudocode that is not connected to any specific state space.  Importantly, although the states of the algorithm are finite, the input to the algorithm (initial state) many be intuitively uncomputable, or the algorithm may ask questions of extremely powerful oracles in a single time step, because the domain $D$ over which the algorithm operates is not restricted.  So the algorithm’s transitions rely on a a finite number of critical terms from $D$, but those terms may be answers to finitely askable queries like, “Does program $P$ halt?”

So an implementation of the algorithm is the algorithm restricted to a specific domain $D$.  Further, the authors define a basic implementation to be one in which all inital states are constructible in a way that is unfortunately unclear to me, because of what may be a typo — the appearance in two paragraphs of the definition of a nowhere-defined $J$.  It appears, though, from a paragraph of informal explanation that follows, that the intent is to restrict original states to those that can be built up from finite (and algebraically simple) constructors acting finitely often on a finite alphabet.  Then, if the constructors can construct $D$, and all oracles which can be called are themselves subroutines constructible from the constructors, the implementation is effective.  We can now state the paper’s main theorem.

Theorem: Any effective implemention with complexity $T(n)$  with respect to a valid size measure, can be simulated by a RAM in order $n+nT(n) + T(n)^2$ steps, with a word size that grows up to order $\log T(n)$.

We have not yet defined what the complexity of an implementation is, nor the notion of a valid size measure.  Briefly, the authors associate a “size” with each element of a domain, and note that a domain may have infinitely-many ways to measure size.  For example, we are accustomed to measure the size of a natural number by the number of bits required to represent that number, so the size of $n$ is roughly $\log_2 n$.  But there are infinitely many other ways to represent the natural numbers, some of which may require an algorithm to “work exponentially harder” than others to solve the same problem.  So a valid size for elements of a domain is a function $f$ which, if $A$ is an effective implementation over $D$, then there exists an effective implementation  $B$ over a bijection from the range of $f$, so that $A$ and $B$ have identical computational runs up to isomorphism.  Therefore, the theorem above, informally, states: “For every computationally equivalent representation of an effective domain, a RAM can simulate the behavior of a classical algorithm’s behavior over that domain’s representation, with only polynomially many more steps than the algorithm takes.”

The main technical point of the proof that a RAM can efficiently simulate a classical algorithm is the use of a data structure trick that reminded me of memoization: the authors define a tangle to be a graph that captures all terms in memory without repetition — repetition could cause an exponential blowup of states for the RAM.  (This is not the same as John Conway’s notion of tangle in knot theory, to be clear.)

Conclusion

This paper is brand new, and has just appeared in a workshop.  I find the concept of axiomatizing our intuitions of computation — much as Zermelo, Fraenkel, and others axiomatized the intuition of “set” a century ago — to be quite intriguing.  With formal axioms, we can then see whether consequences we expect to follow actually do.  However, the historical situations are quite different.  In the case of sets, fundamental paradoxes were staring practitioners in the face, and there was a clear need to be able to perform mathematics in a non-contradictory framework.  There is no similar “intellectual crisis” in modern computer science.  In fact, quantum computing and complexity theory expert Scott Aaronson recently said (kindly responding to a question I asked on CSTheory), “I don’t see how talking about ‘proofs’ of the CT or ECT adds any light to this discussion. Such ‘proofs’ tend to be exactly as good as the assumptions they rest on—in other words, as what they take words like ‘computation’ or ‘efficient computation’ to mean. So then why not proceed right away to a discussion of the assumptions, and dispense with the word ‘proof’?”

I find the Dershowitz/Falkovich paper intriguing but, I’m afraid, ultimately unsatisfying in its current incarnation, and I look forward to a journal version.  The philosophical justification for the axiomatization of “efficiently computable” is brief, and I was unable to fully parse the technical definitions, despite having backgrounds in both computer science and mathematical logic.  I see this paper as a very interesting contribution to a conversation that deserves to be continued — and I plan to continue it elsewhere.  Stay tuned. :-)

Nachum Dershowitz, & Evgenia Falkovich (2011). A Formalization and Proof of the Extended Church-Turing Thesis International Workshop on the Development of Computational Models