Automated theorem proving
Automated theorem proving (also known as ATP or automated deduction) is a subfield of
Logical foundations
While the roots of formalised
In 1929,
However, shortly after this positive result, Kurt Gödel published On Formally Undecidable Propositions of Principia Mathematica and Related Systems (1931), showing that in any sufficiently strong axiomatic system there are true statements that cannot be proved in the system. This topic was further developed in the 1930s by Alonzo Church and Alan Turing, who on the one hand gave two independent but equivalent definitions of computability, and on the other gave concrete examples of undecidable questions.
First implementations
Shortly after
The "heuristic" approach of the Logic Theorist tried to emulate human mathematicians, and could not guarantee that a proof could be found for every valid theorem even in principle. In contrast, other, more systematic algorithms achieved, at least theoretically,
Decidability of the problem
Depending on the underlying logic, the problem of deciding the validity of a formula varies from trivial to impossible. For the common case of
The above applies to first-order theories, such as
Related problems
A simpler, but related, problem is
Since the proofs generated by automated theorem provers are typically very large, the problem of proof compression is crucial, and various techniques aiming at making the prover's output smaller, and consequently more easily understandable and checkable, have been developed.
However, these successes are sporadic, and work on hard problems usually requires a proficient user.Another distinction is sometimes drawn between theorem proving and other techniques, where a process is considered to be theorem proving if it consists of a traditional proof, starting with axioms and producing new inference steps using rules of inference. Other techniques would include model checking, which, in the simplest case, involves brute-force enumeration of many possible states (although the actual implementation of model checkers requires much cleverness, and does not simply reduce to brute force).
There are hybrid theorem proving systems that use model checking as an inference rule. There are also programs that were written to prove a particular theorem, with a (usually informal) proof that if the program finishes with a certain result, then the theorem is true. A good example of this was the machine-aided proof of the
Applications
Commercial use of automated theorem proving is mostly concentrated in
Other uses of theorem provers include program synthesis, constructing programs that satisfy a formal specification.[13] Automated theorem provers have been integrated with proof assistants, including Isabelle/HOL.[14]
First-order theorem proving
In the late 1960s agencies funding research in automated deduction began to emphasize the need for practical applications.[
First-order theorem proving is one of the most mature subfields of automated theorem proving. The logic is expressive enough to allow the specification of arbitrary problems, often in a reasonably natural and intuitive way. On the other hand, it is still semi-decidable, and a number of sound and complete calculi have been developed, enabling fully automated systems.[18] More expressive logics, such as higher-order logics, allow the convenient expression of a wider range of problems than first-order logic, but theorem proving for these logics is less well developed.[19][20]
Relationship with SMT
There is substantial overlap between first-order automated theorem provers and SMT solvers. Generally, automated theorem provers focus on supporting full first-order logic with quantifiers, whereas SMT solvers focus more on supporting various theories (interpreted predicate symbols). ATPs excel at problems with lots of quantifiers, whereas SMT solvers do well on large problems without quantifiers.[21] The line is blurry enough that some ATPs participate in SMT-COMP, while some SMT solvers participate in CASC.[22]
Benchmarks, competitions, and sources
The quality of implemented systems has benefited from the existence of a large library of standard benchmark examples—the Thousands of Problems for Theorem Provers (TPTP) Problem Library[23]—as well as from the CADE ATP System Competition (CASC), a yearly competition of first-order systems for many important classes of first-order problems.
Some important systems (all have won at least one CASC competition division) are listed below.
- E is a high-performance prover for full first-order logic, but built on a purely equational calculus, originally developed in the automated reasoning group of Technical University of Munich under the direction of Wolfgang Bibel, and now at Baden-Württemberg Cooperative State University in Stuttgart.
- Mace4.
- SETHEO is a high-performance system based on the goal-directed model elimination calculus, originally developed by a team under direction of Wolfgang Bibel. E and SETHEO have been combined (with other systems) in the composite theorem prover E-SETHEO.
- Vampire was originally developed and implemented at Manchester University by Andrei Voronkov and Kryštof Hoder. It is now developed by a growing international team. It has won the FOF division (among other divisions) at the CADE ATP System Competition regularly since 2001.[24]
- Waldmeister is a specialized system for unit-equational first-order logic developed by Arnim Buch and Thomas Hillenbrand. It won the CASC UEQ division for fourteen consecutive years (1997–2010).
- Max Planck Institute for Computer Science.
The Theorem Prover Museum[25] is an initiative to conserve the sources of theorem prover systems for future analysis, since they are important cultural/scientific artefacts. It has the sources of many of the systems mentioned above.
Popular techniques
This article is in prose. is available. (December 2023) |
- unification
- Model elimination
- Method of analytic tableaux
- Superposition and term rewriting
- Model checking
- Mathematical induction[26]
- Binary decision diagrams
- DPLL
- Higher-order unification
- Quantifier elimination[27]
Software systems
Name | License type | Web service | Library | Standalone | Last update ( YYYY-mm-dd format )
|
---|---|---|---|---|---|
ACL2 | 3-clause BSD |
No | No | Yes | May 2019 |
Prover9/Otter | Public Domain | Via System on TPTP | Yes | No | 2009 |
Jape | GPLv2 |
Yes | Yes | No | May 15, 2015 |
PVS | GPLv2 |
No | Yes | No | January 14, 2013 |
EQP | ? | No | Yes | No | May 2009 |
PhoX | ? | No | Yes | No | September 28, 2017 |
E |
GPL |
Via System on TPTP | No | Yes | July 4, 2017 |
SNARK |
Mozilla Public License 1.1 | No | Yes | No | 2012 |
Vampire |
Vampire License | Via System on TPTP | Yes | Yes | December 14, 2017 |
Theorem Proving System (TPS) | TPS Distribution Agreement | No | Yes | No | February 4, 2012 |
SPASS | FreeBSD license |
Yes | Yes | Yes | November 2005 |
IsaPlanner | GPL |
No | Yes | Yes | 2007 |
KeY | GPL |
Yes | Yes | Yes | October 11, 2017 |
Z3 Theorem Prover | MIT License | Yes | Yes | Yes | November 19, 2019 |
Free software
- Alt-Ergo
- Automath
- CVC
- E
- IsaPlanner
- LCF
- Mizar
- NuPRL
- Paradox
- Prover9
- PVS
- SPARK (programming language)
- Twelf
- Z3 Theorem Prover
Proprietary software
- CARINE
- Wolfram Mathematica
- ResearchCyc
See also
- Curry–Howard correspondence
- Symbolic computation
- Ramanujan machine
- Computer-aided proof
- Formal verification
- Logic programming
- Proof checking
- Model checking
- Proof complexity
- Computer algebra system
- Program analysis (computer science)
- General Problem Solver
- Metamath language for formalized mathematics
- De Bruijn factor
Notes
- ^ Frege, Gottlob (1879). Begriffsschrift. Verlag Louis Neuert.
- ^ Frege, Gottlob (1884). Die Grundlagen der Arithmetik (PDF). Breslau: Wilhelm Kobner. Archived from the original (PDF) on 2007-09-26. Retrieved 2012-09-02.
- ^ Russell, Bertrand; Whitehead, Alfred North (1910–1913). Principia Mathematica (1st ed.). Cambridge University Press.
- ^ Russell, Bertrand; Whitehead, Alfred North (1927). Principia Mathematica (2nd ed.). Cambridge University Press.
- ^ Herbrand, J. (1930). Recherches sur la théorie de la démonstration (PhD) (in French). University of Paris.
- ^ Presburger, Mojżesz (1929). "Über die Vollständigkeit eines gewissen Systems der Arithmetik ganzer Zahlen, in welchem die Addition als einzige Operation hervortritt". Comptes Rendus du I Congrès de Mathématiciens des Pays Slaves. Warszawa: 92–101.
- ^ a b c d Davis, Martin (2001). "The Early History of Automated Deduction". Robinson & Voronkov 2001. Archived from the original on 2012-07-28. Retrieved 2012-09-08.
- ^ Bibel, Wolfgang (2007). "Early History and Perspectives of Automated Deduction" (PDF). Ki 2007. LNAI (4667). Springer: 2–18. Archived (PDF) from the original on 2022-10-09. Retrieved 2 September 2012.
- .
- S2CID 30847540.
- ^ Kolata, Gina (December 10, 1996). "Computer Math Proof Shows Reasoning Power". The New York Times. Retrieved 2008-10-11.
- ISBN 978-981-15-6401-7, retrieved 2024-02-10
- CiteSeerX 10.1.1.62.4976.
- S2CID 7716709.
- ^ Luckham, David C.; Suzuki, Norihisa (Mar 1976). Automatic Program Verification V: Verification-Oriented Proof Rules for Arrays, Records, and Pointers (Technical Report AD-A027 455). Defense Technical Information Center. Archived from the original on August 12, 2021.
- S2CID 10088183.
- ^ Luckham, D.; German, S.; von Henke, F.; Karp, R.; Milne, P.; Oppen, D.; Polak, W.; Scherlis, W. (1979). Stanford Pascal verifier user manual (Technical report). Stanford University. CS-TR-79-731.
- S2CID 14361631.
- ^ Kerber, Manfred. "How to prove higher order theorems in first order logic." (1999).
- ^ Benzmüller, Christoph, et al. "LEO-II-a cooperative automatic theorem prover for classical higher-order logic (system description)." International Joint Conference on Automated Reasoning. Berlin, Germany and Heidelberg: Springer, 2008.
- S2CID 5389933.
ATPs and SMT solvers have complementary strengths. The former handle quantifiers more elegantly, whereas the latter excel on large, mostly ground problems.
- .
In recent years, we have seen a blurring of lines between SMT-COMP and CASC with SMT solvers competing in CASC and ATPs competing in SMT-COMP.
- ^ Sutcliffe, Geoff. "The TPTP Problem Library for Automated Theorem Proving". Retrieved 15 July 2019.
- ^ "History". vprover.github.io.
- ^ "The Theorem Prover Museum". Michael Kohlhase. Retrieved 2022-11-20.
- hdl:1842/3394.
- ^ Gabbay, Dov M., and Hans Jürgen Ohlbach. "Quantifier elimination in second-order predicate logic." (1992).
References
- Chang, Chin-Liang; Lee, Richard Char-Tung (2014) [1973]. Symbolic Logic and Mechanical Theorem Proving. Elsevier. ISBN 9780080917283.
- ISBN 9781483296777.
- ISBN 978-1461396871.
- ISBN 978-0-486-78082-5.
This material may be reproduced for any educational purpose, ...
- Duffy, David A. (1991). Principles of Automated Theorem Proving. Wiley. ISBN 9780471927846.
- ISBN 9780079112514.
- ISBN 9780262182232.
- ISBN 9781461223603.