Theorem Proving | George Shakan

To acclimate myself with Computer Proofs, I aimed to write down some basic theorems from my area of research in Lean. I chose Lean primarily because of the very active and helpful community. In fact, everything I did in lean was done alongside Yael Dillies whom I met through Zulip. In what follows below, I will discuss how I went about this and what I took away from the experience.

First of all, what is Lean? It is an interactive theorem prover (such as Isabelle, Coq, and others) . For our purposes, it is a way to input mathematical statements and their corresponding proofs. The power of lean is that it verifies that the proof you enter is indeed correct. The user can write a mathematical statement she wishes to prove. The interactive window then notifies us what goals remain to prove and we can begin working to establish them. Correct proofs are rewarded by a “Proof Complete” (and a festive 🎉). Incomplete proofs give unaccomplished goals or errors. To get a feel of how this works, one can load up the natural number game.

An interactive theorem prover offers interesting possibilities. The most obvious is that we can be sure that a proof is correct. Of course this is not completely true, as one has to make sure that previous statements and definitions are indeed correctly stated! One can consult this Quanta article about the impressive computer formalization of some of Peter Scholze’s work.

An even more interesting goal is the hope to automate parts of the mathematical research. One idea, independent of theorem provers, is an advanced autocomplete to be used when writing mathematical papers in latex. The technology to build this already exists and has been applied successfully to programming and is still evolving. For instance, Google’s Minerva has already had some success in using language models to perform mathematical reasoning.

Another goal would be a to automate the process of checking correctness of proofs. An effective type of system would save mathematicians countless hours just in refereeing. This can be viewed loosely as a neural machine translation problem. There has been lots of success in this domain with natural language (say something like English to German) both in research and in practice.

Finally a third, and even more ambitious, goal is for an AI to be able to come up with its own proofs. Needless to say, we are currently at the infancy of this program. The interested reader can consult open AI’s somewhat recent paper on the subject.

For further reading on these sorts of things, I recommend a promising path towards autoformalization and general artificial intelligence by Christian Szegedy. Here Szegedy outlines his thoughts on so-called autoformalization as well as how such a program would fit into the general development of AI.

My first step in getting familiar with lean was to work through the natural number game. This is basically a guided tour for proving some basic statements in lean, avoiding some technical difficulties in the process (for instance, it is browser based). Here one has a goal to prove, and must use the given hypotheses to establish the goal. After each new line of code is written, one can analyze how this changes the hypotheses/goals.

After this, I got lean on my local machine and worked through a tutorial, which is again a guided tour, but a bit more in depth than the natural number game (here I was joined for some of this by Tsutomu Okano). I worked through various problems one might see in undergraduate analysis course relating to sequences, limits, and other topics.

The next leap is to begin to contribute to mathlib. Mathlib is a library of mathematics in lean that hundreds (in my estimation) of people have worked on. This is not an easy step, and to proceed here it helps that lean has a very active community through Zulip. The active community was my main reason for choosing Lean over other interactive theorem provers. For instance, knowledge of the incredibly useful Github is helpful for figuring out what is going on.

After asking around, I was lucky that Bhavik Mehta and Yael Dillies were both knowledgeable with Lean and interested in my main area of research. Moreover, there was already some work in mathlib on additive combinatorics and finite sets, which was very useful to build off of (see this work of Bloom and Mehta). I started off by trying to prove Ruzsa’s triangle inequality.

Theorem 1: Let $A , B , C \subset \mathbb{Z}$ be finite sets. Then

$|A-C| |B| \leq |A-B||B-C|. \ \ \ \ \ \spadesuit$

I thought this was a good starting point as it is fundamental to the area, whilst having a short proof (i.e. Lemma 2.6 in Tao and Vu). This turned out to be a bit harder than I expected. The argument can be formulated as a double counting argument or showing that a certain function is an injection. It turns out that the former formulation is a bit easier to input into lean due to what is already in mathlib, borrowing some of Bhavik’s work on the sum-product problem (however, I was shown that with enough perseverance one can use the latter proof). Moreover, some very simple things like point-wise set subtraction, had to be added to mathlib. This turns out to be quite a non-trivial task and was completed by Yael. This is perhaps a result of additive combinatorics being a relatively new addition to mathlib.

Next, Yael and I input Plunnecke’s inequality into mathlib. Plunnecke’s inquality is another fundamental result that has a relatively short proof (thanks to Petridis). Here, Yael took over the lean part. Their workflow was way more efficient than mine, as they is a veteran of lean. They have developed a plethora of tricks for finding results in mathlib and even figuring out what to use in mathlib. While they cannot write in lean as quickly as I can Latex, they is not ten times slower.

We eventually made a pull request to mathlib containing Ruzsa’s triangle inquality and Plunnecke’s inequality. Mathematically, I did not gain too much working on this. I believe this is in part because I already had a thorough understanding of the proofs. Nevertheless, thinking of how to actually convert theorems to lean did force me to revisit these classical theorems from multiple perspectives and I could see this being productive as one scales.

Overall, I enjoyed the experience of working in Lean, largely due to the vibrant community. Transferring mathematics to theorem provers requires a unique combination of mathematics and software engineering (both to contribute as well as under the hood). We already have some talented people (both young and old) working on this and I certainly look forward to seeing what will be accomplished along these lines.

	George Shakan on Can Chat-GPT Do Math?
	J.C Vega on Can Chat-GPT Do Math?
	Carnival of Mathemat… on Carnival Of Mathematics
	Playful Math Educati… on Carnival Of Mathematics
	Carnival of Mathemat… on Carnival Of Mathematics

George Shakan

Data Science and Math Blog

Category Archives: Theorem Proving

My experience with Computer Proofs