Theorem proving with Coq

Antonis Stampoulis

What is Coq?

Coq is a tool to help you write formal proofs, that are mechanically verifiable. This means that once you have proved something in Coq, you have very high assurance that it is true -- more than what you usually have when doing a pen-and-paper proof. It can be used in an interactive style, thus we call it an interactive proof assistant. It is based on a very expressive logic, the Calculus of Inductive Constructions.

What Coq is not.

Coq is not a tool that will automatically prove theorems. So, you can't have Coq solve your logic or math homework for you, or prove the Goldbach conjecture. Still, it greatly simplifies the development of formal proofs, by automating some aspects of it.

Why you should learn to use it.

Many reasons: to understand formal mathematical logic better; to verify that a proof is indeed valid; to develop certified software.

How you could learn to use it.

Sit through this section for the basics.
Follow the standard Coq tutorial. (this presentation is based on it)
Use the reference manual for things not covered there.
For more detailed introductions, look at the Coq'Art book and Adam Chlipala's book.
Treat it as a programming language: the only proper way to learn it is to use it!

How to use it.

Coq is installed in the zoo machines, in the directory


/c/cs430/bin

As a development environment for it, you can either use the CoqIDE or ProofGeneral, which is an Emacs-based environment. Both are installed in the same directory.

Propositional Logic

We will now see how to prove a simple theorem of propositional logic, namely the S axiom of the Hilbert calculus, in Coq. Its statement is
(P → Q → R) → (P → Q) → P → R

As we said, Coq is an interactive tool. That means that instead of having to write a development in full and then pass it to Coq (the way we write programs and pass them to a compiler), we can feed a development little by little to Coq, which will then help us proceed. Environments like CoqIDE and ProofGeneral help us do that, and one should use the rest of this tutorial like that.

As a first step, we state the theorem we're trying to prove.

Theorem hilbert_axiom_S : (P → Q → R) → (P → Q) → P → R.

Logical negation is represented as implication of False; so not A is equivalent to A → False. Still, the treatment of logical negation in Coq is uncommon, in that in its underlying logic the law of the excluded middle does not hold, unless we explicitly include it. We can still prove the law of noncontradiction:

Theorem noncontradiction : not (P ∧ not P).

We can summarize some of the tactics we can use for propositional logic in the following table:

	As a goal	As a hypothesis
→	intros	(various)
∧	split	destruct
∨	left or right	destruct
True	trivial	--
False	--	destruct

Predicate logic

Universally quantified goals can be proven by introducing a new free variable through the intros tactic, while such hypotheses can be used by specifying instantiations for the quantified variables (implicitly or explicitly) when using tactics like apply and destruct. Existentially quantified goals can be proven by selecting a witness for which they hold; existentially quantified hypotheses can be opened up so as to give a name to their contained witness for the rest of the proof.

We can thus augment the previous tactics table with the following:

	As a goal	As a hypothesis
∀	intros	... with (x := ...)
∃	exists	destruct
=	trivial when a = a	rewrite

Inductive definitions

In Coq we specify the objects that we reason about by defining inductively how such objects can be constructed. For example, if we want to be able to reason about natural numbers, we would give the following definition:

Inductive nat : Set :=
O : nat
| S : nat → nat.

O is the constructor that corresponds to zero; S is the constructor that corresponds to the successor function.

Immediately from the previous definition, we get the principle of induction over natural numbers:

Print nat_ind.

nat_ind : ∀ P : nat → Prop,
           P O →
           (∀ n : nat, P n → P (S n)) →
           ∀ n : nat, P n

In fact, inductive definitions play a much more central role in Coq than just being able to inductively define sets like natural numbers. Predicates like the ≤-relation for natural numbers are also be inductively defined.

Inductive le : nat → nat → Prop :=
le_n : ∀ n : nat, le n n
| le_S : ∀ n m : nat, le n m → le n (S m).

Even logical connectives like ∨ are defined inductively.

Inductive or (A B : Prop) : Prop :=
or_introl : A → or A B
| or_intror : B → or A B.

Let's now see some example proofs involving induction. We'll prove the theorems:

∀ n : nat, O ≤ n

∀ n m : nat, n ≤ m → S n ≤ S m

Let's now move on to the next theorem.

Theorem le_S_S : ∀ n m : nat, n ≤ m → S n ≤ S m.

What is interesting is that it is easier to prove this by performing induction on the derivation of n ≤ m instead of natural number induction on n or m.

The induction principle that we get from the definition of ≤ is the following:
(∀ n.P n n) → (∀ n, m.P n m → P n (S m)) →
(∀ n, m.n ≤ m → P n m)

By instantiating P to fun a b ⇒ S a ≤ S b we get:
(∀ n.S n ≤ S n) → (∀ n, m.S n ≤ S m → S n ≤ S (S m)) →
(∀ n, m.n ≤ m → S n ≤ S m)

As a last example, let's see how recursive functions are handled. We can use the Fixpoint command to define such functions, e.g. the addition of natural numbers:

Fixpoint plus (a b : nat) : nat :=
  match a with
     O ⇒ b
    |S a ⇒ S (plus a b)
  end.

These functions need to be terminating, in order to ensure logical consistency.

We can now prove theorems like the following:

Theorem le_plus : ∀ n m p, n ≤ m → n ≤ plus p m.

The simpl tactic can be used to perform evaluation of a function.

Declarative proof style

The proof scripts we've been writing are very hard to read, and also perform too many low-level steps. There are a number of ways we can mitigate these problems, and arrive at scripts that are better at conveying the essential ideas of proofs, and easier to write. One way to do this is to use Coq's new declarative proof mode, where we specify intermediate steps rather than the tactics required to reach them; another way is through the careful use of automation. Here we'll briefly see the declarative style.

Let's revisit a theorem we proved earlier and see its declarative proof.

Theorem latin_square_property_decl : ∀ a b : G, ∃ x : G, a × x = b.
proof.
  let a : G, b : G.
  take (inv a × b).
  have H1:(a × (inv a × b) = (a × inv a) × b) by associativity.
  have H2:(a × inv a = e) by inverse.
  have (a × (inv a × b) = e × b) by H1, H2.
                       ~= b by identity.
  hence thesis.
end proof.
Qed.

The basic way of doing proofs in the declarative style is the following:

We use let and assume instead of intros.
We use have to state one fact and explicitly provide justification for it by listing the hypotheses we use. then is similar but includes the previous fact in the justifications.
We use thus to show (part of) the thesis, including justification if needed. hence includes the previous fact.
When doing induction, we need to explicitly specify the different cases. Dependent induction is still not supported though...