5th HLF – Lecture: How to Write a 21st Century Proof
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 49 | |
Author | ||
License | No Open Access License: German copyright law applies. This film may be used for your own use but it may not be distributed via the internet or passed on to external parties. | |
Identifiers | 10.5446/40137 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
2
00:00
Internet forumDescriptive statisticsOrder (biology)Different (Kate Ryan album)Meeting/Interview
00:33
Internet forumBit rateComputer scienceMereologySphereProof theoryTheoremWordNetwork topologyDisk read-and-write headMathematicsQuicksortWell-formed formulaMedical imagingEqualiser (mathematics)MathematicianLecture/Conference
02:36
Bit rateTheoremMultiplication signProof theoryWell-formed formulaDerivation (linguistics)WritingAlgorithmProgrammer (hardware)WordSoftware bugTrailMathematicianSet (mathematics)HypothesisError messagePositional notationPrime idealFiber bundleLecture/Conference
05:10
Bit rateComputer programmingProof theoryMathematicsTheoremWell-formed formulaStatement (computer science)BitArithmetic meanSymbol tableMereologyNichtlineares GleichungssystemPoint (geometry)Elementary arithmeticData structureForm (programming)CalculusPositional notationMean value theoremPrime idealAdditionFigurate number1 (number)Computer scienceStudent's t-testState of matterWeb pageSummierbarkeitDifferentiable functionWordNumberMathematical analysisCompact spaceMathematicianWritingLecture/Conference
12:10
Bit rateError messageProof theoryHypertextElement (mathematics)Goodness of fitLevel (video gaming)MereologyTheoremData structureExpandierender GraphMathematicsReal numberVideo gameReading (process)Online helpWeb pageExistenceCategory of beingStatement (computer science)Physical systemSocial classPoint (geometry)Arithmetic meanHierarchyStudent's t-testAlgebraNumberMultiplication signMean value theoremFiber bundleLecture/Conference
18:53
Bit rateProof theoryError messageLevel (video gaming)ThumbnailGreatest elementMultiplication signWeb 2.0TrailComputer scienceComputer programmingRule of inferenceHierarchyNeuroinformatikLecture/Conference
20:41
Proof theoryError messageComputer scienceData structureLevel (video gaming)Web 2.0TheoremNP-hardMathematicsReal numberLecture/Conference
21:55
First-order logicPoint (geometry)Proof theoryComputer scienceCASE <Informatik>Constructor (object-oriented programming)Level (video gaming)AlgorithmNumberLecture/Conference
23:08
Bit rateHeegaard splittingProof theoryCASE <Informatik>Point (geometry)Data structureLevel (video gaming)Lecture/Conference
23:32
Internet forumWeb pageProof theoryInformationMusical ensembleLecture/Conference
Transcript: English(auto-generated)
00:01
And now it's my pleasure to introduce the third speaker of this first session today, Leslie Lampert. And here I would just encourage people to look at this interesting text written in the description about all the different kinds of accomplishments that Leslie has made in his scientific career.
00:21
And two pages, only the last sentence mentions LaTeX, which probably most people know about. But there is so much more, so much exciting thing, which I will not start talking about in order to not take his time. Please have a look. Okay, so this is mostly a talk for mathematicians, but there are still a few computer scientists who write proofs these days.
00:42
So I'm talking to tell you about how to write a proof. Or more precisely, how to write a 21st century proof. And how to stop writing the 17th century proofs that those of you who are writing proofs are probably writing now. A brief history of writing formulas. Here's what a formula might look like written in the 17th century.
01:08
See if you can read it and see if you can figure out, recognize that formula. Anyone recognize it? I see a couple of heads nodding. Well, here it is written in the 20th century.
01:22
You will notice it as a rather old but fairly recently proved theorem of Fermat. Here's how it might look like in the later part of the 21st century, where we've eliminated all the words and now just use formulas. But don't panic. We're not going to worry about entering the later parts of the 21st century today.
01:47
What's important is leaving the 17th century. So, a brief history of writing proofs. Here is a 17th century theorem. Homogeneous and equal spherical bodies opposed blah blah blah.
02:02
This is from Newton's Principia, or Principia as I'm sometimes told it's supposed to be called. And here's his proof. Obviously don't try to read it, but just, you know, get that sort of image in your mind. Here's a 20th century theorem. It's the Schroeder Bernstein theorem, for those mathematicians among you.
02:26
And it's from General Kelly's General Topology book, which used to be a bestseller as far as math books go, and I think it's still used. And here's what his proof looks like. Now, it's not very different from Newton's proof.
02:44
And like writing formulas and words, writing proofs and prose must have led to lots of errors. I mean, try proving Fermat's theorem if instead of using formulas you were writing it out in 17th century style. You're not going to get very far.
03:01
Well, in fact, writing formulas and words and writing proofs and probes has led to many errors. There's anecdotal evidence that suggests that about one third of all published and refereed mathematical papers have false theorems.
03:22
This proof is wrong. And I'll talk about that later. Now, I'm going to show you my way of writing proofs in the 21st century. If you don't like my way, that's fine. Find a better way. I really mean that. I'm challenging you.
03:42
I like the notation f prime for the first derivative of f. If you don't, maybe you want to write df or df of x dx. Fine. Or invent another notation. Don't write the first derivative of f in a formula. It's time to leave the 17th century.
04:04
We've stopped writing formulas and words. It's time to stop writing proofs and words. Okay, what led me to this? Well, about 35 years ago, actually maybe it's more like 40 years ago today, I started writing rigorous correctness proofs of algorithms.
04:22
Now, proofs of algorithms are different from mathematical proofs. They tend not to be mathematically deep, but they have lots of details. And if a mathematician forgets a little detail, like the hypothesis that a set is non-empty, it's a trivial oversight.
04:41
Programmers have a name for such trivial oversights. They're called bugs. I began by writing my proofs the way mathematicians do, and not surprisingly, I was trained as a mathematician. But when I tried proving correctness of non-simple algorithms, I discovered that this kind of proof didn't work.
05:03
It was impossible to keep track of all the details, and to make sure that I didn't miss any. I had to invent a different way to write proofs. It worked for proofs of programs, so I tried it on my proofs of mathematical theorems, and I found that it worked great for them too.
05:21
Now, my goal is not to convince you that I know the best way to write proofs. I just want to convince you that now you are not writing good proofs. If you care about your proofs, you will look for a better way. And looking at my method is a good place to start. Now, I'm going to talk about mathematical proofs, not algorithm proofs.
05:41
Written by mathematicians, and occasionally computer scientists write proofs like that. So, why are formulas easier to read than prose? Now, why, you know, this is hard, but what makes formulas easier to read than this? Well, there are several things. First is naming.
06:01
If I were to give names to those numbers, instead of the sum, you know, first two numbers, the sum of x and y, it makes it a little easier to read. Second thing is structure. If I structure the formula like this, now you'll find that this is just about as easy to read as the modern form.
06:21
And what the third leg of what formulas provide is a compact notation. But for proofs, most of you are going to want to stop here. And that's fine. Notice that's a lot easier to read than the first statement of that formula.
06:45
And you'll have gone from the 17th century to the early part of the 21st century. And that's a pretty good-sized leap. Okay, adding structure and naming to proofs. Let's start, I'm going to start with an example from Mike Spivak's calculus book.
07:02
I love what Mike wrote in the introduction, so I'll read it. In addition to developing the student's intuition about the beautiful concepts of analysis, it is surely equally important to persuade them that precision and rigor are neither deterrence to intuition, nor ends in themselves, but the natural medium in which to formulate and think about mathematical questions.
07:29
Precision and rigor are the natural medium in which to formulate and think about mathematical questions. So let's see how precise and rigorous he was in proving the corollary to the mean value theorem.
07:44
This is an example I chose, not by random, but by looking at a theorem at a corollary of the right size to present in a talk. And here's the statement. It's a corollary to the mean value theorem. If you remember your calculus, it's straightforward.
08:01
I'm not going to, don't try to, you know, read the details. I'll be pointing out the relevant ones to you. It's adopted from Spivak's calculus book. So what's wrong with it? Precision and rigor. It's not precise and rigorous to introduce a symbol without saying what it is.
08:21
What is this F? This is a text for freshmen. How can they learn to be precise and rigorous from a text that isn't? So, and I should mention that this is the most precise and rigorous elementary text, calculus text, well that I saw when I looked for them at them when I was teaching calculus decades ago,
08:43
but it's still in use and it's still considered very rigorous textbook. So we'll say what F is. It's a differentiable function on an interval. Okay, on an interval, an interval, the interval. Naming. We'll give that interval a name, call it I.
09:05
And we'll add some structure, state it as what are the assumptions and what it is that we have to prove. Okay, now do you prefer this, writing it out F prime of X is greater than zero for all X and I instead of that?
09:23
Then why not write this? You know, why not get rid of all the words? But again, we're going to approach the 21st century slowly. So we'll let you write your equations with a few words in there, that's fine. Okay, the proof. Let A and B be two points in I with A less than B.
09:43
But A and B might not exist because the interval might contain only a single point. He didn't make any assumption about it being an un-empty interval. So is this proof wrong? Well actually it's not. But you have to look at the end of the proof to see that it's not.
10:01
And you'll see that what it's trying to prove is F of B is greater than F of A. And he's doing it by the standard way of proving that it's increasing by showing that you have to choose two points and show that F of one is bigger than F of the other. But should you have to read to the end of the proof to understand the first sentence?
10:22
I don't think so. So, right, we assume that A and B are two points in I with A less than B and prove that F of B is greater than F of A. We need to say it right out front. Then he said there's some X in AB which satisfies this.
10:41
And why is this true? By the mean value theorem, which is fine that he doesn't mention it explicitly. It's a corollary to the mean value theorem. But is it? Here is a statement of the mean value theorem. And notice that it has this assumption that F is continuous on this interval.
11:01
But why is this true? Well, if you look at the assumptions up here, you know, that assumption isn't there. What he's actually using without mentioning it is a theorem 35 pages earlier. That if F is differentiable at a point, that it's continuous at the point.
11:23
Well, should the reader have to figure out by herself that a theorem stated 35 pages earlier is being used? Remember, this is a freshman. Okay, 17th century prose proofs encourage this imprecision and lack of rigor.
11:41
And let's write, rewrite the proof in a way that encourages precision and rigor. So, what we'll do is first step, it suffices to assume A and B is less than B to prove F of A is less than F of B.
12:02
Then a little bit of notation instead of just abbreviate that a bit. And the reason for it, every step has a proof. And it's by definition of increasing. Now we choose an A and B with this property. And the reason by mean value theorem, theorem one, and assumptions one and two.
12:27
And step one. Because of the assumption that A and B are elements of the interval. And then step three, that's by assumptions one and three, and step one and step two.
12:49
So all being stated out for the reader can then go over, look at all of these things, and make sure that this step follows by these reasons.
13:00
I don't have time, you know, to let you really look carefully and convince yourself that is true. And thereby step one, from A less than B, and we're assuming that the reader can figure out that A less than B implies B minus A greater than zero. But it doesn't hurt to give a clue as to what part of step one we're using.
13:27
And QED, this QED step is an abbreviation for the current goal. So the thing that this is proving is the current goal, that F is increasing. And it steps by three and four and assumes that we can do the algebra.
13:46
Okay, here it is. Structure, naming. Notice we have named all the assumptions that are being used and all the steps. And every step's proof names every fact needed to prove it.
14:07
And if you've ever written, read mathematics, you come to some sentence and you say, well, how the hell does this follow? You know, is it a trivial consequence of the next previous statement? Is it using something, you know, three sentences back or what?
14:22
And it makes math really hard to read. But you don't have that problem with this method. Reasoning is easier to follow and makes the proof easier to read. Now this step might be hard for a freshman to understand.
14:40
So let's give her some help. Now in prose proofs, there's a downside to doing that. Because when you add details, it makes the proof longer. And when a proof is longer, it's harder to follow. But in structured proofs, all we do is we add another level of proof.
15:04
We break it down into a number of steps. And again, this QED step for a choose statement to prove that you can choose something. You have to show that that thing you're trying to choose exists.
15:24
Now in the latter part of the 21st century, proofs are going to be written in hypertext. So what we're going to do is we'll just have to click on a step to show or hide its proof. So you click on this corollary and we get the high level of the proof.
15:42
We click on this step and we get that proof of that step expanded. And we click again, we get it hidden again. So we can add detail and as much detail as we want without losing the basic structure of the proof.
16:06
So adding detail makes the proof easier to read with no downside. And we can also right click and get an informal sketch of the proof. If somebody wants to just get a general idea of why this is true and here's a high level sketch.
16:25
And you know, left click if you want the whole story. Oh well, we don't have hypertext in 2017, not very much of it. And no good hypertext system I know for writing math. But you can do it today, you don't have to wait for this technology to be developed.
16:45
You can show hierarchy with indentation and you can set off informal explanation with shading. It's not as good as hypertext but it works. Structured proofs help avoid errors. I have lots of personal evidence to support this.
17:03
I'm going to tell you one story. Remember this incorrect proof. How did I discover it was wrong? Well this theorem is very simple, the Schroeder Bernstein theorem. But it has a non-trivial proof and I was teaching an introductory math course. And I thought that would be a great example to use.
17:22
So I took this and I looked around and this was the shortest proof of the Schroeder Bernstein theorem I could find. But in one page long, but it's written for math graduate students, not for freshmen. So I had to expand it, explain the details. So I went and after about four pages of filling in details, I discovered that the proof was wrong.
17:46
Now I was writing this in the middle of the night before the class. And at that point I didn't see how to fix the error and I just forgot about it. Okay, how I found the error again. 20 years later, I read it again.
18:01
And again. And again. And again. And I looked at that proof and said, there's no error there. I must have been misremembering. Maybe it was something else. Maybe I was just too sleepy to see something. But I said, well, okay. May not be a wrong proof, but it's a nice little proof.
18:21
And it'd be a good example. So I began writing it as a structured proof. And I found the error in 15 minutes. Structured proofs in real life. Perhaps it's overkill for a simple proof and sophisticated readers. When you're sure that the theorem is true, we write proofs to make sure a theorem is true.
18:47
Now that's not the only reason we write proofs, but that's the first reason we write a proof. Before you go trying to figure out how to, you know, make it pretty, how to explain it, how to make it sound really cool to somebody else, we want to make sure that it's true.
19:03
And our proofs are often not simple. Hierarchical structuring is the best way to avoid errors. Tony Hoare helped a computer scientist discover that 40 years ago, 50 years ago. I've lost track of the time. And it works for everything, it's not just for computer programs, it works for proofs as well.
19:26
Especially for long and subtle proofs. The lowest level proofs should be short paragraphs. And if you're not completely sure about the proof of a step, then don't just write a bigger paragraph.
19:42
You break it up and add another level. And my rule of thumb is to decompose the proof until each lowest level step is obviously true. And then go one level deeper. Later, it's easy to replace the lowest levels with longer paragraphs for your less skeptical colleagues.
20:06
You know, I've done that. It's very easy to just snip off the bottom levels of the proof and reproduce it by, you know, and replace it by a simple paragraph explaining what's going on.
20:20
And for publications, these proofs get to be long, but you don't have to publish the whole thing, you can publish the sketch, and these days you can put the complete proof on the web. I've published the papers this way with, you know, a short proof in the paper and an appendix. This is an ACM publication.
20:41
The appendix was published by the ACM just on the web. Now I've been writing structured proofs since about 1990. And here's a typical example of a proof that I've published. Obviously don't try to read it. Notice that it's about four levels deep.
21:00
And nothing hard about it. Okay, why shouldn't you write structured proofs? Well, there are lots of reasons. You hate change. Everybody hates change. It means you have to learn something new. You'll have to learn to write these proofs. You know, it takes some practice to get good at it. You're afraid to be different.
21:22
Well, there's no need to be afraid. Even if nobody else is writing them, readers have no trouble with them. It's more work. It is more work. Sloppiness is easier than precision and rigor. It takes a little work to make something precise and rigorous. Nothing is as easy as this proof.
21:44
And the real reason is you might discover your theorem is wrong. You won't be able to publish it. And after all, the referees wouldn't have found the error. Everyone makes errors anyway, especially computer scientists writing algorithms.
22:00
I wish I were joking. I haven't found this in mathematicians, but I've spoken to computer scientists who have published an incorrect proof, and you know, they'll sound contrite, but I can tell that that's what's in the back of their mind, why I'm glad the referees didn't see that. It's easy to find other reasons.
22:20
Why you should write structured proofs, there's only one reason. You want to really make sure that what you prove is true. What else do you need to know? A few details. Numbering steps. Once you start getting a little deep, you know, numbers like 2.3... They're not going to get you very far.
22:40
But there's an easy solution to this. You replace that with this step. It's step two of the current level six proof. And when you think about it, it works because there's at most one step with that label that it's legal to refer to at any point in the proof.
23:01
But you have to think about that for a while. Anyways, the case construct. Case n greater than zero is just an abbreviation for assume greater than zero and prove the current goal. Useful for proofs by case splitting. Definitions. You often need to define things locally at a certain point in the sub-proof.
23:21
And it's easy, you just define them and the structuring level tells you where that, what the scope of that definition is. And that's it. This is not rocket science. Just do it. For more information, look at my paper How to Write a 21st Century Proof.
23:41
It's on my publication page. Be fruitful and prove. Thank you.