A binary tree is:
(define-struct empty-tree ()) (define-struct branch (datum left right))
Prove: for any binary tree, there is one more leaf than branch.
NOT the way to argue the inductive step:
"Take any binary tree of size k; I'll splice out a leaf and add a branch with two leaves. this gives me a binary tree of size k+1, which has one more branch and (net) one more leaf."Problem: this gives you some binary tree of size k+1, but can you guarantee that all such binary trees were created through this method?
To be clear, start with an arbitrary binary tree of size k+1, and show that it has the property (knowing that any way you trim it down gives you a binary tree of size k, for which the inductive hypothesis will hold).
We'll define the size of a binary tree as the number of branches:
We'll define the height of a binary tree as:
Prove: For any binary tree, size(t)+1 ≤ 2height(t)
Similarly, structural induction on WFFs of first-order logic: A WFF is:
Def'n: An "atom" is T, F, or a proposition.
Prove: for any WFF, the number of connectives ≥ is equal to the number of atoms minus 1.
(More generally, we can include not: #conns ≥ #props-1.)
Here is a def'n of string: For a set Σ (the alphabet), the set Σ* ("strings over Σ") contains:
Once we have this,
how to define concatenation?
For v,w ∈ Σ*, vw ∈ Σ* as follows:
Th'm (book): length(vw) = length(v) + length(w).
Some exercises:
Book (and a hw problem) talks about reversal wR of a string w. What is skeleton for showing
(vw)R = wRvR ?Again, make P() about w only, with a "forall v ∈ Σ* …". Use the Def'n of concatenation!
Section 3.4, #30: Prove that a string contains at most one more occurrence of 01 than of 10. Notation, to help: for a string s, #01(s) is the number of 01's, and #10(s) corresponding.
Let P(s) be "#01(s) ≤ #10(s)+1". Proof by structural induction:
Let P(s) be ``#01(s) ≤ #10(s)+1.
Furthermore, if s doesn't end in 1, then
this inequality is strict: #01(s) < #10(s)+1.''
Proof by structural induction:
[The first time I wrote out this proof, I didn't use this notation "#01()". As i was repeating the same words over and over, I went back and re-wrote it with notation.
By the way, deciding what things to give special shorthand/names to is part of the art of writing a good proof… just as deciding precisely what tasks to include as separate functions is part of the art of writing a good program.
Beware adding too much new notation — this can result in a more-confusing proof. Using "standard" names like x,y for real numbers, m,n for natnums, i,j for indices can also make your notation more accessible. ]
In structural induction (and in general for the inductive step(s)), start with an arbitrary structure, then name the sub-parts its made out of, and then invoke the inductive hypothesis.
Example:
Let P(t) be ``2height(t) ≥ size(t)''. We prove P(t) holds for all trees t by structural induction:
The reason the first situation is more clear is that it's nicer to say "hey, you give me any tree t. I'll reason with it."
The second approach is round-about, saying "well, if something holds for some smaller trees t1 and t2, then it holds for a tree made out of them." Left unstated is: "Oh, and this tree I made out of the t1,t2 happens to be the particular tree you are interested in, even though I never really told you which t1 and t2 I was choosing."
For you to ponder: If I restricted you to only use a single base case, would this suffice to still solve the problem? That is, is strong-induction-with-multiple-premises a truly more powerful inference rule than strong-induction-with-single-premise?
For that matter, introduced mathematical (non-strong) induction as a new rule of inference; then sneakily slipped in strong-induction as well. Are we justified to throw in stronger and stronger rules of inference whenever we feel like it? (Rosen even has a couple of exercises in this vein: sect 3.3, #55,56.)
Suppose you are given some code for append, and you try to prove
For any list a, (append a (append a a)) = (append (append a a) a)Seems like this shouldn't be so bad. But you try doing it by induction, and you get stuck — your inductive hypothesis doesn't quite give you enough information to get things to go through.
;; append: list, list --> list ;; (define (append x y) (cond [(empty? x) y] [(cons? x) (cons (first x) (append (rest x) y))]))
Let P(a) be ``(append a (append a a)) = (append (append a a) a)''. We will (try unsuccessfully to) prove by structural induction that this property holds for our code as written.
(append empty (append empty empty)) = (append empty empty) = (append (append empty empty) empty)
(append a (append a a)) = (append (cons a0 a*) (append a a)) ; by this case for a = (cons a0 (append a* (append a a))) ; by code for append(So far so good; we can't apply our inductive hypothesis (since we're not triply-appending the same list any more), though I guess we can pursue the inner append knowing something about the structure of a:
= (cons a0 (append a* (append (cons a0 a*) a))) ; by this case for a = (cons a0 (append a* (cons a0 (append a* a)))) ; by code for append = ???Unfortunately, we are stuck — we have a statement where the second argument to the (first) append is in terms of cons, but that doesn't help us, since our code never disassembles our second argument.
Ironically, by proving a stronger statement, your life actually becomes easier! It's not too hard to show that
For any lists a,b,c (append a (append b c)) = (append (append a b) c).We will still induct on a (alone): let P(a) be ``For all lists b,c, (append a (append b c)) = (append (append a b) c)''. This is called ``loading the inductive hypothesis''.
We will (successfully, this time) prove by structural induction that P(a) holds for all lists a.
(append empty (append empty empty)) = (append empty empty) = (append (append empty empty) empty)
(append a (append b c)) = (append (cons a0 a*) (append b c)) ; by this case for a = (cons a0 (append a* (append b c))) ; by code for append = (cons a0 (append (append a* b) c)) ; by our inductive hypothesis! = (append (cons a0 (append a* b)) c) ; By code (in reverse) = (append (append (cons a0 a*) b) c) ; By code (in reverse) = (append (append a b) c) ; By this case for a
(Hmm, in this case it's because the first example, while it can be (correctly) viewed as an instance of "append associates", it can also be (incorrectly) viewed as "append commutes"!)
[an error occurred while processing this directive] [an error occurred while processing this directive]