COMP 210 Homework #7

Spring 2003

Due Wed. March 5, 2003 at the start of class

All problems will require all (applicable) steps from the design recipe, including a template. Before you tackle the homework, remind yourself of our General Advice, Advice on Homeworks (in particular: staple and provide questions), and the Grading Guidelines.

Remember: Contract, purpose, header and test cases are always required!

(30 pts total) Cleaning-up with local
Rewrite the mergesort code first seen in Lecture 15, using local to eliminate redundant calculations and to encapsulate (hide) the implementation details of the merge sort.

All the code you need to start with can be downloaded here: mergesort.scm.
Do not use the code from the Lecture 15 download -- the above version runs more consistently and will save you a few unnecessary headaches later.

Do not use multi-way cond implementations!
1. (15 pts) First take merge, unzip and mergesort and use local to eliminate their globally scoped helper functions as well as to eliminate redundant calculations.
2. (15 pts) Write a second mergesort2 that combines merge, unzip and mergesort into a single function. The only globally visible function should be mergesort2 itself (and the functions you wrote in Part a -- which should not be accessed by mergesort2).
(90 pts total) A More General Kind of Sorting
1. (10 pts) A Template for Sorting
  
  The merge sort above actually represents a general form of "binary comparison sorting" -- that is, sorting done by comparing the values of only two elements at a time. The merge sort above is implemented using Susan Merritt's template-based sorting algorithm (see, for instance, Dr. Wong and Dr. Nguyen's paper on an object-oriented implementation of Merritt's ideas). In a nutshell, Merritt's thesis is that all comparison-based sorting can be described as follows:
  
  Step 1: Is the list one element long?
  
  Step 2: Yes: Then the list is already sorted and we are done.
  No: The split the list into two sub-lists, sort the first sub-list, sort the second sublist and then join the two sub-lists back into the final resultant list.
  
  This is called a "divide-and-conquer" algorithm because it conquers the whole problem (sorting the whole list) by dividing it into smaller problems and conquering those individually.
  
  It is called a "template pattern" because the split-sort-sort-join constitutes a programmatic template that all binary comparison sorting algorithms follow.
  
  In the above algorithm, the check for the list being one element long occurs when the process is delegated to the helper function. The empty? case of the helper is only called when the list is one element long.
  
  The unzip function performs the splitting of the list into two sub-lists. These sub-lists are then sorted.
  
  The merge function joins the two sorted sub-lists back into a single, sorted list.
  
  In general however, to sort a list, we just need to create a matching pair of functions, one to split the list into two pieces and another to join the split lists back together again after they have been sorted.
  
  I recommend that you run the Java applet demo on the above link to see an animation of template pattern sorting. When the demo comes up, select "Graphical Ordering", "Graphic Split/Join" and "Merge Sort". Click the "Build" button to create a random set of values, then click "Sort" to see them get sorted. You may want to increase the "Compare Delay" and "Sort Delay" to slow the animation down. Some people prefer to see the animation using bars rather than dots. The split lists (arrays actually, here) show up as different colors and the white flashing values are the ones being compared in size at any given moment.
  
  To accomplish this, we need to modify the mergesort function from above (not mergesort2 -- why not?) to be more general. We will do this by first defining a structure, called Sorter, that holds two functions:
  
  ;; Sorter is structure that holds the split and join functions needed for template pattern sorting.
  ;; where split and join have the following abstract contracts
  ;; split: list-of-any --> (list list-of-any list-of-any)
  ;; join: list-of-any list-of-any --> list-of-any
  ;; (make-Sorter function function)
  (define-struct Sorter split join)
  
  So here's what you're going to do:
  
  -- Copy your mergesort code and rename it from "mergesort" to "sort".
  -- Modify sort to take a second parameter of type Sorter (I called mine "sorter").
  -- Modify sort to replace the references to unzip and merge with (Sorter-split sorter) and (Sorter-join sorter).
  -- Instantiate a Sorter structure called "merger" with unzip and merge as the values for its two fields.
  -- Run the same test cases as you did for mergesort, but with Sorter and the Sorter structure.
  
  Notice how sort contains 100% invariant code now, because all new implementation of sorting will simply be new instantiations of the Sorter structure. Implementing a sorting algorithm has been reduced to simply writing a split/join function pair.
2. (20 pts) Insertion Sort
  
  To prove the above conjecture, let's try another implementation of sorting. All we have to do is to specify a split function and a join function that follow the contracts given in the definition of the Sorter structure above.
  
  DO NOT TOUCH THE SORT FUNCTION CODE!!
  
  This implementation is called an "insertion sort" because it is done by inserting single values into an already sorted list of values. Many people sort a hand of playing cards by doing an insertion sort.
  
  Here are the specifications for the split and join functions:
  
  Split: Take a list and return a list of two lists, where the first list is a single element list with the original list's first and the second list is the rest of the original list.
  Join: Take two lists of the form described above and assume that the second list is already sorted. Return a single list which is the second list with the single element from the first list inserted into its correct location, e.g. {4} {2 3 5 6} --> {2 3 4 5 6}.
  
  See the Java applet demo described above to see the insertion sorting in action. Just select "Insertion Sort".
  
  For the record, insertion sort is refered to as an "easy-split/hard-join" algorithm because the split function is trivial and the join function is non-trivial. Merge sort is also an easy-split/hard-join algorithm, though that is not as obvious because the split function is trivial if you are using arrays, but when you are using lists as we are here, the split (unzip) is non-trivial, but still easier than the join (merge).
  
  -- Write the above described split and join functions, calling them insert-split and insert-join respectively. Recommendation: First write a stand-alone function that does an ordered insert of a value into an ordered list, test it, then attempt to write the split function.
  -- Instantiate a Sorter structure called "insertioner" with those two functions.
  -- Test sort using insertioner using the same test cases as you did for merger.
3. (20 pts) An Even Better Insertion Sort
  
  Did it bother you that in both the merge sort and insertion sort above, that their respective split and join functions were completely visible to the outside world, even though they are never used outside of the confines of the Sorter structure? Sure it did.
  
  What we need to do is to encapsulate the split and join functions inside of their Sorter instantiations.
  
  We could use local to do this, but still creates an unnecessary naming of the split and join functions. We don't need to name the functions ever because once they are inserted into the Sorter structure, they can be referenced as (Sorter-split structureName) and (Sorter-join structureName).
  
  A lambda function is what we need -- a nameless function.
  
  -- Create a new Sorter instantiation called "insertioner2" that uses lambda functions and no globally scoped functions.
  
  Your code should look like this:
  
  (define insertioner2
            (make-Sorter
                    (lambda (lon)
                             ...)
                    (lambda (lon1 lon2)
                             ...)))
  
  Your sorters are now fully encapsulated.
  
  You'll encounter one problem though: in Scheme there is no apparent way for a lambda function to refer to itself. This causes problems with recursive functions like the join function in insertion sorting. To refer to the join function, use the fact that the defined name of the Sorter instantiation, insertioner2, is in scope inside the body of the the join lambda function. Thus, to make the recursive call, write ((Sorter-join insertioner2) ... ... ).
  
  -- Test insertioner2 with the sort function test cases as before.
4. (20 pts) And a Better Merge Sort Too!
  
  -- Write a new merge sort ("merger2") that is also a Sorter structure like merger, but that uses lambda functions for split and join..
  
  Be sure to encapsulate any helper functions inside merger2 by defining them inside of a local. If the helper needs to be recursive, use a regular function definition
  (as we've always done before) for it.
5. (15 pts) Can't Get No Satisfaction!
  
  Whoa! We've been barreling down our crusade for invariant behavior and encapsulated code, but we ran right past something! The comparison function, "<" or ">" used in the insertion and merge sorts determines whether the sorting is ascending or descending. But does the sorting process actually "care" which comparison function is being used?
  
  I smell an abstraction coming here....
  
  What if sort took an abstract comparison function as an input? sort doesn't actually use the comparison function, but it could simply pass it on to the split and join functions who might -- note that since the sort adheres to the abstract contract for split and join, that all split and join functions would have to take a comparison function even if they did not use it.
  
  The contract for the comparison function is the same as for "<" and ">":
  
  comp: number number --> boolean
  
  Actually, we can expand the contract to
  
  comp: any any --> boolean
  
  You'll see why in a minute...heh, heh...:-)
  
  -- Copy your sort, merger2, insertioner2 and Sorter definitions over to a new DrScheme file (plus any other documentations you need).
  -- Modify the functions to accept a comparison function as an input. Note that this will entail modifying the abstract contract for split and join. Are split and join still limited to processing list-of-nums?
  -- Test your new code with an expanded test suite that tests both ascending and descending sorts.
6. (5 pts) Sorting Other Things
  
  With our new contract for the comparison operator, our quest for abstraction has bought us something for free:
  
  -- Use the string comparison operators, string<? and string>?, to show that you can alphabetically sort a list of strings (ascending and descending, of course) using any of your Sorters.
  
  THIS SHOULD INVOLVE NO NEW CODE, JUST NEW TEST CASES!!!
  
  Be sure your test cases probe the ability to distinguish more than just differences in the first character in a string, e.g. "Stephen" vs. "Steven"..
  
  By pushing the abstractions by way of enforcing encapsulations, we have completely decoupled the sorting process from what is being sorted.
  
  Ain't abstraction wonderful?
7. (Extra Credit 30 pts) Bubble Sort
  
  Bubble sorting is a sorting process where one removes the first element of a list and compares it against the second element. Whichever element is smaller (larger) is then compared against the third element in the list. The other element is put back into the list at the second element's position. This process is repeated until the end of the list is reached. The element that is still being carried forward was then the smallest (largest) element in the list and has been removed from the list. This processing is called "bubbling" the data.
  
  In a template pattern sorting, that remaining element is the one that is split off into its own sub-list. The remainder of the original list is the other sub-list.
  
  The splitting can be done either by a tail-recursive method that utilizes two accumulators, or a non-tail-recursive method that uses one accumulator.
  
  The join process is thus a trivial appending of the two sub-lists together.
  
  Bubble sort a "hard-split/easy-join" algorithm as you can see.
  
  -- Write a Sorter instantiation for bubble sort called "bubbler".

120 points total + 30 pts Extra Credit.

Drop-off homework here: ftp://comp210@www.exciton.cs.rice.edu/comp210/dropoff/hw07