COMP 210 Homework #7
Spring 2003
Due Wed. March 5, 2003 at the
start of class
All problems will require all (applicable) steps from the design recipe, including
a template. Before you tackle the homework, remind yourself of our General
Advice, Advice on Homeworks
(in particular: staple
and provide
questions), and the Grading
Guidelines.
Remember: Contract, purpose, header and test cases
are always required!
- (30 pts total) Cleaning-up with local
Rewrite the mergesort code first seen in Lecture
15, using local to eliminate redundant calculations and to encapsulate
(hide) the implementation details of the merge sort.
All the code you need to start with can be downloaded here: mergesort.scm.
Do not use the code from the Lecture 15 download -- the above version
runs more consistently and will save you a few unnecessary headaches later.
Do not use multi-way cond implementations!
- (15 pts) First take merge,
unzip and mergesort
and use local to eliminate
their globally scoped helper functions as well as to eliminate redundant
calculations.
- (15 pts) Write a second mergesort2
that combines merge, unzip
and mergesort into a single
function. The only globally visible function should be mergesort2
itself (and the functions you wrote in Part a -- which should not
be accessed by mergesort2).
- (90 pts total) A More General Kind of Sorting
- (10 pts) A Template for Sorting
The merge sort above actually represents a general form of "binary
comparison sorting" -- that is, sorting done by comparing the values
of only two elements at a time. The merge sort above is implemented using
Susan Merritt's template-based sorting algorithm
(see, for instance,
Dr. Wong and Dr. Nguyen's paper on an object-oriented implementation
of Merritt's ideas). In a nutshell, Merritt's thesis is that all comparison-based
sorting can be described as follows:
Step 1: Is the list one element long?
Step 2: Yes: Then the list is already sorted and we are done.
No:
The split the list into two sub-lists,
sort
the first sub-list, sort
the second sublist and then join the
two sub-lists back into the final resultant list.
This is called a "divide-and-conquer" algorithm because
it conquers the whole problem (sorting the whole list) by dividing it
into smaller problems and conquering those individually.
It is called a "template pattern" because the split-sort-sort-join
constitutes a programmatic template that all binary comparison sorting
algorithms follow.
In the above algorithm, the check for the list being one element long
occurs when the process is delegated to the helper function. The empty?
case of the helper is only called when the list is one element
long.
The unzip function performs
the splitting of the list into two sub-lists. These sub-lists are then
sorted.
The merge function joins
the two sorted sub-lists back into a single, sorted list.
In general however, to sort a list, we just need to create a matching
pair of functions, one to split the list into two pieces and another to
join the split lists back together again after they have been sorted.
I recommend that you run the Java applet demo on the above link to see
an animation of template pattern sorting. When the demo comes up, select
"Graphical Ordering", "Graphic Split/Join" and "Merge
Sort". Click the "Build" button to create a random set
of values, then click "Sort" to see them get sorted. You may
want to increase the "Compare Delay" and "Sort Delay"
to slow the animation down. Some people prefer to see the animation using
bars rather than dots. The split lists (arrays actually, here) show up
as different colors and the white flashing values are the ones being compared
in size at any given moment.
To accomplish this, we need to modify the mergesort
function from above (not mergesort2
-- why not?) to be more
general. We will do this by first defining a structure, called Sorter,
that holds two functions:
;; Sorter is structure that holds
the split and join functions needed for template pattern sorting.
;; where split and join have the following abstract contracts
;; split: list-of-any --> (list list-of-any list-of-any)
;; join: list-of-any list-of-any --> list-of-any
;; (make-Sorter function function)
(define-struct Sorter split join)
So here's what you're going to do:
-- Copy your mergesort
code and rename it from "mergesort"
to "sort".
-- Modify sort to take
a second parameter of type Sorter
(I called mine "sorter").
-- Modify sort to replace
the references to unzip
and merge with (Sorter-split
sorter) and (Sorter-join
sorter).
-- Instantiate a Sorter
structure called "merger"
with unzip and merge
as the values for its two fields.
-- Run the same test cases as you did for mergesort,
but with Sorter and the
Sorter structure.
Notice how sort
contains 100% invariant code now, because all new implementation of sorting
will simply be new instantiations of the Sorter
structure. Implementing a sorting algorithm has been reduced to
simply writing a split/join function pair.
- (20 pts) Insertion Sort
To prove the above conjecture, let's try another implementation of sorting.
All we have to do is to specify a split function and a join function that
follow the contracts given in the definition of the Sorter
structure above.
DO NOT TOUCH THE SORT FUNCTION
CODE!!
This implementation is called an "insertion sort" because it
is done by inserting single values into an already sorted list of values.
Many people sort a hand of playing cards by doing an insertion sort.
Here are the specifications for the split and join functions:
Split: Take a list and return a list of two lists, where the first
list is a single element list with the original list's first
and the second list is the rest
of the original list.
Join: Take two lists of the form described above and assume that
the second list is already sorted. Return a single list which is the second
list with the single element from the first list inserted into its correct
location, e.g. {4} {2 3 5 6} --> {2 3 4 5 6}.
See the Java applet demo described above to see the insertion sorting
in action. Just select "Insertion Sort".
For the record, insertion sort is refered to as an "easy-split/hard-join"
algorithm because the split function is trivial and the join function
is non-trivial. Merge sort is also an easy-split/hard-join algorithm,
though that is not as obvious because the split function is trivial if
you are using arrays, but when you are using lists as we are here, the
split (unzip) is non-trivial, but still easier than the join (merge).
-- Write the above described split and join functions, calling them
insert-split and insert-join
respectively. Recommendation: First write a stand-alone function
that does an ordered insert of a value into an ordered list, test it,
then attempt to write the split function.
-- Instantiate a Sorter structure called "insertioner"
with those two functions.
-- Test sort using insertioner
using the same test cases as you did for merger.
- (20 pts) An Even Better Insertion Sort
Did it bother you that in both the merge sort and insertion sort above,
that their respective split and join functions were completely visible
to the outside world, even though they are never used outside of the confines
of the Sorter structure?
Sure it did.
What we need to do is to encapsulate the split and join functions inside
of their Sorter instantiations.
We could use local to do
this, but still creates an unnecessary naming of the split and join functions.
We don't need to name the functions ever because once they are inserted
into the Sorter structure,
they can be referenced as (Sorter-split
structureName) and (Sorter-join
structureName).
A lambda function is what
we need -- a nameless function.
-- Create a new Sorter
instantiation called "insertioner2"
that uses lambda functions and no globally scoped functions.
Your code should look like this:
(define insertioner2
(make-Sorter
(lambda
(lon)
...)
(lambda
(lon1 lon2)
...)))
Your sorters are now fully encapsulated.
You'll encounter one problem though: in Scheme there is no apparent way
for a lambda function to
refer to itself. This causes problems with recursive functions like the
join function in insertion sorting. To refer to the join function, use
the fact that the defined name of the Sorter
instantiation, insertioner2,
is in scope inside the body of the the join lambda
function. Thus, to make the recursive call, write ((Sorter-join
insertioner2) ... ... ).
-- Test insertioner2 with
the sort function test
cases as before.
- (20 pts) And a Better Merge Sort Too!
-- Write a new merge sort ("merger2")
that is also a Sorter structure
like merger, but that uses
lambda functions for split
and join..
Be sure to encapsulate any helper functions inside merger2
by defining them inside of a local.
If the helper needs to be recursive, use a regular function definition
(as we've always done before) for it.
- (15 pts) Can't Get No Satisfaction!
Whoa! We've been barreling down our crusade for invariant behavior and
encapsulated code, but we ran right past something! The comparison function,
"<" or ">" used in the insertion and merge sorts
determines whether the sorting is ascending or descending. But does the
sorting process actually "care" which comparison function is
being used?
I smell an abstraction coming here....
What if sort took an abstract
comparison function as an input? sort
doesn't actually use the comparison function, but it could simply pass
it on to the split and join functions who might -- note that since the
sort adheres to the abstract
contract for split and join, that all split and join functions would have
to take a comparison function even if they did not use it.
The contract for the comparison function is the same as for "<"
and ">":
comp:
number number --> boolean
Actually, we can expand the contract to
comp:
any any --> boolean
You'll see why in a minute...heh, heh...:-)
-- Copy your sort, merger2,
insertioner2 and Sorter
definitions over to a new DrScheme file (plus any other
documentations you need).
-- Modify the functions to accept a comparison function as an input. Note
that this will entail modifying the abstract contract for split and join.
Are split and join still limited to processing list-of-nums?
-- Test your new code with an expanded test suite that tests both ascending
and descending sorts.
- (5 pts) Sorting Other Things
With our new contract for the comparison operator, our quest for abstraction
has bought us something for free:
-- Use the string comparison operators, string<? and string>?, to
show that you can alphabetically sort a list of strings (ascending and
descending, of course) using any of your Sorters.
THIS SHOULD INVOLVE NO NEW CODE, JUST NEW TEST CASES!!!
Be sure your test cases probe the ability to distinguish more than just
differences in the first character in a string, e.g. "Stephen"
vs. "Steven"..
By pushing the abstractions by way of enforcing encapsulations, we
have completely decoupled the sorting process from what is being sorted.
Ain't abstraction wonderful?
- (Extra Credit 30 pts) Bubble Sort
Bubble sorting is a sorting process where one removes the first element
of a list and compares it against the second element. Whichever element
is smaller (larger) is then compared against the third element in the
list. The other element is put back into the list at the second element's
position. This process is repeated until the end of the list is reached.
The element that is still being carried forward was then the smallest
(largest) element in the list and has been removed from the list. This
processing is called "bubbling" the data.
In a template pattern sorting, that remaining element is the one that
is split off into its own sub-list. The remainder of the original list
is the other sub-list.
The splitting can be done either by a tail-recursive method that utilizes
two accumulators, or a non-tail-recursive method that uses one accumulator.
The join process is thus a trivial appending of the two sub-lists together.
Bubble sort a "hard-split/easy-join" algorithm as you can see.
-- Write a Sorter instantiation
for bubble sort called "bubbler".
120 points total + 30 pts Extra Credit.
Drop-off homework here: ftp://comp210@www.exciton.cs.rice.edu/comp210/dropoff/hw07