|
Comp210: Principles of Computing and Programming
Fall 2004 -- Homework
#7
|
All problems will require all (applicable) steps from the design recipe,
including a template. Before you tackle the homework, remind yourself of
our General Advice,
Advice on Homeworks (in particular:
staple
and provide questions), and the
Grading Guidelines.
Remember: Contract, purpose, header and test cases
are always required!
- (30 pts total) Cleaning-up with
local
Rewrite the mergesort code first seen in
Lecture 15, using local to
eliminate redundant calculations and to encapsulate (hide) the
implementation details of the merge sort.
All the code you need to start with can be downloaded here:
mergesort.scm.
Do not use the code from the Lecture 15 download -- the above
version runs more consistently and will save you a few unnecessary
headaches later.
Do not use multi-way cond implementations!
- (15 pts) First take merge,
unzip and
mergesort
and use local to
eliminate their globally scoped helper functions as well as to eliminate
redundant calculations.
- (15 pts) Write a second
mergesort2
that combines merge,
unzip and
mergesort
into a single function. The only globally visible function
should be mergesort2
itself (and the functions you wrote in Part a -- which should not
be accessed by mergesort2).
- (90 pts total) A More General Kind of Sorting
- (10 pts) A Template for Sorting
The merge sort above actually represents a general form of "binary
comparison sorting" -- that is, sorting done by comparing the values of
only two elements at a time. The merge sort above is implemented using
Susan Merritt's template-based sorting algorithm (see, for
instance,
Dr. Wong and Dr. Nguyen's paper on an object-oriented implementation
of Merritt's ideas). In a nutshell, Merritt's thesis is that all
comparison-based sorting can be described as follows:
Step 1: Is the list one element long?
Step 2: Yes: Then the list is already sorted and we are done.
No: The split the list
into two sub-lists,
sort
the first sub-list,
sort
the second sublist and then join
the two sub-lists back into the final resultant list.
This is called a "divide-and-conquer" algorithm because it
conquers the whole problem (sorting the whole list) by dividing it into
smaller problems and conquering those individually.
It is called a "template pattern" because the
split-sort-sort-join constitutes a programmatic template that all binary
comparison sorting algorithms follow.
In the above algorithm, the check for the list being one element long
occurs when the process is delegated to the helper function. The
empty?
case of the helper is only called when the list is one element
long.
The unzip function
performs the splitting of the list into two sub-lists. These sub-lists
are then sorted.
The merge function
joins the two sorted sub-lists back into a single, sorted list.
In general however, to sort a list, we just need to create a matching
pair of functions, one to split the list into two pieces and another to
join the split lists back together again after they have been sorted.
I recommend that you run the Java applet demo on the above link to see
an animation of template pattern sorting. When the demo comes up, select
"Graphical Ordering", "Graphic Split/Join" and "Merge Sort". Click the
"Build" button to create a random set of values, then click "Sort" to
see them get sorted. You may want to increase the "Compare Delay" and
"Sort Delay" to slow the animation down. Some people prefer to see the
animation using bars rather than dots. The split lists (arrays actually,
here) show up as different colors and the white flashing values are the
ones being compared in size at any given moment.
To accomplish this, we need to modify the
mergesort
function from above (not
mergesort2
-- why not?) to be
more general. We will do this by first defining a structure, called
Sorter, that holds two
functions:
;; Sorter is structure that
holds the split and join functions needed for template pattern sorting.
;; where split and join have the following abstract contracts
;; split: list-of-any --> (list list-of-any list-of-any)
;; join: list-of-any list-of-any --> list-of-any
;; (make-Sorter function function)
(define-struct Sorter split join)
So here's what you're going to do:
-- Copy your mergesort
code and rename it from "mergesort"
to "sort".
-- Modify sort to take
a second parameter of type
Sorter
(I called mine "sorter").
-- Modify sort to
replace the references to unzip
and merge with
(Sorter-split sorter) and
(Sorter-join sorter).
-- Instantiate a Sorter
structure called "merger"
with unzip and
merge
as the values for its two fields.
-- Run the same test cases as you did for
mergesort, but with
Sorter and the
Sorter structure.
Notice how
sort
contains 100% invariant code now, because all new implementation of
sorting will simply be new instantiations of the
Sorter
structure. Implementing a sorting algorithm has been reduced to
simply writing a split/join function pair.
- (20 pts) Insertion Sort
To prove the above conjecture, let's try another implementation of
sorting. All we have to do is to specify a split function and a join
function that follow the contracts given in the definition of the
Sorter
structure above.
DO NOT TOUCH THE SORT
FUNCTION CODE!!
This implementation is called an "insertion sort" because it is done
by inserting single values into an already sorted list of values. Many
people sort a hand of playing cards by doing an insertion sort.
Here are the specifications for the split and join functions:
Split: Take a list and return a list of two lists, where the
first list is a single element list with the original list's
first
and the second list is the
rest
of the original list.
Join: Take two lists of the form described above and assume
that the second list is already sorted. Return a single list which is
the second list with the single element from the first list inserted
into its correct location, e.g. {4} {2 3 5 6} --> {2 3 4 5 6}.
See the Java applet demo described above to see the insertion sorting
in action. Just select "Insertion Sort".
For the record, insertion sort is refered to as an
"easy-split/hard-join" algorithm because the split function is trivial
and the join function is non-trivial. Merge sort is also an
easy-split/hard-join algorithm, though that is not as obvious because
the split function is trivial if you are using arrays, but when you are
using lists as we are here, the split (unzip) is non-trivial, but still
easier than the join (merge).
-- Write the above described split and join functions, calling them
insert-split and
insert-join
respectively. Recommendation: First write a stand-alone
function that does an ordered insert of a value into an ordered list,
test it,
then attempt to write the split function.
-- Instantiate a Sorter structure called "insertioner"
with those two functions.
-- Test sort using
insertioner
using the same test cases as you did for
merger.
- (20 pts) An Even Better Insertion Sort
Did it bother you that in both the merge sort and insertion sort
above, that their respective split and join functions were completely
visible to the outside world, even though they are never used outside of
the confines of the Sorter
structure? Sure it did.
What we need to do is to encapsulate the split and join functions
inside of their Sorter
instantiations.
We could use local to
do this, but still creates an unnecessary naming of the split and join
functions. We don't need to name the functions ever because once they
are inserted into the Sorter
structure, they can be referenced as
(Sorter-split
structureName) and
(Sorter-join
structureName).
A lambda function is
what we need -- a nameless function.
-- Create a new Sorter
instantiation called "insertioner2"
that uses lambda functions and no globally scoped functions.
Your code should look like this:
(define insertioner2
(make-Sorter
(lambda (lon)
...)
(lambda (lon1 lon2)
...)))
Your sorters are now fully encapsulated.
You'll encounter one problem though: in Scheme there is no apparent
way for a lambda function
to refer to itself. This causes problems with recursive functions like
the join function in insertion sorting. To refer to the join function,
use the fact that the defined name of the
Sorter
instantiation, insertioner2,
is in scope inside the body of the the join
lambda
function. Thus, to make the recursive call, write
((Sorter-join insertioner2) ... ... ).
-- Test insertioner2
with the sort function
test cases as before.
- (20 pts) And a Better Merge Sort Too!
-- Write a new merge sort ("merger2")
that is also a Sorter
structure like merger,
but that uses lambda
functions for split and
join..
Be sure to encapsulate any helper functions inside
merger2
by defining them inside of a
local. If the helper needs to be recursive, use a regular
function definition
(as we've always done before) for it.
- (15 pts) Can't Get No Satisfaction!
Whoa! We've been barreling down our crusade for invariant behavior and
encapsulated code, but we ran right past something! The comparison
function, "<" or ">" used in the insertion and merge sorts determines
whether the sorting is ascending or descending. But does the sorting
process actually "care" which comparison function is being used?
I smell an abstraction coming here....
What if sort took an
abstract comparison function as an input?
sort
doesn't actually use the comparison function, but it could simply pass
it on to the split and join functions who might -- note that since the
sort adheres to the
abstract contract for split and join, that all split and join functions
would have to take a comparison function even if they did not use it.
The contract for the comparison function is the same as for "<" and
">":
comp: number number -->
boolean
Actually, we can expand the contract to
comp: any any --> boolean
You'll see why in a minute...heh, heh...:-)
-- Copy your sort,
merger2,
insertioner2 and
Sorter
definitions over to a new DrScheme file (plus any other
documentations you need).
-- Modify the functions to accept a comparison function as an input.
Note that this will entail modifying the abstract contract for split and
join. Are split and join still limited to processing list-of-nums?
-- Test your new code with an expanded test suite that tests both
ascending and descending sorts.
- (5 pts) Sorting Other Things
With our new contract for the comparison operator, our quest for
abstraction has bought us something for free:
-- Use the string comparison operators, string<? and string>?, to show
that you can alphabetically sort a list of strings (ascending and
descending, of course) using any of your
Sorters.
THIS SHOULD INVOLVE NO NEW CODE, JUST NEW TEST CASES!!!
Be sure your test cases probe the ability to distinguish more than
just differences in the first character in a string, e.g. "Stephen" vs.
"Steven"..
By pushing the abstractions by way of enforcing encapsulations, we
have completely decoupled the sorting process from what is being sorted.
Ain't abstraction wonderful?
- (Extra Credit 30 pts) Bubble Sort
Bubble sorting is a sorting process where one removes the first
element of a list and compares it against the second element. Whichever
element is smaller (larger) is then compared against the third element
in the list. The other element is put back into the list at the second
element's position. This process is repeated until the end of the list
is reached. The element that is still being carried forward was then the
smallest (largest) element in the list and has been removed from the
list. This processing is called "bubbling" the data.
In a template pattern sorting, that remaining element is the one that
is split off into its own sub-list. The remainder of the original list
is the other sub-list.
The splitting can be done either by a tail-recursive method that
utilizes two accumulators, or a non-tail-recursive method that uses one
accumulator.
The join process is thus a trivial appending of the two sub-lists
together.
Bubble sort a "hard-split/easy-join" algorithm as you can see.
-- Write a Sorter
instantiation for bubble sort called "bubbler".
120 points total + 30 pts Extra Credit.
Last Revised
Saturday, 09-Oct-2004 01:35:05 CDT
©2004 Stephen Wong and Dung Nguyen