Comp 210 Lab 15: Timing Sorting Algorithms

Insertion Sort vs. Mergesort, Analyzing the results


Insertion Sort vs. Mergesort

In class, we've seen two different ways of sorting numbers, insertion sort (``isort'') and mergesort (``msort''). A few moments in lecture talked about how many steps each of these algorithms takes to sort a list of n numbers. Today's lab will do some timing experiments, repeat the theoretical analysis, and see if the two jive. In addition, we also offer you code for another sorting method, called quicksort. We will use it for timing purposes, but won't discuss it otherwise.

So we have three fundamentally different approaches to sorting a list. It seems unsurprising that some might behave differently than others. Can we observe this difference? Can we provide explanations?

If your programs only sort small lists a few times, it doesn't matter; any sort that is easy to write and have correct is fine. However, for longer lists, the difference really is huge. In Real Life (tm), sorting is an operation often done on very long lists (repeatedly). Only after we've decided that sorting is critical to having our programs running reasonably do we worry about which sorting method is fastest.

Making Test Lists

To do: In order to compare these different sorting algorithms, let's create some large lists of numbers to try them on.

  1. Write a function nums-down, where (nums-down 7) returns an descending list of 7 numbers, say (list 7 6 5 4 3 2 1).

  2. Now write a function nums-up which similarly creates an ascending list of any desired length.

  3. Now write a function nums-rand, which similarly creates a list of random numbers in the range from 0 to 32767. To create a single random numbers in the range from 0 to 9, you could use (random 10). For larger numbers, try:

         (define max-rand 32768)
         (random max-rand)
         
    (The function random is built-in; the value 32768 happens to be the biggest useful number to pass to that function. But this could conceivably change in the future; using the place-holder max-rand in your code makes it both easier to read, and easier to maintain.)

    Note that random is certainly a function in the Scheme sense, but not in the mathematical sense (since the output is not determined by the input).

You should be able to whip out these test-data generating functions pretty easily.

Timing

There is a built-in function time-apply, which can be used to time how long it takes the computer to evaluate something.

Although we'd like to be able to (time-apply (msort big-list)), where big-list is some previously-defined list. However, this wouldn't quite achieve the effect you want. (Q: Why not?) Instead, you must give time-apply a function of zero arguments. That is, something like (time-apply msort-big-list), where msort-big-list takes no arguments, and predictably sorts big-list. (Remember, big-list must be previously defined.)

To do: Write time-apply (You may want to have msort-big-list do the work of sorting, but then always return just #t, since we don't really want to see the sorted list printed out--it will be long.)

Here's our solution. time-apply reports how much two numbers, both are times in milliseconds. The second number is the actual time elapsed. The first number is the cpu time--that is, how much time DrScheme itself used up, as opposed to time spent on your Netscape program, updating any mouse movements you made, handling any mail requests coming in, etc.

Note that the time might vary for repeated evaluations of the same expression. This is because there might be different times spent waiting to load parts of Dr. Scheme from disk or cache memory, figuring out what memory is no longer being used and can be recycled, etc. You should time the same function several times to get a better idea of how much time is typically required.

To do: Okay, now we're set! Divide your lab group into three different groups (Team iSort, Team mSort, Team qSort); your group is in charge of running one of the sorting algorithms on lists of size 500, 1000, and 2000, for each of the types of data (ascending, descending, random). For instance, your task might be to write a function my-task, that takes no arguments and calls mSort on a list of 1000 ascending numbers, and finding (time-apply my-task).

To do (challenging): How would you make a function which takes a list as an argument (we'll call it nums), and returns:
A function which takes no arguments, and calls mSort on the list nums.
Hint: use local. Try it!

Q: On the board, your section leader will tally all the results. Which algorithm appears to be quickest for random data? Is this also the best sort to use if your data happens to be mostly-sorted? (Consider a bank, which has a list of your checks already sorted by transaction date, and then wants to sort them by check-number.)

input size
250 500 1000
Insert sort
up
down
rand
up
down
rand
up
down
rand
Mergesort
up
down
rand
up
down
rand
up
down
rand
Quicksort
up
down
rand
up
down
rand
up
down
rand

(To be fair, quicksort does somewhat better if you have data stored in a different manner (vectors), which is often true in many scientific applications.)


Analyzing the Results

We can theoretically reason about how many steps insertion sort requires, as well as how many steps Mergesort requires, to sort a list of n elements.

Big-Oh (optional)

Can we get a more systematic idea of how fast these different sorting algorithms run?

So far, we've tried to get a gut feeling for the behavior of algorthims. However, there is "science" in computer science; it is important to be able to formalize these notions, without any hand-wavy arguments. This is done with the ``order-of'' concept, which has a formal description.

We say a "function f(n) is order of n2" to mean ``f(n) < n2'' along with some fine print: ``up to some constant factor, and perhaps ignoring small values of n''.

"Function f(n) is order of n2" is abbreviated by saying "f(n) is in big-Oh of n2", which is written "f(n) is in O(n2)".

Of course, there is nothing special about n2; this generalizes to any function.

The Nitty-gritty

Enough hand-waving. What do we mean "up to some constant factor", and "small values of n"? Don't worry, there is a technical definition:

Definition: A function f(n) is in O(g(n)) if there exist numbers c,bigEnough such that for all n >= bigEnough, f(n) <= c * g(n).

(A technical note on abusage of this definition).

Finally, we can say that merge-sort runs in time O(n log n), and insertion sort and quicksort in time O(n2). This refers to how long these algorithms take in the worst case. (Though they are also just upper bounds, we usually try to make the strongest statement we can. Thus saying merge-sort runs in time O(n17) is true, but that's not a very strong statement.)

(We can also talk about running times in the "average" case, i.e., as an average over all possible inputs. Quicksort runs in time O(n log n) average case. Quicksort's worst case time can be improved to this bound as well if we use vectors and a better way of picking the pivot.)