COMP 210 Lab 9: Sorting Code

Insertion Sort Code, Insertion Sort Analysis, Merge Sort Code, Merge Sort Analysis, Quicksort Code,


Insertion Sort

;; iSort: list-of-numbers --> list-of-numbers
;; Return a list with all elements of nums, in ascending order.
;; Uses an insertion-sort.
;;
(define (iSort nums)
  (cond [(empty? nums) empty]
        [(cons? nums)  (insert (first nums) (iSort (rest nums)))]))

;; insert: number, list-of-numbers --> list-of-numbers
;; Return an ascending list with the elements of already-sorted
;;   and also new, inserted into the correct (ascending) place.
;;
;; Pre-condition: Already-sorted must be in ascending order.
;; 
(define (insert new already-sorted)
  (cond [(empty? already-sorted) (list new)]
        [(cons? already-sorted) 
         (cond [(< new (first already-sorted)) (cons new already-sorted)]
               [else (cons (first already-sorted)
                           (insert new (rest already-sorted)))])]))


Analyzing Insertion Sort

To sort a list of n items, Insertion-sort does the following:

  1. insert an item into the empty list,
  2. insert an item into a list of length 1,
  3. insert an item into a list of length 2,
  4. ...
  5. insert an item into a list of length n-1.
Approximately how many steps, at most, could it take to insert an item into a list of length k?

As a crude overestimate, consider insertion sort as doing n repetitions of the task "insert an item into t alist of length at most n". So roughly, you could say that insertion sort, given a list of n items, takes about n2 steps.

You can see this is only overestimating by about a factor of two:

  n      XXXXXXXXXX
+ n-1    XXXXXXXXX
+ n-2    XXXXXXXX
+ n-3    XXXXXXX
+ ...      ...
+ 3      XXX
+ 2      XX
+ 1      X
----    ---------------
=        hey, that's roughly half of a nxn square!

Of course you don't know how long it takes the computer to do exactly one 'step' (a term we are using fuzzily), but do the times of insertion sort suggest that, in the worst case, indeed it takes on the order of n2 time?


Merge Sort Code

;; mSort: list-of-number --> list-of-number
;; Return a list with the same elements of lon, in ascending order.
;;
(define (mSort lon)
    (cond [(length<=1? lon) lon]
          [else (local {(define two-parts (unzip lon))
                        (define one-part (first two-parts))
                        (define other-part (second two-parts))}
                  (merge-two (mSort one-part)
                             (mSort other-part)))]))


;; unzip: list-of-x --> (list list-of-x list-of-x))
;; Return two lists, each containing every-other element of lst
;; (in unspecified order).


;; (merge-two l1 l2)
;; l1, l2 are ascending lists of numbers
;; return a single ascending list of the numbers of of l1,l2.
;;
;; Example:
;;    (merge-two (list 3 8 11) (list 1 3 4 9 29))
;;  = (list 1 3 3 4 8 9 11 29)
;;
(define (merge-two l1 l2)
  (cond [(and (empty? l1) (empty? l2)) empty]
        [(and (empty? l1) (cons?  l2)) l2]
        [(and (cons?  l1) (empty? l2)) l1]
        [(and (cons?  l1) (cons?  l2))
         (cond [(> (first l1) (first l2))
                (cons (first l2) (merge-two l1 (rest l2)))]
               [else (cons (first l1) (merge-two (rest l1) l2))])]))
                                                  

Analyzing Merge Sort

Now consider merge sort. Given a list of n items, what happens? It takes about n steps to unzip the list. Furthermore, it takes about n steps to call merge-two (in merging two sorted lists, each element is looked at once). So altogether, mergesorting a list of n elements needs about 3n steps, plus however many steps are involved in the two recursive calls.

So:
3n + 2 * [steps to mergesort n/2 elements]
= 3n + 2*[ 3(n/2) + 2*[steps to mergesort n/4 elements]]
= 3n + 2*(3n/2) + 4[steps to mergesort n/4 elements]
= 3n + 3n + 4*[3(n/4) + 2[steps to mergesort (n/8) elements]]
= 3n + 3n + 3n + 8[steps to mergesort (n/8) elements] = ...
= 3n + 3n + 3n + ... + 3n.
But how many times are in the "..."? If you think about it, this is asking, how many times can you double 2, until you reach n? (Since when n/(2??) is 1 element, then we reach the mergesort's base case. The answer is, log2(n). (This is actually the definition of log2(n).)

Thus we conclude, it takes = 3n log2(n) steps, for mergesort to sort a list of n elements. (Read here for justification of why the particular base of the logarithm doesn't really matter for our purposes.) The factor of 3 isn't important; the part to remember is that roughly, merge-sort takes on the order of n logn steps. Do the timings support this estimate?


Quicksort

Like mergesort, quicksort works on the idea of breaking its input into two parts and recurring. But for quicksort, the two parts are "the small numbers" and "the big numbers"; the insight is that combining these to lists is a cinch -- you just append them.

Of course, this raises the question, is 10 a big number or a small one? Well, it depends on the list, of course. Quicksort classifies numbers as bigger or smaller than a "pivot" value, and the question becomes, what to choose as a pivot? The easiest approach is to choose the first number in the list; everything smaller than that is considered small, and everything bigger than that is considered big.
(You might ponder more sophisticated ideas of selecting a pivot, to perhaps bypass quicksort's performance on already-sorted lists.)

;; qSort: list-of-numbers --> list-of-numbers
;; Return a list with all elements of nums, in ascending order.
;;
(define (qSort nums)
    (cond [(empty? nums) empty]
          [else      
           (local [(define pivot (first nums))]
             (append (qSort (filter (lambda (n) (< n pivot)) nums))
                            (filter (lambda (n) (= n pivot)) nums)  ; DON'T recur here!
                     (qSort (filter (lambda (n) (> n pivot)) nums))))]))

Analyzing Quick Sort

We won't give any analysis, beyond observing that when the pivot tends to split the list half : half, the running time behaves like mergesort, but if it only splits the list 1 : n-1 then the running time behaves like insert-sort. Hopefully the former happens more often. (What if the split always happens to be one-third : two-thirds?)