Comp 210 Lab 15: Sorting Code

Insertion Sort Code, Insertion Sort Analysis, Merge Sort Code, Merge Sort Analysis, Quicksort Code, Generating Test Data, Doing the Timing

Insertion Sort

(define iSort
  (lambda (list-to-sort)
    (cond ((empty? list-to-sort) empty)
	  (else (insert (first list-to-sort) (iSort (rest list-to-sort)))))))

(define insert
  (lambda (new already-sorted)
     (cond
	[(empty? already-sorted) 
         (list new)]
        [else 
         (if (< new (first already-sorted))
             (cons new already-sorted)
             (cons (first already-sorted)
                   (insert new (rest already-sorted))))])))

Analyzing Insertion Sort

To sort a list of n items, Insertion-sort does the following:

insert an item into the empty list,
insert an item into a list of length 1,
insert an item into a list of length 2,
...
insert an item into a list of length n-1.

Approximately how many steps, at most, could it take to insert an item into a list of length k?

As a crude overestimate, consider insertion sort as doing n repetitions of the task "insert an item into t alist of length at most n". So roughly, you could say that insertion sort, given a list of n items, takes about n² steps.

You can see this is only overestimating by about a factor of two:

  n      XXXXXXXXXX
+ n-1    XXXXXXXXX
+ n-2    XXXXXXXX
+ n-3    XXXXXXX
+ ...      ...
+ 3      XXX
+ 2      XX
+ 1      X
----    ---------------
=        hey, that's roughly half of a nxn square!

Of course you don't know how long it takes the computer to do exactly one 'step' (a term we are using fuzzily), but do the times of insertion sort suggest that, in the worst case, indeed it takes on the order of n² time?

Merge Sort Code

;; (mSort lon)
;; lon is a list of numbers
;; return a list with the same elements in ascending order
;;
(define mSort
  (lambda (lon)
    (if (or (empty? lon) (empty? (rest lon)))
        lon
        (merge-two (mSort (every-other-odd lon))
                   (mSort (every-other-even lon))))))


;; (merge-two l1 l2)
;; l1, l2 are ascending lists of numbers
;; return a single ascending list of the numbers of of l1,l2.
;;
;; Example:
;;    (merge-two (list 3 8 11) (list 1 3 4 9 29))
;;  = (list 1 3 3 4 8 9 11 29)
;;
(define merge-two
  (lambda (l1 l2)
    (cond [(empty? l1) l2]
	  [(empty? l2) l1]
	  [(> (first l1) (first l2))
	   (cons (first l2) (merge-two l1 (rest l2)))]
	  [else (cons (first l1) (merge-two (rest l1) l2))])))

;; (every-other-odd l)
;; return a list containing every other element of l,
;; those with odd indices into the list (starting with 1).
;; E.g.  (every-other-odd (list 'a 'b 'c 'd 'e) = (list 'a 'c 'e).
;;
(define every-other-odd
   (lambda (l)
      (cond [(empty? l) empty]
            [else       (cons (first l) (every-other-even (rest l)))])))


;; (every-other-even l)
;; return a list containing every other element of l,
;; those with even indiced into the list (starting with 1).
;; E.g.  (every-other-even (list 'a 'b 'c 'd 'e) = (list 'b 'd).
;;
(define every-other-even
   (lambda (l)
      (cond [(empty? l) empty]
            [else       (every-other-odd (rest l))])))

Analyzing Merge Sort

Now consider merge sort. Given a list of n items, what happens? It takes about n steps to pull out every-other-even element, and another n steps to pull out every-other odd. Furthermore, it takes about n steps to call merge-two (in merging two sorted lists, each element is looked at once). So altogether, mergesorting a list of n elements needs about 3n steps, plus however many steps are involved in the two recursive calls.

So:
3n + 2 * [steps to mergesort n/2 elements]
= 3n + 2*[ 3(n/2) + 2*[steps to mergesort n/4 elements]]
= 3n + 2*(3n/2) + 4[steps to mergesort n/4 elements]
= 3n + 3n + 4*[3(n/4) + 2[steps to mergesort (n/8) elements]]
= 3n + 3n + 3n + 8[steps to mergesort (n/8) elements] = ...
= 3n + 3n + 3n + ... + 3n.
But how many times are in the "..."? If you think about it, this is asking, how many times can you double 2, until you reach n? (Since when n/(2^??) is 1 element, then we reach the mergesort's base case. The answer is, log₂(n). (This is actually the definition of log₂(n).)

Thus we conclude, it takes = 3n log₂(n) steps, for mergesort to sort a list of n elements. (Read here for justification of why the particular base of the logarithm doesn't really matter for our purposes.) The factor of 3 isn't important; the part to remember is that roughly, merge-sort takes on the order of n logn steps. Do the timings support this estimate?

Quicksort

Here's the basic idea of quicksort. It is vaguely similar to mergesort, in that it partitions its input into two smaller lists, recursively sorts each of those, and then puts them back together. However, it partitions them differently that mergesort. (Can you look at the code, and figure out how?)

(define qSort
  (lambda (l)
    (local [(define pivot (first l))]
           (cond [(empty? l) empty]
	         [else       (append (qSort (lesser l pivot))
			             (equal-to l pivot)
                                     (qSort (greater l pivot)))]))))

(define lesser
  (lambda (l e)
    (gather l < e)))

(define greater
  (lambda (l e)
    (gather l > e)))

(define equal-to
  (lambda (l e)
    (gather l = e)))

(define gather
  (lambda (data test? reference)
    (cond [(empty? data) empty]
	  [else (if (test? (first data) reference)
		    (cons (first data) (gather (rest data) test? reference))
		    (gather (rest data) test? reference))])))

Generating Test Lists

(define nums-down
  (lambda (n)
    (cond [(zero? n) empty]
          [else      (cons n (nums-down (sub1 n)))])))

(define nums-up
  (lambda (n)
    (local [(define help-up
              (lambda (curr)
                (if (> curr n)
                    empty
                    (cons curr (help-up (add1 curr))))))]
      (help-up 1))))


(define max-rand 32768)

(define nums-rand
  (lambda (n)
    (cond [(zero? n) empty]
          [else      (cons (random max-rand) (nums-rand (sub1 n)))])))

Doing the Timing

Here's one way to time the sort:

(local [(define big-list (nums-rand 1000))]
  (time-apply (lambda () (empty? (mSort big-list)))))

Note how it would be a drag to see a 1000-element list printed; we're just interested in the times, not the return value. So we throw in the hack of returning not the list, but empty? of the list.

A more general way to test functions: We write a function do-test, whith takes a sorting function, a function to create a data list, and the length of the desired data list. It times the function, and returns #f.

(define do-test
  (lambda (sort-fn data-fn data-len)
    (local [(define data (data-fn data-len))]
      (time-apply (lambda () (empty? (sort-fn data)))))))

(do-test iSort nums-rand 2000)
(do-test iSort nums-up 500)
;; etc