[an error occurred while processing this directive]

Quantifying algorithms

Comparing insertion sort and mergesort

A couple of motivating code examples: insertion sort, and merge sort. (We'll re-consider quicksort later.)

;;;;;;;;;;;;;;;;;;;;;;;;;;;; Insertion Sort ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; iSort: list-of-numbers --> list-of-numbers
;; Return a list with all elements of nums, in ascending order.
;; Uses an insertion-sort.
;;
(define (iSort nums)
  (cond [(empty? nums) empty]
        [(cons? nums)  (insert (first nums) (iSort (rest nums)))]))

;; insert: number, list-of-numbers --> list-of-numbers
;; Return an ascending list with the elements of already-sorted
;;   and also new, inserted into the correct (ascending) place.
;;
;; Pre-condition: Already-sorted must be in ascending order.
;; 
(define (insert new already-sorted)
  (cond [(empty? already-sorted) (list new)]
        [(cons? already-sorted) 
         (cond [(< new (first already-sorted)) (cons new already-sorted)]
               [else (cons (first already-sorted)
                           (insert new (rest already-sorted)))])]))



                                           
;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Merge Sort ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; mSort: list-of-number --> list-of-number
;; Return a list with the same elements of lon, in ascending order.
;;
(define (mSort lon)
    (cond [(length<=1? lon) lon]
          [else (local {(define two-parts (unzip lon))
                        (define one-part (first two-parts))
                        (define other-part (second two-parts))}
                  (merge-two (mSort one-part)
                             (mSort other-part)))]))


;; (merge-two l1 l2)
;; l1, l2 are ascending lists of numbers
;; return a single ascending list of the numbers of of l1,l2.
;;
;; Example:
;;    (merge-two (list 3 8 11) (list 1 3 4 9 29))
;;  = (list 1 3 3 4 8 9 11 29)
;;
(define (merge-two l1 l2)
  (cond [(and (empty? l1) (empty? l2)) empty]
        [(and (empty? l1) (cons?  l2)) l2]
        [(and (cons?  l1) (empty? l2)) l1]
        [(and (cons?  l1) (cons?  l2))
         (cond [(> (first l1) (first l2))
                (cons (first l2) (merge-two l1 (rest l2)))]
               [else (cons (first l1) (merge-two (rest l1) l2))])]))

                                                  
;; unzip: list-of-x --> (list list-of-x list-of-x))
;; Return two lists, each containing every-other element of lst
;; (in unspecified order).
;;
(define (unzip lst) (unzip-help lst empty empty))


;; unzip-help: list-of-x list-of-x list-of-x --> (list list-of-x list-of-x)
;; Return two lists, each containing every-other element of lst
;; and the elements of so-far1 (so-far2) respectively.
;;
(define (unzip-help lst so-far1 so-far2)
  (cond [(empty? lst) (list so-far1 so-far2)]
        [(cons?  lst) (unzip-help (rest lst) so-far2 (cons (first lst) so-far1))]))
                      ; Fancy footwork: swap order of so-far1, so-far2.

We saw in a Comp210 lab empirical tests showing that these two types of sorting have different behaviours, and discussed the reasons there. Today we'll look at the tools needed to formalize these concepts.

For a function "insert", we'll consider the running time of iSort -- a function tiSort, which takes a list, and returns how long iSort runs on that input … on a 1GHz sparc20 with 512MB RAM, no other applications running besides the OS …

Suppose:

tiSort( (list 50 23) )       =   7ns
tiSort( (list 11 50 23) )    =  19ns
tiSort( (list 33 11 50 23) ) =  31ns

tiSort( (list 11 23 33 50) ) =   8ns
tiSort( (list 50 33 23 11) ) =  60ns
We can compare this to mergesort:
tmSort( (list 50 23) )       =  11ns
tmSort( (list 11 50 23) )    =  15ns
tmSort( (list 33 11 50 23) ) =  19ns

tmSort( (list 11 23 33 50) ) =  20ns
tmSort( (list 50 33 23 11) ) =  19ns
(These are fictitious but representative numbers)

We want to be able to compare these two sorts, to arrive at a general conclusion. There are several glitches:

We answer each of these concerns in turn. We want general answers, so we can also analyze the repeated-squaring algorithms (multiple versions) from hws, as compared to other algorithms for exponentiation.

Some of these approximations are definite trade-offs of accuracy vs keeping your model simple.

For these latter two reasons, we'll develop a formal notion of comparing (the growth of) two functions.
[We'll be talking about functions for a while now, having shifted the topic away from algorithms to their running times (functions) to their worst-case running-times on inputs of a certain size.]

Defining Big-Oh

Intuition: f = O(g) (pronounced ``f is big-Oh of g'') means "f ≤ g, up to a constant, ignoring small inputs"

Definition: f ∈ O(g) iff:
∃ c,n0 ∀ n>n0, |f(n)| ≤ c⋅|g(n)|.

Examples

Comments

(Book uses "k" instead of "n0"; we are usually interested in functions in N+→N+ rather than ℜ→ℜ. We'll often ignore the absolute values, for the same reason. And there is a fair amount of play in the details &emdash; we can have replace strict-inequalities with non-strict ones and vice-versa, without actually changing anything.)

Abuse alert: Although people write "f = O(g)", really O(g) is a set of functions, and we are saying ``f ∈ O(g)''. O(g) is the set of all functions which g is no-less-than … up to constant, ignoring small inputs.

We often use n to mean "the size of the input". Be careful:

≤ is to O(•) as ≥ is to Ω(&bull) as = is to Θ(&bull) That is, these are (resp.) useful for expressing upper bound, lower bound, and tight bounds
(with the usual caveats: up to a constant factor, ignoring small inputs).

Th'm: the running time of any comparison-based¹ sort is Ω(n log n).
This is a very strong result — you can't be substantially cleverer than mergesort! We will show this later, after covering a bit of counting and the pigeon-hole principle.

Less common are little-oh (omicron, ο) and little-omega; these are the companions of strictly < and >, resp.

Note that ``f ∈ O(g)'' is close to saying that ∃ c such that the limit as n→∞ of f(n)/g(n) < c. It's not exactly, equivalent for two technicalities:

Big-Oh and Big-Theta are also good for expressing error estimates, in a formal way:
For example, Stirling's Approximation says that

n! ≈ &radic(2πn)⋅(n/e)n
But what if we are working with factorials and want to be sure we have an upper bound? A more useful statement of Stirling's approximation is
n! ≈ ( &radic(2πn)⋅(n/e)n ) ⋅ (1 + Θ(1/n))
or even
n! ≈ ( &radic(2πn)⋅(n/e)n ) ⋅ (1 + 1/(12n) + Θ(1/n2))

Of note:


¹ At least, not without doing more than just comparisons between the objects you're sorting. But often there is a further bit of information: Suppose you are sorting N exams by score. If scores are always integers in [0,100], then you can set up 101 bins, and sort all N exams in a single pass. (``Bin sort'').
(back)

[an error occurred while processing this directive] [an error occurred while processing this directive]