A couple of motivating code examples:
insertion sort, and merge sort.
(We'll re-consider quicksort later.)
;;;;;;;;;;;;;;;;;;;;;;;;;;;; Insertion Sort ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; iSort: list-of-numbers --> list-of-numbers
;; Return a list with all elements of nums, in ascending order.
;; Uses an insertion-sort.
;;
(define (iSort nums)
(cond [(empty? nums) empty]
[(cons? nums) (insert (first nums) (iSort (rest nums)))]))
;; insert: number, list-of-numbers --> list-of-numbers
;; Return an ascending list with the elements of already-sorted
;; and also new, inserted into the correct (ascending) place.
;;
;; Pre-condition: Already-sorted must be in ascending order.
;;
(define (insert new already-sorted)
(cond [(empty? already-sorted) (list new)]
[(cons? already-sorted)
(cond [(< new (first already-sorted)) (cons new already-sorted)]
[else (cons (first already-sorted)
(insert new (rest already-sorted)))])]))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Merge Sort ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; mSort: list-of-number --> list-of-number
;; Return a list with the same elements of lon, in ascending order.
;;
(define (mSort lon)
(cond [(length<=1? lon) lon]
[else (local {(define two-parts (unzip lon))
(define one-part (first two-parts))
(define other-part (second two-parts))}
(merge-two (mSort one-part)
(mSort other-part)))]))
;; (merge-two l1 l2)
;; l1, l2 are ascending lists of numbers
;; return a single ascending list of the numbers of of l1,l2.
;;
;; Example:
;; (merge-two (list 3 8 11) (list 1 3 4 9 29))
;; = (list 1 3 3 4 8 9 11 29)
;;
(define (merge-two l1 l2)
(cond [(and (empty? l1) (empty? l2)) empty]
[(and (empty? l1) (cons? l2)) l2]
[(and (cons? l1) (empty? l2)) l1]
[(and (cons? l1) (cons? l2))
(cond [(> (first l1) (first l2))
(cons (first l2) (merge-two l1 (rest l2)))]
[else (cons (first l1) (merge-two (rest l1) l2))])]))
;; unzip: list-of-x --> (list list-of-x list-of-x))
;; Return two lists, each containing every-other element of lst
;; (in unspecified order).
;;
(define (unzip lst) (unzip-help lst empty empty))
;; unzip-help: list-of-x list-of-x list-of-x --> (list list-of-x list-of-x)
;; Return two lists, each containing every-other element of lst
;; and the elements of so-far1 (so-far2) respectively.
;;
(define (unzip-help lst so-far1 so-far2)
(cond [(empty? lst) (list so-far1 so-far2)]
[(cons? lst) (unzip-help (rest lst) so-far2 (cons (first lst) so-far1))]))
; Fancy footwork: swap order of so-far1, so-far2.
We saw in a
Comp210 lab empirical
tests showing that these two types of sorting have different behaviours,
and discussed the reasons there.
Today we'll look at the tools needed to formalize these concepts.
For a function "insert", we'll consider the running time of iSort --
a function t_iSort, which takes a list, and returns
how long iSort runs on that input ... on a 1GHz sparc20 with 512MB
RAM, no other applications running besides the OS ...
Suppose:
t_iSort( (list 50 23) ) = 7ns
t_iSort( (list 11 50 23) ) = 19ns
t_iSort( (list 33 11 50 23) ) = 31ns
t_iSort( (list 11 23 33 50) ) = 8ns
t_iSort( (list 50 33 23 11) ) = 60ns
We can compare this to mergesort:
t_mSort( (list 50 23) ) = 11ns
t_mSort( (list 11 50 23) ) = 15ns
t_mSort( (list 33 11 50 23) ) = 19ns
t_mSort( (list 11 23 33 50) ) = 20ns
t_mSort( (list 50 33 23 11) ) = 19ns
(fictitious but representative numbers)
We want to be able to compare these two sorts,
to arrive at a general conclusion.
There are several glitches:
- The exact function is very complicated, since
lists of different length.
Is there a more useful function, to capture the
essence of the running time?
- This is tied to one particular technology.
Isn't there something more fundamental about these
algorithms, independent of technology?
How to capture that?
- One algorithm might be much more efficient than another
in general, but (for small inputs) suffer from startup overhead
(initializing variables, bringing code into memory, etc).
In fact, those small inputs we care about least, since
they're not the inputs causing us to wait.
How can we have a notion which overlooks startup overhead?
We answer each of these concerns in turn.
We want general answers, so we can also analyze
the repeated-squaring algorithms (multiple versions) from hws,
as compared to other algorithms for exponentiation.
Some of these approximations are definite trade-offs of
accuracy vs keeping your model simple.
- The exact function is very complicated [when input is a list]
Solution: Rather than look at individual lists,
look at lists of length n.
Take the ...best-case? average-case? worst-case?
We'll take worst-case, with the theory that when we show
the worst-case isn't so bad, we have an ironclad guarantee.
(Also, average-case is much more difficultin general.)
We extend t_isort over N:
t_isort(n) = max_{l in R^n}t_isort(l)
The max corresponds to worst-case. (What would best-case be? average-case?)
- This is tied to one particular technology.
Solution: We might count just the number of atomic operations
made. This abstracts away OS, memory size, etc.
It does require some consensus about what is an atomic operation;
what takes 3 steps on one processor might take 7 on another.
And in two essentially-the-same algorithms, one might use 30% mjore
atomic operations than another (but be
Besides, this #operations is talking about machine code (technology dep.),
not high-level source code, where we'd prefer to keep the discussion.
However, these two processors might always have a constant-factor
conversion between them:
E.g. *whenever* you see those 3 steps, you can always convert
them into (no more than) 7 steps on the other.
Thus we might be happy to take the
Solution: capture the running time, up to a constant factor.
This constant factor corresponds to running a 30% less-efficient compiler
(on average) on a 57% faster machine, or adding more cache (making
all memory accesses nearly 3 times faster), etc.
- Ignore startup overhead for small inputs.
When comparing two algorithms,
We'll simply talk about comparing algorithms' *asymptotic* (marginal)
time complexity.
For these latter two, we have an idea of ocmparing two functions.
[We'll be talking about functions for a while now, leaving
algorithms behind.]
"f is big-oh of g" means "f <= g, up to a constant, ignoring small inputs"
More formally: exists c,n_0 forall n>n_0, |f(n)| <= c*|g(n)|.
Example:
f=10n, g=n^2.
Is f big-oh of g? g big-oh of f?
(Book uses "k" instead of "n_0"; we are usually interested
in functions in N+->N+ rather than R->R. We'll often ignore
the absolute values, for the same reason)
Abuse alert: "f = O(g)"; really this is "f in O(g)", and the
latter is the set of all functions which g is bigger than..up to constant,
ignoring small inputs.
polynomials in general:
Sum_{i=n}^{0} a_i*x^i is O(x^n) (Which c, k?)
base of log. [leave as exercise]
sum i = O(n^2); sum i^2 = O(n^3). Prove the upper bound!
[The lower bound can be shown by calculus -- sketch.]
Consider a loop, where the body takes O(k) (linear) time.
This body is executed up to n times.
What is the overall runnign time?
What about a code which has 15n^3 setup, and a loop
with n calls to a O(i) function?
The same, except there are n^3 passes through the loop?
big-Oh is an upper bound.
Big-omega, big-theta.
We often use n to mean "the size of the input".
(be careful -- adding a list of 15 numbers, but
each number has 3000 digits?)