Higher and Higher....

Mapping

A few lectures ago (Lecture 19) we tried unsuccessfully to abstract the processes on a list. Let's try a slightly different tack here by first analyzing a few examples programs:

;; squares each element in a list-of-numbers
(define (square  lon)
    (cond
        [(empty? lon) empty]
        [(cons? lon) (cons (* (first lon) (first lon)) (square (rest lon)))]))

;; removes the symbol 'junk from a list-of-symbols
(define (removeJunk los)
    (cond
        [(empty? lon) empty]
        [(cons? lon) (cons   (cond [(symbol=? 'junk (first los)) 
                                       '__ ]
                                   [else (first los)]) 
                                       (removeJunk (rest los)))]))

How could we abstractly describe the above two programs, which are not unlike many list-processing functions we've written before? What is the essence of both programs?

Some things you might have noticed (some of which are color-coded above):

Both follow the design template -- duh.
Both return lists.
Both have the same base case.
Both have the same recursive call.
In the inductive case, both cons on a value to the result of the recursive call.
The value cons'ed on is a function of first.

In short, both functions return a list where a fixed function ( squaring or filtering) has been applied to every element in the input list. In mathematical terms, we say that both functions map another function over the list.

Map

To map a function over a set is to apply the function to every element of the set. In regular algebra:

map(f, {x0 x1 x2 x3 ... xn} = {f(x0) f(x1) f(x3) .... f(xn)}

Note that f is necessarily a function of one input parameter.

So, by separating the variant code from the invariant code, can we write a function that will express the abstract process of mapping? Simple--just take the above code, rip out the part that processes first, and replace it with a function call:

;; mapper:  (f: any1 --> any2) list-of-any1 --> list-of-any2
;; maps f over a-list.
(define (mapper f  a-list)
    (cond
        [(empty? a-list) empty]
        [(cons? a-list) (cons (f (first a-list)) (map f  (rest a-list)))]))

Notice how for all mapping processes, mapper is 100 invariant code.

Thus the above functions can be written simply as (for example):

(mapper sqr (list 1 2 3 4))

(mapper (lambda (x)
             (cond [(symbol=? 'junk x) '__]
                   [else x])) 
        (list 'a 'b 'junk 'c 'junk 'd))

Mapping is one of the most useful higher order functions becauise it enables us to think in terms of high level mapping rather than in terms of the lower level list traversal.

Note: The function above is called "mapper" here instead of "map" because Scheme already has map built in (I wonder why?).

Question to think about: What if the function used by map depends on more than just any single element in the list? For instance, suppose it also depends on some other value defined elsewhere in the program? Can map handle this?

Map is pretty cool and very useful, but not all problems can be expressed in terms of mapping. We need something even more abstract...

Folding

We've done a lot of "natural recursion" or "reverse accumulation" problems so far. Is there a way to express them all in one abstract form? Let's take a look at a couple of examples:

;; sums a list-of-numbers
(define (sum lon)
    (cond
        [(empty? lon) 0]
        [(cons? lon) (+ (first lon) (sum (rest lon)))]))

;; removes all the negative values from a list-of-nums
(define (removeNeg lon)
  (cond
    [(empty? lon) empty]
    [(cons? lon) 
     (cond  [(> 0 (first lon)) (removeNeg (rest lon))]
            [else (cons (first lon) (removeNeg (rest lon)))])]))

What's the same about these two examples? What's the abstraction?

Both follow the design template--strange but true.
Both return a particular value from their empty cases.
Both inductive cases can be expressed as functions of first and the recursive call.

That is, both functons have this form if lon = {x-N x-N-1...x-2 x-1} and the there is some abstract function f and the return of the base case is base:

(f x-N (f x-N-1 (f .....(f x-2 (f x-1 base))...)))

This process is so common that it has a name: fold-right

Once again, we separate the variant from the invariant and we can thus write the Scheme code for fold-right, which amounts to an implementation of Scheme's built-in foldr function (are those guys ahead of us or what?)

;; foldRight: (f: any1 any2--> any2) any2 list-of-any1 --> any2 
;; Implementation of foldr
(define (foldRight f base a-list)
  (cond
      [(empty? a-list) base]
      [(cons? a-list) (f (first a-list) (foldRight f base (rest a-list)))]))

For reverse accumulation processes, foldRight is 100% invariant code. The above examples thus become:

(equal? (sum (list 1 2 3 4 5)) (foldRight + 0 (list 1 2 3 4 5)))

(foldRight (lambda (x r)
               (cond
                   [(> 0 x) r]
                   [else (cons x r)]))
           empty 
           (list 1 -2 -3 4 -5)))

You're secretly impressed, I know--you just don't show it.

"Ahhh," you say, "fold-right is all well and good, but not all algorithms are nicely expressable in terms of reverse accumulation. What about all those forward accumulation algorithms, eh? Take these, for instance:"

;; revList:  list-of-any --> list-of-any
;; reverses the elements of a list.
(define (revList a-list)
  (cond
      [(empty? a-list) empty]
      [(cons? a-list) 
           (local [(define (helper a-list acc)
                       (cond
                           [(empty? a-list) acc]
                           [(cons? a-list) (helper (rest a-list) (cons (first a-list) acc))]))]
                  (helper a-list empty))]))


;; findMax: non-empty-list-of-numbers --> number
;;find the maximum of a non-empty list of numbers
(define (findMax nelon)
  (local [(define (helper a-list acc)
            (cond
                [(empty? a-list) acc]
                [(cons? a-list) (helper (rest a-list) 
                                    (cond
                                        [(< (first a-list) acc) acc]
                                        [else (first a-list)]))]))]
         (helper nelon (first nelon))))

So what's the essence here?

Both follow the design template -- could have guessed that one!
Both use a helper function.
The result of the helper's base case is the accumulator.
The recursive call in the helper involves a function of first and the accumulator.

The process involved with both functions above is called fold-left, where the result is of the form:

(f (f (...(f (f (f base x-N) x-N-1) x-N-2)...) x-2) x-1)

Scheme has fold-left built in as the function foldl.

Separating the variant from the invariant, we get the following invariant code:

;; foldLeft: (f: any1 any2 --> any2) any2 list-of-any1 --> any2
;; Implementation of the built-in foldl
(define (foldLeft f base a-list)
  (cond
      [(empty? a-list) base]
      [(cons? a-list) 
           (local  [(define (helper a-list acc)
                        (cond
                            [(empty? a-list) acc]
                            [(cons? a-list) (helper (rest a-list) (f (first a-list) acc))]))]
                   (helper a-list base))]))

The two examples above thus become:

"foldLeft test cases:"
(equal? (reverse (list 'a 'b 'c 'd 'e)) (foldLeft (lambda (x a) 
                                                      (cons x a)) 
                                                  empty 
                                                  (list 'a 'b 'c 'd 'e)))

(define lon1 (list 2 -3 4 -5 -4 -3))

(equal? (findMax lon1) (foldLeft (lambda (x a) 
                                     (cond 
                                         [(< x a) a]
                                         [else x])) 
                                 (first lon1) 
                                 lon1))

Fold-Right & Fold-Left

Fold-Right is the abstract process of reverse accumulation through a list--moving information from the rest to first.

Fold-Left is the abstract process of forward accumulation through a list--moving information form first to rest.

Both fold processes require elements for the two fundamental parts of recursion:

Base case: a value is required.

Inductive case: a function is required.

Half the code for forward and reverse accumulation algorithms has now been written for us -- permanently!

Thie code for the above examples can be downloaded here: map-fold.scm

"But...but...but..." you stammer, "aren't there a lot of functions on lists that are neither (or both) forward nor (and) reverse accumulation? What about them?"

What's the ultimate list processing abstraction? Can we code the mother-of-all-list-processing-abstractions?

Pushing to the Limit...

The progress forward enough to find the ultimate list function abstraction, we must go backwards in our thought.

Look at the above analyses of map, fold-right, and fold-left. What is the essence that runs through all three of these higher order functions?

Our old friend, the design template. What does the design template say about processing a list?

The design template says that all list processing functions have this form:

(define (listFunc a-list param)
    (cond
        [(empty? a-list) (baseCaseFunc a-list param)]
	    [(cons? a-list) (inductiveCaseFunc a-list param)]))

where all the possible parameters have been lumped together into a single parameter, param, which could be a list of values.

That is, in essence, the design template says:

The base case does something.
The inductive case does something else.

Deep, huh?

One important thing to note however, is that the base case functionality and the inductive case functionality are not completely independent. They form a matched set. this implies that in terms of encapsulation, that the base case function and the inductive case function should be encapsulated together.

How do we encapsulate two things together?

Visiting

We put them together in a structure called a "Visitor":

;; A Visitor is a structure made up of two functions
;; One for the base case and 
;; one for the inductive case
(define-struct Visitor (fBase fInduct))

The abstract list processing algorithm can thus be trivially expressed in terms of a Visitor structure. I will call this function "execute":

;; execute: list-of-any1 Visitor any2 --> any3
;; Executes ("accepts") the visitor on the list
;; returning the result.
;; param is passed to the visitor unmodified.
;; The base case of the Visitor is called on the empty list:
;; Visitor-fBase: list-of-any1 any2 --> any3
;; The inductive case is called on the non-empty list.
;; Visitor-fInduct: list-of-any1 any2 --> any3

(define (execute  a-list visitor param)
  (cond
      [(empty? a-list) ((Visitor-fBase visitor) a-list param)]
      [(cons? a-list) ((Visitor-fInduct visitor) a-list param)]))

execute is 100% invariant code for any function on a list.

(cond [empty? ..)...] [(cons?...)...]) is gone completely now. execute is the invariant code embodiment of the design template.

The Visitor structure with the execute function are an example of the Visitor Design Pattern -- yeah, this covers more than just lists. More information, albeit in object-oriented programming terms, can be found here.

Note that the base case and the inductive case have the same contract -- this is superfluous here, but comes into play when the visitor pattern is applied to more general structures than just lists.

Examples of Vistors:

;; sums a list of numbers
;; param is ignored.
(define sumVisitor
    (local 
        [(define (fBase a-list param) 
            0)
        (define (fInduct a-list param)
            (+ (first a-list) (execute (rest a-list) sumVisitor 0)))]
        (make-Visitor fBase fInduct)))

"sumVisitor test case:"
(execute (list 4 2 7 3 5 6 1) sumVisitor 0)


;;reverseVisitor reverses a list
(define reverseVisitor
  (local
      [(define (fBase a-list param)
           empty)
       (define (fInduct a-list param)
           (execute a-list reverseVisitorHelper empty))
       
       ;; reverseVisitorHelper reverses a list
       ;; The reversed list is prepended onto the param.
       (define reverseVisitorHelper
         (local 
             [(define (fBase a-list param) 
                  param)
              (define (fInduct a-list param)
                  (execute (rest a-list) reverseVisitorHelper (cons (first a-list) param)))]
             (make-Visitor fBase fInduct)))]
    
      (make-Visitor fBase fInduct)))

"reverseVisitor test case:"
(equal? (list 7 6 5 4 3 2 1) (execute (list 1 2 3 4 5 6 7) reverseVisitor empty))

Not having forgotten everything we learned about using lambda functions we can see that visitors were made for lambdas:

;; prodVisitor multiplies all the elements of a list together.
;; param is ignored.
(define prodVisitor
  (make-Visitor
     (lambda (a-list param)
         1)
     (lambda (a-list param)
        (* (first a-list) (execute (rest a-list) prodVisitor param)))))

"prodList test case:"
(= 5040 (execute (list 1 2 3 4 5 6 7) prodVisitor empty))

Visitor Design Pattern

The visitor design pattern, in Scheme, consists of two pieces:

The invariant function execute (written above).
The Visitor structure, which holds the variant base and inductive case code.

The visitor design pattern decouples the variant base and inductive case processing on a list from the invariant method in which those processes are applied to the list.

A Visitor structure contains only that code which is unique to that particular process on a list and none of the code that is common to all processes on a list.

All functions on a list can be expressed in terms of Visitors.

Take a deep breath. We made it!

Download the visitor code here: visitors.scm

COMP 210: Principles of Computing and Programming

Lecture #21 Fall 2003