Comp 210 Lab 10: Accumulators

Index: The Big Picture: Review, More examples, Missionaries & Cannibal tips

The Big Picture: Review

In-class, we saw three examples for accumulators:

reversing a list,
finding the maximum element of a list, and
summing the elements of a list.

The idea was that we use an extra argument in our function(s), called the accumulator, which "accumulates" some information about the problem, often the eventual result. In the first, it was the reverse of the front part of the list. In the second, it was the maximum element of the front part of the list. In the third, it was the sum of the front part of the list.

When do we use accumulators? It's hard to give any firm rules, and it's best to learn from examples. In general, you should always try to solve a problem without accumulators (using plain structural or generative recursion), and only add accumulators when it

simplifies the structure of the solution (e.g., arguably list reversal),
fundamentally improves the asymptotic running time of the program (e.g., list reversal improves from O(n²) to O(n)), or
converts a program to tail-recursive form to fundamentally improve the asymptotic space requirement of the program (e.g., maximum of list, sum of list)

For the curious... A program P runs in time (or uses space) O(f(n)) for some function f, if the running time (space usage) on inputs of size n, for suffciently large n, is bounded by C*f(n) for some constant C. The "sufficiently large n" qualification provides exceptions for values of n below some threshold N. For example, say that we define the size of a list as the length of the list and we have written a linear version of the reverse function. The reverse function on the empty list runs very fast but it does not run in zero time, which is what our definition would require if we did not allow a finite number of exceptions.

Obviously any function that runs time O(n) also technically runs in time O(n²) and O(2ⁿ). By convention, we use the best (smallest) possible function f when stating asymptotic complexity. If a program runs in linear time (O(n)), it is misleading to say that is runs in quadratic time (O(n²)).

This will be more fully explained in COMP 212 and COMP 280.

When using accumulators, we normally hide the accumulator-based function inside a local. For example,

     (define (sum alon)
          (local [(define (sum-acc alon accum) ...)]
              (sum-acc alon 0)))

This way we ensure no one initializes the accumulator incorrectly.

More Examples

Labbies: Most students won't be able to complete all of these during lab. Do foldl and a representative set of the rest.

`foldl` exercise
Recall the equational definition of the abstract function `foldl`: (foldl f base (list x₁ x₂ ... x_N)) = (f x_N ... (f x₂ (f x₁ base))...) Previously, we did not provide code to define `foldl`, and hinted that you needed to use an accumulator. The example definition (define (reverse aloa) (foldl cons empty aloa)) should hint why -- its implementation is very similar to the accumulator-based `reverse`. Write `foldl` using structural recursion and an accumulator. Do not use any helper functions.

Recall that in Mergesort, we split a list into two halves and recursively sort them. One way to split the list was to put every other element in one list and the other elements in another list. Such splitting did not need to preserve the order of the lists.

Develop functions split-list-odd and split-list-even to split the list. E.g., (split-list-odd (list 1 2 3 4 5)) should return some list with elements 1, 3, and 5.
Define a pair structure: (define-struct pair (first second)).

Develop a function split-list to split the list, without using the previous functions. It should return a pair of lists. In the function you should use two accumulators.

For the curious... Scheme has a more convenient built-in way to return multiple values from a function. We're simply not introducing the extra syntax.

Having multiple accumulators is occasionally useful, although it is not the most common case. There's another example of this below.

`flatten` exercise
Recall the function `flatten` from homework. It "flattens" a `list-of-list-or-symbol` to a `list-of-symbol`. Write a definition for `flatten` that uses an accumulator to form the flattened list.

Duplicate names exercises
We would like to write a function `dup-names` that found duplicate names in parent-based family tree. E.g., given a family tree node, where the given person's mother's mother's father and the person's father were each named `'Joe`, the returned list would contain `'Joe`. Develop a version of `dup-names` that uses an accumulator to record the names seen so far. Develop a version of `dup-names` that uses two accumulators: one to record the names seen so far, and one to accumulate the result. It should also ensure that the second accumulator (and result) has no duplicates. As these illustrate, we can use accumulators using any data structure, not just lists.

Missionaries & Cannibals tips

Debugging

When you are done, you will have a relatively large program -- much larger than for any previous homework in this course. If you just try to write the whole thing, then test it, you will have little if any luck in trying to find mistakes.

You should follow the design methodology -- create and use test cases for all helper functions. In previous programs, that might have sounded like busywork, but here it will save your sanity. Each of the functions should be reasonably straightforward, but as a whole system, they will solve a complicated problem and interact in complicated ways.

In the end, you'll want to hide your helper functions inside a big local. But first, make all your functions global (i.e., not inside a local), and do your testing. When everything works, then edit to hide your helper functions.

Data representation

In lab, we will discuss some ideas about what states are for the Missionaries & Cannibals problem.

Breadth-First Search

Our in-class route-finding example used a graph-searching strategy called "depth first search". The idea was to search for a solution along one full path, and then backtrack to find another path if you don't find a solution on the current path.

If you follow the homework directions, you will use an alternate strategy called "breadth first search". The idea is to look at longer and longer paths in the graph until you find a solution. However, instead of looking at one path at a time and backtracking to the next path, you are look at all paths at once and just keep looking deeper and deeper along those paths.

For details of both of these, please refer to the class discussion and the book. Labbies: Illustrate the difference between these strategies on a small graph.

There are several tradeoffs in using these two strategies. In answering all the homework parts, you will look at some of these tradeoffs, but you'll look at these strategies in more detail in later COMP courses.