Comp 210 Lab 4: Trees

Index: Binary Search Trees, Arithmetic Expressions

Trees are a very common form of data. In fact, lists are simply trees with at most one branch at each node. This week in lab, we'll use more examples of trees.


Binary Search Trees

We often want to be able to build a data structure that allows us to look up information based on some key value. One simple idea would be to just create a list of all the data items. The disadvantage is that it can then take a long time to find the items at the end of the list, since we have to look at all the prior items first. Many data structures can be used for this problem, one of which is a binary search tree.

For simplicity, let's assume we just want to keep track of a bunch of numbers, remembering which ones we've seen. From this, we could easily generalize our results to, for example, keeping track of a sports team's player information indexed by player number, or to indexing on names rather than numbers.

The following are some binary search trees for the numbers 1,3,4,6,9:

Each node holds an index (here, number), data for that index (here, nothing), and two possibly-empty subtrees. All the nodes in the left branch are less than the current node. All the nodes in the right branch are greater than the current node.

Each of the above three trees is a correct binary search tree. Which is "best" is a question we'll delay until Comp 212.

To do:

  1. Make a data definition for these binary search trees.

  2. Make example data representing the above three trees.

  3. Develop a function search which consumes a binary search tree and a number and returns the boolean indicating whether the number is in the tree.

  4. Develop a function insert which consumes a binary search tree and a number and returns a new binary search tree. It should contain all the number from the first tree, plus the new number, but the new number shouldn't be duplicated. Hint: The easiest option is to add the new node at the bottom of the appropriate branch.

    Your example function uses should including represent each of the above three trees with only the empty tree and appropriate calls to insert.

  5. Develop a function in-order which consumes a binary search tree and returns a list of the numbers in the tree in sorted order. You'll want to use the built-in function append to append lists together. Hint: Note that the leaves of the tree are sorted left-to-right.

  6. For the curious... Describe how long search and insert take in terms of the shape of the input tree.

    Many operations on binary search trees would be faster if the trees were also balanced, i.e., the branches were all more-or-less equally deep. Implementing balanced binary trees is much more difficult, as you'll see in Comp 212.

  7. For the curious... Change the definition of a node to also store more information than just the index. E.g., consider each number to be a team player's number, and add the player's stats. Adapt search and insert to look up and add this information to the tree.


Arithmetic Expressions

Arithmetic expressions can also be modeled as trees. For example, the Scheme expression

     (+ 5 (* (- 1 8) (+ 7 1)))
is pictorially
        +
      /   \
     5     *
         /   \
        -     +
       / \   / \
      1   8 7   1
We would like to treat this expression as a piece of data. This is just like how DrScheme works -- one of your programs is really just a big piece of data for DrScheme.

An A-Expression is one of the following:

I.e., the A-Expression (make-add ...) will represent the appropriate arithmetic expression involving addition. We use the name "A-Expression" to distinguish it from its meaning of an arithmetic expression you'd write on a piece of paper or type into DrScheme.

Thus the above Scheme expression is represented by

     (make-add 5
               (make-mul (make-sub 1 8)
                         (make-add 7 1)))
As always, we distinguish between the information (the Scheme expression) and its data representation (the A-Expression).

To do:

  1. Make more example data and a template for A-Expressions.

  2. Develop the function evaluate which consumes an A-Expression and returns a number. Your function should be like a mini-DrScheme, computing the number that the A-Expression represents.

    DrScheme, like any language interpreter, is just a generalization of evaluate. A Scheme expression is like an arithmetic expression, but with more cases.

  3. For the curious... Let's say we had many basic operations, not just these four. We would want to have one structure for any binary operation, using a separate data definition enumerating all of our operations. Rewrite the data definitions, example, and evaluate for this definition.

  4. For the very curious... Adapt the previous idea to also allow unary operations.