From Data Description to Data Definitions

Thus far we have used Scheme's existing data to represent the information we want to process. To get a better understanding of information processing functions, we rigorously described the subsets of data that we were interested in. However, in many cases it is better to introduce entirely new forms of data. In Scheme the format of new data is defined with the definition:

(define-structure (Name Sel1 Sel2 ... SelN))

This structure definition defines N+2 new functions:

a constructor: make-Name;
a predicate: Name?;
the selectors: Name-Sel1, Name-Sel2, ..., Name-SelN.

To create a value of this new kind, we apply the constructor make-Name to N values. The result is a single value that contains all N values. This compound piece of information is sometimes called a tuple, an aggregate, a structure, a record, or an object. We will use tuple in this course.

Generalized Data Descriptions

Structure definitions extend the set of Scheme values, making it easier to represent sets of information in Scheme. We will use names for the data sets that are enclosed in < >.

When we design a representation for the information we are interested in, we are typically confronted with two situations:

Tuples: The first situation requires the "combination" of several pieces of information. Some of this information may be represented as plain Scheme data, others may be other forms of tuples that we have introduced (or will introduce). To describe a combination of N different kinds of data, we write
```
  < KindName >  ::= (sel1 sel2 ... selN)
```
Union: The second situation requires the collection of different, disjoint sets of information. One good method for describing such a union is:
```
  < KindName > ::= (pred1 sel11 sel12  ... ) |
                   (pred2 sel21 sel22  ... ) |
                    ...
                   (predM selM1 selM2  ... )
```
This description denotes that our data is one possibility out of a collection of M types, each defined as a tuple. Note that a union requires the addition of a field, predI, to distinguish the various case.

Using Data Definitions to Implement Data Descriptions

define-structure creates constructors, predicates, and selectors for tuples and unions in Scheme. For data defined as a tuple,

  < KindName > ::= (sel1 sel2 ... selN)

we add the Scheme definition

(define-structure (KindName sel1 sel2 ... selN))

to our program.

For a union description of the form

  < KindName > ::= (pred1 sel11 sel12  ... ) |
                   (pred2 sel21 sel22  ... ) |
                   ...
                   (predM selM1 selM2  ...)

we add the following M Scheme definitions:

(define-structure (KindName-pred1 sel11 sel12 ... ))
(define-structure (KindName-pred2 sel21 sel22 ... ))
...
(define-structure (KindName-predM selM1 selM2 ... ))

By convention, we distinguish the Ith possible case by appending predI to KindName. These define-structure expressions produce the constructors make-KindName-predi, the predicates KindName-predi?, and the selectors typname-predi-selij.

Data definition to program templates

Given constructors, selectors, and predicates, we can derive a program template from the data description as follows:

For a tuple description

  < KindName > ::= (sel1 sel2 ... selN)

the program template for a function foo that takes as input a value (named: bar) of type <KindName> is:

(define foo
   (lambda (bar)
      ... (KindName-sel1 bar) ... (KindName-sel2 bar) ... ))

This template uses the selectors to break the value named bar into simpler pieces. Completing the program template specifies how foo assembles these pieces into the answer.

For a union description of the form

   < KindName > ::= (pred1 sel11 sel12  ... ) |
                    (pred2 sel21 sel22  ... ) |
                      .....
                    (predM selM1 selM2  ... )

the program template for a function baz that takes as input a value (named bum) of type <KindName> is:

(define baz
  (lambda (bum)
    (cond
      ((KindName-pred1? bum)
       (KindName-pred1-sel11 bum) ... (KindName-pred1-sel12 bum) ...)
      ((KindName-pred2? bum)
       (KindName-pred2-sel21 bum) ... (KindName-pred2-sel22 bum) ...)
      ....
      ((KindName-predM? bum)
       (KindName-predM-selM1 bum) ... (KindName-predM-selM2 bum) ))))

If a selector returns a value of the kind KindName, then a recursive call to baz should be included in the program template. If a selector returns a value of some other described kind of data, the resulting program template should include a helper function that operates on these values. This helper function has a program template that can also be derived from its data definition of this value.

This approach can be used to derive mutually recursive functions from mutually defined data types (see the third example). Of course, this approach may also lead to a number of simple helper functions. These functions can removed (if so desired) when simplifying the completed program. But do not simplify until you have the function correct!

Examples

Lists

Data description

< list > := (empty) |
            (join first rest)

where first is some primitive kind of data (e.g. symbol, number) and rest is a <list>.

Constructors, selectors, predicates

(define-structure (list-empty))
(define-structure (list-join first rest))

Constructors: make-list-empty, make-list-join
Predicates: list-empty?, list-join?
Selectors: list-join-first, list-join-rest

Program template

(define list-proc
  (lambda (l)
    (cond
      ((list-empty? l)
       ...)
      ((list-join? l)
       ... (list-join-first l) ... (list-proc (list-join-rest l)) ... ))))

Trees

Data description

< tree > ::= (empty) |
             (branch info left right)

where info is a <symbol> and left and right are <tree>s.

Constructors, selectors, predicates

(define-structure (tree-empty))
(define-structure (tree-branch info left right))

Constructors: make-tree-empty, make-tree-branch
Predicates: tree-empty?, tree-branch?
Selectors: tree-branch-info, tree-branch-left, tree-branch-right

Program template

(define tree-proc
  (lambda (t)
    (cond
      ((tree-empty? t)
       ...)
      ((tree-branch? t)
       ... (tree-branch-info t) ...
       ... (tree-proc (tree-branch-left t)) ...
       ... (tree-proc (tree-branch-right t)) ... ))))