COMP 210 Lab 6: Mutually Recursive Data

Files and Directories

The following are data definitions for one possible (simplified) representation of files and directories. Observe the mutual recursion between files and list-of-files.

     ; A file is one of
     ;   - a symbol
     ;     representing a "simple" file's name
     ;   - a directory
     ;     (make-dir name contents)
     ;     where name is a symbol, and contents is a l-o-f.

     ; A list-of-files (l-o-f) is one of
     ;   - empty
     ;   - (cons f lofd)
     ;     where f is a file, and lofd is a l-o-f

This is very similar to the descendant trees data structure seen in class. Tree-based data structures are very common!

Directory exercises
  1. Create some sample data for the above types.

  2. Write the templates for the above types.

  3. Develop a function

             ; find? : symbol file -> boolean
             ; Returns whether the filename is anywhere in the
             ; tree of files represented by the file.
             

    An aside

    Note that this is a vast simplification of find, the mother-of-all everything-but-the-kitchen-sink UNIX directory traversing command. If you're curious, at a UNIX prompt, enter man find to see what it can do.

  4. Use DrScheme's stepper to step through an example use of find?.

    Following the templates leads to an overall strategy known as depth-first search. I.e., it explores each tree branch to the end before moving on to the next branch.

  5. Develop a function

             ; any-duplicate-names? : file -> boolean
             ; Returns whether any (sub)directory directly or indirectly contains
             ; another directory or file of the same name.  It does NOT check
             ; for duplicated names in separate branches of the tree.
             
    There's a straightforward way that just follows the template.

  6. For the curious... Develop a program to check for duplicated names among all directories and files in the given tree, not just subdirectories. Here's a hint.

  7. Develop the following function:

             ; flatten-dir-once : symbol file -> (file or l-o-f)
             ; Returns a structure like the original file, except that any
             ; (sub)directory with that name is removed and its contents
             ; moved up one level.
             

    Here are two pictorial examples, in both cases removing the directory named to-remove. These illustrate why this function can return either a file or a list of files.

    Input Output
    Example 1:
             foo
           /  |  \
        bar baz to-remove
                 / \
               one two
             
              foo
           /  /  \  \
        bar baz one two
             
    Example 2:
         to-remove
          /  |  \
        foo bar baz
             
        foo bar baz
             

    Follow the templates and think about a single case at a time. If you do that, it shouldn't be too difficult. If you don't, you'll probably have real trouble.

Sample solutions.


Descendant Family Trees

Review: In class, we originally discussed Ancestor Family Trees (where each person had exactly two parents), and today turned that on its head and introduced Descendent Family Trees (where each person has any number of children).

Note on Family Trees

Pragmatically, we would have a representation of family trees that had the advantages of both the ancestor- and descendant-based versions. That is indeed possible. For that, we want a graph, a kind of data structure that we will look at later in the course.

The descendant trees entailed two separate interrelated types:

     ;; A Person is a structure
     ;; where name is their name, year is their birth year
     ;; and kids is a list the person's children.
     ;; (make-Person symbol num list-of-Persons)
     (define-struct Person (name year kids))

     ;; A list-of-Persons is either
     ;; - empty
     ;; - (cons Person list-of-Persons)

As a reminder, here are their templates:

     ;; Template for Person
     (define (f-Person a-person)
        ...(Person-name a-person)...
        ...(Person-year a-person)...
        ...(f-lop (Person-kids a-person))...)))
     ;; Template for list-of-Persons

      (define (f-lop a-lop)
        (cond
          [(empty? a-lop) ...]
          [(cons? a-lop)  ...(f-Person (first a-lop))...
                          ...(f-lop (rest a-lop))...)]))

Some examples of this data might be

     (define Bart    (make-Person 'Bart    1979 empty))
     (define Lisa    (make-Person 'Lisa    1981 empty))
     (define Maggie  (make-Person 'Maggie  1988 empty))
     (define Homer   (make-Person 'Homer   1955 (list Bart Lisa Maggie)))
     (define Marge   (make-Person 'Marge   1956 (list Bart Lisa Maggie)))
     (define Selma   (make-Person 'Selma   1950 empty))
     (define Patty   (make-Person 'Patty   1950 empty))
     (define Jackie  (make-Person 'Jackie  1926 (list Selma Patty Marge)))
     (define Herbert (make-Person 'Herbert 1950 empty))
     (define Mona    (make-Person 'Mona    1929 (list Homer Herbert)))
     (define Abe     (make-Person 'Abe     1920 (list Homer Herbert)))

Descendant Family Tree warm-up exercises

These are variations on the in-class example. Do some to make sure you understand the template, before moving on to the more difficult next exercise.

  1. Note how we're using define to save ourselves typing. How would we write the descendent tree for Homer, if we hadn't have used define? How about for Jackie?

  2. Develop a function that counts the number of decendants in a person's family tree. Include the person in the count.

  3. Develop a function that returns whether there is someone in a family tree with a certain given name.

Descendant Family Tree pretty-printing exercises

Note how DrScheme displays the value Abe: nicely indented, but in Scheme format, using "make-Person", "list", etc. This is the preferred way for us programmers, because it shows the structure of the data. But suppose we want to show our data to non-programmers. We still want things indent appropriately, but not necessarily as internalized structure, e.g.,

     Jackie (b. 1926)
     |_ Selma (b. 1950)
     |_ Patty (b. 1950)
     |_ Marge (b. 1956)
        |_ Bart (b. 1979)
        |_ Lisa (b. 1981)
        |_ Maggie (b. 1988)
  
or
     Abe (b. 1920)
     |_ Homer (b. 1955)
        |_ Bart (b. 1979)
        |_ Lisa (b. 1981)
        |_ Maggie (b. 1988)
     |_ Herbert (b. 1950)
  
We will write a function Person->string produces that output as a string. This will be similar to the corresponding function for Ascestor Family Trees shown in class.

Like the other pretty-printing examples, this program will involve two separate conceptual strategies: data-driven mutual recursion and problem-driven accumulation (of the prefix string).

  1. What are a couple of other sample outputs (e.g., for Abe and Bart)?

    Assume that a list of children should print out in the same order as listed in the data structure. That does not necessarily correspond to birth order.

  2. Group discussion: How do the solutions to the sub-problem (e.g., the string version of Marge) relate to the total solution (e.g., the string version of Jackie)?

  3. Group discussion: What info do we need to keep track of, when writing these examples on the board? We need to keep track of what to output at the beginning of each line -- at first nothing, but for each generation, some additional space. This is an accumulator!

  4. Okay, write the function!

    A couple of built-in functions you'll need to use are symbol->string and number->string. Also, load the teachpack /home/comp210/Labs/Lab06/lab06-lib.ss via DrScheme's Language.. menu. This provides you with the following constant and function:

             ;; newline: string
             ;; The one-character string with a carriage-return.
      
             ;; display-string: string -> true
             ;; Display the given string in a text-box.
             ;; Always returns true.
    
             ; Example:
             (define limerick (string-append "There once was a man from Kali,"
                                             newline
                                             "Whose limerick stopped at line three."
                                             newline
                                             "You'd think it just started --"))
    
             limerick
             (display-string limerick)
             
  5. For the curious... For similarity to the pretty-printing with ancestor trees, we'd prefer the output for Abe to be

             Abe (b. 1920)
             |_ Homer (b. 1955)
             |  |_ Bart (b. 1979)
             |  |_ Lisa (b. 1981)
             |  |_ Maggie (b. 1988)
             |_ Herbert (b. 1950)
             
    without changing the output for Jackie. Write that version.

    Hint: Note that the last child is treated differently than the others in that his/her children are not prefixed by a vertical bar. You'll need to check for that, while not breaking encapsulation.

Sample solution.

A couple of notes on strings and I/O
  • Some people thing that input and output are intrinsic to programming. They're not -- they're just impediments to get around, before you can even begin to solve the problem you're interested in. DrScheme makes I/O fairly transparent via the calculator-like behavior of the read-eval-print loop.

    Formatting output is more important when writing programs that other non-programmers will want to use. It's rarely relevant to solving the problem itself.n Thus, it's necessary, but uninteresting. (Some language environments don't have default output so nice; in those cases, programmers usually have to spend time, for each structure, writing how to format it, just so they can debug more easily.)

  • Converting data to strings throws away the data's structure. You should never convert your high-level, structured data down into a string, pass that string to some other function, and force that function to parse the string back into a structure.

    Working within a single language, it's easy to pass your high-level structures immediately. Saving structured data to a file or communicating across a web-link is more problematic. The recent solution is to encode structure as a series of "tags" in a language called XML. XML is a standard way to convert (nested) lists into strings, in a way that makes it trivial to reconstitute the original structure. If you know some XML (or HTML), you may be interested in DrScheme's menu item Insert XML box.