COMP 210 Lab 13: Functions Sharing State

We have seen different functions which share the same closure, and (separately) we have seen the use of set! to keep track of state, such as in lecture's pseudo-random number generator. Let's combine the two.

An early homework involved a phone-directory, implemented as a list-of-name/number pairs. We could add things to the directory, and we could also lookup a name in a directory.

There is no need to restrict ourselves to name/number pairs; in general a "dictionary" might be a set of key/value pairs. You can add a key/value pair to the dictionary, and you can lookup the value associated with a given key. (There can only one value associated with a given key.)

(define-struct entry (key val))

Note that a dictionary can be used in many different ways -- Instead of names(keys) and phone-numbers(values), we could analyze how frequently words occur in a text -- each word is a key, and the value is the number-of-occurrences. Or, a common situation is to use a student-id or social-security-number as a key, with the value being a Person structure containing all further information; The dictionary is the personnel database, indexed by id number.

We'll write two sorts of dictionaries -- one that uses a list of Entries, and another than uses a vector. However, note that code using dictionaries don't care about how the data is stored -- it just calls functions to add and lookup entries.

We'll start by writing a dictionary implementation that has some problems, and then we'll incrementally improve it.
(Of course, test each part thoroughly, before proceeding to the next step.)
  1. Create a global list-of-entries, named contents, which will be the "datastore" for our dictionary. Then, write two functions lookup and add!
    ;; add!: ANY ANY --> list-of-entries
    ;; SIDE-EFFECT: change the global placeholder "contents".
    ;; We return (the updated) contents.
    ;;
    (define (add! key val)
      ...)
    
    
    
    ;; lookup: key --> entry or false
    ;; Does the key occur inside "contents"?
    ;; If so, return that entry, else return the sentinel false.
    ;;
    (define (lookup name)
      (lookup-helper name contents))
    
    ;; lookup-helper: key, list-of-entries --> entry or false
    ;; Does the key occur inside "entries"?
    ;; If so, return that entry, else return the sentinel false.
    ;;
    (define (lookup-helper key entries)
      (cond [(empty? entries) ...]
            [(cons?  entries) ...]))
    ;
    ; NB this function can be of course easily re-written as a list visitor.
    
    (soln)
  2. The above code should work and test okay. Hewever, it has a couple of huge drawbacks:

    We'll Fix these flaws by

    ;; A Dictionary is a structure with two "methods" (fields which are functions):
    ;; add-fn! and lookup-fn are functions which signatures as add!,lookup above.
    ;;
    (define-struct dict (add-fn! lookup-fn))
    
    Write a function new-dict, which takes no arguments but returns a new, independent dictionary.

    Make sure you can create several dictionaries which don't interact.
    Hint: instead of contents being global, you want it to be in a local, with what two functions? You'll want a new local variable per each call to new-dict.

    (soln)

  3. Optional Suppose we'd like to keep track of how many dictionaries get made.
    1. Have an (initially) global variable num-dicts which keeps track of this.
      Then, the tricky part: make this variable non-global. But, clearly it does need to have new-dict inside its scope, and we only want one variable -- not one-per-call-to-new-dict.
      Hint:
      (define new-dict
        (local {...}
          (lambda () ...)))
      
      This defines a function (lambda), which is inside a local, but that local is only evaluated once -- not every single time the lambda is called!
    2. Currently this count of dictionaries isn't being used. Add another field to dictionaries -- id -- so that the first dictionary has an id of 1, the second an id of 2, etc. (Making new-dictionaries should not change the ids of existing dictionaries, of course.)
    3. Discuss with your partner: What are the drawbacks of making this id number a regular field (containing a number), rather than a method (a function returning a number)? If your code is going to be used by many other programmers, how easy is it -- intentionally or accidentally -- for them to subvert your data?

    (soln)

  4. Note that anybody using your dictionary couldn't care less, that you are using a list-of-Entries to store your data; they just call the add!-fn and lookup-fn.

    Other programmers tell you, that they often use your dictionary, but that the keys are often numbers in some bounded range: say, 1..75 for the programmer keeping track of comp210 grades, 1..2500 for the programmer in the registrar's office for students, and 1..1000000 (the Rice id number) for all rice personnel, or 1..10 (These latter actually have redundant information encoded into the number, to guard against mis-keying one student ID and accidentally getting another person's valid number.)

    We can make a Dictionary which is optimized for quick lookups this case. Rather than lists, we can use ...? First, rename the old function new-dict to be new-dict/list. Then, write:

    ;; new-dict/vec: number --> dictionary
    ;; 
    (define (new-dict/vec max-key)
      ..)
    
    For the moment, write this assuming the key will always be a number less than max-key.

    (soln)

  5. To be fair, the previous function doesn't quite return a true dictionary, since supposedly the key should be anything. Modify your previous function so that if the key is a non-number, it still works. How? Well, we could have our dictionary contain both a vector (to hold all the numeric keys), and it also holds a list-of-entries (to hold all the non-numeric keys).
    Hint: you could copy/paste code from new-dict/list. But we hate repeating code, remember? How about, having your new-dict/vec secretly containing not just a vector (for numeric keys), but also a whole other new-dict/list, where it stashes all its entries for (large or) non-numeric keys!

    (soln)

  6. optional

    Vectors are great when we know how many entries will be used in advance. But what if we don't? What if somebody provides a numeric key bigger than max-key? We want the unbounded nature of a list, with the random-access nature of a vector.

    Here's a clever trick so that we can have our cake and eat it too: Start with a vector of size (say) 100. For keys of size less than 100, no problem. If we eventually get a key that is 100 or bigger, then we'll make a whole new vector of size 200, copy over the contents of our size-100 vector, keeping only the new, bigger vector! This "repeated doubling" technique is handy.

    (soln)

Observe: other programmers who wrote functions which accepted dictionaries, could use them, regardless if they were handed a list-type dictionary or a vec-type dictionary. But it's even better: Next week you might sit down and write an even snazzier dictionary implementation (like a list that's kept in a sorted order, or something exotic like a hash table (see comp212!)). Even though their code was written before your wacky new dictionary implementation had even been concieved of, their code works without modification, even given your fancy new dictionary. This is an essential feature (and wouldn't have held, if you'd revealed the innards of your .)

We say that Dictionary is an abstract data type; The dict-list and dict-vector and hash-table structures are all specific implementations of this abstract data type. It is critical skill to all programmers, to separate the abstract behavior (add!, lookup) from the concrete implementation (lists, vectors, hash tables, whatever).