Index: list, Directory Example, Accumulators, Local & Scope (reading only)
First, we have a brief aside from our main topic.
The following is a convenient abbreviation shown in class, and already introduced in some lab sections:
(list 1 3 5 7) = (cons 1 (cons 3 (cons 5 (cons 7 empty))))Using list introduces the exact same cons structure, but it is easier for humans to read in big or nested lists, as in some data structures we're now using. When writing programs on lists, you should still think in terms of the cons structure.
In short, cons and empty are the constructors of lists. list is another function which makes lists, and is convenient shorthand for combinations of the constructors.
The following are data definitions for for a simplified representation of files and directories, as seen in class:
A file is a symbol (its name). A directory is a structure (make-dir name contents) where name is a symbol, and contents is a l-o-f-d. A list-of-files-and-directories (l-o-f-d) is one of - empty - (cons f lofd) where f is a file, and lofd is a l-o-f-d - (cons d lofd) where d is a directory, and lofd is a l-o-f-d.Observe the mutual recursion between directories and list-of-files-and-directories.
This is just another example of a tree data structure. We warned you trees were common!
Also, as seen in class, the following are the corresponding templates:
;(define (dir-fn a-dir) ; ...(dir-name a-dir)... ; ...(lofd-fn (dir-contents a-dir))...) ;(define (lofd-fn a-lofd) ; (cond ; [(empty? a-lofd) ; ...] ; [(file? (first a-lofd)) ; ...(first a-lofd)...(lofd-fn (rest a-lofd))...] ; [(dir? (first a-lofd)) ; ...(dir-fn (first a-lofd))...(lofd-fn (rest a-lofd))...]))
To do:
Develop a function
; find? : directory file -> boolean ; Returns whether the given file is in the directory or one ; of its subdirectories.
Note that this is a vast simplification of find, the mother-of-all everything-but-the-kitchen-sink UNIX directory traversing command.
(optional -- go do the accumulator exercises first please) Develop a function
; flatten-dir-once : directory symbol -> (directory or l-o-f-d) ; Returns a structure like the original directory, except that the ; named directory is removed, with its contents moved up one level.Here are two pictorial examples, in both cases removing the directory named to-remove.
Input | Output | |
---|---|---|
Example 1: |
foo / | \ bar baz to-remove / \ one two |
foo / / \ \ bar baz one two |
Example 2: |
to-remove / | \ foo bar baz |
foo bar baz |
Follow the templates and think about a single case at a time. If you do that, it shouldn't be too difficult. If you don't, you'll probably have real trouble.
If you still have time, develop a function
; any-duplicate-names? : directory -> boolean ; Returns whether any directory directly or indirectly contains ; another directory or file of the same name. It does NOT check ; for duplicated names in separate branches of the tree.There's a straightforward way that just follows the template. There's a more efficient way that follows a template we'll introduce later in the course.
For the very curious... Develop a program to check for duplicated names among all directories and files in the given tree. Here's a hint.
For the curious... Give more accurate data definitions for modeling UNIX directories and files. Files have not only names, but contents and other properties, such as an owner and last-modified date. Furthermore, a directory is considered to be a kind of file.
Here are solutions.
Accumulators are an algorithmic technique where the result is stored in a parameter that is passed to the recursive call. The final result is thus "accumulated" in that parameter, which is usually simply returned by the base case. Note that accumulator-style functions usually need helper functions, which are the actual workhorse accumulating & recurring functions. The main function's job is often just to initialize the accumulator and set up the helper.
To do:
(0 1)
) and returns a Fibonacci sequence with the
next value at the end of the list (i.e. next to the empty list).(fib (list 0 1)) --> (list 0 1 1)
or (fib (fib
(fib (list 0 1)))) -> (list 0 1 1 2 3)
(1 0)
)and returns the a Fibonacci
sequence with the next value at the beginning of the list (i.e. (first list)).
(fib2 (list 1 0)) -> (list 1 1 0)
or (fib2 (fib2
(fib2 (list 1 0)))) -> (list 3 2 1 1 0)
(Actually, I think this
one is easier than the first one.)Index: What is local?, Scope, When to use local?, When not to use local?, local & the Design Methodology
First, to use local, change DrScheme's language level to "Intermediate Student".
Class introduced local. Let's start with a quick review. Its syntax is
(local [definitions] expression)for example,
(local [(define x 3) (define y (+ x 1))] (+ x y))
How do we evaluate it? Basically, we
local does not allow you to solve any problems that you couldn't solve before, but it does allow you to solve them with better programs.
To do: Hand-evaluate all steps of the following program:
(define x 3) (define z 6) (local [(define x 7) (define y (+ x 4))] (+ x y z))
Previously, in programs like
(define x 3) (define (square x) (* x x)) (square 4)we knew that the variable x inside square was somehow different from the one outside. We didn't discuss this much, since it was fairly intuitive. Our terminology is that the x inside the function is local, while the other is global. We also say that the local variable shadows (or hides or masks) the identically-named global variable.
This distinction is one of scope. I.e., we can have multiple distinct variables with the same name. Each variable has a scope -- the part of the code where that variable can be referred to. Previously, we variables either had local scope or global scope. Now, local allows us to make further distinctions. I.e., some variables will be "more local" than others.
Thinking about scope is basically a shortcut to the whole renaming step of local evaluation. Once you understand scope, you simply say that a use of variable x refers to the most local definition of x.
The scoping rules are rather simple:
Global definitions: The scope of x in (define x expression) or (define (x ...) expression) is the entire program. It is NOT just the remainder of the program following this definition (as erroneously stated in some old versions of the Comp 210 lab materials). If this claim were true, mutual recursion would be impossible! The variable x does not have a value until its define statement has been executed, but the variable still exists and can be referenced inside a function definition that precedes the defintion of x.
For the curious... As we'll soon see, (define (x ...) expression) is shorthand for (define x some-similar-expression), so this case could be described more simply.
Local to a function: The scope of x in (define (foo ... x ...) expression) is the expression.
Local to local: The scope of a define'd variable in a local's definitions is both all of the definitions in the enclosing local (including those that precede it!) and the local expression body.
That's a lot of words for an idea that most people find reasonably intuitive. Really the main difference now is that we have a mechanism for nesting definitions, so we can have more than just the two levels of "global" and "local" variables.
If you use DrScheme's "Check Syntax" button, it will show you the scope of variables:
To do: The following are stylistically ugly toy examples for scoping. Draw arrows between definition and uses in the following examples, and figure out what are the results of the expressions. Check your answers with DrScheme's "Check Syntax" button and its evaluation.
(define x 1) (define y 2) (define z 3) (define (fee x y) (local [(define z 7) (define y 4)] (+ x y z))) (fee x z) (define (fie x y) (local [(define z 7) (define y (+ y 3))] (+ x y z))) (fie x z) (define (foe x y) (local [(define z 7) (define y (+ x z)) (define x (local ((define y 10)) (+ y z)))] (+ x y z))) (foe x z) (define (fum x y) (local [(define fum 7) (define (x z) (+ y z))] (x fum))) (fum x z) (define (foo x y) (local [(define (z y) (cond [(zero? y) 1] [(positive? y) (* x (z (sub1 y)))]))] (local [(define x (+ y 1))] (+ x (z y))))) (foo x z)Fortunately, real-world examples as convoluted as these are uncommon. However, the last example's use of x in z is a common technique we'll see more of later.
There are several overlapping reasons for using local:
To hide a helper function or variable from other code.
E.g., hiding bigger in our first class example:
(define (bigger num1 num2) ...) (define (best-score a-lon) ...(bigger (first a-lon) (best-score (rest a-lon)))...)
This is a very important rule stylistically. On large projects with multiple programmers, for example, it is easy for multiple programmers to define helper functions with the same names. Hiding these prevents others from interfering with your programs' correctness.
In the end, your programs tend to look like
(define (my-main-function ...) (local [(define (helper1 ...) ...) (define (helper2 ...) ...) (define (helper3 ...) ...) ...] code-using-helpers))
Reuse code and avoid duplicating code.
Again, referring to the class best-score example, the reason the first version was inefficient was because we recomputed recursive calls. By only making the recursive call once, we improved the algorithm. We could have done that without a helper function, as in
(define (best-score a-lon) (cond [(empty? a-lon) 0] ; the minimum score [(cons? a-lon) (local [(define best-score-rest (best-score (rest a-lon)))] (cond [(< (first a-lon) best-score-rest) best-score-rest] [else (first a-lon)]))]))
Avoiding duplication also helps stylistically, since it is a special case of our general rule to reuse code when possible. It also helps avoid errors.
A special case of this is eliding invariant function arguments. A variant of our in-class example:
(define (is-in-los? a-los a-sym) (local [(define (is-in-alos? a-los) (cond [(empty? a-los) false] [(cons? a-los) (or (symbol=? (first a-los) a-sym) (is-in-alos? (rest a-los)))]))] (is-in-alos? a-los)))The argument a-sym is invariant in the search, and this technique allows us to not pass it around everywhere. For the curious... At the lowest levels of implementation, this idea can improve efficiency, but whether it does at the high-level of Scheme is unclear, because Scheme has lots of leeway in implementing your code.
To name interesting subcomputations.
Even when not accomplishing either above goal, it can be useful to name the results of subexpressions to improve the readability of your code.
As a small example, compare the readability of
(define (rectangle-area upper-left-point lower-right-point) (* (abs (- (point-x upper-left-point) (point-x lower-right-point))) (abs (- (point-y upper-left-point) (point-y lower-right-point))))) (define (rectangle-area upper-left-point lower-right-point) (local [(define top-length (abs (- (point-x upper-left-point) (point-x lower-right-point)))) (define side-length (abs (- (point-y upper-left-point) (point-y lower-right-point))))] (* top-length side-length)))
To do: Go back and review your last two assignments. Use local where appropriate.
You are expected to use local on the current and future assignments where appropriate.
For the curious... You can also use define-struct in a local. Q: When would this be appropriate?
Usually you shouldn't consider identical recursive calls in different conditional cases to be repeated code.
Consider the following code:
(define (count-symbols a-lons) (cond [(empty? a-lons) 0] [(number? (first a-lons)) (count-symbols (rest a-lons))] [(symbol? (first a-lons)) (add1 (count-symbols (rest a-lons)))]))Q: What is wrong in the following uses of local?
(define (count-symbols a-lons) (local [(define rest-count (count-symbols (rest a-lons)))] (cond [(empty? a-lons) 0] [(number? (first a-lons)) rest-count] [(symbol? (first a-lons)) (add1 rest-count)])) (define (count-symbols a-lons) (cond [(empty? a-lons) 0] [else (local [(define rest-count (lons-length (rest a-lons)))] (cond [(number? (first a-lons)) rest-count] [(symbol? (first a-lons)) (add1 rest-count)]))]))
Don't bother naming uninteresting subcomputations.
You should still use the design methodology for developing local functions. In particular, local functions still need a contract and purpose.
However, there are complications. You can't test a local function independently, because it is hidden. A standard technique is to define the function globally for testing, then move it into a local. This techniques doesn't work when the function uses variables that aren't local to the function, as when eliding invariant arguments, e.g.,
(define (expt x y) (local [(define (expt-of-x y) (cond [(zero? y) 1] [(positive? y) (* x (expt-of-x (sub1 y)))]))] (expt-of-x y)))