Comp 210 optional lab 1: Simple I/O and Programs as Data


representations and s-expressions

So far when writing our programs, all of the information we need is given in the program. We define functions and constants, and that is all we need to solve the problem. Sometimes we need to look outside of DrScheme for information. We could need to communicate with the user of the program or read information from a text file. We could even be sending requests to a database or talking to another computer on the network. All of these operations are grouped under input and output, or I/O.

In all of our programs we have emphasized the use of structured data. We write a data definition, and then we develop the program around it. Scheme's I/O functions also encourage the use of structured data. However, unlike programs we write, Scheme's I/O functions only know about the types of data built into Scheme. So we cannot read and write make-posn s or make-famtrees directly. We are limited to a type of data called an s-expresssion (for symbolic expression).

An s-expression is one of
- a symbol
- a string
- a number
- a list-of-s-expressions
A list-of-sexpressions is one of
- empty
- (cons <s-expression> <list-of-se-xpressions>)

This data definition describes the types of Scheme values we can read and write. But what do they look like when we write them to a file? I will distinguish between s-expressions (values in DrScheme) with s-expression representations (how they look in a file, for example).

File contains (representation)
Scheme value (s-expression)
happy
'happy
(4 2 even)
(list 4 2 'even)
("one" two (3))
(list "one" 'two (list 3))
()
empty

So single words are read as symbols, numbers are just numbers, and text inside of double quotes is read as a string. A sequence of s-expression representations between parentheses is read as a list.

We've seen that the function list can be used as a shortcut for writing down lists. There is another shortcut which is more useful when we want to express large s-expressions.  The special form quote interprets an s-expression as data, rather than trying to execute it.

(quote (1 2 happy sad)) => (list 1 2 'happy 'sad)
(quote angry) => 'angry

On the other hand, what if we want to make an s-expression which is mostly constant, except for some pieces somewhere in the middle? Nothing inside of quote will be evaluated, so we can't use a placeholder to stick another value inside of a quoted list.  There is another form called quasiquote, which can be used to insert any unquoted expressions into an otherwise constant s-expression.

(define my-var 3)
(quote (1 2 my-var)) => (list 1 2 'my-var)
(quasiquote (1 2 (unquote my-var))) => (list 1 2 3)
(quasiquote (1 3 ,(+ 2 3) ,(+ 3 4)) => (list 1 3 5 7)

quote can also be written as ' (the single quote mark), quasiquote as ` (the back-quote), and unquote as , (the comma).  In that syntax, the above examples become:

(define my-var 3)
'(1 2 my-var) => (list 1 2 'my-var)
`(1 2 ,my-var) => (list 1 2 3)

Like list, quote will never be of any use writing functions which recursively construct lists.  Be sure that you don't use list or the quote syntax when you mean cons.


ports and I/O

In some programs you write you will want to read information from a file. This section introduces the idea of ports (also called streams in some languages). A port can be in one of two states.  It can have data left, or it can be at the end of the file.  Unlike lists, where you can use empty? to find out what type of list you have, you find out what state a port is in by calling read .  If read returns an end-of-file object (which can be tested with the predicate eof-object?), then the file has no more data left. Otherwise, read will return the next data in the file.  Once you call read on a port, the port remembers that you've read from it, and the next call to read will return the next data - this is called a side-effect.

So what does read actually return?  It returns the next entire s-expression in the file.  This may be a simple symbol or number, or it may be a big list with sublists and symbols and numbers inside of it

#|
The "template" for using an input-port.  
Not really a template in the sense we've been using.
(define (reading-func iport)
  (local [(define next-sexpr (read iport))] ;; IMPORTANT
      (cond [(eof-object? next-sexp) ...]
[(symbol? next-sexpr)
(... next-sexpr ... (reading-func iport))]
[(number? next-sexpr)
(... next-sexpr ... (reading-func iport))]
[(string? next-sexpr)
(... next-sexpr ... (reading-func iport))]
[(list? next-sexpr)
(... next-sexpr ... (reading-func iport))])))
|#
Functions to know:
open-input-file : string -> input-port
open-input-string : string -> input-port
read : input-port -> s-expression or eof-object

For example, suppose we wanted to write a program that returned the sum of all the numbers in a file between the word start and the word end . (We only want to count numbers "at the top level", not numbers buried inside of lists.)

;; sum-from-file : input-port -> number
(define (sum-from-file iport)
(local [(define next-sexpr (read iport))]
(cond [(eof-object? next-sexpr) 0]
[(and (symbol? next-sexpr) (symbol=? next-sexpr 'start))
(sum-from-file-helper iport)]
[else (sum-from-file iport)])))
;; sum-from-file : input-port -> number
(define (sum-from-file-helper iport)
(local [(define next-sexpr (read iport))]
(cond [(eof-object? next-sexpr) 0]
[(and (symbol? next-sexpr) (symbol=? next-sexpr 'end))
0]
[(number? next-sexpr)
(+ next-sexpr (sum-from-file-helper iport))]
[else (sum-from-file-helper iport)])))

To do:
  1. Write a function contains-word? which takes in an input-port and a symbol and returns true if the symbol is ever read from the file.  Then try (contains-word? (open-input-file "/usr/dict/words" 'zealous)). Look up some other words in the dictionary.
  2. Write list-functions, which returns a list of all the functions defined in a Scheme file (you can use your old homework to run it on).  How do you know when you've found a function definition? You'll read in something like (list 'define (list 'my-function-name 'parameter1 'parameter2) ...)).

representing HTML

DrScheme comes with a library which lets you represent XML and HTML as a special kind of s-expression. We can use this library to produce web pages.

We'll define an x-expression (like an s-expression but for XML) as follows:
An x-expression is one of
- a string
- (cons symbol x-list)
An x-list is
- empty
- (cons x-expression x-list)

So how do we represent HTML?

x-expression
HTML (text)
(p "this is simply" (em "fascinating"))
<p>this is simply <em>fascinating</em></p>
(html (title "My page") ,,,) <html><title>Under construction</title> ... </html>


I will provide a function write-xml-to-file which takes an x-expression and writes the corresponding HTML to a file. Notice also that everything except the tag names (the parts that end up between <>) must be strings or numbers. You may need the function symbol->string to convert symbols into strings.

write-xml-to-file: x-expression string -> true

<HTML tutorial, if necessary>

To do:

  1. Write a program to generate simple documentation for a Scheme program. The page should have a header with the name of the Scheme file. It should have three sections. The first should list all of the structure types defined. The second should list all of the functions, and the third should list all of the other constants defined. Use the <ul> and <li> tags to make bulleted lists.

Given the program program on top in a file called my-program.scm, you should produce something similar to the output on bottom.
;; all-positive? : (listof numbers) -> boolean
(define (all-positive? lon)
  (cond [(empty? lon) true]
        [(cons? lon)
         (and (positive? (first lon))
              (all-positive? (rest lon)))])

;; pizza-cost : number
;; Pizza cost is the price of a pizza in dollars
(define pizza-cost 12)

;; pizza-slices : number
;; Number of slices a pizza is cut into
(define pizza-slices 8)

;; lunch-bill : number -> number
;; Given the number of slices eaten, calculate the amount owed
(define (lunch-bill slices)
  (* (/ pizza-cost pizza-slices) slices))

;; A lyst is either
;;  - empty
;;  - (make-kons <something> <lyst>)
(define-struct (kons furst wrest))

;; somm : lyst-of-numbers -> number
(define (somm lon)
  (cond [(empty? lon) 0]
        [(kons? lon)
         (+ (kons-furst lon)
            (somm (kons-wrest lon)))]))
<html>
<title>my-program.scm</title>
<body>
<h1>my-program.scm</h1>
<ul>Structures
<li>kons</li>
</ul>
<ul>Functions
<li>all-positive?<li>
<li>lunch-bill</li>
<li>somm</li>
</ul>
<ul>Constants
<li>pizza-cost</li>
<li>pizza-slices</li>
</ul>
</body>
</html>


Note that whitespace is not significant in HTML, and DrScheme will write it much more spaces out than I have above. It will appear the same way in your browser.

preview of next time

In this lab, we dealt with structured input, and we deferred output to a library function that handled XML. Next lab, we'll explore the low-level I/O functions which allow us to do I/O at the character level. We'll also talk about network I/O some, and in the end we will write a web server.