Spring 1996
At this point, you should have read the entire C Primer. If you haven't, you are likely to have trouble understanding this material; it's a waste of your time and ours for you to ask lots of questions that are answered in the handout.
In lecture (and in the C Primer) it is noted that arrays and pointers are equivalent; thus, the expressions a[7] and *(a+7) mean the same thing in C. While this is true, you should never use the latter expression. The former is clearer, easier to read and write, and no faster (though old C compilers sometimes generated more efficient code for the pointer version, leading a generation of C programmers to use the obfuscated version simply in order to get a little more performance). You should understand that C implements arrays in terms of pointers, but don't use that knowledge when you write programs.
Similarly, the &
address-of operator was introduced in class. Don't
use it; it is very dangerous and its only real use is to produce blackboard
examples before you know about new (which you have read about in the
C Primer by now).
When we create a new pointer, it doesn't point anywhere in particular; it could have any value, which could refer to any address in the machine (including ones that don't belong to us), so it's dangerous and wrong to dereference it. However, C can't distinguish between such nonsense values and pointers to real objects, so you have to keep track of which is which yourself.
Sometimes you want a pointer not to point anywhere in particular; use the pointer value NULL (sometimes called the null pointer) for that purpose. You may never dereference NULL, but its advantage is that it can be distinguished from real pointers by means of the test (ptr == NULL) which is true for the null pointer and false otherwise. If you have a pointer that might or might not be NULL, then you should test it before using it; if it's NULL, then there is nothing to do.
Like Scheme, C can create new types by combining primitive data types. A collection of one or more variables, possibly of different types, is a structure. For instance, the following defines a new type called point which is suitable for representing points in the plane.
struct point { int x; int y; };Now point is a type just like int, float, etc., and we can define variables of that type. Be sure to fill in the parts of any points you create. Structure slots (sometimes called members) are referred to by the syntax structure.slotname.
point origin; origin.x = 0; origin.y = 0;It is also possible to initialize an instance (only at the time of its definition) by enclosing the slot values in curly braces:
point pt = {100, 200}; point pt2; pt2 = {100,200}; // ILLEGALNow pt.x and pt.y have values 100 and 200 respectively.
Now we will define lists in C. Recall the data definition for a list of integers:
intlist := null | (cons int intlist)Each cons cell has two parts: its
car
is an integer and its
cdr
is a list. We might think of defining a structure to
represent cons cells as follows:
struct conscell { int car; conscell cdr; // WRONG };
This definition has the problem that it never stops: the cdr field of every conscell is another conscell. A conscell is as large as an integer plus another conscell: in other words, a conscell is infinitely large. The difficulty is that we have used only one alternative from the data definition:
intlist := (cons int intlist)
We can use pointers to solve our problems: a pointer has a definite size, and a pointer can either point to more data or to no more data (if it is NULL). Thus, the following definition:
struct conscell { int car; conscell * cdr; // RIGHT };
We represent a list not as a conscell, but as a pointer to a conscell. (That's important; read it again.) Thus, the second element of conscell is a list (i.e., its type is conscell *, just as the data definition states. The reason we don't represent a list as a conscell is that such a representation doesn't admit a reasonable way to specify the empty list; but when pointers are used, the value NULL can be used as a conscell *, in which case it stands for the empty list.
Given a conscell k, we can extract k's elements via k.car and k.cdr; their types are int and conscell *. We can get the element after k.car via (*(k.cdr)).car. We can't do just k.cdr.car because k.car is not a conscell but a pointer to a conscell, so we can't use the . operator--which is applicable only to structures--on it. Since the action of dereferencing then extracting a structure field is common, C provides a shorthand for it: a->b is equivalent to (*a).b. Thus, (k.cdr->car is equivalent to (*(k.cdr)).car.
In Scheme, in order to build a 22-element list, we must call
22 times. Likewise, in C we must create 22 conscell objects in order
to make such a list. (Scheme's cons
function provides a
convenient shorthand, but list
is still being called that many
times. C doesn't have such a shorthand, and can't be expected to since you
just defined the conscell type.)
cons
We have two ways to define a new conscell. The first is to define a variable of that type:
{ conscell c; ... }The problem with this approach is that when the scope of the variable c is exited, then variable c goes away, as does its value. Our new conscell doesn't survive the function call or block in which it was created.
To construct objects that stick around forever, use the new operator. Followed by a type, it forms an expression whose result type is a pointer to that type. For instance, new int is an expression which allocates space for an integer and returns a pointer to that storage. Similarly, new conscell creates a new conscell and returns a pointer to it. You should not use new as a statement:
new conscell; // WRONG, pointer is discardedbecause while this creates a new conscell, the pointer is thrown away. It's as if you had turned some other expression into a statement:
7 + 15; // WRONG, result is discardedYou should use the value of the expression; the most typical thing to do is assign it to a variable (of the right type)
conscell * ccp = new conscell;and then refer to the memory through the pointer. When the scope of ccp is exited, the variable and its value are eliminated. That value is just a pointer and it was stored elsewhere, you can still manipulate the new memory through it. For instance, here is one way to write
cons
in C:
conscell * cons(int newcar, conscell * newcdr) { conscell * result = new conscell; result->car = newcar; result->cdr = newcdr; return result; }Notice that this function takes a list as its second argument and returns a list, as expected.
read_list
that repeatedly prompts the user for a
positive integer and adds it to the front of a list using cons, terminating
(and returning the list) when the user enters 0. Or, your function could
read a length, then that many integers.