Comp210 Lecture 37

Jam2000 lecytre notes

JAM 2000

We have talked about computation at a high level, leaving undefined (a) the primitive functions +,cons, ..., and (b) how to carry out law-of-scheme like "look up placeholder's value" or "re-write body with substitutions...". Here is an alternate, low-level view of computing. "low-level" means that you need to understand the actual machine hardware before understanding the language.

The JAM machine -- a model of all modern electronic computers -- has several parts to it:
[picture]

- memory, a vector with locations 0..65536, holding only numbers. RAM. Refer to contents of i'th position as mem[i].
- ten other memory locations, "registers" R0...R9, Conceptually, these are just like memory. and another special-purpose register "PC", program-counter.
- CPU, "central processing unit".
The cpu carries out individual instructions (mentioned below), and (as we'll see later) it also takes care of carrying them out in sequence (to run an entier program).

Instructions are things like "multiply the contents of R3 and R5 and put the result into R0 (R0 <-- R3*R5)", "put the number 17 into R4 (R4 <-- 17)", and "R8 contains a memory address: put the value of R0 into the indicated location in memory mem[R8] <-- R0".

You'll want to bring to lectures the Jam2000 reference sheet, for frequent reference.

The memory holds both data, and instructions. Wait a minute -- since memory can hold only numbers, we have to encode instructions as numbers.
Some examples of encoding (see reference sheet):

   (add R3 R8 R9) ; (meaning: R3 <-- R8 + R9) is encoded as 00098310.
   (ldi R4 156)   ; (meaning: R4 <-- 156)     is encoded as    15649.

(Note when i say ninety-eight thousand, three hundred and ten, this is actually a (lame) pun: the instruction is a sequence of digits, not a number. In arabic numerals, numbers also happen to be encoded as a sequence of digits.)

Note that the only ultimate work the cpu does -- adding (and branching?) -- is done in registers.
We need to transfer data between main memory and registers and back: we'll see the meaning of "ld" and "st" later. Lab will introduce the JAM-inspector interface, which lets you change memory contents, and then start the actual JAM cpu running.

JAM II
(c) imperative programming:

(define x 3)
(define y 17)
;; ...
(set! x (- x y))

In JAM:

                  ; Choose to store x at location 100
                  ; Choose to store y at location 200

                  ; R0 holds a copy of x
                  ; R1 holds a copy of y
                  ; R2 holds the address of x, 100
                  ;   This version sneaky: no copy of 200 kept around!
                  ; R4 stores x-y

   (ldi R2 100)   ; R2 <-- 100
   (ld  R0 R2)    ; R0 <-- Mem[R2]  =  Mem[100] = x

   (ldi R1 200)   ; R1 <-- 200
   (ld  R1 R1)    ; R1 <-- Mem[R1]  =  Mem[200] = y

   (sub R4 R0 R1) ; R4 <-- R0 - R1  =  x - y

   (st  R4 R2)    ; x <-- x - y
   (halt)

Now:
Compute the sum of data[0]...data[size-1], where data and size are already-defined placeholders. We have seen how to do this with template, you could do it accumulator style,
and even with fori= using (set! sum-so-far ...). [do these as an exercise]

(define data (vector ...))

(define size (vector-length data))  ; It's 10.

;; sum-data-to-size:  --> number
;; An imperative-style program, to sume data[0]...data[size-1].
;;
(define i 0)
(define sum-so-far 0)
(define (sum-data-to-size)
    (if (< i size)
        (begin (set! sum-so-far (+ sum-so-far (vector-ref data i)))
               (set! i (+ i 1))
               (sum-data-to-size))
        sum-so-far))))

; (How to make sure you can run this program twice?)

The CPU operates on a four-step cycle:

fetch mem[PC] and remember it.
decode the fetched value as an instruction.
increment PC.
execute the instruction.

What does this program do?


r y    ; Interface command to reset the machine (and "y" to confirm).
p 186  ; Interface command, so next instruction entered will be at addr#186.


; Okay, now the real jam assembly instructions:

;assembly         addr         comment
;---------        ----         -------
                               ; R0 will hold "i", which will count 0..size-1.
                               ; R1 will hold "size", which is 10.
                               ; R2 will hold "sumSoFar", initially 0.
                               ; R3 will hold the constant 1.
                               ; R7 will hold 200: where "data" starts in mem
                               ; R8 will hold R7+i.
                               ; R9 will hold "data[i]".
             
(ldi  R0 0)      ; [186]       ; i        <-- 0
(ldi  R1 10)     ; [187]       ; size     <-- 10
(ldi  R2 0)      ; [188]       ; sumSoFar <-- 0
(ldi  R3 1)      ; [189]       ; r3       <-- 1
(ldi  R7 200)    ; [190]       ;
                 ; 
(sub  R4 R0 R1)  ; [191] loop: ; Compare i with size (actually, compute i-size)
(bgez R4 198)    ; [192]       ; If i-size >= 0, branch to label "done"
(add  R8 R7 R0)  ; [193]       ; R8 <-- R7+i, the *address* of data[i].
(ld   R9 R8)     ; [194]       ; R9 <-- data[i]
(add  R2 R2 R9)  ; [195]       ; sumSoFar <-- sumSoFar + data[i]
(add  R0 R0 R3)  ; [196]       ; i        <-- i + 1
(jmpi 191)       ; [197]       ; Go back and repeat.
         
(print R2)       ; [198] done: ; Print sumSoFar
(newline)        ; [199]


; Okay, now that we've put these into memory, let's start the cpu running!
p 186
x y

We will walk through the above code. This is alwasy the ultimate method of debugging: see what the code does (not what you think it does).
[I got through the variable declarations.] Finishing walking through the above JAM program. Remember, what Scheme program it came from. [show slide]

Any questions on the above? Here are a few that come to mind:

Yes, if you look at the reference sheet, you can see that you can use "ldx" to accomplish lookup of data[i] (for which we current uses an "add" followed by "ld").

Wouldn't it be nice to write "(jmpi loop)", instead of "(jmpi 191)"?, making use of the semantic label we gave to the instruction living at 191. Alas, the low-level JAM assembly-language doesn't understand this, but sometimes we'll write this higher-level version on the board anyway.

Going over the JAM program:
Hmmm, these bgez/jmpi instructions are new. This code works fine, up until the last moment, when chaos then ensues. Why? What line of scheme code does the instruction "(jmpi loop)" correspond to? Why is the order of CPU-steps important (why can't you swap steps 3,4?) Why does the code go into oblivion, at the end?

N.B. Keep in mind that printing a value and having a function return a value are different. In drscheme, when you type in an expression, it computes the value, and at the very end prints it out. (Many other values were being returned from (recursive) function calls, which never got printed, thank goodness.) In many imperative languages (JAM, C, Java), nothing ever gets printed for you; you must always explicitly call print.

Question: was the Scheme version tail-recursive? Yep. Tail recursion = loops.

How would you implement subroutines, in JAM? (Note the instruction jsr, "jump to subroutine", but it doesn't quite do the full trick. In fact, it only does what the original Fortran did.

Real computers: the register/main-memory distinction is made purely because of technological tradeoff: fast expensive memory vs slow cheap memory. What are other levels in the memory hierarchy, which arise from the same tradeoff? (It's a pretty standard tradeoff which will always be with us ... or will be max out someday, cheaply getting the fastest/smallest memory possible?)

current computers at 1GHz -- that is, the cpu does its 4-step cycle 10^9 times per second. (How far can light travel during one cycle?) (Well actually, it does pipelining: it does a quarter of that cycle, but at the same time some other hardware is doing the next quarter of the cycle for the next instruction, and elsewhere the next quarter-cycle of the next, so on average it's doing one instruction per cycle ... except that branches goof this up, as well as delays when you can't pre-fetch memory. Check out Elec326 and Comp320!

Why are we covering assembly language, in comp210?

to understand some low-level issues the ocmputers on our desks face.
Finish up the intro to imperative programming.

By the way, there is also low-level mathematical computation: E.g. the lambda calculus, where people experimented how little of scheme you need to compute everything. E.g. you don't really need +,- and all the numbers if you have 0, add1, zero?, and "if". In fact, you don't need these: represent 0 with empty, (add1 n) with (cons empty n), zero? with empty?. Still need "if". Can we do away with needing "if", if we're clever? Can we do away with cons? (yes -- use functions! -- e.g. empty list is identity function, '(()) is represented by the function returning the identity function, ...). Interestingly, one can get by w/o even having "define"!

Calling functions in JAM:
Suppose you had sqrt written at 5000.

...input in register (or, agreed memory location)

Need to remember a whole stack full of pending recursive calls... use a "stack" (the adt of a scheme list).

Further info not covered directly:

98.nov.18
Loop, is when you have this jmpi thing, as seen in last program. Tail recursion = loops. Consider:

;; avg-pos: vec(vector of num), i(natNum), sum-so-far(num), num-pos(natNum)
;;   --> number

;; Return the average of the positive numbers in vec[0]...vec[i-1],
;; where we've previously seen num-pos such numbers with accumulated sum-so-far.
;;
(define (avg-pos vec i sum-so-far num-pos)
  (cond [(zero? i) (/ sum-so-far num-pos)]
        [else (if (positive? (vector-ref vec (sub1 i)))
                  (avg-pos vec (sub1 i) (+ sum-so-far (vector-ref vec (sub1 i)))
                                        (add1 num-pos))
                  (avg-pos vec (sub1 i) sum-so-far num-pos))]))

What is the JAM version? Outline:

                 ; [3999]         ; 1.
                 ; [4000]         ; i = vector-size,
                 ; [4001]         ; sum-so-far = 0,
                 ; [4002]         ; num-pos = 0.

                 ; [4003] loop:   ; If (i == 0), then branch to "end".
                 ; [4004]         ; Load vec[i-1] from memory, and
                 ; [4005]         ; If (vec[i-1] <= 0), jump to "skip-update".
                 ; [4006] update: ; sum-so-far = sum-so-far + vec[i-1].
                 ; [4007]         ; num-pos = num-pos + 1.
                 ; [4008] skip-update:  ; i = i - 1.
                 ; [4009]         ; jump to "loop".

                 ; [4010] end:    ; print answer.

[Questions? Could you write this?]
What happened to the tail-recursion, with two accumulated arguments? That's the loop.

Note what we had: it was nicer to talk about variable names, than register names, and do "if..." statments in one line. And again, a for-loop would be nicer yet, to separate out the control-flow.

The complete code from lecture:

       ; A program to add the contents of a vector.
       ;
r y    ; Inspector command to reset the machine (and "y" to confirm).
p 200  ; Put data numbers @addresses 200-209.
20874  ; This is placed @address 200.
00126  ; This is placed @address 201.
5
5
5
5
5
5
300060
5      ; This is placed @address 209.


p 186  ; Inspector command, so next instruction entered will be at addr#186.


; Okay, now the real jam commands:

;assembly         addr         comment
;---------        ----         -------
                               ; R0 will hold "i", which will count 0..size-1.
                               ; R1 will hold "size", which is 10.
                               ; R2 will hold "sumSoFar", initially 0.
                               ; R3 will hold the constant 1.
                               ; R7 will hold 200: where "data" starts in mem
                               ; R8 will hold R7+i.
                               ; R9 will hold "data[i]".
             
(ldi  R0 0)      ; [186]       ; i        <-- 0
(ldi  R1 10)     ; [187]       ; size     <-- 10
(ldi  R2 0)      ; [188]       ; sumSoFar <-- 0
(ldi  R3 1)      ; [189]       ; the constant 1
(ldi  R7 200)    ; [190]       ; the start of "data"
                 ; 
(sub  R4 R0 R1)  ; [191] loop: ; Compare i with size (actually, compute i-size)
(bgez R4 198)    ; [192]       ; If i-size >= 0, branch to label "done"
(add  R8 R7 R0)  ; [193]       ; R8 <-- R7+i, the *address* of data[i].
(ld   R9 R8)     ; [194]       ; R9 <-- data[i]
(add  R2 R2 R9)  ; [195]       ; sumSoFar <-- sumSoFar + data[i]
(add  R0 R0 R3)  ; [196]       ; i        <-- i + 1
(jmpi 191)       ; [197]       ; Go back and repeat.
         
(print R2)       ; [198] done: ; Print sumSoFar
(newline)        ; [199]


mem 186 199
; Okay, now that we've put these into memory, let's start the cpu running!
p 186
s