Previously, we got binary and unary primitive operations working and set up automated testing. This time around, we'll add support for more basic forms and add support for an incredibly simplistic memory allocator.
Local Variables
Since we added stack indices last time around, adding local variables is less
complex than you might expect. We'll extend our compiler to handle the form
(let ((var0 val0) (var1 val1) ...) body ...)
, which binds variables to the
values of the given expressions, then evaluates each of the body
forms in
order, returning the value of the last one.
To implement this, we'll need a place to store the values of each variable, a
way to track the bindings for each variable, and extensions to compile-expr
to
handle the new types of form. Since we already have stack handling integrated,
we can just store all the bindings there, but we'll need to extend our compiler
by threading an environment structure through. For simplicity, our environment
will just be a list of the form ((name . value) (name2 . value2) ...)
; a hash
table or binary tree would be more efficient, but for now our priority is to get
things working, not to optimize them. The Scheme standard includes a function
assoc
we can use to look things up in this structure (called an "association
list" in the standard).
First, we augment our existing functions to thread an environment through the
calls, and add a function to handle let
expressions.
(define (compile-primitive-call form si env) (case (primitive-op form) ... all existing primitives, with env added to the end of compile-expr calls ... )) (define (let? x) (eq? (car x) 'let)) (define (compile-let bindings body si env) ... ) (define (compile-expr e si env) (cond ((immediate? e) (emit "movl $~a, %eax" (immediate-rep e))) ((let? e) (compile-let (cadr e) (cddr e) si env)) ((primitive-call? e) (compile-primitive-call e si env)) )) (define (compile-program program) ... (compile-expr program (- wordsize) '()) ...)
Now that let
forms are recognized, we can actually implement the compilation
in compile-let
. For a given let
, we have to evaluate each of the binding
values, store them to their locations on the stack, then loop through each body
form and evaluate them. For now, we can just assume that values in the
environment are offsets from the stack pointer that can be used to access that
value.
; build a list of numbers (a ... b-1) (define (range a b) (if (= a b) '() (cons a (range (+ 1 a) b)))) (define (compile-let bindings body si env) (let* ((stack-offsets (map (lambda (x) (- si (* x wordsize))) (range 0 (length bindings)))) (names (map car bindings)) (exprs (map cadr bindings)) (inner-si (- si (* (length bindings) wordsize))) (inner-env (append (map cons names stack-offsets) env))) ; evaluate exprs and assign them to stack locations (for-each (lambda (expr offset) (compile-expr expr inner-si env) (emit "movl %eax, $~a(%esp)" offset)) exprs stack-offsets) ; evaluate all body forms - this will leave the last one's output in %eax (for-each (lambda (form) (compile-expr form inner-si inner-env)) body) ))
This is a little more complex than anything we've done so far, so let's unpack
it a bit. First, it defines a list stack-offsets
with elements corresponding
to the stack offsets of each binding. For example, given the binding list
((x 1) (y 2))
, we get stack-offsets = (-4 -8)
, meaning x
is stored at
offset -4 from %esp
, y
is stored at -8
, and inner expressions are
evaluated with si = -12
. At the same time, we also define inner-si
and
inner-env
, which are the values of si
and env
used inside the let
form's
scope.
bindings (x 1) (y 2) (z 3) stack-offsets -4 -8 -12 names x y z values 1 2 3
In the Scheme definition, let
doesn't actually apply the bindings until the
values are evaluated, so we first evaluate each of the binding expressions with
our original environment and the new si
, then store their %eax
values at the
correct offsets onto the stack. Next, we loop through each body form and
evaluate it with the new stack index and environment, so our newly-defined names
are visible in that scope.
We also have to make sure variables are actually accessible. At the moment,
our compile-expr
definition only handles immediates, let
forms, and
primitive calls. If we try to compile x
or another variable, nothing happens.
(define (variable? x) (symbol? x)) (define (compile-var-load v si env) (emit "movl ~a(%esp), %eax" (cdr (assoc v env)))) (define (compile-expr e si env) (cond ... ((variable? e) (compile-var-load e si env)) ... ))
Since the values of each variable in the environment refer to their appropriate
stack offsets, all we have to do in order to evaluate a variable form my-var
is look it up in the environment, then emit movl var_offset(%esp), %eax
to
pull the value out of its storage location on the stack.
Now we can reload the program in our REPL and test it:
scheme@(guile-user)> (load "compiler.scm") ... scheme@(guile-user)> (compile-and-run '(let ((x 1)) x)) 1 scheme@(guile-user)> (compile-and-run '(let ((x (primcall add1 1))) x)) 2 scheme@(guile-user)> (compile-and-run '(let ((x 1) (y 4)) (primcall + x y))) 5
Success! We can define local variables and evaluate them! If we wanted to extend this, we could also check the variable names against the passed environment to make sure they actually exist and emit an error if they don't. That's for later, though - right now we just want to get something working.
Conditional Expressions
Next up, it's time to implement if
. In Scheme, this means handling the form
(if test true-body false-body)
, which evaluates test
, and evaluates
true-body
if the result isn't equal to #f
or false-body
if it is. For our
implementation, we'll also support omitting false-body
, in which case it'll
default to #f
.
The implementation follows the same pattern we've seen for other types of form:
augment compile-expr
to handle it, then delegate to a specialized compile-if
function to generate the actual code.
(define (if? x) (eq? (car x) 'if)) (define (compile-if test t-body f-body si env) ...) (define (compile-expr e si env) (cond ... ((if? e) (compile-if (cadr e) (caddr e) (if (null? (cdddr e)) #f (cadddr e)) si env)) ... ))
To implement compile-if
, we have to generate code that looks something like
this:
... [evaluate test] cmpl #f, %eax je false_label [evaluate true-body] jmp end_label false_label: [evaluate false-body] end_label: ...
However, in order to avoid conflicts, we'll have to generate unique names and
use them in place of true-label
and false-label
. Since this needs to persist
across calls, we'll set up a new global variable and keep updating it every time
we need a new label, then reset it every time we run compile-program
. Ideally
it'd be thread-local, but standard Scheme doesn't support that, so we'll use a
global for now.
(define current-label 0) (define (unique-label) (let ((l current-label)) (set! current-label (+ 1 l)) (format #f "l_~a:" l))) (define (compile-if test t-body f-body si env) (let ((false-label (unique-label)) (end-label (unique-label))) (compile-expr test si env) (emit "cmpl $~a, %eax" (immediate-rep #f)) (emit "je ~a" false-label) (compile-expr t-body si env) (emit "jmp ~a" end-label) (emit "~a:" false-label) (compile-expr f-body si env) (emit "~a:" end-label))) (define (compile-program program) (set! current-label 0) ...)
The actual implementation in our compiler is pretty much a 1:1 translation of
the code shown above to Lisp, when the addition of a (unique-label)
function
to generate a new unused label for branching. We don't need to handle the
default value for f-body
inside compile-if
since the condition in
compile-expr
will pass the correct literal if the input leaves that argument
out.
A Primitive Heap
At this point, we're starting to run out of features that can be implemented without finally facing the music and tackling memory allocation. That doesn't mean writing a full garbage collector quite yet, but for vectors, pairs, strings and the like we'll need a way to store data in memory. For now, let's use a bump allocator - we keep a pointer to the first free byte, then increment that pointer whenever we allocate memory. Since we're not dealing with larger programs quite yet, we can just allocate a big "heap" buffer in our runtime and ignore the possibility of out-of-memory until later.
First, we modify the runtime system to allocate a 4MB heap buffer and pass it to our Scheme code.
#define HEAP_SIZE 0x400000 __attribute__((__cdecl__)) extern int scheme_entry(void* heap); int main(int argc, const char **argv) { void* heap = aligned_alloc(8, HEAP_SIZE); int val = scheme_entry(heap); show(val); printf("\n"); free(heap); return 0; }
So far our Scheme code obeys the convention that %eax
holds the result of each
expression. Since we'll probably be allocating a lot, we'll add a second part to
that and assert that %esi
always holds a pointer to the first unallocated byte
on the heap. That's not part of the cdecl calling convention, so we'll have to
update our function preamble to fetch the pointer parameter and store it in
%esi
before evaluating our actual code.
When we enter scheme_entry
from C, our stack looks like this:
low addresses --------------- | | | ... | | | |-------------| | | | | | our data | | | |-------------| | esi | <---- %esp | edi | | edx | |-------------| | return addr | |-------------| | heap ptr | |-------------| | caller %ebp | <---- %ebp |-------------| | ... | |-------------| high addresses
The first thing we do is save the old values of %esi
(our heap pointer),
%edi
, and %edx
(which is overwritten when we multiply integers). Then we
have to fetch our heap pointer into %esi
by accessing the heap pointer, stored
16 bytes (return addr + 3 words for saved registers) before the stack pointer.
(define (compile-program program) ... (emit "movl 16(%esp), %esi") ...)
Pair Allocation
To illustrate the basic process for memory allocation, let's implement the
(cons x y)
primitive, which constructs a pair with the car
field set to x
adn the cdr
field set to y
. To implement it, we just evaluate x
and y
,
store the results in the next two words after the allocation pointer, bump it up
by 8 bytes, then tag the resulting pointer based on the data model we defined in
part 1. Note how this implementation stores the cdr
first, so it can re-use
%eax
to hold the car
value as it's loaded off the stack.
(define (immediate-rep x) (cond ... ((null? x) pair-tag))) (define (compile-primitive-call form si env) (case (primitive-op form) ... ; cons - build a pair from the two arguments ((cons) ; evaluate x and y (compile-expr (primitive-op-arg1 form) si env) (emit "movl %eax, ~a(%esp)" si) (compile-expr (primitive-op-arg2 form) (- si wordsize) env) ; store cdr (emit "movl %eax, ~a(%esi)" wordsize) ; store car (emit "movl ~a(%esp), %eax" si) (emit "movl %eax, 0(%esi)") ; save pointer and tag it, then increment heap ptr (emit "movl %esi, %eax") (emit "orl $~a, %eax" pair-tag) (emit "addl $~a, %esi" (* 2 wordsize))) ))
Before we actually test it, we need to update our runtime system's show
function to support pairs. This is also a good time to define the representation
for the null object, ()
- it'll be a pair pointer whose address is zero. By
convention, pairs whose cdr
points to another pair are flattened, so (1 2 3)
is printed instead of (1 (2 (3 ())))
. However, when the cdr
isn't a pair,
the printer should put a dot between the other elements and the cdr
. If we
want to see lists correctly in our output, we need to add that logic to the
runtime system.
void show(int x) { ... } else if((x & PTR_MASK) == PAIR_TAG) { int* ptr = (int*)(x - PAIR_TAG); if(ptr == NULL) { printf("()"); return; } // either a list or a dotted pair int car = ptr[0]; int cdr = ptr[1]; putchar('('); show(car); // show additional space-separated elems while((cdr & PTR_MASK) == PAIR_TAG) { ptr = (int*)(cdr - PAIR_TAG); if(ptr == NULL) break; car = ptr[0]; cdr = ptr[1]; putchar(' '); show(car); } // show dotted pair notation if relevant if((cdr & PTR_MASK) != PAIR_TAG) { printf(" . "); show(cdr); } putchar(')'); } }
Finally, we can test out our new implementation of cons
by building and
printing some lists!
scheme@(guile-user)> (load "compiler.scm") ... scheme@(guile-user)> (compile-and-run '(primcall cons 6 (primcall cons 1 ()))) (6 1) scheme@(guile-user)> (compile-and-run '(primcall cons 6 (primcall cons 1 9))) (6 1 . 9)
While we're at it, we should probably implement a few more important
pair-related primitives: null?
, car
, cdr
, set-car!
, and set-cdr!
. The
implementations aren't that complex; they all pretty much boil down to "subtract
the pair tag from the input to get a pointer, then do whatever is needed with
the structure at that address." There's a clever optimization we can do, though;
instead of mutating the pointer in %eax
, we can just use a constant offset
from it to access its fields, since we already know how many bytes we need to
go; subtracting the tag and then adding a constant k
is the same thing as
adding k-tag
, which can be encoded as one instruction rather than two. We also
end up needing a spare register in set-car!
and set-cdr!
, but we're already
treating %edx
as disposable anyway due to the way integer multiplies work, so
we can use that.
(define (compile-primitive-call form si env) (case (primitive-op form) ... ; car - get the car of a pair ((car) (compile-expr (primitive-op-arg1 form) si env) (emit "movl ~a(%eax), %eax" (- pair-tag))) ; 0(%eax) plus tag offset ; cdr - get the cdr of a pair ((cdr) (compile-expr (primitive-op-arg1 form) si env) (emit "movl ~a(%eax), %eax" (- 4 pair-tag))) ; 4(%eax) plus tag offset ; null? - check whether an object is null ((null?) (compile-expr (primitive-op-arg1 form) si env) (emit-is-eax-equal-to pair-tag)) ; (set-car! pair obj) - mutate the car of a pair ((set-car!) (compile-expr (primitive-op-arg1 form) si env) ; pair (emit "movl %eax, ~a(%esp)" si) (compile-expr (primitive-op-arg2 form) (- si wordsize) env) ; obj (emit "movl ~a(%esp), %edx" si) (emit "movl %eax, ~a(%edx)" (- pair-tag))) ; (set-cdr! pair obj) - mutate the cdr of a pair ((set-cdr!) (compile-expr (primitive-op-arg1 form) si env) ; pair (emit "movl %eax, ~a(%esp)" si) (compile-expr (primitive-op-arg2 form) (- si wordsize) env) ; obj (emit "movl ~a(%esp), %edx" si) (emit "movl %eax, ~a(%edx)" (- 4 pair-tag))) ))
Vector and String Allocation
Vectors and strings are almost identical conceptually, but require slightly different implementations due to the differences in how their contents are stored. Both of them are basically variable-length packed sequences of objects. For strings, the objects are bytes, and for lists they're words. In memory, we can store them as a packed array of their contents preceded by one unsigned word holding the length of the array.
STRING ------------------------------------------------ | | | | | | | | | | | | | length |c0|c1|c2|c3|c4|c5|c6|c7|c8|c9| ... | | | | | | | | | | | | ------------------------------------------------ 00 01 02 03 04 05 06 07 08 09 ... VECTOR ----------------------------------------------------- | | | | | | length | elem 0 | elem 1 | elem 2 | ... | | | | | ----------------------------------------------------- 00 01 02 03 04 05 06 07 08 09 ...
Now that we know how things are going to be laid out in memory, let's start by
implementing the string and vector allocation functions, make-string
and
make-vector
. These take either one or two arguments; the first one is always
the length of the buffer, and the second, if present, is a value the buffer
should be filled with. Just like with cons
, we can just start storing data at
%esi
, then move it forward once we've built our object.
(define (compile-primitive-call form si env) (case (primitive-op form) ... ; (make-string k) - construct a string of length k ; (make-string k c) - construct a string of length k filled with c ((make-string) (compile-expr (primitive-op-arg1 form) si env) ; deal with the length (emit "shr $~a, %eax" fixnum-shift) ; remove fixnum tag (emit "movl %eax, 0(%esi)") ; store length into new structure (emit "movl %eax, %ecx") ; save actual length while we tag the pointer (emit "movl %esi, %eax") ; tag the result ptr (emit "orl $~a, %eax" str-tag) (emit "addl $11, %ecx") ; pad size to the next multiple of 8 bytes (emit "andl $-8, %ecx") (emit "addl %ecx, %esi") ; advance allocation pointer (if (null? (cdddr form)) '() ; no extra work for one-arg form (begin ; emit initialization code ; what should we fill with? (emit "movl %eax, ~a(%esp)" si) ; save the pointer (compile-expr (primitive-op-arg2 form) (- si wordsize) env) (emit "shrl $~a, %eax" char-shift) ; grab character into AL ; perform fill - rep stosb takes from AL, so we don't need to mask ; out any higher bits (emit "movl ~a(%esp), %edi" si) ; target pointer (emit "subl $~a, %edi" str-tag) (emit "movl 0(%edi), %ecx") ; load buffer length (emit "addl $~a, %edi" wordsize) ; skip over length word (emit "rep stosb") ; fill with byte (emit "movl ~a(%esp), %eax" si)))) ; recover final value ; (make-vector k) - construct a vector of length k ; (make-vector k o) - construct a vector of length k with cells set to o ((make-vector) (compile-expr (primitive-op-arg1 form) si env) ; deal with the length (emit "shr $~a, %eax" fixnum-shift) ; remove fixnum tag (emit "movl %eax, 0(%esi)") ; store length into new structure (emit "movl %eax, %ecx") ; save actual length (emit "sall $2, %ecx") ; multiply by 4 to get bytes (emit "movl %esi, %eax") ; tag the result ptr (emit "orl $~a, %eax" vec-tag) (emit "addl $11, %ecx") ; pad size to the next multiple of 8 bytes (emit "andl $-8, %ecx") (emit "addl %ecx, %esi") ; advance allocation pointer (if (null? (cdddr form)) '() ; no extra work for one-arg form (begin ; emit initialization code ; what should we fill with? (emit "movl %eax, ~a(%esp)" si) ; save the pointer (compile-expr (primitive-op-arg2 form) (- si wordsize) env) ; load target pointer and count (emit "movl ~a(%esp), %edi" si) ; ptr (emit "subl $~a, %edi" vec-tag) (emit "movl 0(%edi), %ecx") ; count (emit "addl $~a, %edi" wordsize) ; don't write to length ; initialize items and return (emit "rep stosl") ; fill with word (emit "movl ~a(%esp), %eax" si)))) ; recover final value ))
The two primitives work almost identically; they save the length, get the number
of bytes to allocate, advance the allocation pointer, then initialize the array
if needed. Note that make-vector
has to multiply the length by 4 (the word
size) to get bytes, whereas make-string
can use it directly since characters
are one byte each.
For initialization, they use variants of the rep stos
instruction, which
copies the lower section of eax
repeated ecx
times into the buffer starting
at edi
. Depending on the suffix (e.g. stosb
vs stosl
), the section of
eax
taken and the amount that edi
moves forward each time change. For
strings, we use rep stosb
, which copies one byte at a time, and for vectors we
use rep stosl
, which copies one word at a time. We could probably do this more
efficiently with vectorized code, but for now we're going for simplicity.
Once we have the allocation functions, we can move on to some of the other string and vector-related primitives. Specifically, the get/set functions, which access or modify the element at a given index.
(define (compile-primitive-call form si env) (case (primitive-op form) ... ; (string-ref str k) - get the kth character of str ((string-ref) (compile-expr (primitive-op-arg1 form) si env) ; get str (emit "subl $~a, %eax" str-tag) ; untag pointer (emit "addl $~a, %eax" wordsize) ; skip length word (emit "movl %eax, ~a(%esp)" si) (compile-expr (primitive-op-arg2 form) (- si wordsize) env) (emit "shrl $~a, %eax" fixnum-shift) ; get index (emit "addl ~a(%esp), %eax" si) ; offset pointer (emit "movzbl (%eax), %eax") (emit "shl $~a, %eax" char-shift) ; tag as char (emit "orl $~a, %eax" char-tag)) ; (string-set! str k c) - set the kth character of str to c ((string-set!) (compile-expr (primitive-op-arg1 form) si env) ; get str (emit "subl $~a, %eax" str-tag) ; untag pointer (emit "addl $~a, %eax" wordsize) ; skip length word (emit "movl %eax, ~a(%esp)" si) (compile-expr (primitive-op-arg2 form) (- si wordsize) env) (emit "shrl $~a, %eax" fixnum-shift) ; get index (emit "addl %eax, ~a(%esp)" si) ; offset pointer (compile-expr (primitive-op-arg3 form) (- si wordsize) env) ; value (emit "shrl $~a, %eax" char-shift) ; get character value (emit "movl ~a(%esp), %ecx" si) ; store (emit "movb %al, (%ecx)") (emit "xorl %eax, %eax")) ; return 0 ; (vector-ref vec k) - get the kth element of vec ((vector-ref) (compile-expr (primitive-op-arg1 form) si env) ; get vec (emit "subl $~a, %eax" vec-tag) ; untag pointer (emit "addl $~a, %eax" wordsize) ; skip length word (emit "movl %eax, ~a(%esp)" si) (compile-expr (primitive-op-arg2 form) (- si wordsize) env) (emit "shrl $~a, %eax" (- fixnum-shift 2)) ; fixnum->idx (*4) (emit "addl ~a(%esp), %eax" si) ; offset pointer (emit "movl (%eax), %eax")) ; load and return ; (vector-set! vec k v) - set the kth element of vec to v ((vector-set!) (compile-expr (primitive-op-arg1 form) si env) ; get vec (emit "subl $~a, %eax" vec-tag) ; untag pointer (emit "addl $~a, %eax" wordsize) ; skip length word (emit "movl %eax, ~a(%esp)" si) (compile-expr (primitive-op-arg2 form) (- si wordsize) env) (emit "shrl $~a, %eax" (- fixnum-shift 2)) ; fixnum->idx (*4) (emit "addl %eax, ~a(%esp)" si) ; offset pointer (compile-expr (primitive-op-arg3 form) (- si wordsize) env) ; value (emit "movl ~a(%esp), %ecx" si) ; store (emit "movl %eax, (%ecx)") (emit "xorl %eax, %eax")) ; return 0 ))
Again, the implementations for strings and vectors are almost identical, save
for a few index conversions to deal with the differing element sizes. Later on,
we'll consolidate this code so we aren't repeating ourselves quite so much. To
finish up the compiler side of our implementation, we implement the last few
vector and string primitives: string-length
/vector-length
, string?
, and
vector?
. The length operations just load from the length field of the given
pointer, and the type-checking functions work just the same way as all our other
type predicates.
(define (compile-primitive-call form si env) (case (primitive-op form) ... ; string? - return whether an object is a string ((string?) (compile-expr (primitive-op-arg1 form) si env) (emit "andl $~a, %eax" ptr-mask) (emit-is-eax-equal-to str-tag)) ; vector? - return whether an object is a vector ((vector?) (compile-expr (primitive-op-arg1 form) si env) (emit "andl $~a, %eax" ptr-mask) (emit-is-eax-equal-to vec-tag)) ; string-length - get the length in chars of a string ((string-length) (compile-expr (primitive-op-arg1 form) si env) (emit "movl ~a(%eax), %eax" (- str-tag)) ; get length (emit "sall $~a, %eax" fixnum-shift)) ; convert to fixnum ; vector-length - get the length in chars of a vector ((vector-length) (compile-expr (primitive-op-arg1 form) si env) (emit "movl ~a(%eax), %eax" (- vec-tag)) ; get length (emit "sall $~a, %eax" fixnum-shift)) ; convert to fixnum ))
We also need to update our runtime system so it knows how to display strings and vectors. If we were to try, the current version would just ignore it. The implementation here isn't too complex - it just loads the length and prints that many elements.
void show(int x) { ... } else if((x & PTR_MASK) == STR_TAG) { int* ptr = (int*)(x - STR_TAG); int len = *ptr; char* body = (char*)(ptr+1); putchar('"'); for(;len > 0;len--) putchar(*body++); putchar('"'); } else if((x & PTR_MASK) == VEC_TAG) { int* ptr = (int*)(x - VEC_TAG); int len = *ptr++; printf("#("); for(;len > 0;len--) { show(*ptr++); if(len != 1) putchar(' '); } printf(")"); } }
Now that we have everything written, let's test it! If you're following along and implementing things yourself, you should keep in mind that I actually test more often than I'm showing here; I just don't want to clutter these posts with all the output I'd have to include if I showed every test run.
scheme@(guile-user)> (load "compiler.scm") ... scheme@(guile-user)> (compile-and-run '(primcall make-string 10 #\c)) "cccccccccc" scheme@(guile-user)> (compile-and-run '(primcall make-vector 10 27)) #(27 27 27 27 27 27 27 27 27 27)
At this point, our Scheme can handle most of the primitive datatypes needed to run itself, and has support for quite a few basic primitives. Next time, we'll add support for function calls, closures, and tail call recursion.