Let's Build a Compiler 03: Locals, Conditionals, and the Heap

20 minute read Published: 2019-03-30

Previously, we got binary and unary primitive operations working and set up automated testing. This time around, we'll add support for more basic forms and add support for an incredibly simplistic memory allocator.

Local Variables

Since we added stack indices last time around, adding local variables is less complex than you might expect. We'll extend our compiler to handle the form (let ((var0 val0) (var1 val1) ...) body ...), which binds variables to the values of the given expressions, then evaluates each of the body forms in order, returning the value of the last one.

To implement this, we'll need a place to store the values of each variable, a way to track the bindings for each variable, and extensions to compile-expr to handle the new types of form. Since we already have stack handling integrated, we can just store all the bindings there, but we'll need to extend our compiler by threading an environment structure through. For simplicity, our environment will just be a list of the form ((name . value) (name2 . value2) ...); a hash table or binary tree would be more efficient, but for now our priority is to get things working, not to optimize them. The Scheme standard includes a function assoc we can use to look things up in this structure (called an "association list" in the standard).

First, we augment our existing functions to thread an environment through the calls, and add a function to handle let expressions.

(define (compile-primitive-call form si env)
  (case (primitive-op form)
    ...
    all existing primitives, with env added to the end of compile-expr calls
    ...
   ))

(define (let? x) (eq? (car x) 'let))
(define (compile-let bindings body si env)
  ...
  )

(define (compile-expr e si env)
  (cond
    ((immediate? e) (emit "movl $~a, %eax" (immediate-rep e)))
    ((let? e) (compile-let (cadr e) (cddr e) si env))
    ((primitive-call? e) (compile-primitive-call e si env))
    ))

(define (compile-program program)
  ...
  (compile-expr program (- wordsize) '())
  ...)

Now that let forms are recognized, we can actually implement the compilation in compile-let. For a given let, we have to evaluate each of the binding values, store them to their locations on the stack, then loop through each body form and evaluate them. For now, we can just assume that values in the environment are offsets from the stack pointer that can be used to access that value.

; build a list of numbers (a ... b-1)
(define (range a b)
  (if (= a b) '()
      (cons a (range (+ 1 a) b))))

(define (compile-let bindings body si env)
  (let* ((stack-offsets (map (lambda (x) (- si (* x wordsize)))
                             (range 0 (length bindings))))
         (names (map car bindings))
         (exprs (map cadr bindings))
         (inner-si (- si (* (length bindings) wordsize)))
         (inner-env (append (map cons names stack-offsets) env)))
    ; evaluate exprs and assign them to stack locations
    (for-each (lambda (expr offset)
                (compile-expr expr inner-si env)
                (emit "movl %eax, $~a(%esp)" offset))
              exprs stack-offsets)

    ; evaluate all body forms - this will leave the last one's output in %eax
    (for-each (lambda (form) (compile-expr form inner-si inner-env)) body)
    ))

This is a little more complex than anything we've done so far, so let's unpack it a bit. First, it defines a list stack-offsets with elements corresponding to the stack offsets of each binding. For example, given the binding list ((x 1) (y 2)), we get stack-offsets = (-4 -8), meaning x is stored at offset -4 from %esp, y is stored at -8, and inner expressions are evaluated with si = -12. At the same time, we also define inner-si and inner-env, which are the values of si and env used inside the let form's scope.

      bindings  (x 1)   (y 2)   (z 3)
 stack-offsets   -4      -8      -12
         names    x       y       z
        values    1       2       3

In the Scheme definition, let doesn't actually apply the bindings until the values are evaluated, so we first evaluate each of the binding expressions with our original environment and the new si, then store their %eax values at the correct offsets onto the stack. Next, we loop through each body form and evaluate it with the new stack index and environment, so our newly-defined names are visible in that scope.

We also have to make sure variables are actually accessible. At the moment, our compile-expr definition only handles immediates, let forms, and primitive calls. If we try to compile x or another variable, nothing happens.

(define (variable? x) (symbol? x))

(define (compile-var-load v si env)
  (emit "movl ~a(%esp), %eax" (cdr (assoc v env))))

(define (compile-expr e si env)
  (cond
    ...
    ((variable? e) (compile-var-load e si env))
    ...
    ))

Since the values of each variable in the environment refer to their appropriate stack offsets, all we have to do in order to evaluate a variable form my-var is look it up in the environment, then emit movl var_offset(%esp), %eax to pull the value out of its storage location on the stack.

Now we can reload the program in our REPL and test it:

scheme@(guile-user)> (load "compiler.scm")
...
scheme@(guile-user)> (compile-and-run '(let ((x 1)) x))
1
scheme@(guile-user)> (compile-and-run '(let ((x (primcall add1 1))) x))
2
scheme@(guile-user)> (compile-and-run '(let ((x 1) (y 4)) (primcall + x y)))
5

Success! We can define local variables and evaluate them! If we wanted to extend this, we could also check the variable names against the passed environment to make sure they actually exist and emit an error if they don't. That's for later, though - right now we just want to get something working.

Conditional Expressions

Next up, it's time to implement if. In Scheme, this means handling the form (if test true-body false-body), which evaluates test, and evaluates true-body if the result isn't equal to #f or false-body if it is. For our implementation, we'll also support omitting false-body, in which case it'll default to #f.

The implementation follows the same pattern we've seen for other types of form: augment compile-expr to handle it, then delegate to a specialized compile-if function to generate the actual code.

(define (if? x) (eq? (car x) 'if))

(define (compile-if test t-body f-body si env)
  ...)

(define (compile-expr e si env)
  (cond
    ...
    ((if? e) (compile-if (cadr e) (caddr e)
                         (if (null? (cdddr e)) #f (cadddr e))
                         si env))
    ...
    ))

To implement compile-if, we have to generate code that looks something like this:

             ...
             [evaluate test]
             cmpl #f, %eax
             je false_label
             [evaluate true-body]
             jmp end_label
false_label: [evaluate false-body]
end_label:   ...

However, in order to avoid conflicts, we'll have to generate unique names and use them in place of true-label and false-label. Since this needs to persist across calls, we'll set up a new global variable and keep updating it every time we need a new label, then reset it every time we run compile-program. Ideally it'd be thread-local, but standard Scheme doesn't support that, so we'll use a global for now.

(define current-label 0)
(define (unique-label)
  (let ((l current-label)) (set! current-label (+ 1 l))
    (format #f "l_~a:" l)))

(define (compile-if test t-body f-body si env)
  (let ((false-label (unique-label))
        (end-label (unique-label)))
    (compile-expr test si env)
    (emit "cmpl $~a, %eax" (immediate-rep #f))
    (emit "je ~a" false-label)
    (compile-expr t-body si env)
    (emit "jmp ~a" end-label)
    (emit "~a:" false-label)
    (compile-expr f-body si env)
    (emit "~a:" end-label)))

(define (compile-program program)
  (set! current-label 0)
  ...)

The actual implementation in our compiler is pretty much a 1:1 translation of the code shown above to Lisp, when the addition of a (unique-label) function to generate a new unused label for branching. We don't need to handle the default value for f-body inside compile-if since the condition in compile-expr will pass the correct literal if the input leaves that argument out.

A Primitive Heap

At this point, we're starting to run out of features that can be implemented without finally facing the music and tackling memory allocation. That doesn't mean writing a full garbage collector quite yet, but for vectors, pairs, strings and the like we'll need a way to store data in memory. For now, let's use a bump allocator - we keep a pointer to the first free byte, then increment that pointer whenever we allocate memory. Since we're not dealing with larger programs quite yet, we can just allocate a big "heap" buffer in our runtime and ignore the possibility of out-of-memory until later.

First, we modify the runtime system to allocate a 4MB heap buffer and pass it to our Scheme code.

#define HEAP_SIZE       0x400000

__attribute__((__cdecl__))
extern int scheme_entry(void* heap);

int main(int argc, const char **argv) {
    void* heap = aligned_alloc(8, HEAP_SIZE);
    int val = scheme_entry(heap);
    show(val);
    printf("\n");
    free(heap);
    return 0;
}

So far our Scheme code obeys the convention that %eax holds the result of each expression. Since we'll probably be allocating a lot, we'll add a second part to that and assert that %esi always holds a pointer to the first unallocated byte on the heap. That's not part of the cdecl calling convention, so we'll have to update our function preamble to fetch the pointer parameter and store it in %esi before evaluating our actual code.

When we enter scheme_entry from C, our stack looks like this:

   low addresses
  ---------------
  |             |
  |     ...     |
  |             |
  |-------------|
  |             |
  |             |
  |  our data   |
  |             |
  |-------------|
  |     esi     | <---- %esp
  |     edi     |
  |     edx     |
  |-------------|
  | return addr |
  |-------------|
  |  heap ptr   |
  |-------------|
  | caller %ebp | <---- %ebp
  |-------------|
  |     ...     |
  |-------------|
   high addresses

The first thing we do is save the old values of %esi (our heap pointer), %edi, and %edx (which is overwritten when we multiply integers). Then we have to fetch our heap pointer into %esi by accessing the heap pointer, stored 16 bytes (return addr + 3 words for saved registers) before the stack pointer.

(define (compile-program program)
  ...
  (emit "movl 16(%esp), %esi")
  ...)

Pair Allocation

To illustrate the basic process for memory allocation, let's implement the (cons x y) primitive, which constructs a pair with the car field set to x adn the cdr field set to y. To implement it, we just evaluate x and y, store the results in the next two words after the allocation pointer, bump it up by 8 bytes, then tag the resulting pointer based on the data model we defined in part 1. Note how this implementation stores the cdr first, so it can re-use %eax to hold the car value as it's loaded off the stack.

(define (immediate-rep x)
  (cond ...
        ((null? x) pair-tag)))


(define (compile-primitive-call form si env)
  (case (primitive-op form)
    ...
    ; cons - build a pair from the two arguments
    ((cons)
     ; evaluate x and y
     (compile-expr (primitive-op-arg1 form) si env)
     (emit "movl %eax, ~a(%esp)" si)
     (compile-expr (primitive-op-arg2 form) (- si wordsize) env)
     ; store cdr
     (emit "movl %eax, ~a(%esi)" wordsize)
     ; store car
     (emit "movl ~a(%esp), %eax" si)
     (emit "movl %eax, 0(%esi)")
     ; save pointer and tag it, then increment heap ptr
     (emit "movl %esi, %eax")
     (emit "orl $~a, %eax" pair-tag)
     (emit "addl $~a, %esi" (* 2 wordsize)))
  ))

Before we actually test it, we need to update our runtime system's show function to support pairs. This is also a good time to define the representation for the null object, () - it'll be a pair pointer whose address is zero. By convention, pairs whose cdr points to another pair are flattened, so (1 2 3) is printed instead of (1 (2 (3 ()))). However, when the cdr isn't a pair, the printer should put a dot between the other elements and the cdr. If we want to see lists correctly in our output, we need to add that logic to the runtime system.

void show(int x) {
    ...
    } else if((x & PTR_MASK) == PAIR_TAG) {
        int* ptr = (int*)(x - PAIR_TAG);
        if(ptr == NULL) {
            printf("()");
            return;
        }

        // either a list or a dotted pair
        int car = ptr[0];
        int cdr = ptr[1];
        putchar('(');
        show(car);

        // show additional space-separated elems
        while((cdr & PTR_MASK) == PAIR_TAG) {
            ptr = (int*)(cdr - PAIR_TAG);
            if(ptr == NULL) break;

            car = ptr[0];
            cdr = ptr[1];
            putchar(' ');
            show(car);
        }

        // show dotted pair notation if relevant
        if((cdr & PTR_MASK) != PAIR_TAG) {
            printf(" . ");
            show(cdr);
        }
        putchar(')');
    }
}

Finally, we can test out our new implementation of cons by building and printing some lists!

scheme@(guile-user)> (load "compiler.scm")
...
scheme@(guile-user)> (compile-and-run '(primcall cons 6 (primcall cons 1 ())))
(6 1)
scheme@(guile-user)> (compile-and-run '(primcall cons 6 (primcall cons 1 9)))
(6 1 . 9)

While we're at it, we should probably implement a few more important pair-related primitives: null?, car, cdr, set-car!, and set-cdr!. The implementations aren't that complex; they all pretty much boil down to "subtract the pair tag from the input to get a pointer, then do whatever is needed with the structure at that address." There's a clever optimization we can do, though; instead of mutating the pointer in %eax, we can just use a constant offset from it to access its fields, since we already know how many bytes we need to go; subtracting the tag and then adding a constant k is the same thing as adding k-tag, which can be encoded as one instruction rather than two. We also end up needing a spare register in set-car! and set-cdr!, but we're already treating %edx as disposable anyway due to the way integer multiplies work, so we can use that.

(define (compile-primitive-call form si env)
  (case (primitive-op form)
    ...
    ; car - get the car of a pair
    ((car)
     (compile-expr (primitive-op-arg1 form) si env)
     (emit "movl ~a(%eax), %eax" (- pair-tag))) ; 0(%eax) plus tag offset

    ; cdr - get the cdr of a pair
    ((cdr)
     (compile-expr (primitive-op-arg1 form) si env)
     (emit "movl ~a(%eax), %eax" (- 4 pair-tag))) ; 4(%eax) plus tag offset

    ; null? - check whether an object is null
    ((null?)
     (compile-expr (primitive-op-arg1 form) si env)
     (emit-is-eax-equal-to pair-tag))

    ; (set-car! pair obj) - mutate the car of a pair
    ((set-car!)
     (compile-expr (primitive-op-arg1 form) si env) ; pair
     (emit "movl %eax, ~a(%esp)" si)
     (compile-expr (primitive-op-arg2 form) (- si wordsize) env) ; obj
     (emit "movl ~a(%esp), %edx" si)
     (emit "movl %eax, ~a(%edx)" (- pair-tag)))

    ; (set-cdr! pair obj) - mutate the cdr of a pair
    ((set-cdr!)
     (compile-expr (primitive-op-arg1 form) si env) ; pair
     (emit "movl %eax, ~a(%esp)" si)
     (compile-expr (primitive-op-arg2 form) (- si wordsize) env) ; obj
     (emit "movl ~a(%esp), %edx" si)
     (emit "movl %eax, ~a(%edx)" (- 4 pair-tag)))
  ))

Vector and String Allocation

Vectors and strings are almost identical conceptually, but require slightly different implementations due to the differences in how their contents are stored. Both of them are basically variable-length packed sequences of objects. For strings, the objects are bytes, and for lists they're words. In memory, we can store them as a packed array of their contents preceded by one unsigned word holding the length of the array.

                        STRING
  ------------------------------------------------
  |           |  |  |  |  |  |  |  |  |  |  |  
  |   length  |c0|c1|c2|c3|c4|c5|c6|c7|c8|c9| ...
  |           |  |  |  |  |  |  |  |  |  |  |  
  ------------------------------------------------
   00 01 02 03 04 05 06 07 08 09 ...

                        VECTOR
  -----------------------------------------------------
  |           |           |           |           |
  |   length  |   elem 0  |   elem 1  |   elem 2  | ...
  |           |           |           |           |
  -----------------------------------------------------
   00 01 02 03 04 05 06 07 08 09 ...

Now that we know how things are going to be laid out in memory, let's start by implementing the string and vector allocation functions, make-string and make-vector. These take either one or two arguments; the first one is always the length of the buffer, and the second, if present, is a value the buffer should be filled with. Just like with cons, we can just start storing data at %esi, then move it forward once we've built our object.

(define (compile-primitive-call form si env)
  (case (primitive-op form)
    ...

    ; (make-string k)   - construct a string of length k
    ; (make-string k c) - construct a string of length k filled with c
    ((make-string)
     (compile-expr (primitive-op-arg1 form) si env) ; deal with the length
     (emit "shr $~a, %eax" fixnum-shift) ; remove fixnum tag
     (emit "movl %eax, 0(%esi)") ; store length into new structure
     (emit "movl %eax, %ecx") ; save actual length while we tag the pointer
     (emit "movl %esi, %eax") ; tag the result ptr
     (emit "orl $~a, %eax" str-tag)
     (emit "addl $11, %ecx") ; pad size to the next multiple of 8 bytes
     (emit "andl $-8, %ecx")
     (emit "addl %ecx, %esi") ; advance allocation pointer
     (if (null? (cdddr form))
         '() ; no extra work for one-arg form
         (begin ; emit initialization code
           ; what should we fill with?
           (emit "movl %eax, ~a(%esp)" si) ; save the pointer
           (compile-expr (primitive-op-arg2 form) (- si wordsize) env)
           (emit "shrl $~a, %eax" char-shift) ; grab character into AL

           ; perform fill - rep stosb takes from AL, so we don't need to mask
           ; out any higher bits
           (emit "movl ~a(%esp), %edi" si) ; target pointer
           (emit "subl $~a, %edi" str-tag)
           (emit "movl 0(%edi), %ecx") ; load buffer length
           (emit "addl $~a, %edi" wordsize) ; skip over length word
           (emit "rep stosb") ; fill with byte

           (emit "movl ~a(%esp), %eax" si)))) ; recover final value

    ; (make-vector k)   - construct a vector of length k
    ; (make-vector k o) - construct a vector of length k with cells set to o
    ((make-vector)
     (compile-expr (primitive-op-arg1 form) si env) ; deal with the length
     (emit "shr $~a, %eax" fixnum-shift) ; remove fixnum tag
     (emit "movl %eax, 0(%esi)") ; store length into new structure
     (emit "movl %eax, %ecx") ; save actual length
     (emit "sall $2, %ecx") ; multiply by 4 to get bytes
     (emit "movl %esi, %eax") ; tag the result ptr
     (emit "orl $~a, %eax" vec-tag)
     (emit "addl $11, %ecx") ; pad size to the next multiple of 8 bytes
     (emit "andl $-8, %ecx")
     (emit "addl %ecx, %esi") ; advance allocation pointer
     (if (null? (cdddr form))
         '() ; no extra work for one-arg form
         (begin ; emit initialization code
           ; what should we fill with?
           (emit "movl %eax, ~a(%esp)" si) ; save the pointer
           (compile-expr (primitive-op-arg2 form) (- si wordsize) env)

           ; load target pointer and count
           (emit "movl ~a(%esp), %edi" si) ; ptr
           (emit "subl $~a, %edi" vec-tag)
           (emit "movl 0(%edi), %ecx") ; count
           (emit "addl $~a, %edi" wordsize) ; don't write to length

           ; initialize items and return
           (emit "rep stosl") ; fill with word
           (emit "movl ~a(%esp), %eax" si)))) ; recover final value
  ))

The two primitives work almost identically; they save the length, get the number of bytes to allocate, advance the allocation pointer, then initialize the array if needed. Note that make-vector has to multiply the length by 4 (the word size) to get bytes, whereas make-string can use it directly since characters are one byte each.

For initialization, they use variants of the rep stos instruction, which copies the lower section of eax repeated ecx times into the buffer starting at edi. Depending on the suffix (e.g. stosb vs stosl), the section of eax taken and the amount that edi moves forward each time change. For strings, we use rep stosb, which copies one byte at a time, and for vectors we use rep stosl, which copies one word at a time. We could probably do this more efficiently with vectorized code, but for now we're going for simplicity.

Once we have the allocation functions, we can move on to some of the other string and vector-related primitives. Specifically, the get/set functions, which access or modify the element at a given index.

(define (compile-primitive-call form si env)
  (case (primitive-op form)
    ...

    ; (string-ref str k) - get the kth character of str
    ((string-ref)
     (compile-expr (primitive-op-arg1 form) si env) ; get str
     (emit "subl $~a, %eax" str-tag) ; untag pointer
     (emit "addl $~a, %eax" wordsize) ; skip length word
     (emit "movl %eax, ~a(%esp)" si)
     (compile-expr (primitive-op-arg2 form) (- si wordsize) env)
     (emit "shrl $~a, %eax" fixnum-shift) ; get index
     (emit "addl ~a(%esp), %eax" si) ; offset pointer
     (emit "movzbl (%eax), %eax")
     (emit "shl $~a, %eax" char-shift) ; tag as char
     (emit "orl $~a, %eax" char-tag))

    ; (string-set! str k c) - set the kth character of str to c
    ((string-set!)
     (compile-expr (primitive-op-arg1 form) si env) ; get str
     (emit "subl $~a, %eax" str-tag) ; untag pointer
     (emit "addl $~a, %eax" wordsize) ; skip length word
     (emit "movl %eax, ~a(%esp)" si)
     (compile-expr (primitive-op-arg2 form) (- si wordsize) env)
     (emit "shrl $~a, %eax" fixnum-shift) ; get index
     (emit "addl %eax, ~a(%esp)" si) ; offset pointer
     (compile-expr (primitive-op-arg3 form) (- si wordsize) env) ; value
     (emit "shrl $~a, %eax" char-shift) ; get character value
     (emit "movl ~a(%esp), %ecx" si) ; store
     (emit "movb %al, (%ecx)")
     (emit "xorl %eax, %eax")) ; return 0

    ; (vector-ref vec k) - get the kth element of vec
    ((vector-ref)
     (compile-expr (primitive-op-arg1 form) si env) ; get vec
     (emit "subl $~a, %eax" vec-tag) ; untag pointer
     (emit "addl $~a, %eax" wordsize) ; skip length word
     (emit "movl %eax, ~a(%esp)" si)
     (compile-expr (primitive-op-arg2 form) (- si wordsize) env)
     (emit "shrl $~a, %eax" (- fixnum-shift 2)) ; fixnum->idx (*4)
     (emit "addl ~a(%esp), %eax" si) ; offset pointer
     (emit "movl (%eax), %eax")) ; load and return

    ; (vector-set! vec k v) - set the kth element of vec to v
    ((vector-set!)
     (compile-expr (primitive-op-arg1 form) si env) ; get vec
     (emit "subl $~a, %eax" vec-tag) ; untag pointer
     (emit "addl $~a, %eax" wordsize) ; skip length word
     (emit "movl %eax, ~a(%esp)" si)
     (compile-expr (primitive-op-arg2 form) (- si wordsize) env)
     (emit "shrl $~a, %eax" (- fixnum-shift 2)) ; fixnum->idx (*4)
     (emit "addl %eax, ~a(%esp)" si) ; offset pointer
     (compile-expr (primitive-op-arg3 form) (- si wordsize) env) ; value
     (emit "movl ~a(%esp), %ecx" si) ; store
     (emit "movl %eax, (%ecx)")
     (emit "xorl %eax, %eax")) ; return 0
  ))

Again, the implementations for strings and vectors are almost identical, save for a few index conversions to deal with the differing element sizes. Later on, we'll consolidate this code so we aren't repeating ourselves quite so much. To finish up the compiler side of our implementation, we implement the last few vector and string primitives: string-length/vector-length, string?, and vector?. The length operations just load from the length field of the given pointer, and the type-checking functions work just the same way as all our other type predicates.

(define (compile-primitive-call form si env)
  (case (primitive-op form)
    ...

    ; string? - return whether an object is a string
    ((string?)
     (compile-expr (primitive-op-arg1 form) si env)
     (emit "andl $~a, %eax" ptr-mask)
     (emit-is-eax-equal-to str-tag))

    ; vector? - return whether an object is a vector
    ((vector?)
     (compile-expr (primitive-op-arg1 form) si env)
     (emit "andl $~a, %eax" ptr-mask)
     (emit-is-eax-equal-to vec-tag))

    ; string-length - get the length in chars of a string
    ((string-length)
     (compile-expr (primitive-op-arg1 form) si env)
     (emit "movl ~a(%eax), %eax" (- str-tag)) ; get length
     (emit "sall $~a, %eax" fixnum-shift)) ; convert to fixnum

    ; vector-length - get the length in chars of a vector
    ((vector-length)
     (compile-expr (primitive-op-arg1 form) si env)
     (emit "movl ~a(%eax), %eax" (- vec-tag)) ; get length
     (emit "sall $~a, %eax" fixnum-shift)) ; convert to fixnum
  ))

We also need to update our runtime system so it knows how to display strings and vectors. If we were to try, the current version would just ignore it. The implementation here isn't too complex - it just loads the length and prints that many elements.

void show(int x) {
    ...
    } else if((x & PTR_MASK) == STR_TAG) {
        int* ptr = (int*)(x - STR_TAG);
        int len = *ptr;
        char* body = (char*)(ptr+1);
        putchar('"');
        for(;len > 0;len--) putchar(*body++);
        putchar('"');
    } else if((x & PTR_MASK) == VEC_TAG) {
        int* ptr = (int*)(x - VEC_TAG);
        int len = *ptr++;

        printf("#(");
        for(;len > 0;len--) {
            show(*ptr++);
            if(len != 1) putchar(' ');
        }
        printf(")");
    }
}

Now that we have everything written, let's test it! If you're following along and implementing things yourself, you should keep in mind that I actually test more often than I'm showing here; I just don't want to clutter these posts with all the output I'd have to include if I showed every test run.

scheme@(guile-user)> (load "compiler.scm")
...
scheme@(guile-user)> (compile-and-run '(primcall make-string 10 #\c))
"cccccccccc"
scheme@(guile-user)> (compile-and-run '(primcall make-vector 10 27))
#(27 27 27 27 27 27 27 27 27 27)

At this point, our Scheme can handle most of the primitive datatypes needed to run itself, and has support for quite a few basic primitives. Next time, we'll add support for function calls, closures, and tail call recursion.