Two tiny Lisp evaluators

:: lisp, programming

Everyone who has written Lisp has written tiny Lisp evaluators in Lisp: here are two more.

Following two recent articles I wrote on scope and extent in Common Lisp, I thought I would finish with two very tiny evaluators for dynamically and lexically bound variants on a tiny Lisp.

The language

The tiny Lisp these evaluators interpret is not minimal: it has constructs other than lambda, and even has assignment. But it is pretty small. Other than the binding rules the languages are identical.

  • λ & lambda are synonyms and construct procedures, which can take any number of arguments;
  • quote quotes its argument;
  • if is conditional expression (the else part is optional);
  • set! is assignment and mutates a binding.

That is all that exists.

Both evaluators understand primitives, which are usually just functions in the underlying Lisp: since the languages are Lisp–1s, you could also expose other sorts of things of course (for instance true and false values). You can provide a list of initial bindings to them to define useful primitives.

Requirements

Both evaluators rely on my iterate and spam hacks: they could easily be rewritten not to do so.

The dynamic evaluator

A procedure is represented by a structure which has a list of formals and a body of one or more forms.

(defstruct (procedure
            (:print-function
             (lambda (p s d)
               (declare (ignore d))
               (print-unreadable-object (p s)
                 (format s "λ ~S" (procedure-formals p))))))
  (formals '())
  (body '()))

The evaluator simply dispatches on the type of thing and then on the operator for compound forms.

(defun evaluate (thing bindings)
  (typecase thing
    (symbol
     (let ((found (assoc thing bindings)))
       (unless found
         (error "~S unbound" thing))
       (cdr found)))
    (list
     (destructuring-bind (op . arguments) thing
       (case op
         ((lambda λ)
          (matching arguments
            ((head-matches (list-of #'symbolp))
             (make-procedure :formals (first arguments)
                             :body (rest arguments)))
            (otherwise
             (error "bad lambda form ~S" thing))))
         ((quote)
          (matching arguments
            ((list-matches (any))
             (first arguments))
            (otherwise
             (error "bad quote form ~S" thing))))
         ((if)
          (matching arguments
            ((list-matches (any) (any))
             (if (evaluate (first arguments) bindings)
                 (evaluate (second arguments) bindings)))
            ((list-matches (any) (any) (any))
             (if (evaluate (first arguments) bindings)
                 (evaluate (second arguments) bindings)
               (evaluate (third arguments) bindings)))
            (otherwise
             (error "bad if form ~S" thing))))
         ((set!)
          (matching arguments
            ((list-matches #'symbolp (any))
             (let ((found (assoc (first arguments) bindings)))
               (unless found
                 (error "~S unbound" (first arguments)))
               (setf (cdr found) (evaluate (second arguments) bindings))))
            (otherwise
             (error "bad set! form ~S" thing))))
         (t
          (applicate (evaluate (first thing) bindings)
                     (mapcar (lambda (form)
                               (evaluate form bindings))
                             (rest thing))
                     bindings)))))
    (t thing)))

The interesting thing here is that applicate needs to know the current set of bindings so it can extend them dynamically.

Here is applicate which has a case for primitives and procedures

(defun applicate (thing arguments bindings)
  (etypecase thing
    (function
     ;; a primitive
     (apply thing arguments))
    (procedure
     (iterate bind ((vtail (procedure-formals thing))
                    (atail arguments)
                    (extended-bindings bindings))
       (cond
        ((and (null vtail) (null atail))
         (iterate eval-body ((btail (procedure-body thing)))
           (if (null (rest btail))
               (evaluate (first btail) extended-bindings)
             (progn
               (evaluate (first btail) extended-bindings)
               (eval-body (rest btail))))))
        ((null vtail)
         (error "too many arguments"))
        ((null atail)
         (error "not enough arguments"))
        (t
         (bind (rest vtail)
               (rest atail)
               (acons (first vtail) (first atail)
                      extended-bindings))))))))

The thing that makes this evaluator dynamic is that the bindings that applicate extends are those it was given: procedures do not remember bindings.

The lexical evaluator

A procedure is represented by a structure as before, but this time it has a set of bindings associated with it: the bindings in place when it was created.

(defstruct (procedure
            (:print-function
             (lambda (p s d)
               (declare (ignore d))
               (print-unreadable-object (p s)
                 (format s "λ ~S" (procedure-formals p))))))
  (formals '())
  (body '())
  (bindings '()))

The evaluator is almost identical:

(defun evaluate (thing bindings)
  (typecase thing
    (symbol
     (let ((found (assoc thing bindings)))
       (unless found
         (error "~S unbound" thing))
       (cdr found)))
    (list
     (destructuring-bind (op . arguments) thing
       (case op
         ((lambda λ)
          (matching arguments
            ((head-matches (list-of #'symbolp))
             (make-procedure :formals (first arguments)
                             :body (rest arguments)
                             :bindings bindings))
            (otherwise
             (error "bad lambda form ~S" thing))))
         ((quote)
          (matching arguments
            ((list-matches (any))
             (first arguments))
            (otherwise
             (error "bad quote form ~S" thing))))
         ((if)
          (matching arguments
            ((list-matches (any) (any))
             (if (evaluate (first arguments) bindings)
                 (evaluate (second arguments) bindings)))
            ((list-matches (any) (any) (any))
             (if (evaluate (first arguments) bindings)
                 (evaluate (second arguments) bindings)
               (evaluate (third arguments) bindings)))
            (otherwise
             (error "bad if form ~S" thing))))
         ((set!)
          (matching arguments
            ((list-matches #'symbolp (any))
             (let ((found (assoc (first arguments) bindings)))
               (unless found
                 (error "~S unbound" (first arguments)))
               (setf (cdr found) (evaluate (second arguments) bindings))))
            (otherwise
             (error "bad set! form ~S" thing))))
         (t
          (applicate (evaluate (first thing) bindings)
                     (mapcar (lambda (form)
                               (evaluate form bindings))
                             (rest thing)))))))
    (t thing)))

The differences are that when constructing a procedure the current bindings are recorded in the procedure, and it is no longer necessary to pass bindings to applicate.

applicate is also almost identical:

(defun applicate (thing arguments)
  (etypecase thing
    (function
     ;; a primitive
     (apply thing arguments))
    (procedure
     (iterate bind ((vtail (procedure-formals thing))
                    (atail arguments)
                    (extended-bindings (procedure-bindings thing)))
       (cond
        ((and (null vtail) (null atail))
         (iterate eval-body ((btail (procedure-body thing)))
           (if (null (rest btail))
               (evaluate (first btail) extended-bindings)
             (progn
               (evaluate (first btail) extended-bindings)
               (eval-body (rest btail))))))
        ((null vtail)
         (error "too many arguments"))
        ((null atail)
         (error "not enough arguments"))
        (t
         (bind (rest vtail)
               (rest atail)
               (acons (first vtail) (first atail)
                      extended-bindings))))))))

The difference is that the bindings it extends when binding arguments are the bindings which the procedure remembered, not the dynamically-current bindings, which it does not even know.

The difference between them

Here is the example that shows how these two evaluators differ.

With the dynamic evaluator:

? ((λ (f)
     ((λ (x)
        ;; bind x to 1 around the call to f
        (f))
      1))
   ((λ (x)
      ;; bind x to 2 when the function that will be f is created
      (λ () x))
    2))
1

The binding in effect is the dynamically current one, not the one that was in effect when the procedure was created.

With the lexical evaluator:

? ((λ (f)
     ((λ (x)
        ;; bind x to 1 around the call to f
        (f))
      1))
   ((λ (x)
      ;; bind x to 2 when the function that will be f is created
      (λ () x))
    2))
2

Now the binding in effect is the one that existed when the procedure was created.

Something more interesting is how you create recursive procedures in the lexical evaluator. With suitable bindings for primitives, it’s easy to see that this can’t work:

((λ (length)
   (length '(1 2 3)))
 (λ (l)
   (if (null? l)
       0
       (+ (length (cdr l)) 1))))

It can’t work because length is not in scope in the body of length. it will work in the dynamic evaluator.

The first fix, which is similar to what Scheme does with letrec, is to use assignment to mutate the binding so it is correct:

((λ (length)
   (set! length (λ (l)
                  (if (null? l)
                      0
                      (+ (length (cdr l)) 1))))
   (length '(1 2 3)))
 0)

Note the initial value of length is never used.

The second fix is to use something like the U combinator (you could use Y of course: I think U is simpler to understand):

((λ (length)
   (length '(1 2 3)))
 (λ (l)
   ((λ (c)
      (c c l 0))
    (λ (c t s)
      (if (null? t)
          s
          (c c (cdr t) (+ s 1)))))))

Source code

These two evaluators, together with a rudimentary REPL which can use either of them, can be found here.