Generic interfaces in Racket

:: computer, lisp

Or: things you do to distract yourself from watching an attempted fascist coup.

A thing that exists in many languages with a notion of a sequence of objects is a function variously known as fold or reduce: this takes another function of two arguments, some initial value, and walks along the sequence successively reducing it using the function. So, for instance:

  1. (fold + 0 '(1 2 3)) turns into (fold + (+ 0 1) '(2 3)) which turns into …
  2. (fold + 1 '(2 3)) turns into (fold + (+ 1 2) '(3)) which turns into …
  3. (fold + 3 '(3)) turns into (fold + (+ 3 3) '()) which turns into …
  4. 6.

It’s pretty easy to write a version of fold for lists:

(define (fold op initial l)
  (if (null? l)
      initial
      (fold op (op initial (first l)) (rest l))))

Racket calls this (or a more careful version of this) foldl: there is also foldr which works from the other end of the list and is more expensive as a result.

Well, one thing you might want to do is have a version of fold which works on trees rather than just lists. One definition of a tree is:

  1. it’s a collection of nodes;
  2. nodes have values;
  3. nodes have zero or more unique children, which are nodes.
  4. no node is the descendant of more than one node;
  5. there is exactly one root node which is the descendant of no other nodes.

A variant of this (which will matter below) is that the children of a node are either nodes or any other object, and there is some way of knowing if something is a node or not1.

You can obviously represent trees as conses, with the value of a cons being its car, and the children being its cdr. Whatever builds the tree needs to make sure that (3), (4) and (5) are true, or you get a more general graph structure.

But you might want to have other sorts of trees, and you’d want the fold function not to care about what sort of tree it was processing: just that it was processing a tree. Indeed, it would be nice if it was possible to provide special implementations for, for instance, binary trees where rather than iterating over some sequence of children you’d know there were exactly two.

So, I wondered if there was a nice way of expressing this in Racket and it turns out there mostly is. Racket has a notion of generic interfaces which are really intended as a way for different structure types to provide common interfaces, I think. But it turns out they can be (ab?)used to do this, as well.

Generic interfaces are not provided by racket but by racket/generic: everything below assumed (require racket/generic).

A generic treelike interface

A treelike object supports two operations:

  • node-value returns the value of a node;
  • node-children returns a list of the node’s children.

The second of these is a bit nasty: it would be better perhaps to either provide an interface for mapping over a node’s children, or to return some general, possibly lazy, sequence of children. But this is just playing, so I don’t mind.

Here is a definition of a generic treelike interface, which includes default methods for lists:

(define-generics treelike
  ;; treelike objects have values and children
  (node-value treelike)
  (node-children treelike)
  #:fast-defaults
  (((λ (t)
      (and (cons? t) (list? t)))
    ;; non-null proper lists are trees: their value is their car;
    ;; their children are their cdr.
    (define node-value car)
    (define node-children cdr))))

Notes:

  • This uses #:fast-defaults instead of #:defaults, which means that the dispatch to objects which satisfy list? happens. This is fine in this case: lists are never going to be confused with any other tree type.
  • This relies on Racket’s (and Scheme’s?) list? predicate returning true only for proper lists rather than CL’s cheap listp which just returns true for anything which is either nil or a cons.
  • There are lots of other options to define-generics which I’m not using and many of which I don’t understand.

With this definition:

> (treelike? '())
#f
> (treelike? '(1 2 3))
#t
> (treelike? '(1 2 . 3))
#f
> (node-children '(1 2 3))
'(2 3)

So, OK.

A treelike binary tree

We could then define a binary-tree type which implements this generic interface:

(struct binary-tree (value left right)
  #:transparent
  #:methods gen:treelike
  ((define (node-value bt)
     (binary-tree-value bt))
   (define (node-children bt)
     (list (binary-tree-left bt)
           (binary-tree-right bt)))))

The #:methods gen:treelike tells the structure we’re defining the methods needed for this thing to be a treelike object.

And now we can check things:

> (treelike? (binary-tree 1 2 3))
#t
> (node-value (binary-tree 1 2 3))
1
> (node-children (binary-tree 1 2 3))
'(2 3)

OK.

Two attempts at a generic foldable interface

So now I want to define another interface for things which can be folded. And the first thing I tried is this:

(define-generics foldable
  ;; broken
  (fold operation initial foldable)
  #:defaults
  ((treelike?
    (define (fold op initial treelike)
      (let ([current (op initial (node-value treelike))]
            [children (node-children treelike)])
        (if (null? children)
            current
            (fold op (fold op current (first children))
                  (rest children))))))
   ((const true)
    (define (fold op initial any)
      (op initial any)))))

So this tries to define a fold generic function, which has two implementations: one for treelike objects and one for all other objects. So this means that all objects are foldable, and, for instance (fold + 0 1) simply turns into (+ 0 1). This is a bit odd but it simplifies the implementation of the interface for treelike objects on the assumption that the children of nodes may not themselves be nodes (see above).

There is another complexity: if the list of a treelike node’s children isn’t null, then it’s a treelike, so it can safely be recursed over rather than explicitly iterated over. This is a slightly questionable pun I think, but, well, I am a slightly questionable programmer.

And this … doesn’t work:

> (fold + 0 '(1 2 3))
; node-value: contract violation:
; expected: treelike?
; given: 2
; argument position: 1st

It took me a long time to understand this, and the answer is that the definitions of fold inside the define-generic form aren’t adding methods to a generic function: what they are doing is defining a little local function, fold which then gets glued into the generic function. So references to fold in the definition refer to the little local function. It is exactly as if you had done this, in fact:

(define-generics foldable
  ;; this is why it's broken
  (fold operation initial foldable)
  #:defaults
  ((treelike?
    (define fold
      (letrec ([fold (λ (op initial treelike)
                       (let ([current (op initial (node-value treelike))]
                             [children (node-children treelike)])
                         (if (null? children)
                             current
                             (fold op (fold op current (first children))
                                   (rest children)))))])
        fold)))
   ((const true)
    (define (fold op initial any)
      (op initial any)))))

And you can see why this can’t work: the fold bound by the letrec calls itself rather than going through the generic dispatch.

The way to fix this is to use the magic define/generic form to get a copy of the generic function, and then call that. This is syntactically horrid, but you can see why it is needed given the above. So a working version of this interface purports to be:

(define-generics foldable
  ;; not broken
  (fold operation initial foldable)
  #:defaults
  ((treelike?
    (define/generic fold/g fold)
    (define (fold op initial treelike)
      (let ([current (op initial (node-value treelike))]
            [children (node-children treelike)])
        (if (null? children)
            current
            (fold op (fold/g op current (first children))
                  (rest children))))))
   ((const true)
    (define (fold op initial any)
      (op initial any)))))

And indeed it is not broken:

> (fold + 0 '(1 2 3))
6

and with some tracing added:

> (fold + 0 '(1 2 3))
fold/treelike + 0 (1 2 3)
fold/any + 1 2
fold/treelike + 3 (3)
6

Adding a special case to fold for the binary tree

So now, finally, we can add a special case to fold to the binary tree defined above, rather than needlessly consing a list of children. We will need the same explicit-generic-function hack as before as the children of a binary tree may not be binary trees.

(struct binary-tree (value left right)
  #:transparent
  #:methods gen:treelike
  ((define (node-value bt)
     (binary-tree-value bt))
   (define (node-children bt)
     (list (binary-tree-left bt)
           (binary-tree-right bt))))
  #:methods gen:foldable
  ((define/generic fold/g fold)
   (define (fold op initial bt)
     (fold/g op
             (fold/g op (op initial (binary-tree-value bt))
                     (binary-tree-left bt))
             (binary-tree-right bt)))))

And now

> (fold + 0 (binary-tree 1
                         (binary-tree 2 3 4)
                         (binary-tree 5 6 7)))
28

and with some tracing

> (fold + 0 (binary-tree 1
                         (binary-tree 2 3 4)
                         (binary-tree 5 6 7)))
fold/bt + 0 #(struct:binary-tree 1 #(struct:binary-tree 2 3 4) #(struct:binary-tree 5 6 7))
fold/bt + 1 #(struct:binary-tree 2 3 4)
fold/any + 3 3
fold/any + 6 4
fold/bt + 10 #(struct:binary-tree 5 6 7)
fold/any + 15 6
fold/any + 21 7
28

Missing CLOS

In some ways this makes me miss CLOS: the explicit-generic-function hack is very annoying, single dispatch is annoying, not being able to define predicate-based methods separately from the define-generics form is annoying. But on the other hand predicate-based dispatch is pretty cool.


  1. Perhaps these should be called ‘sloppy trees’ or something.