Štar: an iteration construct for Common Lisp

:: programming, lisp

Štar is a concise and extensible iteration construct for Common Lisp which aims to be pleasant to use, easy to understand, fast if needed, general, and not to look like Fortran.

Common Lisp has multiple iteration constructs: mapping functions such as mapcar, special-purpose constructs such as dotimes and dolist, the general but somewhat clumsy construct which is do and do*, and finally the extended loop macro which aims to embed a ‘more friendly’ iteration language into CL and succeeds in being so complex that it is often hard to know whether a given form is legal or not without poring over loop’s grammar.

None of these constructs manage to be all three of pleasant to use, easy to understand and general. loop somehow fails to be any of these things in many cases. None are extensible1.

But Common Lisp is a Lisp, and Lisp’s huge advantage is that it is a programming language in which it is easy to write programming languages, or parts of them, like iteration constructs. That is, after all, how most or all of the existing constructs started life.

Lots of these have been written, of course. Štar tries to distinguish itself by being as simple as possible: it has as little special syntax as I could work out how to give it – there is no special little language you need to learn. It also has no inherent knowledge about how to iterate over any particular structure: it doesn’t know how to iterate over lists, or ranges of numbers. Rather it knows that iterating has to answer two questions:

  • is there more?
  • what is the next thing?

In addition it knows how to ask another question:

  • is there any information I can use to make asking the first two questions faster?

What it looks like

(for ((e (in-list l)))
  (print e))

(for (((k v) (in-hash-table h)))
  ...)

(for* ((entry (in-list entries))
       (element (in-list entry)))
  ...)

(defun in-alist (alist)
  (values
    (lambda () (not (null alist)))
    (lambda ()
      (destructuring-bind ((k . v) . more) alist
        (setf alist more)
        (values k v)))))

(for (((k v) (in-alist ...)))
  ...)

These are some simple examples: the last shows how easy it is to teach Štar to iterate over new things, or over existing things in new ways. Not shown here is that it’s also pretty easy to teach it how to optimize new iterators and to make various declarations about things.

What Štar is not

Štar is an iteration construct: what it does is to iterate. It has nothing to do with collecting values. For Štar, iteration and value accumulation are orthogonal problems which should be solved by orthogonal constructs. In particular if you wanted to make a list of the even numbers from a list, you might to this by using Štar together with a value-collection macro:

(collecting
  (for ((e (in-list ...)))
    (when (evenp e)
      (collect e))))

This is yet another way in which Štar differs from loop and many other constructs. In particular Štar’s point of view is that mixing together iteration and value accumulation results in a system that is not very good at either.

Similarly, Štar doesn’t contain a mass of syntax letting you select only certain values, or allowing you to terminate iteration early: you don’t write

(loop for x in l while (numberp x) do ...)

Instead you write

(for ((x (in-list l)))
  (unless (numberp x) (final))
  ...)

The body of an iteration is exactly like the body of defun, except there are some local functions which you can call to skip to the next iteration or finish the iteration.

Štar, of course, doesn’t bundle some destructuring system which will inevitably be subtly incompatible with other destructuring systems while also not being usable independently. If you want destructuring, use a full-fat system of your choice.

You probably get the idea: Štar is a tool whose job is to iterate: not some leaking bag of broken abstractions.

Multiple values

One thing that Štar does do is to take multiple values seriously. A clause which specifies a list of variables will bind them to the multiple values returned by the iterator. Multiple values, unlike destructuring, are something you really have to have in the iteration construct itself.

The thing that doesn’t really matter but everyone cares about

So, OK, Štar is an extensible, general, iteration construct. Obviously it will have traded performance for all this. I mean, it’s the old Lisp story, the one Gabriel told us long ago. Right?

* Running benchmarks
** lists of length 2000, nesting 3
what                                                  seconds      ratio
star                                                   30.312      1.000
loop                                                   30.071      0.992
dolist                                                 21.594      0.712
** range 100000, nesting 2
what                                                  seconds      ratio
star/with-step                                          9.414      1.000
star/no-step                                            9.406      0.999
loop                                                   18.412      1.956
dotimes                                                 9.469      1.006

A sketch of Štar

Štar has three parts, four if you count the iterators:

  • the iteration constructs themselves and bindings they make;
  • a protocol for defining new iterators;
  • a protocol for defining optimizers for iterators;
  • a collection of predefined iterators and optimizers for them.

The first three parts are much more finished than the fourth: most of the existing iterators were written as proofs of concept and may well change, get better, or go away.

Iterators

These are forms (usually, named function calls) which return two values, both functions of no arguments:

  • the first value is called and should return true if there is more to do;
  • the second value is called to return the next value or values of the iterator.

These functions answer the first two questions Štar needs to ask: is there more, and if there is, what is it? These two functions obviously share state in general.

To answer the third question – how do I make things faster? – named iterator functions can have optimizers: these are functions called by Štar at macroexpansion time which tell it how to make things faster. It’s up to an optimizer to ensure the semantics are the same for the optimized and unoptimized versions. Optimizers can specify a set of bindings to make, declarations for them, how to iterate, and some other things.

It’s possible to install and remove optimizers, and to dynamically bind sets of them. This might be useful, for instance, to compile a file where some particular assumptions (‘all vectors are vectors of floats’) are true.

Iteration constructs

There are two:

  • for iterates in parallel;
  • for* defines nested loops: (for* ((...) (...)) ...) is like (for ((...)) (for ((...)) ...)).

Note that because of the way iterators work, sequential binding within one loop makes no sense.

The first argument of for or for* specifies a number of clauses: each clause is of the form <var/s> <iterator>). Multiple values are supported, and it is possible to make various declarations about variables: this matters for for*, where there is no room for declarations for other than the last (innermost) clause. Variables whose name is "_" are ignored by default.

Within the body of an iteration there are four local functions:

  • next skips to the next iteration;
  • next* skips to the next outer-level iteration for for* and is the same as next for for;
  • final returns zero or more values from the current iteration
  • final* returns zero or more values for the outer-level iteration for for* and is the same as final for for.

Provided iterators

There are iterators which work for lists, vectors, general sequences, hash tables, ranges of reals, as well as some more interesting ones. Not all iterators currently have optimizers. You only need to care about writing optimizers if you need to make iterators very fast: they can be significantly fiddly to write.

The iterator over ranges of reals has an optimizer which tries hard to make things pretty fast.

As well as this there are some more interesting iterators. An example is sequentially:

(for ((a (sequentially 1 2)))
  ...)

will bind a sequentually to 1 and 2. But this iterator is a macro which has the behaviour of a FEXPR, so it evaluates its arguments only when needed (and as many times as needed). A variant of sequentially is sequentially* which ‘sticks’ on its last argument. So for instance:

> (let ((v 0))
    (for ((a (sequentially* (incf v)))
          (_ (in-range 3)))
      (print a)))

1 
2 
3 

Another way to write this is:

> (for ((a (let ((v 0)) (sequentially* (incf v))))
        (_ (in-range 4)))
    (print a))

1 
2 
3 
4 

And of course there is a meta-iterator (also implemented as a macro), in-iterators:

> (for ((a (in-iterators
            (in-list '(1 2))
            (in-list '(3 4)))))
    (print a))

1 
2 
3 
4 

It’s possible to construct very general iterators with tools like this.

As I said above, Štar’s current iterators are in a fairly rough state: a lot of this might change.


Notes

Declarations

A form like

(for* ((a (in-range 10))
       (b (in-range 10)))
  ...)

corresponds roughly to

(for ((a (in-range 10)))
  (for ((b (in-range 10)))
    ...)

The problem is now how to make declarations which apply to a. This is hard because CL doesn’t provide the tools you need to know whether a declaration refers to a variable or not: to know whether (declare (foo a)) refers to a or not you need to know, at least, whether foo names a type, which you can’t do. You often can guess, but not always.

So, rather than trying to solve an intractable problem, Štar lets you specify some properties of a variable in the clause that binds it: you can say

(for* (((a :type fixnum) (in-range 10))
       (b (in-range 10)))
  ...)

for instance. This is a bit ugly, but it solves the problem. It is only useful for for*.

Binding

Štar binds, rather than assigns:

(collecting
  (for ((a (in-list 1 2)))
    (collect
      (lambda (&optional (v vp))
        (if vp (setf a v) a)))))

will return two closures over independent bindings.

Epilogue

Štar’s source code is here. The manual is included with the source code but also here.

Štar is pronounced roughly ‘shtar’.

Much of the inspiration for Štar came from my friend Zyni: thanks to her for the inspiration behind it, actually making me write it and for many other things.

Štar is dedicated to her, and to Ian Anderson.


  1. Some implementations have mechanisms for extending loop