The U combinator allows you to define recursive functions and I think it is simpler to understand than the Y combinator.
It’s not obvious how things like
letrec get defined in Scheme, without using secret assignment. In fact I think they are defined using secret assignment:
(letrec ([f (λ (...) ... (f ...) ...)]) ...)
(let ([f ...]) (set! f (λ (...) ... (f ...) ...)) ...)
But it’s interesting to see how you can define recursive functions without relying on assignment, including mutually-recursive collections of functions. One way is using the U combinator.
I suspect that there is lots of information about this out there, but it’s seriously hard to search for anything which looks like ’*-combinator’ now (even now I am starting a set of companies called ‘integration by parts’, ‘the quotient rule’ &c).
You can famously do this with the Y combinator, but I didn’t want to do that because Y is something I find I can understand for a few hours at a time and then I have to work it all out again. But it turns out that you can use something much simpler: the U combinator. It seems to be even harder to search for this than Y, but here is a quote about it:
In the theory of programming languages, the U combinator, \(U\), is the mathematical function that applies its argument to its argument; that is \(U(f) = f(f)\), or equivalently, \(U = \lambda f \cdot f(f)\).
Self-application permits the simulation of recursion in the λ-calculus, which means that the U combinator enables universal computation. (The U combinator is actually more primitive than the more well-known fixed-point Y combinator.)
The expression \(U(U)\) is the smallest non-terminating program.
(Text mildly edited from here, which unfortunately is not a site all about the U combinator other than this quote.)
All of the following code samples are in Racket. The macros are certainly Racket-specific and some of the other code probably is as well. To make the macros work you will need
(require (for-syntax syntax/parse))
However note that my use of
syntax-parse is naïve in the extreme: I’m really just an unfrozen CL caveman pretending to understand Racket’s macro system.
Also note I have not ruthlessly turned everything into λ: Rather than
((λ (...) ...) ...) there is
(let ([... ...] ...) ...) in this code; there is use of multiple values including
let-values; there is
(define (f ...) ...) rather than
(define f (λ (...) ...)) and so on.
Two versions of U
The first version of U is the obvious one:
(define (U f) (f f))
But this will run into some problems with an applicative-order language, which Racket is by default. To avoid that we can make the assumption that
(f f) is going to be a function, and wrap that form in another function to delay its evaluation until it’s needed: this is the standard trick that you have to do for Y in an applicative-order language as well. I’m only going to use the applicative-order U when I have to, so I’ll give it a different name:
(define (U/ao f) (λ args (apply (f f) args)))
Note also that I’m allowing more than one argument rather than doing the pure-λ-calculus thing.
Using U to construct a recursive functions
To do this we do a similar trick that you do with Y: write a function which, if given a function as argument which deals with the recursive cases, will return a recursive function. And obviously I’ll use the Fibonacci function as the canonical recursive function.
So, consider this thing:
(define fibber (λ (f) (λ (n) (if (<= n 2) 1 (+ ((U f) (- n 1)) ((U f) (- n 2)))))))
This is a function which, given another function,
U of which computes smaller Fibonacci numbers, will return a function which will compute the Fibonacci number for
In other words,
U of this function is the Fibonacci function!
And we can test this:
> (define fibonacci (U fibber)) > (fibonacci 10) 55
So that’s very nice.
Wrapping U in a macro
So, to hide all this the first thing to do is to remove the explicit calls to
U in the recursion. We can lift them out of the inner function completely:
(define fibber/broken (λ (f) (let ([fib (U f)]) (λ (n) (if (<= n 2) 1 (+ (fib (- n 1)) (fib (- n 2))))))))
Don’t try to compute
U of this: it will recurse endlessly because
(U fibber/broken) ->
(fibber/broken fibber/broken) and this involves computing
(U fibber/broken), and we’re doomed.
Instead we can use
(define fibber (λ (f) (let ([fib (U/ao f)]) (λ (n) (if (<= n 2) 1 (+ (fib (- n 1)) (fib (- n 2))))))))
And this is all fine
((U fibber) 10) is
55 (and terminates!).
Purists can then turn
λ in the usual way:
(define fibber (λ (f) ((λ (fib) (λ (n) (if (<= n 2) 1 (+ (fib (- n 1)) (fib (- n 2)))))) (U/ao f))))
And this is really all you need to be able to write the macro:
(define-syntax (with-recursive-binding stx) (syntax-parse stx [(_ (name:id value:expr) form ...+) #'(let ([name (U (λ (f) (let ([name (U/ao f)]) value)))]) form ...)]))
Or, for the pure of heart:
(define-syntax (with-recursive-binding stx) (syntax-parse stx [(_ (name:id value:expr) form ...+) #'((λ (name) form ...) (U (λ (f) ((λ (name) value) (U/ao f)))))]))
And this works fine:
(with-recursive-binding (fib (λ (n) (if (<= n 2) 1 (+ (fib (- n 1)) (fib (- n 2)))))) (fib 10))
A caveat on bindings
One fairly obvious thing here is that there are two bindings constructed by this macro: the outer one, and an inner one of the same name. And these are not bound to the same function in the sense of
(with-recursive-binding (ts (λ (it) (eq? ts it))) (ts ts))
#f. This matters only in a language where bindings can be mutated: a language with assignment in other words. Both the outer and inner bindings, unless they have been mutated, are to functions which are identical as functions: they compute the same values for all values of their arguments. In fact, it’s hard to see what purpose
eq? would serve in a language without assignment.
This caveat will apply below as well.
Two versions of U for many functions
The obvious generalization of U, U*, to many functions is that \(U^*(f_1, \ldots, f_n)\) is the tuple \((f_1(f_1, \ldots, f_n), f_2(f_1, \ldots, f_n), \ldots)\). And a nice way of expressing that in Racket is to use multiple values:
(define (U* . fs) (apply values (map (λ (f) (apply f fs)) fs)))
And we need the applicative-order one as well:
(define (U*/ao . fs) (apply values (map (λ (f) (λ args (apply (apply f fs) args))) fs)))
Note that U* is a true generalization of U:
(U f) and
(U* f) are the same.
Using U* to construct mutually-recursive functions
I’ll work with a trivial pair of functions:
- an object is a numeric tree if it is a cons and its car and cdr are numeric objects;
- an objct is a numeric object if it is a number, or if it is a numeric tree.
So we can define ‘maker’ functions (with an ’-er’ convention: a function which makes an x is an xer, or, if x has hyphens in it, an x-er) which will make suitable functions:
(define numeric-tree-er (λ (nter noer) (λ (o) (let-values ([(nt? no?) (U* nter noer)]) (and (cons? o) (no? (car o)) (no? (cdr o))))))) (define numeric-object-er (λ (nter noer) (λ (o) (let-values ([(nt? no?) (U* nter noer)]) (cond [(number? o) #t] [(cons? o) (nt? o)] [else #f])))))
Note that for both of these I’ve raised the call to
U* a little, simply to make the call to the appropriate value of
U* less opaque.
And this works:
(define-values (numeric-tree? numeric-object?) (U* numeric-tree-er numeric-object-er))
> (numeric-tree? 1) #f > (numeric-object? 1) #t > (numeric-tree? '(1 . 2)) #t > (numeric-tree? '(1 2 . (3 4))) #f
Wrapping U* in a macro
The same problem as previously happens when we raise the inner call to
U* with the same result: we need to use
U*/ao. In addition the macro becomes significantly more hairy and I’m moderately surprised that I got it right so easily. It’s not conceptually hard: it’s just not obvious to me that the pattern-matching works.
(define-syntax (with-recursive-bindings stx) (syntax-parse stx [(_ ((name:id value:expr) ...) form ...+) #:fail-when (check-duplicate-identifier (syntax->list #'(name ...))) "duplicate variable name" (with-syntax ([(argname ...) (generate-temporaries #'(name ...))]) #'(let-values ([(name ...) (U* (λ (argname ...) (let-values ([(name ...) (U*/ao argname ...)]) value)) ...)]) form ...))]))
And now, in a shower of sparks, we can write:
(with-recursive-bindings ((numeric-tree? (λ (o) (and (cons? o) (numeric-object? (car o)) (numeric-object? (cdr o))))) (numeric-object? (λ (o) (cond [(number? o) #t] [(cons? o) (numeric-tree? o)] [else #f])))) (numeric-tree? '(1 2 3 (4 (5 . 6) . 7) . 8)))
As I said, I am sure there are well-known better ways to do this, but I thought this was interesting enough not to lose. This originated as an answer to this Stack Overflow question.