# The U combinator

The U combinator allows you to define recursive functions and I think it is simpler to understand than the Y combinator.

It’s not obvious how things like `letrec`

get defined in Scheme, without using secret assignment. In fact I think they *are* defined using secret assignment:

```
(letrec ([f (λ (...) ... (f ...) ...)])
...)
```

turns into

```
(let ([f ...])
(set! f (λ (...) ... (f ...) ...))
...)
```

But it’s interesting to see how you can define recursive functions without relying on assignment, including mutually-recursive collections of functions. One way is using the U combinator.

I suspect that there is lots of information about this out there, but it’s seriously hard to search for anything which looks like ’*-combinator’ now (even now I am starting a set of companies called ‘integration by parts’, ‘the quotient rule’ &c).

You can famously do this with the Y combinator, but I didn’t want to do that because Y is something I find I can understand for a few hours at a time and then I have to work it all out again. But it turns out that you can use something much simpler: the U combinator. It seems to be even harder to search for this than Y, but here is a quote about it:

In the theory of programming languages, the U combinator, \(U\), is the mathematical function that applies its argument to its argument; that is \(U(f) = f(f)\), or equivalently, \(U = \lambda f \cdot f(f)\).

Self-application permits the simulation of recursion in the λ-calculus, which means that the U combinator enables universal computation. (The U combinator is actually more primitive than the more well-known fixed-point Y combinator.)

The expression \(U(U)\) is the smallest non-terminating program.

(Text mildly edited from here, which unfortunately is not a site all about the U combinator other than this quote.)

## Prerequisites

All of the following code samples are in Racket. The macros are certainly Racket-specific and some of the other code probably is as well. To make the macros work you will need `syntax-parse`

via:

`(require (for-syntax syntax/parse))`

However note that my use of `syntax-parse`

is naïve in the extreme: I’m really just an unfrozen CL caveman pretending to understand Racket’s macro system.

Also note I have not ruthlessly turned everything into λ: Rather than `((λ (...) ...) ...)`

there is `(let ([... ...] ...) ...)`

in this code; there is use of multiple values including `let-values`

; there is `(define (f ...) ...)`

rather than `(define f (λ (...) ...))`

and so on.

## Two versions of U

The first version of U is the obvious one:

```
(define (U f)
(f f))
```

But this will run into some problems with an applicative-order language, which Racket is by default. To avoid that we can make the assumption that `(f f)`

is going to be a function, and wrap that form in another function to delay its evaluation until it’s needed: this is the standard trick that you have to do for Y in an applicative-order language as well. I’m only going to use the applicative-order U when I have to, so I’ll give it a different name:

```
(define (U/ao f)
(λ args (apply (f f) args)))
```

Note also that I’m allowing more than one argument rather than doing the pure-λ-calculus thing.

## Using U to construct a recursive functions

To do this we do a similar trick that you do with Y: write a function which, if given a function as argument which deals with the recursive cases, will return a recursive function. And obviously I’ll use the Fibonacci function as the canonical recursive function.

So, consider this thing:

```
(define fibber
(λ (f)
(λ (n)
(if (<= n 2)
1
(+ ((U f) (- n 1))
((U f) (- n 2)))))))
```

This is a function which, given another function, `U`

of which computes smaller Fibonacci numbers, will return a function which will compute the Fibonacci number for `n`

.

In other words, * U of this function is the Fibonacci function*!

And we can test this:

```
> (define fibonacci (U fibber))
> (fibonacci 10)
55
```

So that’s very nice.

## Wrapping U in a macro

So, to hide all this the first thing to do is to remove the explicit calls to `U`

in the recursion. We can lift them out of the inner function completely:

```
(define fibber/broken
(λ (f)
(let ([fib (U f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))
```

*Don’t try to compute U of this*: it will recurse endlessly because

`(U fibber/broken)`

-> `(fibber/broken fibber/broken)`

and this involves computing `(U fibber/broken)`

, and we’re doomed.Instead we can use `U/ao`

:

```
(define fibber
(λ (f)
(let ([fib (U/ao f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))
```

And this is all fine `((U fibber) 10)`

is `55`

(and terminates!).

Purists can then turn `let`

into `λ`

in the usual way:

```
(define fibber
(λ (f)
((λ (fib)
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))
(U/ao f))))
```

And this is really all you need to be able to write the macro:

```
(define-syntax (with-recursive-binding stx)
(syntax-parse stx
[(_ (name:id value:expr) form ...+)
#'(let ([name (U (λ (f)
(let ([name (U/ao f)])
value)))])
form ...)]))
```

Or, for the pure of heart:

```
(define-syntax (with-recursive-binding stx)
(syntax-parse stx
[(_ (name:id value:expr) form ...+)
#'((λ (name)
form ...)
(U (λ (f)
((λ (name)
value)
(U/ao f)))))]))
```

And this works fine:

```
(with-recursive-binding (fib (λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))
(fib 10))
```

## A caveat on bindings

One fairly obvious thing here is that there are *two* bindings constructed by this macro: the outer one, and an inner one of the same name. And these are not bound to the same function in the sense of `eq?`

:

```
(with-recursive-binding (ts (λ (it)
(eq? ts it)))
(ts ts))
```

is `#f`

. This matters only in a language where bindings can be mutated: a language with assignment in other words. Both the outer and inner bindings, unless they have been mutated, are to functions which are identical *as functions*: they compute the same values for all values of their arguments. In fact, it’s hard to see what purpose `eq?`

would serve in a language without assignment.

This caveat will apply below as well.

## Two versions of U for many functions

The obvious generalization of U, U*, to many functions is that \(U^*(f_1, \ldots, f_n)\) is the tuple \((f_1(f_1, \ldots, f_n), f_2(f_1, \ldots, f_n), \ldots)\). And a nice way of expressing that in Racket is to use multiple values:

```
(define (U* . fs)
(apply values (map (λ (f)
(apply f fs))
fs)))
```

And we need the applicative-order one as well:

```
(define (U*/ao . fs)
(apply values (map (λ (f)
(λ args (apply (apply f fs) args)))
fs)))
```

Note that U* is a true generalization of U: `(U f)`

and `(U* f)`

are the same.

## Using U* to construct mutually-recursive functions

I’ll work with a trivial pair of functions:

- an object is a
*numeric tree*if it is a cons and its car and cdr are numeric objects; - an objct is a
*numeric object*if it is a number, or if it is a numeric tree.

So we can define ‘maker’ functions (with an ’-er’ convention: a function which makes an *x* is an *x*er, or, if *x* has hyphens in it, an *x*-er) which will make suitable functions:

```
(define numeric-tree-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(and (cons? o)
(no? (car o))
(no? (cdr o)))))))
(define numeric-object-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(cond
[(number? o) #t]
[(cons? o) (nt? o)]
[else #f])))))
```

Note that for both of these I’ve raised the call to `U*`

a little, simply to make the call to the appropriate value of `U*`

less opaque.

And this works:

```
(define-values (numeric-tree? numeric-object?)
(U* numeric-tree-er numeric-object-er))
```

And now:

```
> (numeric-tree? 1)
#f
> (numeric-object? 1)
#t
> (numeric-tree? '(1 . 2))
#t
> (numeric-tree? '(1 2 . (3 4)))
#f
```

## Wrapping U* in a macro

The same problem as previously happens when we raise the inner call to `U*`

with the same result: we need to use `U*/ao`

. In addition the macro becomes significantly more hairy and I’m moderately surprised that I got it right so easily. It’s not conceptually hard: it’s just not obvious to me that the pattern-matching works.

```
(define-syntax (with-recursive-bindings stx)
(syntax-parse stx
[(_ ((name:id value:expr) ...) form ...+)
#:fail-when (check-duplicate-identifier (syntax->list #'(name ...)))
"duplicate variable name"
(with-syntax ([(argname ...) (generate-temporaries #'(name ...))])
#'(let-values
([(name ...) (U* (λ (argname ...)
(let-values ([(name ...)
(U*/ao argname ...)])
value)) ...)])
form ...))]))
```

And now, in a shower of sparks, we can write:

```
(with-recursive-bindings ((numeric-tree?
(λ (o)
(and (cons? o)
(numeric-object? (car o))
(numeric-object? (cdr o)))))
(numeric-object?
(λ (o)
(cond [(number? o) #t]
[(cons? o) (numeric-tree? o)]
[else #f]))))
(numeric-tree? '(1 2 3 (4 (5 . 6) . 7) . 8)))
```

and get `#t`

.

As I said, I am sure there are well-known better ways to do this, but I thought this was interesting enough not to lose. This originated as an answer to this Stack Overflow question.