It's been a while since I wrote about any coding here. Most of my coding lately has been done at my job, and while it has been interesting, it's not for public consumption. However, I started a fun little project last weekend and finished it this weekend: adding tail-call optimization (TCO) to emacs lisp.
If you're not familiar with TCO, it's not too complicated. Most programming language interpreters (including the CPU) use a stack to keep track of the local variables in a function. When a new function is called, its arguments are pushed on the stack before the call and read from the stack. When the function returns, the arguments are removed from the stack and the old local variables are available just as they were before. However, in the case where the result of the function call is being returned unmodified, this is somewhat wasteful: the old local variables are not going to be used any more so we could just put the arguments on the stack where the old variables were. We can then just jump into the new function.
Using this technique, multiple tail-calls won't grow the stack, and tail-recursive functions (that is, functions that only recurse when they are about to return) become a good alternative to loops. This is heavily used in Scheme, where tail-call elimination is part of the standard. Note that the reason this is an optimization isn't speed of execution; it's more about the amount of memory that is used.
Until recently, TCO was not possible at all in emacs lisp. This is
because, until recently, all of emacs lisp was dynamically scoped.
Dynamic scoping essentially means that local variables shadow global
variables in functions that are called while the local variables are
in scope. So if you have global variable
a and function
something that depends on
a, then you can change the behavior of
b when you call it from
c by setting the value of
This rule, while unconventional, can be quite useful in configuring
the behavior of the editor. Anyway, dynamic scoping means that
before any function returns, all the bound local variables must be
unbound so that they no longer affect other functions. This means
that a function call is almost never followed by a return.
However, emacs 24 added the option to compile files using lexical scope (the more common scoping rule that you are probably already familiar with if you've gotten this far). There is no need to explicitly unbind variables before a return in lexical scope because local variables aren't bound. Thus we can do TCO on these functions. In short, this is what I added.
The first thing to decide was where to add the TCO code. Since emacs byte-code doesn't have the ability to jump into a function, I quickly decided that I needed to modify the byte-code interpreter instead. This means going into the C guts of emacs.
The first thing I did was to follow the path that a call to a
lexically-bound function would make from the byte-code interpreter.
I added all this code plus a
goto statement in case the operation
that followed the call was a
return. It wasn't a whole lot of
code, so I was feeling fairly good about things.
The first problem with this approach is that the emacs stack could
be a different size from what was needed by the calling function.
Since the stack is allocated by
alloca (which allocates on the
processor's stack), and because there is only one
return from the
byte-code interpreter, I thought I would be fine changing the
alloca to a combination of
free. Strangely, when I
went to recompile the emacs lisp files, I kept getting errors. This
left me fairly befuddled.
I finally figured it out when I realized the garbage collector was
looking at the processor's stack to find variables that are in use.
Thus to work properly, the emacs stack must be allocated on the
processor's stack. That meant I had to go back to
modify the garbage collector – something I was not likely to do
correctly). Since the (emacs) stack size could change, the memory
allocated for it would also need to change. I needed something that
let me blow away part of the (CPU) stack –
goto keeps the stack
In my spelunking in the emacs C code, I had seen functions
longjmp used to implement emacs's exception mechanism. I had
never used these functions before (and with good reason – if you
goto creates spaghetti code, these functions are much
worse), but it seemed like they might do what I want. That is, when
longjmp, you get the effect of a
goto, but the
processor's stack is restored to the state that it was when you were
at the place that jumped to. So by using a
longjmp within the
function, I could allocate the emacs stack on the processor's stack
and still do TCO without growing the amount of memory used for each
By the way, I also learned more about the
volatile keyword than I
wanted to know, including this beauty:
funargs = *(volatile Lisp_Object *) &funargs;
That means (as far as I understand it) "put this in memory right now".
Benefits / Limitations
There is an obvious benefit from this work: people can now write tail-recursive functions and not worry about the the size of the input causing the function to stop working. It does, however, have some limitations. It
- only works for lexically-bound functions,
- only works for byte-compiled functions, and
- could cause problems with debugging.
I've gone through the reasons for the first two, and the third should be well known to people who are familiar with TCO. The main reason is that since the function call can't add to the memory usage, it must also not add information about itself to the list of function calls displayed in a backtrace.
I am going to test this for a while longer before I release it. However, I think it's really going to work, and it will allow an entirely different style of programming in emacs lisp. That's pretty exciting for me.