pages tagged lab-notesHiking and Codinghttp://chrismgray.github.com//tags/lab-notes/Hiking and Codingikiwiki2012-12-09T18:38:28ZTail-call optimization in emacs lisphttp://chrismgray.github.com//posts/emacs-tco/2012-12-09T18:38:28Z2012-12-09T18:38:28Z
<p>It's been a while since I wrote about any coding here. Most of my
coding lately has been done at my job, and while it has been
interesting, it's not for public consumption. However, I started a
fun little project last weekend and finished it this weekend: adding
tail-call optimization (TCO) to emacs lisp.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Background</h2>
<div class="outline-text-2" id="text-1">
<p>
If you're not familiar with TCO, it's not too complicated. Most
programming language interpreters (including the CPU) use a stack to
keep track of the local variables in a function. When a new
function is called, its arguments are pushed on the stack before the
call and read from the stack. When the function returns, the
arguments are removed from the stack and the old local variables are
available just as they were before. However, in the case where the
result of the function call is being returned unmodified, this is
somewhat wasteful: the old local variables are not going to be used
any more so we could just put the arguments on the stack where the
old variables were. We can then just jump into the new function.
</p>
<p>
Using this technique, multiple tail-calls won't grow the stack, and
tail-recursive functions (that is, functions that only recurse when
they are about to return) become a good alternative to loops. This
is heavily used in Scheme, where tail-call elimination is part of
the standard. Note that the reason this is an optimization isn't
speed of execution; it's more about the amount of memory that is
used.
</p>
<p>
Until recently, TCO was not possible at all in emacs lisp. This is
because, until recently, all of emacs lisp was dynamically scoped.
Dynamic scoping essentially means that local variables shadow global
variables in functions that are called while the local variables are
in scope. So if you have global variable <code>a</code> and function <code>b</code> does
something that depends on <code>a</code>, then you can change the behavior of
<code>b</code> when you call it from <code>c</code> by setting the value of <code>a</code> locally.
This rule, while unconventional, can be quite useful in configuring
the behavior of the editor. Anyway, dynamic scoping means that
before any function returns, all the bound local variables must be
unbound so that they no longer affect other functions. This means
that a function call is almost never followed by a return.
</p>
<p>
However, emacs 24 added the option to compile files using lexical
scope (the more common scoping rule that you are probably already
familiar with if you've gotten this far). There is no need to
explicitly unbind variables before a return in lexical scope because
local variables aren't bound. Thus we can do TCO on these
functions. In short, this is what I added.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">Implementation Challenges</h2>
<div class="outline-text-2" id="text-2">
<p>
The first thing to decide was where to add the TCO code. Since
emacs byte-code doesn't have the ability to jump into a function, I
quickly decided that I needed to modify the byte-code interpreter
instead. This means going into the C guts of emacs.
</p>
<p>
The first thing I did was to follow the path that a call to a
lexically-bound function would make from the byte-code interpreter.
I added all this code plus a <code>goto</code> statement in case the operation
that followed the call was a <code>return</code>. It wasn't a whole lot of
code, so I was feeling fairly good about things.
</p>
<p>
The first problem with this approach is that the emacs stack could
be a different size from what was needed by the calling function.
Since the stack is allocated by <code>alloca</code> (which allocates on the
processor's stack), and because there is only one <code>return</code> from the
byte-code interpreter, I thought I would be fine changing the
<code>alloca</code> to a combination of <code>malloc</code> and <code>free</code>. Strangely, when I
went to recompile the emacs lisp files, I kept getting errors. This
left me fairly befuddled.
</p>
<p>
I finally figured it out when I realized the garbage collector was
looking at the processor's stack to find variables that are in use.
Thus to work properly, the emacs stack must be allocated on the
processor's stack. That meant I had to go back to <code>alloca</code> (or
modify the garbage collector – something I was not likely to do
correctly). Since the (emacs) stack size could change, the memory
allocated for it would also need to change. I needed something that
let me blow away part of the (CPU) stack – <code>goto</code> keeps the stack
in place.
</p>
<p>
In my spelunking in the emacs C code, I had seen functions <code>setjmp</code>
and <code>longjmp</code> used to implement emacs's exception mechanism. I had
never used these functions before (and with good reason – if you
think <code>goto</code> creates spaghetti code, these functions are much
worse), but it seemed like they might do what I want. That is, when
you call <code>longjmp</code>, you get the effect of a <code>goto</code>, but the
processor's stack is restored to the state that it was when you were
at the place that jumped to. So by using a <code>longjmp</code> within the
function, I could allocate the emacs stack on the processor's stack
and still do TCO without growing the amount of memory used for each
tail call.
</p>
<p>
By the way, I also learned more about the <code>volatile</code> keyword than I
wanted to know, including this beauty:
</p>
<pre class="example">funargs = *(volatile Lisp_Object *) &funargs;
</pre>
<p>
That means (as far as I understand it) "put this in memory <b>right now</b>".
</p>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">Benefits / Limitations</h2>
<div class="outline-text-2" id="text-3">
<p>
There is an obvious benefit from this work: people can now write
tail-recursive functions and not worry about the the size of the
input causing the function to stop working. It does, however, have
some limitations. It
</p>
<ul>
<li>only works for lexically-bound functions,
</li>
<li>only works for byte-compiled functions, and
</li>
<li>could cause problems with debugging.
</li>
</ul>
<p>
I've gone through the reasons for the first two, and the third
should be well known to people who are familiar with TCO. The main
reason is that since the function call can't add to the memory
usage, it must also not add information about itself to the list of
function calls displayed in a backtrace.
</p>
</div>
</div>
<div id="outline-container-4" class="outline-2">
<h2 id="sec-4">Conclusion</h2>
<div class="outline-text-2" id="text-4">
<p>
I am going to test this for a while longer before I release it.
However, I think it's really going to work, and it will allow an
entirely different style of programming in emacs lisp. That's
pretty exciting for me.
</p>
</div>
</div>
More Clojurescript Macroshttp://chrismgray.github.com//posts/cljs-macros-02/2012-04-20T23:18:48Z2012-04-20T22:47:21Z
<p>Most of my time this week was spent traveling and in meetings, but I
did have a chance to work on the Clojurescript macro system that I
have been working on.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Testing</h2>
<div class="outline-text-2" id="text-1">
<p>
Things didn't start too well for the system, since it would not run
the simplest of tests written for it in the Clojurescript testing
system. My test was a macro that would take a positive integer,
recursively decrease it until it was zero, and then output the
zero. (Clearly not the smartest macro in the world – it was just
meant to test recursion.) However, I kept getting errors saying
that the system didn't know how to test equality among numbers.
This seemed strange until I realized that the <code>extend-type</code> call for
numbers wasn't getting executed because it wasn't expanded into a
<code>def</code>. (I had been executing all <code>def</code> calls, which defined all
functions and variables, but nothing else.)
</p>
<p>
So I briefly flirted with executing a bunch of other special forms,
including <code>set!</code> and <code>deftype</code>, but I could see that this would make
things pretty unwieldy, and that the compiler would be doing a lot
of confusing things, for example trying to execute a <code>set!</code> that was
buried deep inside a function without being able to know anything
about the locals that it was being set to.
</p>
<p>
That was clearly the way of madness, so I eventually decided to
execute every special form that was at the top level. This worked
really well, and the tests started passing.
</p>
<p>
I also added tests to make sure that the namespaces work as
expected. You can <code>use</code> or <code>require</code> macros from other namespaces
just like other functions.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">What's not there (yet)</h2>
<div class="outline-text-2" id="text-2">
<p>
So far I have not added the <code>&env</code> and <code>&form</code> variables to macros.
I expect that it will be necessary to add them at some point (and
honestly not too hard). I know they are extremely useful in some
projects like <a href="https://github.com/clojure/core.match">core.match</a>, but that project has already been ported
to Clojurescript (via Clojure macros), so it might make more sense
to leave the large macros that need those facilities to Clojure.
</p>
<p>
Backquoted forms don't work as nicely as in Clojure. The reason is
that we are using Clojure's reader, which qualifies backquoted
symbols with their full namespace. Unfortunately, it doesn't know
anything about Clojurescript namespaces, so expect to need to
qualify symbols inside backquotes. This is an area where true
reader macros inside Clojure would be really helpful, but we have to
live with what we have.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">One step closer</h2>
<div class="outline-text-2" id="text-3">
<p>
… to Clojure in Clojure. Clojurescript already has a reader and a
compiler that can compile most of itself. The reader isn't totally
complete – it doesn't have backquote working yet for example – but
it is close. It recently got persistent vectors and persistent
hash-maps. With macros added, all that's left for Clojure in
Clojure is to finish up the reader and get rid of the calls to
Java.
</p>
</div>
</div>
Clojurescript Macroshttp://chrismgray.github.com//posts/cljs-macros/2012-04-11T06:33:51Z2012-04-11T05:36:33Z
<p>Today I wrote a proof-of-concept implementation of a macro system for
Clojurescript. The code is <a href="https://github.com/chrismgray/clojurescript/tree/defmacro">here</a>.
</p>
<p>
Clojurescript already had a sort of macro system in the form of
Clojure macros, but this is different – the macros are written in
Clojurescript, get compiled to JavaScript, and are evaluated by a
JavaScript interpreter. They can be mixed in with Clojurescript code
and call Clojurescript functions. In theory, they should work with
any Clojurescript backend that implements the REPL-related protocols
(but who knows if that's true).
</p>
<p>
So that's the big announcement. What follows are some implementation
details and other notes.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Implementation details</h2>
<div class="outline-text-2" id="text-1">
<p>
The macros are defined by passing strings back and forth between the
Clojurescript compiler and the Rhino JavaScript interpreter. The
strings from Rhino are read by Clojure using <code>read-string</code>, so
macros are limited to things that can be printed by Clojurescript in
a form that Clojure can read.
</p>
<p>
Macros are not yet put into the correct namespaces. I don't think
it will be too hard to do that correctly though.
</p>
<p>
Rhino is a slow beast. It adds multiple seconds to the startup time
of the compiler. It might be smart to scan the file for calls to
<code>defmacro</code> before including the macro-interpreting code. However,
since macro expansion requires that all functions are defined in the
interpreter, once a <code>defmacro</code> is hit, all the functions in the file
preceding it (and in all <code>require</code>'d files) must be re-parsed.
</p>
<p>
Existing Clojure macros should still work. If two macros have the
same name, the Clojurescript one will take precedence, but of course
getting namespaces working should eliminate most conflicts.
</p>
<p>
It should go without saying that this is completely experimental at
this point. Things seem like they work to me, but they might yet
blow up in unexpected ways.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">An example</h2>
<div class="outline-text-2" id="text-2">
<p>
Here is a simple Clojurescript file which implements the <code>unless</code>
macro (also known as <code>when-not</code> in Clojure, but I think giving it a
different name shows better that it is really not using the Clojure
macros).
</p>
<pre class="example">(defmacro unless
[pred & body]
`(if (not ~pred)
(do ~@body)
nil))
(let [a (unless true (/ 1 0) (+ 1 1))]
a)
</pre>
<p>
And here is its output:
</p>
<pre class="example">var a__5799 = (cljs.core.truth_(cljs.core.not.call(null,true))?(function (){(1 / 0);
return (1 + 1);
})():null);
a__5799;
</pre>
<p>
It's a bit ugly, but it should be obvious what's going on.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">Conclusion</h2>
<div class="outline-text-2" id="text-3">
<p>
I'm pretty happy with the progress so far. It really shows how
flexible the Clojurescript compiler is that a macro system could be
added in under 75 lines of code, with nearly half of that being very
lightly modified from the compiler itself.
</p>
</div>
</div>
Clojure Data Structureshttp://chrismgray.github.com//posts/clojure-data-structures/2012-04-01T01:27:15Z2012-04-01T01:27:15Z
<p>This past week, I've been knee-deep in Clojure data structures --
mostly persistent vector and hash-map. The goal has been to provide
replacements written in pure Clojure that can be ported to
ClojureScript, but it has also been a great way to learn about how
these structures work. So in this post, I'll try to summarize what I
learned, and give some the results of my hacking.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Vector</h2>
<div class="outline-text-2" id="text-1">
<p>
I started this project inspired by a tweet by <a href="https://twitter.com/#!/swannodette">David Nolen</a>. I can't
find it any more, but he mentioned that he had ported the
persistent-vector data structure to JavaScript and that he was
hoping to use it in ClojureScript. That made me think: since
the ClojureScript compiler now supports languages other than
JavaScript as its backends, wouldn't it be helpful, at least for
those backends, to port the data structure to pure Clojure, so that
any ClojureScript backend could use it?
</p>
<p>
So I did. The result is <a href="https://github.com/chrismgray/persistent-data-structures/blob/master/src/persistent_data_structures/vector.clj">here</a>. Instead of merely mechanically
porting the Java into Clojure, I tried to really understand what was
going on. The gist is really simple (and has been talked about
before): the vector is actually a tree with branching-factor 32, and
a buffer of size 32. When you add to the vector, you really add to
the buffer if you can, copying the buffer before writing to it.
When the buffer fills up, it is put in the tree, and again nodes are
copied rather than simply written-to.
</p>
<p>
This procedure means that adding an element is a \(O(\log n)\)
operation, but the constants are really good: for one thing, the
base of the \(\log\) is 32, so the influence of the \(\log n\) in
the overall running time is pretty small even for very large \(n\).
For another thing, the \(\log n\) penalty is only incurred on every
thirty-second insert. Most inserts are constant-time.
</p>
<p>
Another cool feature is the way that the tree is structured. Each
node has its children in an array of length 32. This means that the
subtree that contains a given element can be found very quickly by
simple bit operations. To find the subtree containing element \(i\)
in tree level \(k\), you shift \(i\) to the right by \(5k\) places
and take the rightmost 5 bits. This procedure would work with
arrays that are of length of any power of 2, where you replace 5 by
the \(\log_2\) of the length.
</p>
<p>
Anyway, László Török had pretty much the same idea as I did (and the
author of gvec.clj, which is in the core distribution, had the same
idea before either of us), and his code was better-tuned than mine,
so it is what went into the new release of ClojureScript.
</p>
</div>
<div id="outline-container-1-1" class="outline-3">
<h3 id="sec-1-1">An aside</h3>
<div class="outline-text-3" id="text-1-1">
<p>
When vectors are created in Clojure (but not yet in ClojureScript),
they are created by way of a transient vector. The idea behind this
is that the transient vector is only around for the creation of the
vector, so it does not need to be persistent. Thus, it does not
need to use the copy-on-write policy that persistent vectors use.
In order to build a larger vector, it simply adds to an empty vector
using the same procedure that I outlined above.
</p>
<p>
Theoretically at least, that is not the optimal algorithm. The
trees can be built from the bottom-up (much like <a href="http://chrismgray.github.com//tags/lab-notes/./../../posts/clojure-sorting">heaps</a>), giving a
\(O(n)\) algorithm, which also has very good constants. I haven't
computed the constant, but it should be less than \(1 {1 \over
16}\). I implemented this algorithm in Clojure, but the difference
in constants between Clojure code and pure Java code, and the fact
that the constants in the Java implementation are <b>so</b> low meant
that my implementation was slower even for very large \(n\). I may
one day try to port my bottom-up algorithm back to Java, but I kind
of doubt it would make much difference. Sometimes the
theoretically better solution just isn't better in the real world.
</p>
</div>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">Hash Maps</h2>
<div class="outline-text-2" id="text-2">
<p>
I also had a look at hash maps. The basic idea behind any hash-map
is to store a key-value pair by first computing the <i>hash</i> of the
key to something much smaller – in Clojure's case a 32-bit integer
– and then storing the pair in a location based on that hash.
There are many ways that this could be done, and I had a look at the
Java implementation used in Clojure and decided that I didn't really
want to do much more porting. So I came up with a couple of ideas
that are related to Clojure's way of doing it.
</p>
<p>
The first idea was to make the same tree as used in the vector
implementation, but to fill it in only as needed. In essence, this
was a sparse vector. The Clojure implementation does something
similar to this, but goes one step further by using a bit-map to
signify which branches of the tree are actually used. In this way,
it can avoid copying arrays that are mostly full of placeholders.
</p>
<p>
The next idea was a bit more radical. It involved making a <a href="https://en.wikipedia.org/wiki/Patricia_trie">radix tree</a> of the input data. Of all my experiments, this might have been
the most successful. In Clojure, its performance is very close to
the built-in data structure. In ClojureScript, for some reason it
is much faster at storing new data than the built-in data structure
but somewhat slower at retrieving data. I am thinking that there
must be some JavaScript-specific reason for this, but I haven't
found it yet. It could just be that the ClojureScript data
structure is asymmetric in the opposite way (I haven't yet looked
very deeply at it).
</p>
<p>
In all cases, the theoretical running times are pretty much what
you'd expect. To either insert a new key-value pair or to find a
key, it's basically the sum of the time needed to hash the key, a
constant to find the place where the key is stored, and then
something proportional to the number of other keys that have the
same hash value that have been inserted so far.
</p>
<p>
In practice, large collections must be hashed each time they're
inserted or searched for, so using them as keys is probably not a
great idea. In Clojure, negative numbers hash to one less than
their absolute value, so there will probably be collisions if you
use a large number of positive and negative numbers as keys, but
it's not that big of a deal, since the number of collisions
generated by this is at most two.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">Conclusions</h2>
<div class="outline-text-2" id="text-3">
<p>
At the very least, this was a good way to learn what's really going
on under the hood. I also did a lot of tuning – not only adding
typehints, but re-implementing some lazy functions to be non-lazy --
which is certainly good experience. I never made a data structure
that was faster than one of the ones implemented in Java, but I got
pretty close in both cases. In order to get meaningful tests, I had
to do things in a very big way: putting millions of things in
vectors, and querying hash-maps hundreds of thousands of times.
Most programs will not do these operations at such scales, and most
of the tests still took less than a second on my underpowered
laptop. That makes me think pretty highly of the existing data
structures in Clojure.
</p>
</div>
</div>
Org Plugin Rethinkhttp://chrismgray.github.com//posts/org-ikiwiki-plugin-5/2012-03-14T15:47:03Z2012-03-14T15:47:03Z
<p>I am glad to announce a new version of my <a href="https://github.com/chrismgray/ikiwiki-org-plugin">ikiwiki org plugin</a>. This is
a major change to the way things had worked previously, and is much
faster than previous versions.
</p>
<p>
The installation procedure has changed, so please read the
<code>README.org</code> file in the repository for information about how to
install the plugin. Also, if you have installed a previous version,
please remove that since it may conflict. Also be sure that you kill
any daemonized versions of emacs that are running.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Background</h2>
<div class="outline-text-2" id="text-1">
<p>
This version of the plugin is no longer what ikiwiki calls an
"external plugin". The overhead associated with XML-RPC was simply
too high (at least on the emacs side of things). I tried
<a href="http://chrismgray.github.com//tags/lab-notes/./../../posts/ikiwiki-hacking">working around</a> it by having less communication between the compiler
and the plugin, but the ikiwiki developers <a href="http://ikiwiki.info/todo/be_more_selective_about_running_hooks/">didn't really think it was a great idea</a>. So I have basically scrapped the idea of the
plugin being external.
</p>
<p>
The nice thing about that is that it allowed me to remove a lot of
code on the emacs side. Instead of communication via XML-RPC, the
new plugin can put the content to be interpreted in a file and call
emacsclient directly.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">Security</h2>
<div class="outline-text-2" id="text-2">
<p>
I note in <code>README.org</code> that there is a concern about security. This
is because any program that calls another program with
user-generated arguments should be concerned about security. As far
as I can tell, the only way that something bad can happen is if
someone creates a filename with an unmatched double-quote and
convinces ikiwiki to accept it. Since ikiwiki doesn't accept
filenames with double-quotes anyway, there really shouldn't be a
problem.
</p>
<p>
However, I would recommend that this plugin not be used in
situations where untrusted people can create files (such as from the
CGI interface), unless you personally verify that nothing bad can
happen. As always, this is free software and comes with no
warranty.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">Remaining todos</h2>
<div class="outline-text-2" id="text-3">
<p>
I would like to get rid of the potential security hole so that
people could feel safe running this plugin from the CGI interface.
I would also like to get rid of the hardcoded org configuration and
let that be stored in the user's setup file. I have a branch that
does that using the old architecture, so it should not be too
difficult to port to the new architecture.
</p>
</div>
</div>
A bit of ikiwiki hackeryhttp://chrismgray.github.com//posts/ikiwiki-hacking/2012-02-09T23:45:17Z2012-02-09T23:45:17Z
<p>Spurred on by the first comment by an actual user of my plugin today,
I tracked down one of the major bottlenecks for external ikiwiki
plugins. My fix might not be the most elegant, but it works so far.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">The problem</h2>
<div class="outline-text-2" id="text-1">
<p>
Ikiwiki plugins work by calling a function named <code>hook</code> which lets
them tell the ikiwiki compiler that they implement a certain
subroutine. Whenever that subroutine needs to be called, the
ikiwiki compiler calls <code>run_hooks</code>, and that runs all the registered
functions.
</p>
<p>
For most of these hooked functions (<code>htmlize</code> is the only exception
I can think of), all of the registered functions are called,
regardless of file type. This is normally not a problem, because
the plugin can check the type of content that it is given and only
operate on the types of content that it knows anything about.
</p>
<p>
For external plugins, however, there is a problem. Ikiwiki
communicates with external plugins via xml-rpc on <code>stdin</code> and
<code>stdout</code>. Every function call that it makes involves writing the
function's parameters on <code>stdout</code>. When there is a file type that
is unknown to the external plugin, all that effort (I/O is the main
time sink I think, but the content must also be encoded correctly
for xml-rpc) is wasted.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">The solution</h2>
<div class="outline-text-2" id="text-2">
<p>
I have added a parameter named "exclusive" to the optional arguments
that are accepted by the <code>hook</code> function. When this argument exists
(and is true), the registered function is only run when the type of
file is the same as the type of file that the plugin accepts. So
far, it only works with the <code>scan</code> and <code>linkify</code> hooks, which were
the main areas that slowed my plugin down. However, it is quite
simple to get it to work with other hooks as well.
</p>
</div>
</div>
Two Small Clojure Packageshttp://chrismgray.github.com//posts/small-clojure-libs/2012-01-07T02:23:15Z2012-01-07T02:23:15Z
<p>I've been working on a larger project lately, but in doing so I found
two little things that were independent of the project and useful
enough for me to release as Clojure packages.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Parallel Thrush</h2>
<div class="outline-text-2" id="text-1">
<p>
I discussed the Clojure <a href="http://chrismgray.github.com//tags/lab-notes/./../../posts/luhnybin">thrush operator</a> once before. The way that I
have been using it lately is to operate on a stream of data, perhaps
condensing the stream at the end into one value. This suggested to
me that I should be able to operate on the stream of data in
parallel, since all of the functions that I am using are pure.
</p>
<p>
What I have done with the new package is to create a new macro
called <code>||->></code>. This operates just like the <code>->></code> macro already in
Clojure, except that it splits the data in the stream and runs it in
parallel. Behind the scenes, it uses the Clojure <code>pmap</code> function,
which advises that it should only be used when the operation being
done is CPU-intensive enough to exceed the coordination cost. Since
multiple functions are put together by my macro before they are
passed to <code>pmap</code>, following this advice should become easier.
</p>
<p>
As an example:
</p>
<pre class="example">(||->> data
(map big-function)
(filter odd?)
:merge
(reduce +))
</pre>
<p>
is the same as
</p>
<pre class="example">(->> data
(map big-function)
(filter odd?)
(reduce +))
</pre>
<p>
but everything before the <code>:merge</code> statement is executed in
parallel.
</p>
<p>
Of course, the source is on <a href="https://github.com/chrismgray/parallel-thrush">my github</a>. The jar can also be easily
downloaded from <a href="https://clojars.org/parallel-thrush">clojars</a>.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">LRU maps</h2>
<div class="outline-text-2" id="text-2">
<p>
The project I've been working on has partly been a struggle to
manage resources. Thus, I've needed a good way to keep some of the
resources in memory and get rid of the ones that are no longer
relevant. Using a <code>hash-map</code> data structure is always nice, so I've
implemented a variant of a <code>hash-map</code> that has an upper limit on its
size. When you try to add an element that would cause the map to
have too many elements, it kicks out the element that was least
recently added or modified.
</p>
<pre class="example">(apply lru-map 4 (range 20))
;=> {12 13, 14 15, 16 17, 18 19}
</pre>
<p>
Sometimes, you want to do something with an element as it gets
kicked out. For that, there is <code>lru-map-with</code>. This takes two
extra arguments – a function that operates on some "state" and the
element that is getting kicked out and the initial "state". (Behind
the scenes, this "state" isn't really state, but it is helpful to
think of it as state.)
</p>
<pre class="example">(apply lru-map-with 4 conj [] (range 20))
;=> {12 13, 14 15, 16 17, 18 19, :lru-state [[0 1] [2 3] [4 5] [6 7] [8 9] [10 11]]}
</pre>
<p>
Again, this little package is on <a href="https://github.com/chrismgray/least-recently-used-map">github</a> and the jar is on <a href="https://clojars.org/least-recently-used-map">clojars</a>.
</p>
</div>
</div>
Lessons learned (so far) from the ikiwiki pluginhttp://chrismgray.github.com//posts/org-ikiwiki-plugin-3/2011-12-07T22:42:09Z2011-12-07T22:00:36Z
<p>Writing my <a href="http://chrismgray.github.com//tags/lab-notes/./../../posts/org-ikiwiki-plugin">new plugin</a> in Emacs Lisp, I learned a few things. Here are
some of them, in no particular order.
</p>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">xml-rpc.el has some problems</h2>
<div class="outline-text-2" id="text-1">
<p>
I don't mean to be unkind here. xml-rpc.el seems pretty good if you
are doing exactly one kind of thing: calling a method on a server
that can be reached by http. For anything else, it is very hard to
use. My plugin has the following requirements:
</p>
<ul>
<li>It must be called by another program through xml-rpc.
</li>
<li>It must read from and write to files (and not http).
</li>
</ul>
<p>
Neither of these things is made easy with xml-rpc. The first I can
understand – xml-rpc would have to somehow insinuate itself into
the emacs event loop and watch for calls all the time. The second
is less easy to understand. Parsing xml-rpc is not really related
to reading from http. So why are the two things tied together? In
my opinion, xml-rpc.el would be a much nicer library if parsing the
xml-rpc was separated completely from the http stuff. There could
be convenience functions, but they would be a layer of abstraction
on top of the other two layers.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">xml.el has some problems</h2>
<div class="outline-text-2" id="text-2">
<p>
There isn't a function to get a list of all nodes with a given name?
Are you kidding me? Here is my implementation, but I bet there is a
better one somewhere:
</p>
<pre class="example">(defun xml-find-nodes-matching (node name)
"Returns all children of `node' that have an `xml-node-name' equal to `name'."
(if (or (eq node '()) (not (listp node)))
'()
(if (equal (xml-node-name node) name)
(cons node (delq nil (mapcar (lambda (nd) (xml-find-nodes-matching nd name)) (xml-node-children node))))
(delq nil (apply 'append (mapcar (lambda (nd) (xml-find-nodes-matching nd name)) (xml-node-children node)))))))
</pre>
<p>
In general, both xml-rpc.el and xml.el use a distressing number of
calls to <code>car</code>, <code>cdr</code>, and <code>cdaddr</code> (and all the versions in
between).
</p>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">Perl has some problems</h2>
<div class="outline-text-2" id="text-3">
<p>
Ikiwiki sends named parameters as an even-lengthed xml-rpc array
rather than as an xml-rpc struct. This is because not all of the
functions that can be called via xml-rpc take named parameters and I
guess Perl isn't smart enough to tell a hash from an even-lengthed
array. This isn't a huge problem, but it does mean that I need to
convert the input to each of the functions that I write into a hash
before I use it.
</p>
</div>
</div>
<div id="outline-container-4" class="outline-2">
<h2 id="sec-4">Ikiwiki has some problems</h2>
<div class="outline-text-2" id="text-4">
<p>
I would like to be able to ignore files that don't have a particular
extension. For <code>htmlify</code>, this is the way it works by default. It
seems like most of the functions that plugins can implement are not
this way by default, though, and that is a shame. If the plugin is
not external – that is, it is written in Perl – there is really no
problem. The function is called, checks the extension of the source
file, and returns without doing anything. However, when the plugin
is external and the call must happen through xml-rpc, ikiwiki must
transmit the data via xml-rpc and receive the returned data back via
xml-rpc. Unnecessary calls take a lot longer in that context.
</p>
<p>
So I would like for most calls to <code>hook</code> to take an optional
<code>extension</code> parameter that takes an extension (or even better, a
regexp), and only call the function if the file name has the same
extension (or matches the regexp).
</p>
</div>
</div>
<div id="outline-container-5" class="outline-2">
<h2 id="sec-5">Working with emacs buffers is pretty nice</h2>
<div class="outline-text-2" id="text-5">
<p>
Does a function that you're trying to write in emacs lisp give you a
string? It's pretty easy to throw it in a temporary buffer and then
tell emacs to do the things that you would normally do while you
were editing in order to get the proper information out of the
string. The <code>with-temporary-buffer</code> macro makes it especially easy
to do just that.
</p>
</div>
</div>
<div id="outline-container-6" class="outline-2">
<h2 id="sec-6">Getting info from a structured list is easier to do as a recursive function</h2>
<div class="outline-text-2" id="text-6">
<p>
One of the things that took me the longest was to figure out what
this couple of lines of code was doing:
</p>
<pre class="example">(setq valtype (car (caddar xml-list))
valvalue (caddr (caddar xml-list)))
</pre>
<p>
What should the value of <code>xml-list</code> look like in order to get the
correct thing out of it? It turned out that I needed to take the
<code>cdr</code> of the <code>cdaddr</code> of the <code>caddar</code> of the parsed xml in order to
get the correct value. That only worked when ikiwiki was responding
to a method call, though. I had a much easier time getting the
right values out when I simply started using the
<code>xml-find-nodes-matching</code> function that I showed above.
</p>
<p>
When you see yourself writing more than a few <code>car</code> or <code>cdr</code> calls
in a row (or <code>first</code>, <code>rest</code>, or <code>nth</code> calls in Clojure), stop and
try to write a function that finds what you are looking for. The
function doesn't have to be recursive, but that might be the easiest
way to do it.
</p>
</div>
</div>
<div id="outline-container-7" class="outline-2">
<h2 id="sec-7">It's nice when all calls and responses are dumped to a file</h2>
<div class="outline-text-2" id="text-7">
<p>
In a sense, this is just saying that code can be easier to debug if
you're tracing it. But since the calls between ikiwiki and the
plugin must go through files anyway, we get the program traced
automatically.
</p>
</div>
</div>
Convex Hullshttp://chrismgray.github.com//posts/convex-hulls/2011-12-02T22:05:50Z2011-11-30T18:06:31Z
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Convex Hulls Three Ways</h2>
<div class="outline-text-2" id="text-1">
<p>
Whenever I watch cooking competition shows, they always have chefs
presenting a foodstuff cooked in multiple different ways. Today,
I'm doing that with algorithms.
</p>
<p>
The algorithm in question today is the <a href="http://en.wikipedia.org/wiki/Convex_hull">convex hull</a> algorithm. In
order of implementation complexity, and descending order of
theoretical running time, there is the Jarvis March, Graham's Scan,
and Chan's Algorithm. All three are implemented in Clojure in my
<a href="https://github.com/chrismgray/convex-hull">github repository</a>.
</p>
</div>
<div id="outline-container-1-1" class="outline-3">
<h3 id="sec-1-1">Jarvis March</h3>
<div class="outline-text-3" id="text-1-1">
<p>
The simplest of the algorithms, the Jarvis March was also one of
the first output-sensitive computational geometry algorithms. In a
nutshell, you first find a point that you know to be on the convex
hull, and then you find the next point by looking at all the rest
of the points and determining which one has a segment that has the
property that all the rest of the points are on one side of it.
You repeatedly find the next point using this procedure until you
get back to the first point. There is basically no way to
implement this algorithm that does not have a running time of
\(O(hn)\), where \(h\) is the number of points on the convex hull
and \(n\) is the number of input points.
</p>
<p>
The one implementation detail of the Jarvis March that is
interesting is that whenever you see the concept of "finding the
next thing" given some previous information, the Clojure
implementation should almost always be lazy. It turns out that
implementing Jarvis March lazily will help in implementing Chan's
Algorithm, so keep that in mind.
</p>
</div>
</div>
<div id="outline-container-1-2" class="outline-3">
<h3 id="sec-1-2">Graham's Scan</h3>
<div class="outline-text-3" id="text-1-2">
<p>
Graham's Scan is one of the algorithms I remember most vividly from
the undergraduate computational geometry course that I took. The
professor, Godfried Toussaint, always referred to it as the "three
coins" algorithm, so I have kept up that tradition in some of my
function names in my implementation.
</p>
<p>
The algorithm first makes a polygon of the input points by sorting
them by angle about the bottom-most point. Then it goes around the
polygon with a stack, pushing and popping points as it goes. If
that sounds familiar, it should – it's the same idea as what I was
talking about when I brought up the idea of <a href="http://chrismgray.github.com/posts/parsing-polygons/">parsing polygons</a> a week
and a half ago.
</p>
<p>
Thus, I used the same polygon-parsing monad in my implementation as
when I computed the visibility polygons last week. It still works
just as well.
</p>
<p>
Since the points must be sorted, Graham's Scan takes \(\Theta(n
\log n)\). Sorting can be reduced to computing convex hulls, so
computing convex hulls has a \(\Omega(n \log n)\) lower bound,
meaning that this algorithm is optimal.
</p>
<p>
But Chan's algorithm is better. Weird huh?
</p>
</div>
</div>
<div id="outline-container-1-3" class="outline-3">
<h3 id="sec-1-3">Chan's Algorithm</h3>
<div class="outline-text-3" id="text-1-3">
<p>
I must confess that I had always been a little intimidated by
Chan's Algorithm. It was invented by Timothy Chan, who has a
well-earned reputation for being a genius, so I thought it would be
really complicated. It's not. There is a decent <a href="http://en.wikipedia.org/wiki/Chan's_algorithm">description</a> of it
on Wikipedia, so I won't go into the details. The gist is that you
combine the two previous algorithms that I discussed. The Jarvis
March needs to be modified so that the points can be input as a
list of smaller convex hulls, and the next point on the convex hull
is found by doing a binary search on the smaller convex hulls. But
that is really the hardest part about the algorithm. I have put
the whole thing below, because I think it's pretty beautiful.
</p>
<pre class="example">(defn chans-algorithm
"Finds the convex hull of a set of points by
the algorithm known as 'Chan's Algorithm'."
[pts]
(let [bottom-pt (apply min-key :y pts)]
(loop [m 3] ; start out with triangles
(let [partitioned-pts (partition m m [] pts)
c-hs (map grahams-scan partitioned-pts)
potential-ch (take m (apply jarvis-march pts c-hs))]
(if (= bottom-pt (last potential-ch)) ; assumes jarvis-march returns bottom-pt last
potential-ch
(recur (min (* m m) (count pts))))))))
</pre>
<p>
The great thing about Chan's Algorithm is that it is also
output-sensitive. But instead of being \(O(nh)\) (which is
\(O(n^2)\) in the worst case), it is \(O(n \log h)\), which is at
least as good as Graham's Scan, but often better. It is also quite
simple to implement, given decent implementations of Jarvis March
and Graham's Scan.
</p>
</div>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">Conclusion</h2>
<div class="outline-text-2" id="text-2">
<p>
Convex hull algorithms are great. If I was ever to teach a
computational geometry course (admittedly that's looking like a long
shot now), I might start and finish the course with them. The
progression from the ultra-simple Jarvis March to the
more-complicated Chan's Algorithm is really nice, and there are
interesting new things to talk about the whole way. They also show
that computational geometry is not so hard to do in a functional
style. In fact, using laziness is what makes the implementation of
Chan's Algorithm so simple. So this might make a nice talk to give
people who are already into functional programming as well.
</p>
<p>
The next thing I have in mind for this project is to animate the
algorithms. Viewing the output of algorithms is already pretty easy
using Processing, but I would like to be able to see them as they
are operating. It would be great if I could do that without
changing the code too much. I have a couple of ideas, but I'm not
sure if they'll work yet.
</p>
<p>
Also, it is slightly embarrassing to admit, but my blogging system
seems to not support putting images in posts. So I am going to have
to figure out how to work around (or even fix) that before I can
show any results.
</p>
</div>
</div>
Lazy Sorting in Clojurehttp://chrismgray.github.com//posts/clojure-sorting/2011-12-01T07:45:44Z2011-11-27T14:23:34Z
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1">Penance</h2>
<div class="outline-text-2" id="text-1">
<p>
In a job interview last week I said a couple of things related to
sorting in Clojure that turned out to be incorrect when I checked
them. This is my way of making up for them – if I can make them
true, then no one will notice. Though to be fair, I'm sure that I'm
the only one that thinks that way.
</p>
<p>
The first thing I said was that heap-sort is my favorite sorting
algorithm, but then I declined to implement it, preferring to
implement merge-sort instead. I think this is actually pretty
reasonable in an interview situation – heap-sort is a conceptually
more difficult sorting algorithm, so being able to remember it on
the spot is more difficult. The second thing I said was that
Clojure itself uses heap-sort and that given its love affair with
laziness that it would not be unreasonable to assume that <code>(first (sort lst))</code> was a linear-time operation. I might have read
something like this on a mailing list or IRC, but it is not
correct. Clojure currently uses Java's <code>sort</code> function, which is a
slightly-modified merge-sort. There is not much point in making
that algorithm lazy, because getting the first element from the
sorted list requires \(O(n \log n)\) time anyway.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2">Heap Sort</h2>
<div class="outline-text-2" id="text-2">
<p>
For those that are not familiar with it, heap-sort is something that
is usually taught in a second-year undergraduate Computer Science
class. So it's not <b>that</b> difficult an algorithm, but it does
require some thinking, and there is some fancy analysis that goes
into part of it. For a full discussion, see the <a href="http://www.amazon.com/gp/product/0262033844/ref=as_li_qf_sp_asin_tl?ie=UTF8&tag=hikiandcodi-20&linkCode=as2&camp=217145&creative=399369&creativeASIN=0262033844">Introduction to Algorithms</a> book by Cormen, Lieserson, Rivest, and Stein.
</p>
</div>
<div id="outline-container-2-1" class="outline-3">
<h3 id="sec-2-1">Heap</h3>
<div class="outline-text-3" id="text-2-1">
<p>
To start with, a <i>heap</i> is conceptually a tree where values are
stored at the nodes. The largest value of all the values stored in
subtrees is stored at the root and the two descendant trees are
heaps. Heaps are usually required to be as close to balanced as
possible – if any level of the tree has leaves, they are all
bunched to the right, and all the rest of the leaves are at the
next level.
</p>
<p>
Such a tree is usually implemented as an array, where the child
nodes of a node can be obtained by some simple arithmetic on the
index of the node in the array.
</p>
</div>
</div>
<div id="outline-container-2-2" class="outline-3">
<h3 id="sec-2-2">Building a Heap</h3>
<div class="outline-text-3" id="text-2-2">
<p>
Given the definition, there is an intuitive algorithm for building
such a heap on a set of numbers: first find the largest number in
your set and have it be the root of the heap, then split the rest
of the numbers in half, and recursively make heaps on those sets.
</p>
<p>
Such an algorithm is clearly correct, but it is also clearly
\(\Theta(n \log n)\). We can do better with a bottom-up algorithm.
If we continue imagining the heap as a tree, we start by putting
the input numbers into the tree willy-nilly. This clearly does not
satisfy the heap properties laid out above. However, some of it
does satisfy the heap properties – the leaves of the tree are
trivially heaps. If we go up one level from the leaves, we can fix
the trees rooted there by exchanging the root of the tree with its
largest child (or not exchanging it if it's already the largest of
the three). Higher levels are a bit more difficult, because if the
root of a tree is exchanged, then we must make sure to fix the tree
that it ends up being the root of. You can imagine the root of a
tree traveling down the heap until it is larger than both of its
children.
</p>
<p>
The correctness of this algorithm is a bit harder to see and it
also appears to take \(O(n \log n)\) time. It does, but there is a
slightly more sophisticated analysis that shows that it is really
\(\Theta(n)\). I won't go into the analysis, but a hint is that
most of the values don't actually travel very far in the tree.
</p>
</div>
</div>
<div id="outline-container-2-3" class="outline-3">
<h3 id="sec-2-3">Using a heap to sort</h3>
<div class="outline-text-3" id="text-2-3">
<p>
With the heap properties in mind, we can easily see how to get the
largest value of a set of numbers – just take the top element from
the heap. How can we use the heap properties to sort? Well, we
want the largest number, then the largest number that remains, then
the largest number that remains after that, and so on. So if we
can take the largest number from the heap and then fix the heap so
that it retains the heap properties, then we'd be done.
</p>
<p>
We just devised a way to fix heaps when building the heap, so we
use that. What we do is to take the very last node in the heap
(which is not necessarily the smallest, but it doesn't hurt to
think about it as the smallest) and put that at the top of the
heap. The resulting tree is clearly not a heap, but if we call the
algorithm to fix heaps on the root of the tree, then we end up with
a heap again. The node that we put on top of the heap might end up
traveling all the way to the bottom, so this update takes
\(\Theta(\log n)\) time. Thus if we sort the entire input set, we
have a \(\Theta(n \log n)\) algorithm.
</p>
</div>
</div>
<div id="outline-container-2-4" class="outline-3">
<h3 id="sec-2-4">Advantages</h3>
<div class="outline-text-3" id="text-2-4">
<p>
That's the best we can do theoretically, which is great, but the
Java <code>sort</code> algorithm is also \(\Theta(n \log n)\), so why is there
any advantage to using heap-sort? In a language where laziness is
embraced (such as Clojure), heap-sort can be made lazy. That is,
the next element of the sorted list can be computed only when it is
needed. Since the build-heap procedure described above takes
linear time, getting the first element from the sorted list takes
\(O(n)\) time. Each subsequent element then takes \(O(\log n)\)
time. Thus, if only a small number of elements from the sorted
list are needed, then this lazy version of heap-sort is
theoretically faster than other sorts.
</p>
<p>
I can think of situations where this would actually have practical
advantages. For example, what if you were writing a search engine
and wanted to obtain the \(k\) best results? You could write an
ad-hoc function that found the best result, removed it and recursed
\(k - 1\) times. Or you could just <code>(take k (heap-sort input))</code>.
The first would take \(O(kn)\) time, whereas the second would take
\(O(k \log n + n)\) time. In many practical situations, \(k\) is
\(\Omega(\log n)\), which means that the first takes \(\Omega(n
\log n)\) time, whereas the second takes only \(O(n)\) time.
Essentially, the first is no better than the second solution with a
non-lazy sorting algorithm.
</p>
</div>
</div>
<div id="outline-container-2-5" class="outline-3">
<h3 id="sec-2-5">Disadvantages</h3>
<div class="outline-text-3" id="text-2-5">
<p>
Heap-sort has some disadvantages compared to other sorts of <code>sort</code>
algorithms. The least theoretically significant is that the
constants hidden in the big-O notation are higher than other
sorting algorithms (tuned versions of quicksort can have extremely
small constants).
</p>
<p>
Another disadvantage can be seen when dealing with data sets so
large that they no longer fit in the computer's main memory.
Something like merge-sort can be modified fairly easily so that the
number I/O operations is minimized. I haven't thought about it too
deeply, but this doesn't seem quite so easy with heap-sort.
However, I think that people dealing with such large datasets
should probably be using specialized libraries anyway, so perhaps
that isn't too bad.
</p>
</div>
</div>
</div>
<div id="outline-container-3" class="outline-2">
<h2 id="sec-3">Implementation</h2>
<div class="outline-text-2" id="text-3">
<p>
This whole discussion is a bit useless if it only remains at the
theoretical level. I have an implementation <a href="https://github.com/chrismgray/clojure-heap-sort">in my github repo</a> that
implements most of the ideas that are given above. The code is
highly optimized so that it is somewhat competitive with the native
Java implementation. This makes the code on the HEAD of the
<code>master</code> branch somewhat less than readable. However, the first
commit to the repository used Clojure vectors and a functional
style, so if you would like to understand the code, you might start
there.
</p>
<p>
I (unfortunately) needed to use Java arrays and mutation rather than
the more functional style that I have gotten used to, but the
results speak for themselves. Finding the first few elements of a
sorted array is significantly faster than the Java version. Finding
the entire sorted list is somewhat slower than the Java version, but
not too much. This is not surprising for a couple of reasons.
First, heap-sort tends to have larger constants than other sorting
methods. Secondly, this code is one day old. The Java sort method
has had years to be optimized.
</p>
</div>
</div>
<div id="outline-container-4" class="outline-2">
<h2 id="sec-4">Conclusion</h2>
<div class="outline-text-2" id="text-4">
<p>
It may be dreaming, but I would love to see this idea (if not this
implementation) put into Clojure proper. I think the advantages
from laziness outweigh the small constant slowdown versus using
Java's <code>sort</code>.
</p>
</div>
</div>