[Mini Kanren] Benchmarking different substitution data structures
In logic programming, a substitution is a mapping from logic variables to values (including other logic variables). Logic programs are composed of goals, with the type signature Subst => Subst* — i.e. it might fail (thus generating 0 possible substitution), succeed once, or succeed multiple times, or even an infinite number of times.
It is necessary for substitutions to behave as values — if you take a substitution, and add a binding, you better get a new substitution, rather than do the modification in place, because that substitution might be used by other goals. In Lisp, the traditional data structure for this is the association list: a list of lists, where the cars (first elements) of each inner list are the keys. Clojure takes Phil Bagwell’s Hash Array Mapped Trie and adds persistency on top of it, to the effect that you have a copy-on-write trie map, with logarithmic rather than linear lookup times.
Technorati Tags: functional, logic, programming, clojure, scala, jvm, data-structures, tries
Using association lists, my Scala code was performing about 3x slower than Petite Chez Scheme. Horrendous results — interestingly, using linked case objects optimized for storing keys and values reduce the gap, to the point that it was only about 10-15% slower. Still not great, but acceptable. Since my Mini Kanren implementation is written in Scala, not Clojure, I tried using Scala’s own immutable maps, reasoning that with Bagwell tries, the boost in look-up performance (O(log n) vs O(n)) ought to outweigh the increased cost of extending (O(log n) vs O(1)). Alas, Scala’s maps are not Bagwell tries. One simply gets an OOM error; I’d have to dig deeper to find out exactly why.
The solution, of course, is to use Clojure’s maps in Scala. Doable, after patching its Java interface a bit — Clojure being a Lisp, there is a penchant for short names, and therefore an IMapEntry has a getter called val(), instead of value() — but val is a keyword in Scala (denoting values, i.e. names, like variables but immutable). Patched code lives in a branch on my Clojure fork; hopefully this can get merged in as a stop-gap until Scala gets its own tries.
And with Clojure maps? Scala Kanren is now 2.5 times *faster* than Mini Kanren on Petite Chez. Mind-boggling. Oh, and you’d want the raw numbers, naturally. Bear in mind that this is used on a desktop system that is running Firefox, etc. at the same time. I’d need to do longer runs (e.g. 100x) from runlevel S to get better results.
Next speed-up is probably going to be obtained by reimplementing this back in Clojure — it has a built-in parallel map that works on lazy sequences, unlike Scala.
| Strategy | 1 | 2 | 3 | avg | rel |
|---|---|---|---|---|---|
| association list (Scheme) | 39312 | 38853 | 39207 | 39124.0 | 1.0 |
| association list (Scala) | 114562 | 111629 | 109939 | 112043.3. | 2.86x |
| linked triples (Scala) | 47833 | 44813 | 44277 | 45641.0 | 1.167 |
| Immutable maps (Scala) | OOM | ||||
| Clojure persistent maps (Scala) | 17955 | 15586 | 13482 | 15674.3. | 0.40 |








try `var` to work around the keyword issue in scala.
Yup, thanks for that — found that out on the Scala mailing lists. The workaround should go away once Scala 2.8 comes out — provided the new persistent map implementation benchmarks well.