[PYTHON] Find of the day: B+ Tree-based lists

I was looking at implementing Clojure’s persistent data structures on other languages — being able to assume that collections are immutable make writing concurrent programs much easier, since these collections can be shared without locking.

While looking if this has been done before, I came across a rejected Python Enhancement Proposal:

The common case for list operations is on small lists. The current array-based list implementation excels at small lists due to the strong locality of reference and infrequency of memory allocation operations. However, an array takes O(n) time to insert and delete elements, which can become problematic as the list gets large.

This PEP introduces a new data type, the BList, that has array-like and tree-like aspects. It enjoys the same good performance on small lists as the existing array-based implementation, but offers superior asymptotic performance for most operations. This PEP proposes replacing the makes two mutually exclusive proposals for including the BList type in Python:

  1. Add it to the collections module, or
  2. Replace the existing list type

It is currently rejected, but could be added to the collections module if there is sufficient outside interest. This is not quite the immutable vector from Clojure, but close enough: one merely needs to subclass it, and modify the setters (__setitem__, __setslice__, etc.) to first copy the collection and then operate on the copy. Copy-on-write ensures that only the changed leaf is actually copied, plus the internal nodes on the path leading to the leaf.

I’ve packaged this for Fedora (review request). Anyone cares to review it?

Technorati Tags: , , ,


2 responses to “[PYTHON] Find of the day: B+ Tree-based lists