iTunes feed extractor

The downside of Apple’s iPod/iPhone being so popular is that so many podcasts only publish iTunes links, instead of the more standard RSS/Atom feeds. And I know of OS X and Windows users who detest iTunes — imagine how Unix users feel!

Well, the feeds are still there, but hidden from plain sight — turns out, though, that if you pretend to be iTunes, you can actually trick the iTMS server into giving you the raw data. And with Python 2.6’s built-in support for Apple’s property lists, extracting the feed is a trivial matter.
Continue reading

OOP in Lua: abstract methods

I discovered Python’s delightful abc module (introduced in PEP 3119) a while back, and have been wondering how a similar functionality could be introduced to other dynamic languages (e.g. Lua and Ruby).

Being more familiar with the former, over the weekend I decided to try and write an equivalent module. Two problems came to mind:

  • Python uses an @abstractmethod decorator to mark an otherwise-normal function definition. While a decorator pattern can be used in Lua (though not a standard practice), this Python decorator works by setting an attribute in the function it decorates:

    funcobj.__isabstractmethod__ = True

    This is not an option in Lua, because functions don’t have individual metatables.

  • Python has a standard way of doing OOP, and the enforcement that an instantiated class does not contain any abstract method is done in the __new__ method of the ABCMeta metaclass. Contrast to Lua, where OOP is normally done in an ad-hoc manner — and even the LOOP library provides several OOP inheritance mechanisms.

The solution I adopted is delightfully simple: the abc module provides two functions: an amethod function that throws an exception no matter what arguments it is passed, and a verify function that takes a classname, and reflects on the members (using loop.cached.allmembers if possible, falling back to pairs if loop is not installed) and check if any of them is equal to amethod.

This works even on simple tables, though it shifts the responsibility a bit to the programmer: unit tests should be used to verify each created class. As a fallback, attempts to use the abstract methods will fail at runtime, but that’s taking dynamism a bit too far…

Still working out how to get the Pythonic behavior (instantiating a class with some abstract members should fail) when using LOOP. Meanwhile, try it out for yourself and make sure to file bug reports!

Technorati Tags: , , , , , , ,

[PYTHON] Find of the day: B+ Tree-based lists

I was looking at implementing Clojure’s persistent data structures on other languages — being able to assume that collections are immutable make writing concurrent programs much easier, since these collections can be shared without locking.

While looking if this has been done before, I came across a rejected Python Enhancement Proposal:

The common case for list operations is on small lists. The current array-based list implementation excels at small lists due to the strong locality of reference and infrequency of memory allocation operations. However, an array takes O(n) time to insert and delete elements, which can become problematic as the list gets large.

This PEP introduces a new data type, the BList, that has array-like and tree-like aspects. It enjoys the same good performance on small lists as the existing array-based implementation, but offers superior asymptotic performance for most operations. This PEP proposes replacing the makes two mutually exclusive proposals for including the BList type in Python:

  1. Add it to the collections module, or
  2. Replace the existing list type

It is currently rejected, but could be added to the collections module if there is sufficient outside interest. This is not quite the immutable vector from Clojure, but close enough: one merely needs to subclass it, and modify the setters (__setitem__, __setslice__, etc.) to first copy the collection and then operate on the copy. Copy-on-write ensures that only the changed leaf is actually copied, plus the internal nodes on the path leading to the leaf.

I’ve packaged this for Fedora (review request). Anyone cares to review it?

Technorati Tags: , , ,

ContactBot: noiselessly telling people where you are

Ever been in a situation where you are abroad for a short period of time, or you forgot your mobile phone charger, or otherwise not reachable at your normal phone number? I’ve been there, and it can be annoying to handle, unless you’re the kind of person who keeps lists (and even then, text messages ain’t free!). Or worse, you have several mobile phones, and you juggle them around, and people don’t know which one you are on right now…

There’s Google Voice, and other forwarding services, but that would require giving people a different number; one whose voice quality is probably not as good. And if the new number is only needed temporarily, that seems like an overkill.

Enter ContactBot. It provides a Twitter bot; you can tell it your phone number and location, and your friends (people you follow, not people who follow you) can then query the bot for your whereabouts.

The test bot is @hircus_contact. It is rate-limited at the basic 150 messages per hour, so if you want to heavily test it, or do not want to give me your phone number (I run the bot, after all!), feel free to check out the code and run your own bot!

Note that the bot does not currently add you back, so you’d have to wait until I personally add you before you can send direct messages. Should take less than a day.

TODO: timezone support, so the bot can warn your tactless friends not to call you at 3 a.m., documentation, and test cases. And online help.

Current command set:

  • D hircus_contact set phone 555-5555
  • D hircus_contact set location New York, NY
  • D hircus_contact phone ma_cherie
  • D hircus_contact location ma_cherie

Technorati Tags: , , , ,

Python tail-call optimization, done right

It occured to me this morning to revisit the issue of tail-call-optmization using function decorators in Python. Last time I checked, the working trick involves stack inspection (works only in CPython) and throwing an exception whenever a tail call is detected. In short: non-portable and slow. I posted an enhancement here that allows for mutual recursion (function A tail-calling function B tail-calling function A …), but it did not occur to me that the stack inspection hack, clever as it is, could be improved on.

Improved on it has: Miguel Perez is reporting that his solution runs pretty much as fast as normal looping. Supports mutual recursion and is completely portable too.

clipped from groups.google.com
Please critique this tail call optimizing decorator I’ve written. I’ve tried to fix the pitfalls of other proposed decorators, and the result is this one that supports mutual recursion, does not use exceptions, stack inspection or any implementation-dependent hack, and is pretty short and fast – the fastest out of the ones I could find and try. In fact, in tail-recursive environments I tested the impact of using the decorator is difficult to even measure, as the extra time the decorator takes to run is probably saved by the better use of cache memory. The only caveat is that if used in a function that’s not called in a tail-recursive fashion, bad things will happen.
  blog it

Python tail-call decorator

Two days ago, this tail call decorator
by Crutcher Dunnavant was mentioned on LtU. It’s a clever little trick: the decorator returns a new function that, when called, checks if its grandparent
is itself. This would mean that a function declared to be tail-call-optimized is called from another tail-called-optimized function, and so the caller’s frame in the stack can be wiped. How does it do this? The function throws an exception. This is caught by the grandparent, which then calls the saved function with the arguments passed in the exception.

This is the original implementation:


#!/usr/bin/env python2.4
# This program shows off a python decorator(
# which implements tail call optimization. It
# does this by throwing an exception if it is
# it's own grandparent, and catching such
# exceptions to recall the stack.

import sys

class TailRecurseException:
  def __init__(self, args, kwargs):
    self.args = args
    self.kwargs = kwargs

def tail_call_optimized(g):
  """
  This function decorates a function with tail call
  optimization. It does this by throwing an exception
  if it is it's own grandparent, and catching such
  exceptions to fake the tail call optimization.

  This function fails if the decorated
  function recurses in a non-tail context.
  """
  def func(*args, **kwargs):
    f = sys._getframe()
    if f.f_back and f.f_back.f_back         and f.f_back.f_back.f_code == f.f_code:
      raise TailRecurseException(args, kwargs)
    else:
      while 1:
        try:
          return g(*args, **kwargs)
        except TailRecurseException, e:
          args = e.args
          kwargs = e.kwargs
  func.__doc__ = g.__doc__
  return func

@tail_call_optimized
def factorial(n, acc=1):
  "calculate a factorial"
  if n == 0:
    return acc
  return factorial(n-1, n*acc)

print factorial(10000)
# prints a big, big number,
# but doesn't hit the recursion limit.

@tail_call_optimized
def fib(i, current = 0, next = 1):
  if i == 0:
    return current
  else:
    return fib(i - 1, next, current + next)

print fib(10000)
# also prints a big number,
# but doesn't hit the recursion limit.

There is a subtle bug: after handling the exception, func then calls its own saved version of g, not the g of the func that throws the exception. Modifying func and the TailRecurseException class so that the function to be called can be passed in the exception solves this nicely:

class TailRecurseException:
  def __init__(self, g, args, kwargs):
    self.g = g
    self.args = args
    self.kwargs = kwargs

  ...
  def func(*args, **kwargs):
    f = sys._getframe()
    if f.f_back and f.f_back.f_back         and f.f_back.f_back.f_code == f.f_code:
      raise TailRecurseException(g, args, kwargs)
    else:
      newg = g
      while 1:
        try:
          return newg(*args, **kwargs)
        except TailRecurseException, e:
          newg = e.g
          args = e.args
          kwargs = e.kwargs
  ...

Throwing an exception just to handle the required stack manipulation because the interpreter won’t handle it is grossly inefficient, granted, and requiring the programmer to declare functions as optimizable is not be ideal, but still a really neat trick.

As a side note, if a function can be called in tail and non-tail position, a non-decorated version should be made available, otherwise bad things can happen:

@tail_call_optimized
def add1(n):
  return (n+1)

@tail_call_optimized
def add2(n):
  return add1(add1(n))

This returns 41.

Modified version, and test cases, available here [IUCS] (backup copy here).

Update: added isTailCall by jorend. Now it is safe to optimize every function (though for efficiency reason you’d only optimize the ones that make tail calls).

Static (lexical) vs dynamic scoping

Eric and I were discussing scoping in Scheme and Python earlier today, our third over the past few weeks – and we finally nailed it shut. The first time he brought up dynamic scoping in Common Lisp and how Prof. Friedman dislikes it; the second was on how Python appears to have dynamic scoping (which turns out to be true, pre-Python 2.2), and now, thanks to Wikipedia, I think we have it right.

Provided Eric gets the H211 Introduction to Programming (Honors) class, which is in Python, and I get the C211 Introduction to Programming (Scheme), our discussion should stand us in good steed, though funnily today I played the Python guy and he played the Scheme one.

I’m going to show the examples, both in Scheme and Python; the first one in each section would appear to show that the language in question features dynamic scoping, which is incorrect as both actually do lexical scoping.

Scheme:
Bad:

(let ((pi 3.1415))
 (define area
  (lambda (r)
   (* pi r r)))
 (display (area 10))
 (newline)
 (set! pi 3)
 (display (area 10)))

Good:

(let ((pi 3.1415))
 (define area
  (lambda (r)
   (* pi r r)))
 (display (area 10))
 (newline)
 (let ((pi 3))
  (display (area 10))))

Python:
Bad:

pi = 3.1415
def area(r):
 return pi*r*r
area(10)
pi = 3
area(10)

Good:

pi = 3.1415
pi_holder = 10
def create_area():
 pi_holder = pi  # local pi_holder, different from pi_holder outside
 def area(r):
  return pi_holder*r*r
 return area
area = create_area()
area(10)    # 314.15
pi_holder    # Still 10
pi_holder = 3
area(10)    # Still 314.15

The above works, but is a bit problematic. I introduced pi_holder=10 to show that, (1), pi_holder inside of create_area() is a local variable; (2), that this local pi_holder is the one that is in area‘s scope, and thus changing the value of pi_holder does not affect it.

Isn’t it easier to just do pi = pi ? Well, that does not work. My initial hunch was that Python reads the LHS of the expression, decided pi has been redeclared as a local variable, and thus since it’s not been initialized it got confused trying to assign it the value of itself. But it’s actually worse; this code does not work either:

x = 42
def local_var_test():
 temp = x
 print temp  # 42
 x = temp
 print x   # 42?

Surprise! Python won’t let you do that either. Take out the last two lines and the code works, though. Basically, if in the block a variable is declared anywhere, it is a local variable everywhere in that block, and trying to refer to a variable declared in the surrounding scope, even before the local declaration, will fail.

But this is where default parameters come in handy. A better way to rewrite the clunky code above is as follows:

pi = 3.1415
def create_area(pi = pi):
 def area(r):
  return pi*r*r
 return area
area = create_area()
area(10)    #314.15
pi = 3
area(10)    #314.15

So Python has static scoping after all. The thing to bear in mind is that Scheme functions are named closures, while Python functions inherit the surrounding scope, so to freeze the variables you depend on you have to wrap your function definition inside another function that copies in the values you need into its local variables.

References:

And the funny thing is, I started the day trying to find good dynamic languages that run on the Java platform (platform envy, I guess, since .NET more prominently touts its language neutrality). Sun’s finally catching up, though – Tim Bray wrote a few months back about the Coyote project to support dynamic languages in Sun’s open source IDE, NetBeans, and pointed to an interesting Sun-developed scripting language, Pnuts. Which reminded me of Groovy and Boo.

Googling for groovy boo .net – Groovy being a Ruby-like scripting language for Java that received a lot of attention a few months ago, and then taken some flak over its development model, and Boo being the Python-like language for .NET – yields this very interesting Slashdot discussion that led me to such intriguing functional OO languages as Scala and Nice. .NET fans do not get to have all the fun!

Groovy, on the other hand, seems rather disappointing. Oh well. Scala looks more like Haskell, but with dynamic type inference (like Boo).. yay!

Update2005/06/06

Realized a few days ago, but haven’t gotten round to posting about it, that I was unfairly comparing Scheme and Python, and that Python methods are closures in themselves. Note:

pi = 3.1415
def area(r):
 return pi*r*r
print area(10)  # 314.15
def test():
 pi = 3
 print area(10) # 314.15
test()

In the earlier example, overriding the value of pi with pi = 3 is the equivalent of doing (set! pi 3) in Scheme, i.e. it will change the value of the variable that both the top-level pi, which is the one that area knows. In a dynamic scope, which uses a stack to figure out which value assignment should apply, pi = 3 would affect the call to area just after it.