Tim Bray’s Wide Finder: a minimalist implementation

Several weeks ago, Tim Bray posted his Wide Finder project: take the Ruby script that parses an Apache log file and report the top 10 hits, and parallelize it in your language of choice.

It occurred to me a while back that this is a perfect job for a C multi-process program, taking advantage of Linux’s cheap copy-on-write fork, if not for the need to merge the result. So it would probably be easier to write it using the pthreads library instead. One would want to reduce inter-thread communication as much as possible, though.

I did not have time to touch the code until today, but now it’s done. Two key insights:

  • The input file can be partitioned cleanly into multiple chunks of roughly identical sizes without communication, as long as each thread follows the same protocol. i.e. a common chunk size is used, and all but the first thread needs to check if the character before their starting offset is the end-of-line. If not, they need to skip until the end of line, yielding that line to the previous thread.
  • When merging the hash maps, the first thread to acquire the lock does not need to insert its items one by one, but can just set the main hash map to point to its own

On a data set size of 200MB, with a single thread, performs slightly slower as Ruby by wall clock:

Ruby timing

2.99user 1.88system 0:05.26elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
152inputs+0outputs (3major+853minor)pagefaults 0swaps
3.04user 1.84system 0:05.33elapsed 91%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+856minor)pagefaults 0swaps
2.84user 2.01system 0:05.19elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+856minor)pagefaults 0swaps

C++ timing, one thread

2.70user 5.21system 0:07.95elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+655minor)pagefaults 0swaps
2.78user 5.14system 0:07.96elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+655minor)pagefaults 0swaps
2.72user 5.12system 0:07.90elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+654minor)pagefaults 0swaps

With two threads, though, it runs faster, almost at Ruby speed.

C++ timing, two threads

3.33user 5.40system 0:05.51elapsed 158%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+675minor)pagefaults 0swaps
3.40user 5.33system 0:05.86elapsed 148%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+672minor)pagefaults 0swaps
3.41user 5.36system 0:05.60elapsed 156%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+674minor)pagefaults 0swaps

So, as Tim observed, the problem is not entirely IO-bound. More testing is needed, but will probably need to be done on a machine with faster IO (and more CPUs). Like Tim’s new Niagara T2 testbed.

Will edit this post later — need to run now, and wanted to get this out as soon as possible. Code is available here… now I just need to find Tim’s email, since his blog is down. Argh!

Update: switched to the new std::tr1::unsorted_map. Performance seems identical with a single thread, but slightly higher (within margin of error) on two threads — which makes sense: a multithreaded run exercises the map more, because each thread’s map has to be merged into the final map.

4 responses to “Tim Bray’s Wide Finder: a minimalist implementation

  1. You can certainly see your skills within the work you write.
    The arena hopes for more passionate writers such as you
    who are not afraid to say how they believe. At all times follow your heart.

  2. of course like your web-site however you have to test the spelling on quite a few of your posts.
    Many of them are rife with spelling problems and I to find it very bothersome to tell the truth nevertheless I will definitely come back again.

  3. My partner and I stumbled over here different website
    and thought I might as well check things out.
    I like what I see so now i’m following you. Look forward to looking into your web page repeatedly.

  4. Awesome blog you have here but I was curious about
    if you knew of any discussion boards that cover the same topics
    talked about in this article? I’d really like to be a part of group where I can get advice from other experienced individuals that share the same interest. If you have any suggestions, please let me know. Thanks a lot!