Simply Jonathan

What’s the Cheapest Empty Iterable in Python?

From the department of ‘optimisations that are so premature that if you ever find yourself actually caring about it, you need to find better problems to solve’, I recently had this thought: What’s the cheapest/fastest iterable in Python?

Reminder: iterable and iterator are not the same thing.

The need for an empty iterable occasionally comes up, like when you need to provide a default value to a missing key in a dictionary, and you need it to be something you can iterate over without running the risk of a TypeError. The beautiful thing, of course, is that you can iterate over an empty iterable and just have nothing happen, so the actual type or contents don’t matter.

So I set out to test it. Again: you should never need to actually care about this. If you can live with the actual overhead of iterating over something, you can live with the overhead if that something is empty, no matter the actual type of iterable.

I evaluated strings, lists, tuples, dictionaries and sets. My hypothesis was that the fastest would be a string or maybe a tuple.

The test was performed on a late-2016 13″ MacBook Pro with a 3.3 GHz Intel Core i7 and I timed it using the timeit module.

First I tested out simply declaring the different types of iterables:

kweli:~ j$ python -m timeit -c '""'
100000000 loops, best of 3: 0.00672 usec per loop
kweli:~ j$ python -m timeit -c '[]'
100000000 loops, best of 3: 0.0187 usec per loop
kweli:~ j$ python -m timeit -c '()'
100000000 loops, best of 3: 0.0119 usec per loop
kweli:~ j$ python -m timeit -c '{}'
10000000 loops, best of 3: 0.0305 usec per loop
kweli:~ j$ python -m timeit -c 'set()'
10000000 loops, best of 3: 0.0924 usec per loop

So far so good: strings are the fastest, followed by tuples and dicts, with sets trailing far behind.

Then to actually iterating over them:

kweli:~ j$ python -m timeit -c 'for i in "": pass'
10000000 loops, best of 3: 0.0433 usec per loop
kweli:~ j$ python -m timeit -c 'for i in []: pass'
10000000 loops, best of 3: 0.0514 usec per loop
kweli:~ j$ python -m timeit -c 'for i in (): pass'
10000000 loops, best of 3: 0.0438 usec per loop
kweli:~ j$ python -m timeit -c 'for i in {}: pass'
10000000 loops, best of 3: 0.0707 usec per loop
kweli:~ j$ python -m timeit -c 'for i in set(): pass'
10000000 loops, best of 3: 0.136 usec per loop

And again, the hypothesis is confirmed, but interestingly the difference between lists and strings/tuples is much smaller when iterating compared to just declaring.

So in conclusion, use a string as an empty iterable, unless you have any reason at all not to. The difference is infinitesimal.

Merge

The day has arrived at last; I have merged Simply Jonathan and holst.notes at the simplyjonathan.com domain. The reasons for this are many, but the primary is that it seemed unnecessary to have them separated. So, starting some time last week, I began redesigning Simply Jonathan to accommodate for new, shorter entries.

I had a long time ago planned to redesign Simply Jonathan, and move it to WordPress. The things I said back then still are true; I just realised that it did not really matter. No, blog systems are still not geared towards the longer entries I want to write, but they are not hostile either, and so it seemed strange to keep on to a publishing engine that did not really cut it. The problem was, after I had launched it, I did not really want to do anymore. This left me with a raw Django admin that, albeit pretty, lacked all the automatic processes I need to write efficiently. WordPress had those, and then the horrible code throughout the system, and the stupid template system was of less importance. I still do not like WordPress, but it does as labelled, and that is fine for me at the moment.

In the midst of it, I also decided to merge my del.icio.us postings into the blog, to emulate a Gruber-esque Linked List approach. A, shall we say, interesting experience. More on that in a bit.

So, after I realised that I would move it into one WordPress installation, I did as follows:

  1. I manually entered the three Simply Jonathan entries into the system. It seemed like too much work to write a conversion script from a custom format to WordPress, since it was only about three entries. Copy/paste in all its glory.
  2. I then wanted to import all the 100-something notes into the system. Now, suffice it to say, this was too much manual labour. And since they were both WordPress, I figured I could just export the holst.notes posts table, and import it into the Simply Jonathan one. Not so. Apparently, I had used ISO-8859-1 with holst.notes, but I opted for UTF-8 on Simply Jonathan. Oh well, this is not really anything I can blame the WordPress team for, this was my own fault.

    So I decided to investigate the ex-/import possibilities I had discovered in the Simply Jonathan interface. However, holst.notes was running an ancient version of WordPress, one from before the time they realised that this sort of behaviour could happen. Bugger. Oh well, I would just have to use the WordPress-to-Wordpress plugin. So I did. But when time came and I had to upload it, WordPress was giving me an error. Turns up, upon code investigation, that WordPress believes it can safely write onto whatever location PHP stores its temporary upload files. Turns out, it could not on my host. So I had to FTP it to the server, and then hardcode the location to that file. Not pretty, but at least it worked. Then I did a bit of hacking regarding categories, as I decided to slightly change formats, and holst.notes was done.

  3. Importing del.icio.us posts proved to be quite tricky, although that was to be expected, given the quite custom format I wanted. I wound up writing a hacked up parsing mechanism for the bookmark format del.icio.us exports into (which seems to be a browser parseable format). Using WordPress’ quite excellent custom fields, I managed to hack up a solution that works quite well.

So now I have a new style blog with three different types of content: links, notes and essays. I hope it will be great.

This is Simply Jonathan, a blog written by Jonathan Holst. It's mostly about technical topics (and mainly the Web at that), but an occasional post on clothing, sports, and general personal life topics can be found.

Jonathan Holst is a programmer, language enthusiast, sports fan, and appreciator of good design, living in Copenhagen, Denmark, Europe. He is also someone pretentious enough to call himself the 'author' of a blog. And talk about himself in the third person.