Simply Jonathan

Portable EPUBs 

Permanent location of 'Portable EPUBs'

Compelling proposal from Will Crichton: using EPUB as a basis for a format to replace PDF.

The second most important reason I don’t read more academic literature than I do (the first of course being other commitments) is that they’re all published as PDF, so can’t really be read on a phone. Reflowing text just fits so much better, and if they were published as HTML (possibly EPUB) I’d have a much easier time getting into them.

From URL to Interactive 

Permanent location of 'From URL to Interactive'

A List Apart’s From URL to Interactive series just concluded, and I think it’s worth a read for any web developer.

It’s structured in a way that reminds me of one of my favourite books, Charles Petzold‘s Code, moving from the bottom of the stack to the top.

Essential image optimization 

An eBook by Addy Osmani on everything you could want to know, and then some, about the most efficient ways to serve images in browsers.

Learning Clojure

Over the last month or so, I’ve begun learning Clojure. I don’t do much blogging, let alone technical, but I realise I’ve actually had this blog for so long that my first impressions of Python (the language I spend most of my day job writing in), are documented.

I read through Kyle Kingsbury’s Clojure from the ground up series, and found it an easy learning process. Although I consider myself somewhat of a polyglot, I realised that I hadn’t actually learned any new programming languages in almost ten years, aside from various JavaScript type annotation supersets. (I’ve tried learning Haskell, but have been largely unsuccessful.)

All in all, I like it. I like its functional paradigm, making functions pure by default, but not having to ask permission to get side-effects (which is the feeling I get a bit with Haskell). Leiningen is a great tool to get up and running, and it takes care of a lot of the minutae, like installing dependencies and running tests. I really like being able to name functions almost anything, including non-ASCII characters and characters normally reserved (so now I can use possessive in a function name).

One thing I find about that I really dislike is the abbreviations. Now, this might simply be because it’s the first language I’ve picked up in a while, and I simply haven’t paid attention to it in other languages, but the incessant abbreviating every conceivable name drives me up the wall. Why in the world does it have to be conj and assoc, what’s wrong with conjoin and associate‽ The fact that abbreviations are applied so randomly proved a stumbling block for me, which I feel it really shouldn’t have to be. This is my first foray into Lisp, so I don’t know how much (if any) is simply convention, but space-saving concerns one might have had in the 50s can surely be ignored today. I get more riled up than I justifiably should be, but it irks me. (And again, I realise this might simply just be my internalisation of some abbreviations: I have no problem with str, concat and def.)

I find the destructuring syntax in a lot of cases to be greatly confusing and emanating magic. I have come to terms with let taking a vector of alternating key value pairs, but the sprinkling of keywords to imbue bindings with special properties means I’m still at a copy & paste–stage for some use cases. I’m not very far into macros yet (I have yet to write my first one), but from what I can sense, it leads to a lot of poorly designed APIs. But it might just take some getting used to. (I thought the self argument for Python methods was stupid at first, and now I don’t think about it.)

I also really miss Python’s named, any-order parameters. I realise something similar can be achieved in Clojure using keys destructuring, but that can’t be combined with arity overloading, which I also really like. (Yes, this might be a case of wanting to have a cake and eating it too.)

The lack of a good date and time library is also unfortunate (at least for the apps I tend to do). I’ve been using clj-time, which seems to be a pretty thin wrapper around Joda Time, and while it does its job, it has some odd shortcomings, the primary being its incapability of representing date-less times. I’ve resorted to vectors of hours, minutes, etc., but when you’re used to Python’s datetime library, specifically datetime.time in this instance, you find yourself wanting.

I have found one library that I really like, though: Enlive. It’s an unconventional templating library, in that it doesn’t make a DSL for templating (or, indeed, give access to the whole language, as in PHP), letting the templates instead be pure HTML, and doing the transformations in Clojure. It took me a little while to get the hang of doing things such as loops, but I think it makes for a clean separation of concerns, and I’ll definitely investigate the concept in Python. (There is a Python port, although it doesn’t seem to get much attention these days.)

All in all, I’m really excited about Clojure. For web development it lacks some of the maturity and cohesiveness that I’m used to with Django, but as a language it has a lot of interesting concepts and libraries.

In defence of the (documented) API

Way back in November (I realise this puts my comment in the category of insanely untimely responses, but so be it) Ruben Verborgh wrote an article called The lie of the API, which I got to via Jeremy Keith.

Let me, before I lay out my disagreement, start by saying that I agree with the basic gist of both their arguments: There is little reason why an HTML representation of a piece of content is freely available, but a JSON (or XML or YAML) representation requires an OAuthenticated API.

Although Keith disagrees that content negotiation is the way forward, his waryness seems to be of a nature that could be URL hacked away — instead of using actual content negotiation (Accept: application/json), one could simply append .json to the end of the URL, or other similar measure. It may not be the absolute cleanest, most native HTTP implementation, but I’m sure Verborgh and Keith could both live with this.

This doesn’t pave over the reliability of the API though, which I’ll argue is an, albeit maybe not explicitly communicated, major reason why sites choose to offer their data via an actual API.

The thing is, when you visit an HTML page, you do not need to know anything about the structure of the data. The browser and the developer have an agreement that HTML structured in a certain way, with CSS and JavaScript structured in their certain ways, will be displayed in a certain way (save of course for browser-inconsistencies, but that’s the general idea). That is why an HTML page can be viewed, and made sense of, by a human being, whereas a JSON representation for the far majority of people will make essentially no sense.

When you make a JSON representation of some data, the structure will not be self-explanatory, and you will be forced to choose a structure for the data. Unless you’re dealing with Platonic ideals (and are very good at achieving those ideals on the first try) this structure is bound to change. You might add a field, remove a field, or change the semantics of a field.

If you do this with an HTML document, you can go about the change any way you like – so long as the new structure of the data still conforms to the browser’s expectation of how an HTML page should be structured, you’ll still have something usable. If you do this with JSON – assuming the JSON representation is only read by a machine, which will almost always be the case – the consumer will quite possibly break, unless you’ve changed the consumer accordingly, or notified them in advance if they’re an external entity. This last case is where the (versioned) API comes into play: If one can make changes that don’t break existing implementations, by somehow working a versioning scheme into the API, that’s a bonus. HTTP Accept will not let you do this.

For representations that are only intended to be used internally, a changing, non-versioned JSON one may suffice; if one has control of the entire stack, one doesn’t need to maintain backwards compatibility to the same degree. But those sorts of APIs wouldn’t be subject to OAuth restrictions anyway.

I agree that an OAuth token shouldn’t be necessary to get a JSON representation of one’s Twitter stream, when an HTML representation is freely available – but I do think that the nature of intended-for-machines representations are so substantially different from intended-for-humans representations that some sort of agreement (and documentation) is required. If you can find an existing format that fits (Accept: application/atom+xml for a Twitter stream), by all means use it, but that’s also locking yourself into a model that may not fit your data exactly as you’d like it to – and unlike HTML, you have no way of telling the consumer what to do with your seemingly arbitrarily structured data, the way you do with CSS.

Insofar as it’s possible, you should make representations of your data that fit the user agent’s Accept header; but if you don’t commit to the structure you choose, it will be unreliable and essentially unusable for machines parsing it.

This is Simply Jonathan, a blog written by Jonathan Holst. It's mostly about technical topics (and mainly the Web at that), but an occasional post on clothing, sports, and general personal life topics can be found.

Jonathan Holst is a programmer, language enthusiast, sports fan, and appreciator of good design, living in Copenhagen, Denmark, Europe. He is also someone pretentious enough to call himself the 'author' of a blog. And talk about himself in the third person.