Facebook employees, with a few individual exceptions, don’t believe their company has crossed a line yet. Twitter employees, again with a few individual exceptions, don’t believe their company has crossed a line yet. We know this because they haven’t put down the tools. And by continuing to aid the companies making those decisions by selling them their labor, they’ve become complicit in their actions. They haven’t organized. They haven’t made a stand.
And they won’t.
Way back in November (I realise this puts my comment in the category of insanely untimely responses, but so be it) Ruben Verborgh wrote an article called The lie of the API, which I got to via Jeremy Keith.
Let me, before I lay out my disagreement, start by saying that I agree with the basic gist of both their arguments: There is little reason why an HTML representation of a piece of content is freely available, but a JSON (or XML or YAML) representation requires an OAuthenticated API.
Although Keith disagrees that content negotiation is the way forward, his waryness seems to be of a nature that could be URL hacked away — instead of using actual content negotiation (
Accept: application/json), one could simply append .json to the end of the URL, or other similar measure. It may not be the absolute cleanest, most native HTTP implementation, but I’m sure Verborgh and Keith could both live with this.
This doesn’t pave over the reliability of the API though, which I’ll argue is an, albeit maybe not explicitly communicated, major reason why sites choose to offer their data via an actual API.
When you make a JSON representation of some data, the structure will not be self-explanatory, and you will be forced to choose a structure for the data. Unless you’re dealing with Platonic ideals (and are very good at achieving those ideals on the first try) this structure is bound to change. You might add a field, remove a field, or change the semantics of a field.
If you do this with an HTML document, you can go about the change any way you like – so long as the new structure of the data still conforms to the browser’s expectation of how an HTML page should be structured, you’ll still have something usable. If you do this with JSON – assuming the JSON representation is only read by a machine, which will almost always be the case – the consumer will quite possibly break, unless you’ve changed the consumer accordingly, or notified them in advance if they’re an external entity. This last case is where the (versioned) API comes into play: If one can make changes that don’t break existing implementations, by somehow working a versioning scheme into the API, that’s a bonus. HTTP Accept will not let you do this.
For representations that are only intended to be used internally, a changing, non-versioned JSON one may suffice; if one has control of the entire stack, one doesn’t need to maintain backwards compatibility to the same degree. But those sorts of APIs wouldn’t be subject to OAuth restrictions anyway.
I agree that an OAuth token shouldn’t be necessary to get a JSON representation of one’s Twitter stream, when an HTML representation is freely available – but I do think that the nature of intended-for-machines representations are so substantially different from intended-for-humans representations that some sort of agreement (and documentation) is required. If you can find an existing format that fits (
Accept: application/atom+xml for a Twitter stream), by all means use it, but that’s also locking yourself into a model that may not fit your data exactly as you’d like it to – and unlike HTML, you have no way of telling the consumer what to do with your seemingly arbitrarily structured data, the way you do with CSS.
Insofar as it’s possible, you should make representations of your data that fit the user agent’s
Accept header; but if you don’t commit to the structure you choose, it will be unreliable and essentially unusable for machines parsing it.