Simply Jonathan

Strings do too many things 

Permanent location of 'Strings do too many things'

This is not really Hillel’s point, but my main gripe with the “static types will save us“ philosophy of some is that there are so many things being represented as just strings, but which have de facto constraints that the type system can’t represent, meaning you either have to rely on runtime validation (which means you haven’t turned them into compile-time errors) or just expect people to adhere to the constraints.

Programmer migration patterns 

Super interesting, albeit completely unscientific, look at the history of programming languages, and the way different categories of programmers have switched from one to the other.

Effective Mental Models for Code and Systems 

Some great advice from Cindy Sridharan on how to write code optimised for others to read, that greatest of Knuthian pursuits.

Checking for missing migrations in Django

Back in the day, Django didn’t have a built-in way to change model schemas. You either had to figure out and apply the changes yourself or use a third party tool like South.

After a successful Kickstarter campaign, Andrew Godwin, Django core member and the author of South, added a native migration tool to Django. It solved problems such as conflicting merge names and the ability to squash migrations once there are too many of them in an app.

One thing it also changed is introduce what I have dubbed ‘cosmetic migrations’: these are migrations that make no changes to the database schema, but only add internal Django changes, such as changing a field’s choices or the ordering of a model. I’m sure the change is for good reason, but it annoys me to no end, because the lack of schema impact means I’m unlikely to notice that I haven’t made them until at some point I do make a schema-altering change, and that migration is then flooded with an untold number of cosmetic changes. This is a problem because commits should be atomic.

Django will occasionally notify you that ‘[y]our models have changes that are not yet reflected in a migration’, but I found that I would only see those when it was too late.

The solution

Thankfully, it is possible to make these checks yourself, although I have never seen it advertised anywhere. Executing this command will give your what you need:

$ python makemigrations --check

This command seems wholly counter-intuitive to me, but it does what I want: Exit with a code 1 if there are unreflected changes.

You can plug the above in to your Continuous Integration system or possibly a pre-commit hook. If you do so, I recommend that you also make it --dry-run, like so:

$ python makemigrations --check --dry-run

This will ensure that no migrations are actually created, which just seems the saner option if you want to check.

The Elements of Python Style 

A proposed style guide for Python, not quite as specific as PEP8, but dealing with some things that PEP8 doesn’t.

I agree with most of this, but this one in particular stood out to me because it echoes what I said in my first impressions of Clojure:

No one wins any points for shortening “response” to “rsp”.

(Via Python Bytes, episode #14)

Learning Clojure

Over the last month or so, I’ve begun learning Clojure. I don’t do much blogging, let alone technical, but I realise I’ve actually had this blog for so long that my first impressions of Python (the language I spend most of my day job writing in), are documented.

I read through Kyle Kingsbury’s Clojure from the ground up series, and found it an easy learning process. Although I consider myself somewhat of a polyglot, I realised that I hadn’t actually learned any new programming languages in almost ten years, aside from various JavaScript type annotation supersets. (I’ve tried learning Haskell, but have been largely unsuccessful.)

All in all, I like it. I like its functional paradigm, making functions pure by default, but not having to ask permission to get side-effects (which is the feeling I get a bit with Haskell). Leiningen is a great tool to get up and running, and it takes care of a lot of the minutae, like installing dependencies and running tests. I really like being able to name functions almost anything, including non-ASCII characters and characters normally reserved (so now I can use possessive in a function name).

One thing I find about that I really dislike is the abbreviations. Now, this might simply be because it’s the first language I’ve picked up in a while, and I simply haven’t paid attention to it in other languages, but the incessant abbreviating every conceivable name drives me up the wall. Why in the world does it have to be conj and assoc, what’s wrong with conjoin and associate‽ The fact that abbreviations are applied so randomly proved a stumbling block for me, which I feel it really shouldn’t have to be. This is my first foray into Lisp, so I don’t know how much (if any) is simply convention, but space-saving concerns one might have had in the 50s can surely be ignored today. I get more riled up than I justifiably should be, but it irks me. (And again, I realise this might simply just be my internalisation of some abbreviations: I have no problem with str, concat and def.)

I find the destructuring syntax in a lot of cases to be greatly confusing and emanating magic. I have come to terms with let taking a vector of alternating key value pairs, but the sprinkling of keywords to imbue bindings with special properties means I’m still at a copy & paste–stage for some use cases. I’m not very far into macros yet (I have yet to write my first one), but from what I can sense, it leads to a lot of poorly designed APIs. But it might just take some getting used to. (I thought the self argument for Python methods was stupid at first, and now I don’t think about it.)

I also really miss Python’s named, any-order parameters. I realise something similar can be achieved in Clojure using keys destructuring, but that can’t be combined with arity overloading, which I also really like. (Yes, this might be a case of wanting to have a cake and eating it too.)

The lack of a good date and time library is also unfortunate (at least for the apps I tend to do). I’ve been using clj-time, which seems to be a pretty thin wrapper around Joda Time, and while it does its job, it has some odd shortcomings, the primary being its incapability of representing date-less times. I’ve resorted to vectors of hours, minutes, etc., but when you’re used to Python’s datetime library, specifically datetime.time in this instance, you find yourself wanting.

I have found one library that I really like, though: Enlive. It’s an unconventional templating library, in that it doesn’t make a DSL for templating (or, indeed, give access to the whole language, as in PHP), letting the templates instead be pure HTML, and doing the transformations in Clojure. It took me a little while to get the hang of doing things such as loops, but I think it makes for a clean separation of concerns, and I’ll definitely investigate the concept in Python. (There is a Python port, although it doesn’t seem to get much attention these days.)

All in all, I’m really excited about Clojure. For web development it lacks some of the maturity and cohesiveness that I’m used to with Django, but as a language it has a lot of interesting concepts and libraries.

In defence of the (documented) API

Way back in November (I realise this puts my comment in the category of insanely untimely responses, but so be it) Ruben Verborgh wrote an article called The lie of the API, which I got to via Jeremy Keith.

Let me, before I lay out my disagreement, start by saying that I agree with the basic gist of both their arguments: There is little reason why an HTML representation of a piece of content is freely available, but a JSON (or XML or YAML) representation requires an OAuthenticated API.

Although Keith disagrees that content negotiation is the way forward, his waryness seems to be of a nature that could be URL hacked away — instead of using actual content negotiation (Accept: application/json), one could simply append .json to the end of the URL, or other similar measure. It may not be the absolute cleanest, most native HTTP implementation, but I’m sure Verborgh and Keith could both live with this.

This doesn’t pave over the reliability of the API though, which I’ll argue is an, albeit maybe not explicitly communicated, major reason why sites choose to offer their data via an actual API.

The thing is, when you visit an HTML page, you do not need to know anything about the structure of the data. The browser and the developer have an agreement that HTML structured in a certain way, with CSS and JavaScript structured in their certain ways, will be displayed in a certain way (save of course for browser-inconsistencies, but that’s the general idea). That is why an HTML page can be viewed, and made sense of, by a human being, whereas a JSON representation for the far majority of people will make essentially no sense.

When you make a JSON representation of some data, the structure will not be self-explanatory, and you will be forced to choose a structure for the data. Unless you’re dealing with Platonic ideals (and are very good at achieving those ideals on the first try) this structure is bound to change. You might add a field, remove a field, or change the semantics of a field.

If you do this with an HTML document, you can go about the change any way you like – so long as the new structure of the data still conforms to the browser’s expectation of how an HTML page should be structured, you’ll still have something usable. If you do this with JSON – assuming the JSON representation is only read by a machine, which will almost always be the case – the consumer will quite possibly break, unless you’ve changed the consumer accordingly, or notified them in advance if they’re an external entity. This last case is where the (versioned) API comes into play: If one can make changes that don’t break existing implementations, by somehow working a versioning scheme into the API, that’s a bonus. HTTP Accept will not let you do this.

For representations that are only intended to be used internally, a changing, non-versioned JSON one may suffice; if one has control of the entire stack, one doesn’t need to maintain backwards compatibility to the same degree. But those sorts of APIs wouldn’t be subject to OAuth restrictions anyway.

I agree that an OAuth token shouldn’t be necessary to get a JSON representation of one’s Twitter stream, when an HTML representation is freely available – but I do think that the nature of intended-for-machines representations are so substantially different from intended-for-humans representations that some sort of agreement (and documentation) is required. If you can find an existing format that fits (Accept: application/atom+xml for a Twitter stream), by all means use it, but that’s also locking yourself into a model that may not fit your data exactly as you’d like it to – and unlike HTML, you have no way of telling the consumer what to do with your seemingly arbitrarily structured data, the way you do with CSS.

Insofar as it’s possible, you should make representations of your data that fit the user agent’s Accept header; but if you don’t commit to the structure you choose, it will be unreliable and essentially unusable for machines parsing it.

This is Simply Jonathan, a blog written by Jonathan Holst. It's mostly about technical topics (and mainly the Web at that), but an occasional post on clothing, sports, and general personal life topics can be found.

Jonathan Holst is a programmer, language enthusiast, sports fan, and appreciator of good design, living in Copenhagen, Denmark, Europe. He is also someone pretentious enough to call himself the 'author' of a blog. And talk about himself in the third person.