Put your API on a JSON diet

30.03.2015
Last week I discussed design considerations for APIs, given that APIs aren't applications and shouldn't be treated as such. At small scales, APIs that come along for the ride with bulky Web frameworks might be fine, but beyond that you're asking for trouble. If you're building an API that will serve a large number of clients, your API code should be thin and tight, as well as make liberal use of caching. Otherwise, the future headaches will be crippling.

But this doesn't pertain to the foundation of your API only; it's relevant within the API itself. Here, putting your JSON on a diet can be crucial.

Older APIs might still be XML, but given the overwhelming trend toward JSON, they should either have or be developing a JSON format as well. Within that JSON, you might find certain design decisions that at scale can make a big difference.

Let's say for our API a client needs to issue a POST request containing JSON data listing a series of coordinates and the length of time spent at those coordinates. In our API, we call these stops, and we don't necessarily know the number of stops that will be sent in any given report. This could be dealt with thus:

{    "stops" : [{        "stop_latitude" : "nn.nnnnn",        "stop_longitude" : "-nn.nnnn",        "stop_duration" : 500    }, {        "stop_latitude" : "nn.nnnnn",        "stop_longitude" : "-nn.nnnn",        "stop_duration" : 500    }, {        "stop_latitude" : "nn.nnnnn",        "stop_longitude" : "-nn.nnnn",        "stop_duration" : 500    }]}

This JSON snippet would accurately deliver the data we need, showing us the coordinates at each stop and how long each stop at the given coordinates lasted. But the snippet is redundant in several ways. Instead, the same data could be delivered this way:

{    "stops" : [{        "coords" : { "lat": "nn.nnnnn", "lng": "-nn.nnnn" },        "duration" : 500    }, {        "coords" : { "lat": "nn.nnnnn", "lng": "-nn.nnnn" },        "duration" : 500    }, {        "coords" : { "lat": "nn.nnnnn", "lng": "-nn.nnnn" },        "duration" : 500    }]}

This snippet delivers the same data, parsable in essentially the same way, but with a nested array for related items (lat/lng in this case) instead of individual items. Also, we eliminated the superflous stop_duration in favor of duration because we already know it is a stop given the key of the parent array. In this tiny example, that change means effectively nothing on the API development side, but saves 57 bytes of data. Sure, that's a pittance, but at scale, it can be huge -- and that's from one little snippet. Across a significant JSON data set, the savings will be many times larger.

Here we're talking only about transmission considerations. Regardless of whether the data is stored at the API side, the bandwidth used to deliver that data matters -- on both the client and server side. In an age where mobile users and even some unfortunate wired broadband users have data caps, every little bit of savings can go a long way.

Of course, it might be tempting to get carried away and render your JSON as functional nonsense, abbreviating like so:

{    "st" : [{        "cds" : { "la": "nn.nnnnn", "ln": "-nn.nnnn" },        "dur" : 500    }, {        "cds" : { "la": "nn.nnnnn", "ln": "-nn.nnnn" },        "dur" : 500    }, {        "cds" : { "la": "nn.nnnnn", "ln": "-nn.nnnn" },        "dur" : 500    }]}

This saves a further 33 bytes per snippet, but at the significant expense of readability, especially if you're mapping database column names to their JSON equivalents. It pays to be judicious about where you apply a slimming strategy because the second example above is as readable as the first, but still saves on bandwidth. The third is mostly unintelligible unless you know exactly what it represents. That knowledge comes at another price.

A decent comparison could be made to Unix in this regard. One can only imagine the number of keystrokes and bandwidth saved over the decades because the copy command is cp not copy and move is mv not move. Being concise is not bad, but it can go too far.

There's a fine line between slimming down your JSON and turning your API into alphabet soup, but if you are expecting to service anything at a significant scale, definitely look at where you can put your API on a diet. As with so very many facets of IT and life, moderation is key.

(www.infoworld.com)

Paul Venezia

Zur Startseite