I’ve spent the last several months working with an API from a major fitness brand, and have found myself entrenched in a kind of Kafkaesque nightmare. Programmers love challenges, and good programmers refuse to be defeated by them. I like to think, that I’m a pretty good programmer. So although each night I go to sleep defeated, something in me won’t let me give up, and I wake up with new ideas and new drive. I’ve continued like this for months now, it’s like a dream, day in and day out. The task is simple. Get the data. Save the data. But here I am surrounded by brittle code, chasing bugs I will never catch.
So, how do you build an API that has technical hurdles so high that you can defeat a 20 year veteran who up until this moment has always found a way. How do you drive him slowly insane and drive all joy from a vocation he loves. Well, I’ll tell you.
Let’s set the playing field.
For authorization we’ll use OAuth1a. Sure, it’s a very fine standard, but signing requests will add a nice little bit of complexity, so you’re never quite sure it it’s the signing that’s causing the requests to fail.
Server to server communication. Instead of responding to your API requests we’ll call you back. You ask for some data, and then at some point between right now and infinity (or never), we’ll respond. Usually the response will take milliseconds, but sometimes it will take minutes and on rare occasions hours. But, most importantly if we don’t have any data we won’t respond at all. Inconsistency is a programmer’s cryptonite.
What does this setup mean in practice? It means you’ll need to setup an endpoint behind a server or a proxy. It also means that in your unit testing if there’s no response then you’re going to have to put in some work. Write some code, test it, and if it fails, check your webserver logs. But maybe while you’re looking through those logs the request comes through, it worked after all, it just took a while. But of course the JSON wasn’t formatted the way you expected, so do it all over again.
Duplicate data prohibition. Here’s a doozy. You can only ask for any piece of data exactly one time. Not one time a minute, or an hour or a day – one time forever. This effectively means no unit test is repeatable without registering a new account by hand and populating it with data. And let’s say when the code’s live, and a user asks for data for a certain time period, that data may or may not exist, you may or may not receive that data, but you still need to track the request because if you overlap by just one day with that request at some point in the future you’ll receive an error. And of course, what happens when a request in production fails? There’s no mitigation, nothing to be done. That data is irretrievable – forever.
I could go on about the implications of this one all day. But just picture writing your first line of code. You successfully execute it. And then when you run it again it doesn’t work. Then imagine that happening for the rest of the project.
Dev and QA rate limits. I guess you can argue that there might be some reason to set the QA rate limit far below production. But how about setting the limit at 100 days of data (not requests) per minute? This one is particularly devious. First, it sets a max limit for your development test cycles. Write some code, run a test. Wait. But what’s really effective is that it adds yet another reason that the code might fail inexplicably.
But the best part is that you’re trying to develop a highly scalable application, how do you even begin to test with any kind of load? Well, the simple answer is you can test in production or not at all.
Application level rate limits in PROD. The beauty of an application level rate limit is that you’re going to need a message queue. Sure, there might be other techniques, but if it’s a web app and you don’t want to have a bunch of long running threads kicking around a queue is pretty much your only bet.
They’ve thrown in other things of course, but those are hardly worth mentioning because they’re just the typical data structure challenges or poor parameterization options. Those are your everyday challenges, not the ones that will bring even the most seasoned developer to their knees.
I’ve spent a fair bit of time over the last few months trying to imagine the team behind this and their motivations. No matter how you look at it, the motivation must be to prevent usage of the API. Perhaps, they’re interfacing with a system with decades old infrastructure that falls over with the slightest disturbance. Maybe they have a lot of new developers who have written some very inefficient code, and they view the solution to that as throwing up as many road blocks as possible? Or maybe they’re just inconsiderate. Perhaps when they weight design decisions usability is rated at 0, and so all other concerns are always preeminent.
But, I think there’s another possibility. Corporations like to wall off their gardens, however in the fitness space, data portability is a selling point. So, how do you achieve both? How do you wall off your garden, but point everyone to the door? In that case this development nightmare is the whole point. And in that case, the only fool is the one earnestly and diligently trying to write code to work with it.