I am one of those people who, of late, are not able to play Pokemon Go. often, because as it turns out, I’m using a Pokemon Trainer Club (PTC) account to play, not Google account, and I can’t get in. I’m frustrated by my inability to “authenticate” and login to play, while my husband who chose to use his Google account is not having the same issues. And that’s significant.
This got me digging into some technical concerns that should really resonate with every company, regardless of the app they’re launching. That concern revolves around the Application Programming Interface (API), Authentication, and Availability.
I was reading an article in Forbes about tracking Pokémon in Pokémon Go, and that led to another article and another, with one speculating that the reason tracking was broken is due to a game update in which an API key was inadvertently left out of the tracking calls back to Niantic’s servers.
Whether this is the case or not, such a faux pas would, indeed, break APIs. But the thing I kept coming back to was that if I couldn’t login to my PTC account and play, why was it that I could switch to my Google account and get in easily?
The API-Authentication Connection
Digging around GitHub and pawing through Pokémon Go APIs finally made the ‘aha’ light go on, since just about every API call in those repositories handles the same exception: LoginFailedException.
In other words, even a simple call to find nearby Pokémon may result in a LoginFailedException, which is not really surprising
Monolithic web applications often track authenticated users via sessions, which often means a cookie that contains a session ID or some other token that the application checks before actually doing anything else. APIs aren’t that much different, in that each API call has to have a way to ensure the calling application (the user) is actually authorized to make the call in the first place. They have to be “logged in”.
APIs often use API keys to achieve this. The key is generally checked against a user profile to ensure the call—every call–is authorized. There are various reasons for such a decision, including the ability to rate-limit calls, which is a big deal. Apigee’s State of APIs 2016 report noted that 68% of APIs were taking advantage of quota management, also known as rate limiting, metering, etc. In order to do that technical trick, one has to first know how many calls have been made in the past minute, hour or day, and thus it must be tracked somewhere safe so users can’t manipulate it and trick the application into allowing more calls per time period.
In other words, APIs can be very taxing on authentication infrastructure because they have to verify status, authorization, and potentially apply rate limiting. That’s a lot of work.
Yet we often don’t consider the impact of those extra calls on capacity. Those extra calls to verify and authorize, even if made on a period-basis to “refresh” a session, are going to put considerable stress on the authentication infrastructure, which is the same infrastructure that is supporting login. It’s the same kind of stress that was seen when the browser limitations on connections per user were increased from two to eight. A single user now consumes eight times the resources to access an application.
Thus, when considering the capacity needs for an app that relies on authorization on a per-API call basis, one has to do some math and figure out just how much more resources an individual user is going to consume. Failure to do so leads to angry gamers when overwhelmed login services stand between them and the Pikachu they desperately want to catch.
Scaling ID and A Critical for Availability
Identity and access are critical app services. We’ve seen their importance rising in our State of Application Delivery surveys for the past two years. And it’s not just because of apps; it’s because of the Internet of things, too, and the growing need to scale out the entire breadth of identity services infrastructure to support more things, more users, more apps using APIs to interact with back-end applications.
Availability is often solely based on a measure of downtime. If the servers were up and working correctly, they’re available. It’s an inside-out perspective. But like security, we need to turn that measurement around and view it from the outside-in. Capacity counts, and merely being “up” and “available” isn’t enough. Services need to be “up” and “available” to everyone who wants to consume them. That means scaling as fast as your python scripts can execute.
It also means understanding the relationship between the various back-end services that actually implement the functionality presented by your APIs. Identity and access services are as critical to availability as the actual application itself. Availability, like security, is only as good as its weakest link. And if your identity services aren’t as scalable (or more scalable if your model is per-API call authentication) as the rest of your application, you’re going to find that availability is a significant problem, even if all your dashboards read “green” inside.
This is because from the outside, we’re seeing red, literally and figuratively.