Keep the ship running

The raw power of what we can do with a few hours of coding and access to HTTP is somewhat mind-boggling. We can aggregate and process data that would have taken a team of analysts working for the war department of a large country during World War II months to handle in mere milliseconds, on-demand, thousands of times a minute, and then beam the results to just about anywhere on the globe, where the visitor's machine will run more software than the entire Apollo moon lander to ingest, parse, and visualize that data.

As I am sure your uncle Ben has already told you:

With great power, comes great responsibility.

The thing about fancy web services is that they can be as reliable as your college roommate when it was time to split the pizza check. They make all kinds of promises, but sometimes they just don't follow through.

When they work, the power they provide is unreal, but when they fail, if you haven't taken the necessary precautions to develop defensively, your entire project can be left stranded when they don't show up.

A really simple call to an API in a PHP application may look something like this:

<?php

class EventGetterThing {
    public function getAllEvents()
    {
        $client = new GuzzleHttp\Client(['base_uri' => 'https://foo.com/api/']);
        $response = $client->get('/events');
        return array_map(function($event) {
            return [
              'date' => $event->date,
              'location' => $event->location,
            ];
        }, $response['data']);
    }
}

We're just grabbing the data from the API using Guzzle (a popular HTTP client library for PHP) and then plucking out the parts of the response that we want. This is pseudo-code, so don't expect to copy it line for line and have it work, it's just an idea to illustrate something fairly typical.

There are a lot of things in this quick and dirty example that we'd want to improve, but I'd like to focus on a couple of things that aren't related to code structure, etc, but rather hone in on some problems that can arise if the foo.com API has something go wrong.

What if the API Data is a no-show?

APIs are made by imperfect humans, who are operating in an imperfect world. Even the best companies sometimes have massive performance degradation or even complete outages from time to time. Sometimes you'll make a request to an API and it will either never respond, or respond so slowly that visitors will have long since given up and moved on to something else.

When your code is being called during a server request, this means it can't complete the request and respond to your user until this API call has resolved. The application process is currently tied up waiting for the API to respond, leaving the user waiting. Add a few of these requests together, and suddenly all of your processes, threads, workers, containers, servers, etc will be consumed, waiting indefinitely for these processes to finish, or until the server configuration itself kills the process.

This can cause the entire application to go down when enough users attempt to request a route blocked by the failing API request.

Defensive measure #1: Timeouts

The first and simplest countermeasure to put in place here is a timeout on the API request:

<?php

class EventGetterThing
{
    public function getAllEvents()
    {
        $client = new GuzzleHttp\Client([
          'base_uri' => 'https://foo.com/api/', 
          // Add a two second timeout (pretty generous)...
          'timeout' => 2.0
        ]);
        $response = $client->get('/events');
        return array_map(function ($event) {
            return [
              'date' => $event->date,
              'location' => $event->location,
          	];
        }, $response['data']);
    }

}

This tells Guzzle to wait no more than two seconds for the request to complete. When the two seconds are up, Guzzle will throw an exception if it hasn't heard back from the API service. You can just stop here, and your app will still crash when the API is down, but at least it will happen in a reasonable timeframe.

It sure would be nice if the app just didn't crash though right?

Defensive measure #2: Catching the exception

When the APIs we depend on are down, it'd be great to be able to continue serving our application, and simply let users know that there is a problem with one of the services we use:

<?php


class EventGetterThing
{
    public function getAllEvents()
    {
        try {
            $client = new GuzzleHttp\Client([
              'base_uri' => 'https://foo.com/api/',
                // Add a two second timeout (pretty generous)...
              'timeout' => 2.0
            ]);
            $response = $client->get('/events');
            return array_map(function ($event) {
                return [
                  'date' => $event->date,
                  'location' => $event->location,
                ];
            }, $response['data']);
        } catch (\GuzzleHttp\Exception\GuzzleException $exception) {
            // Laravel style log statement
            Log::error($exception->getMessage());
            // You'd probably want a more robust return here that matches
            // The type returned during a good request.
            return "We're sorry, we're having trouble reaching the foo.com API";
        }
    }
}

Now we're still serving the user a response, and we're explicitly letting them know what's wrong. Trust in our software is maintained, even though they may not be able to use the feature they wanted right now.

With just these two simple changes, we've completely removed the danger that an external service can take our application down. I've worked on a lot of applications in my career that didn't implement this level of safety.

Other techniques

There are some other things you can here as well to guard yourself against an external API ruining your day:

Cache the results of the call and if a subsequent attempt to fetch it doesn't work, serve a stale cache.
Make API calls in a background job (very easy to do in Laravel) so that they don't block user requests, even if something goes wrong.
Move some API calls to the client if you can. This completely takes the load off your server and instead allows the individual user's machine to asynchronously attempt to fetch the data. You should still use timeouts and try/catch here, after-all, we don't want to tie up a remote machine's resources or crash our javascript!

Why is this important?

When we're running an application on the web, we've made a sort of contract with our users that if we can help it, this service is going to remain available for use, day and night, whenever they need it. What matters here is our user's trust in us, not their trust in the API of our vendors or collaborators.

When you remove the possibility of an external service taking your own application down, you build stronger trust with users which helps reduce churn and increase customer life time value. For non-profits, it means you get to help more people by ensuring they have access to the information and services you provide.

Show up for your users and they'll show up for you.

Keep the Ship Running

What if the API Data is a no-show?

Defensive measure #1: Timeouts

Defensive measure #2: Catching the exception

Other techniques

Why is this important?

Get updates straight to your inbox

More on Better Software

Intoduction to Statamic for Drupal Developers

Automating Blog Newsletters with Mailchimp and Statamic

Async Laravel with jobs and events