4 Anatomy of an API

This chapter is going to be a huge infodump. Sorry, everyone. I want to make sure we’ve got all our basics covered before we get to the table-flipping portion, so this is essentially a primer. If you’re super 100% extra positive that you know what you’re doing and are familiar with REST and HTTP, you have my permission to skip this chapter and the next one, and get right to the good stuff: crimes against APIs.

Still with me? Sweet. Away we go.

Limiting Our Scope

An API is a really hard thing to define. Technically, an API (which, by the way, stands for “Application Programming Interface”) just describes how software components should interact with each other. That means that client libraries and even programming languages have APIs. But, for the sake of my sanity, we’re going to limit our discussion to APIs that interact with services, specifically services over HTTP.

There are lots of ways to build APIs on top of HTTP. There are RPC (“remote procedure call”) APIs, REST (“representational state transfer”) APIs, SOAP (“simple object access protocol”) APIs, and IMTUAIG (“I’m making things up as I go”) APIs. I can’t talk about all of them in this book, because your attention span isn’t that long and I don’t have years to devote to researching and writing a comprehensive compendium of every variation on HTTP APIs that anybody has ever concocted.

Instead, we’re going to focus on the most popular: REST. Because if you’re not using REST, you should have good, strong reasons for not doing so. And if that’s the case, you’re probably already thinking a lot about your API and its design, and therefore won’t get much use out of this book. Your API is not bad, you do not need to feel bad. Carry on being awesome.

For the rest of us mere mortals, we have one final scope-limiter to add: REST is a controversial term. Technically, for an API to be RESTful it needs to support HATEOAS (“Hypermedia As The Engine Of Application State”). This is a really fancy way of saying that your API should be self-describing; given a single endpoint and knowing nothing else about your API, client libraries should be able to access all of its functionality. This is Really Hard™, and most people don’t bother. It’s also really, really good API practice, so if you have the time and interest to try and pursue that, definitely go for it. But in the interest of being inclusive, we’re going to be talking about the most common type of RESTful API; that is, an API that applies 99% of the properties of RESTful APIs, but totally ignores HATEOAS. It’s the dirty little secret of APIs today: most implement the resource part of REST, apply most the principles, and say “good enough”, ignoring HATEOAS. Which is sad, but there you have it.

We’re going to be talking about HATEOAS in a later chapter, so don’t be sad if you were all excited to hear about it. Patience.

To be entirely fair, this really isn’t REST anymore. This is just using the concept of resources, and manipulating them using the semantics of HTTP. But we don’t have a catchy, shorthand name for that yet, so we’re going to stick with REST.

Now that we’ve properly narrowed our scope to something manageable, let’s get to the good stuff.

Resources, Operations, and Representations

REST deals with three core concepts that drive everything about the API:

Resources are the conceptual model of the data you want to expose with the API. Think of them as the “classes” or “types” of the API.

Operations are the things you do to the resources. Think of them as the “methods” or “functions” of the API.

Representations are expressions or instances of a specific resource. This is a bit abstract, but think of it this way: APIs can return responses in different formats (like JSON or XML), right? You can respond with JSON, or XML, or a Protocol Buffer, or with a URL-encoded string. Each different format would represent the same data, so they’d each be their own representation. To make this even more confusing, suppose you have a Book resource. The Book has a Title property and an Author property. When you retrieve a list of a Books, the response contains the Title and the Author. When you retrieve a list of Books, filtered by a specific Author, the response contains just the Title. These are different presentations of the same conceptual objects, even though one presents more data about the object, so they’re both representations of a resource, instead of being independent resources.

HTTP: A Woefully Inadequate Primer

Not to shave a yak, but we also need to talk really briefly about HTTP. I’m going to skim horribly, touching only briefly on a lot of Really Cool Things™ in this section, so please read up on HTTP. Wikipedia has some great information about it, and I bet a Google search turns up a lot of literature on it. It has been documented for over 20 years and it’s fundamental to the internet as we know it today, so lots of people have written and are writing about it.

In the interest of being awesome, let’s make this section illustrative.

 1 GET / HTTP/1.1  
 2 User-Agent: curl/7.30.0  
 3 Host: yourapiisbad.com  
 4 Accept: */*  
 5   
 6 HTTP/1.1 200 OK  
 7 Content-Type: text/plain; charset=utf-8  
 8 Content-Length: 12  
 9 Date: Mon, 28 Oct 2013 21:44:37 GMT  
10   
11 Hello, world  

That looks really complicated, but it’s actually deceptively simple: those first five lines are the request we sent to the server, the other six are the response the server sent back.

Requests

You’ll notice that the first thing we sent was GET / HTTP/1.1, after we connected to www.yourapiisbad.com. www.yourapiisbad.com, by the way, is what’s called a “domain”, but in this instance it’s also our “host”. The domain/host will be important later, when we discuss endpoints, but for now it’s enough to know that it means “where the request should be sent”.

GET is just an HTTP verb, whichi is used to describe the operation that should be performed against the resource. There are currently 9 HTTP verbs:

  • GET: request a representation of a resource or group of resources.
  • POST: request that the server create a new resource using the body of the request (… sort of. POST has some other uses, which we’ll talk about below, but creating resources is the main one)
  • PUT: request that the server set the specified resource to the body of the request
  • DELETE: request that the server remove the specified resource.
  • HEAD: request a response identical to the one a GET would receive, but without any body; typically, this is used to get response headers (which we’ll discuss shortly).
  • TRACE: echo back the received request as a response.
  • OPTIONS: request a list of HTTP verbs that the server supports for a specific URL.
  • CONNECT: converts the request connection to a TCP tunnel.
  • PATCH: request that the server update part of the specified resource using the body of the request.

Of these 9 HTTP verbs, you’ll mostly just use the first four: GET, POST, PUT, and DELETE.

After the verb, the path is specified (/, often called “the root”, in our example above). This is how a request identifies which resource it would like to operate on.

After the path, you’ll see the HTTP version (HTTP/1.1, in our example). This just tells the server which version of HTTP the request is in. You generally don’t need to concern yourself with this.

The next series of lines, each beginning with a word, followed by a colon, followed by some seemingly random stuff, are called “request headers”, and they’re useful for specifying metadata about the request itself. In our example, we have User-Agent, Host, and Accept. The User-Agent is just an identifier for the client that made the request, useful for tailoring responses to specific clients, debugging, and recording analytics. It should be noted that there is no guarantee the User-Agent has not been tampered with, so you should never trust it. Host just identifies the host (or “server” or “domain”) the request was sent to. Accept is where it gets interesting. Accept allows a request to specify the representation it will accept in a response. In our example, */* is specified, meaning any type of representation is acceptable. An Accept header of application/json would mean only JSON-encoded representations should be returned. This is the client’s way of saying “Here’s what I know how to handle.”

The next part would be our request body, the information we want to send with the request, but we didn’t send any information with the request, so there’s no body.

GET requests tend not to have a body, and generally should not have one. Things may not work as expected if you try to send a body with GET.

All said and done, pretty basic: verb (what you want to do), path (what you want to do it to), headers (information about your request), body (the data needed to do what you want). With these four components, we can make magic.

Responses

I’ll reproduce our sample response again, in case you forgot what it looked like:

1 HTTP/1.1 200 OK  
2 Content-Type: text/plain; charset=utf-8  
3 Content-Length: 12  
4 Date: Mon, 28 Oct 2013 21:44:37 GMT  
5   
6 Hello, world  

This looks familiar. We’re starting off with the HTTP version of the response. Again, you generally don’t need to concern yourself with this. After that, we see 200 OK. This is the status code and status message of the response. It’s a short generalisation of how the server is answering your request. A 200 response means “your request succeeded, everything’s fine, good job!”

There are 5 levels of HTTP response codes:

  • 1XX: informational; this should probably never be used in your APIs.
  • 2XX: success; your request succeeded, good job.
  • 3XX: redirection; the client needs to do something else before the request can be handled. Generally used to redirect.
  • 4XX: client error; your request was bad and you should feel bad.
  • 5XX: server error; oops, there’s something preventing the server from serving your request. Try again later.

Again, that’s painting with a broad brush stroke, but it should be enough to give you a general idea.

Next, we have a bunch of headers again. Request headers give metadata about the request; response headers give metadata about the response. In this case, we have Date, Content-Length, and Content-Type.

Date just tells the client when the response was sent from the server.

Content-Length tells the client how many bytes it should expect in the response body. If our request had had a body, it would have had a Content-Length header, too.

Content-Type is the twin to Accept. It describes the representation the server actually responded with, which should be a member of the list of representations passed by Accept in the request. This gives the client information about how to decode the request.

There are actually really cool and powerful rules for whether a content type matches an Accept header. Definitely read up on them.

Finally, we have the response body. The response body is where the data of the response will live; in this case, the data just says “Hello, world”. In real APIs, this would be the information you were trying to retrieve.

Again, pretty basic tools: status (success/fail indicator), headers (information about the response), body (actual response). From these three components, you can give really powerful, useful responses.

Which, of course, doesn’t necessarily mean that most people do.

Back on Track: Conventions

Our long foray into HTTP finished, let’s get back to the REST stuff. Some conventions have taken hold and become de-facto standards. You should follow them unless you have a good reason not to, if only because that’s what your users will expect. And as any designer can tell you, if your user expects your software to do something, that better be what your software does.

Endpoints

Endpoints identify where a resource lives. They’re just URLs used to interact with APIs. “http://api.yourapiisbad.com/” is an endpoint, and “http://yourapiisbad.com/my/awesome/endpoint” is an endpoint. All endpoints provide is a location that a resource can be found at.

Resources are addressed by endpoints, but how do we build those endpoints?

First, a brain exploding moment: a collection of resources is a resource.

Got that? So if I’ve got a list of awesome things (a unicorn, my puppy Roxy, Call Me Maybe), each item in that list is a resource. But the list itself is also a resource. Weird, right?

But that makes building endpoints simple: the endpoint is just a resource hierarchy. We start with the host or the domain, which is the context of the request. This is where the request should is sent, but it also defines the conceptual boundaries of your API. After all, my API and your API may both define User resources, but these resources may not be the same and may not be interchangeable. The domain namespaces resources to the context you’re defining.

Then we start with a base resource. So in our list example, the base resource would be the entire list of awesome things. We could call that collection awesome_things, or things_that_make_paddy_smile. It doesn’t really matter what we call it, it just needs to be unique and should be a good description of the resources in the collection.

After our base resource, we can specify a specific member of that resource to return. So, for example, if we wanted just information about my puppy Roxy (and, let’s face it, everyone wants information about my puppy. You can follow her on Twitter at @RoxyThePuppy, by the way) we’d need her Resource Identifier. Let’s say it’s roxy, though your API may use a database ID or a username or something else. Again, the ID only needs to be unique—non-collection resources don’t even need a descriptive ID, though it’s always nice if they can have one. Now we want to say that roxy is a member of the awesome_things resource (which is actually a collection of resources), so our endpoint would be awesome_things/roxy. And awesome_things is not a member of any resource (it’s a base resource), so the final endpoint is /awesome_things/roxy.

If it helps, think of collection resources as database tables, and regular resources as a specific row in that table. You need to specify the table name and the row ID to get a specific row. If you specify just the table name, you’re really saying “all the rows in the table”

Of course, you can (and should) have more complex hierarchies than that. A blog, for example, may have /posts/1/comments/4, meaning the fourth comment on the first post. /posts/1/comments would mean all the comments on the first post. /posts/1 would mean the first post. /posts would mean all the posts.

Verbs

Though you can theoretically use HTTP verbs however the heck you want when making requests, it’s best to respect the semantic meaning of the verbs. You’ll have less pain that way.

GET is meant to retrieve a representation of a resource. It’s also considered a “safe” verb, meaning that it can be called without any side-effects. So if you’re creating, updating, or deleting things, you really shouldn’t be using GET.

POST is reserved for any non-idempotent action—that is, any action that has side-effects every time it’s performed. Generally, POST requests are used to create objects. For example, you’d use a POST request to /awesome_things if you wanted to add another resource as a “child” of the awesome_things resource. A POST request should specify the resource to be created in the body. You should generally respond with the created resource.

Note: but keep in mind that POST requests can just mean “change this resource”. That’s a powerful definition, and it gets forgotten a lot.

PUT is meant to replace a resource. Semantically, a PUT request’s body should overwrite everything about the resource except for its ID. Think of it as knocking down the eastern wall of your house and building a new wall in its place; totally new wall, but it’s still the eastern wall of your house. If you want to update just part of a resource, you should be using PATCH. PATCH is the equivalent of painting the eastern wall of your house; the wall is a different colour, but is still the eastern wall and everything else about it is the same.

PATCH is a fairly new verb in HTTP. A lot of APIs and clients will use PUT or POST for updates instead.

DELETE is meant to destroy a resource.

Data Formatting

Once every six months or so, I come across an API that doesn’t support JSON representations. This never fails to confuse and bewilder me; it’s like the API designers haven’t been paying attention for the last decade or so.

Don’t be that API.

I’m not here to tell you one data format is empirically better than another. If, to escape the decadent luxury of self-flagellation, you decide to punish yourself by working with some XML, who am I to stop you?

I am going to tell you, however, that pretty much every user expects you to be returning JSON for them to parse. JSON has tooling readily available for it, it’s easy to express data structures in it, it’s easy to read. Which is not to say that JSON is better, just that JSON is massively popular. If you are ignoring the expectations of your users, you should have a very good reason for it.

“I need to send and receive binary data.”

That’s totally a good reason. Use your protocol buffers or your msgpack or your BSON or whatever it is your use case requires. You’ve thought this through. Gold star.

“I’m serving mobile apps for users with low bandwidth, and JSON is too verbose.”

That’s totally a good reason. You also have thought this through. You considered the needs of the user, so choose an appropriate data format for that job.

“But XML is better.”

Your reasoning is bad and you should feel bad.

This isn’t to say that you can’t return XML or whatever data format floats your boat. By all means, go for it. Just also be able to return JSON, and respect the Accept header. If your users are used to working with JSON (protip: in 99.999% of all cases, your users are used to working with JSON) that’s what you should let them work with.

For the rest of this book, we’re going to be using JSON for the examples. Just mentally translate it into whatever data format you prefer.

Wrapping Up

That’s the end of our horribly brief overview of HTTP. Definitely dig more into it. Read the specifications, check out Wikipedia, Google around for some blog posts. HTTP is incredibly powerful and able to create very flexible semantics when you know how to use it, so developing good intuition around it will definitely work out in your favour.

In the next chapter, we’re going to talk a bit about how to go about designing APIs, and it’ll be the last chapter of primer information, I swear.

To recap:

  • HTTP is made up of requests and responses.
  • Requests have a verb, a path, a host, some headers, and a body. The verb is what you’re doing, the path is what you’re doing it to, the host is namespace or context for what you’re doing it to, the headers are metadata about the request, and the body is the data you want to use.
  • Responses have a status code & message, some headers, and a body. The status code & message tell you whether the request succeeded or not, the headers are metadata about the response, and the body is the data you’re receiving.
  • Requests can tell the server how they want the data represented using the Accept header, and the server can tell clients how the data is represented in the response by using the Content-Type header.

If you’ve got all that down, we’re ready to talk about how to go about designing your API.