Leanpub: Publish Early, Publish Often

6 Requests and Responses

As we just covered in Chapter 1, requests and responses are how you interact with APIs. They’re kind of the paradigm that everything else rests on. So getting your requests and responses right is Really Important™.

Let’s talk about how easy it is to get your requests and responses wrong.

Polymorphism

Polymorphism is just the idea that the same concept can appear in many forms. If you remember, resources are polymorphic—they can be returned as XML or JSON, they can have some fields omitted or included, etc.

Polymorphism, by itself, is not inherently evil. By supporting different data formats, you enable multiple environments to interact with your APIs really easily, for example.

But when your API starts supporting polymorphism in internally conflicting ways, that’s when you’re going to start making people Hulk out on your API.

What I mean by internally conflicting is that the property or resource can be one of any number of things that need to be handled differently, and there’s no way of knowing what it will be ahead of time. For example, say you have an Article resource, which looks like this:

{
"title": "My Title",
"author": "Paddy Foran",
}

But sometimes, your Article resource can also look like this:

{
"title": "My Title",
"author": {
	"name": "Paddy Foran",
	"user_id": 123
}
}

See how author is sometimes a string and is sometimes an object? Trying to treat it as a string when it’s returned as an object or trying to treat it as an object when it’s returned as a string is going to make code break, so now developers have to keep careful track of which API call returns which, and need separate code to handle them. Especially for developers working in strongly-typed languages, not knowing the type of the response ahead of time is going to cause a lot of code bloat and waste a lot of developer time writing boilerplate around this, when it could easily be solved by just always returning an object or string for the author field.

There are a few instances where polymorphism is hard to avoid. The one I hear most often is the news feed—that is, a resource like Facebook’s news feed or Github’s news feed; basically, a collection of other resources that have been changed, sorted in descending order. The argument is that each item of the feed is polymorphic, because it could be any of the resources that is tracked on the news feed.

I disagree. Each item of the feed is a news feed item resource, and that shares all the same properties. Maybe a timestamp, a title, a summary, and a link to the resource it’s describing. Maybe a timestamp, a title, a summary, the type of change that occurred, the user that made the change, and a link to the resource it’s describing. No matter how elaborate or simple, these are still just the same type of resource, and should be exposed as such. Then there’s no polymorphism, it is a lot easier for developers in any language to work with your API, and your developers always know what they’re going to get in a response.

If your API is a contract between you and your developers, polymorphism is basically saying “I’m going to respond with whatever I feel like.” It’s a really unfair and untenable position to put developers into, and it’s something you should avoid.

Response Structures

It’s sometimes tempting to recreate the structure of your response for each and every resource, or even each and every action for each and every resource. After all, if the user is asking for a list of Articles, you should respond with a list of Articles, right? So it should look like this:

[
{ title: "Article 1" },
{ title: "Article 2" },
{ title: "Article 3" }
]

But the reality of the situation is a little more complicated than that.

Consider the Experience

When you’re writing an API, whose experience are you trying to optimise? For most APIs, it’s the client library authors who will see the gains of a better design. Optimising for the experience those authors will have is going to be one of the more important things you do in making your API usable.

Writing a client library consists of two main steps:

Mirroring the resources in local data types (whether they be objects, classes, structs, or what have you) and added helper methods to manipulate those resources, so API internals like endpoints and methods aren’t strewn around your codebase. Basically, you create the Article type, then an Article.create("my title") function that makes the relevant API call.
Writing the networking and deserialization code it takes to tie your API and the local data types together. This means making the actual HTTP request, serializing the request data before it’s sent, and deserializing the response data when it returns.

Note that serializing and deserializing the data are separated from the data type. That means you need to be able to serialize and deserialize things without knowing what they are in advance. In this way, request and responses are almost like resources themselves. The developer isn’t asking for a list of Articles, they’re asking for a Response that contains a list of Articles. If you give them a response structure that is consistent across all your resources, they only need to write the code to deserialize things once, and can just write it with the same code that makes the network requests. It makes their lives unbelievably easier, especially when you start to consider retrying requests and handling errors. The more complex the networking code, the more helpful it is to be able to think of whatever the server passes you as a Response resource containing the resources you requested.

Property-Per-Resource

For a lot of programming languages—Go, Python, Javascript, and pretty much every other language I’ve used personally—the easiest way to do this is to create a Response resource that contains a property for each resource in your API:

{
"articles": [],
"users": [],
"comments": []
}

These are then optionally filled out per response, and the empty ones are stripped from the JSON before it’s sent. So a single response may look like this:

{
"articles": [
	{title: "Article 1"},
	{title: "Article 2"},
	{title: "Article 3"}
]
}

A response containing only a single Article would look like this:

{
"articles": [
	{title: "Article 1"}
]
}

Note that the Article is still in the array, even though there will only ever be one. This is because we want to avoid polymorphism; if the articles attribute contains an array of Article objects for one response, it should contain an array of Article objects for every response.

Requests are a little trickier. The same format applies, but it’s rare to have multiple resources in a single request. If you can, you should support this, as one is just a special case of many, but this can sometimes be prohibitive to support. As always, do what is right for your API.

But it can be confusing for clients to need to construct an array to create a single item, especially when you don’t support creating multiple items. The array may lead clients to believe that you can create multiple items, which is then an expectation you’ve broken. For requests, therefore, I have two properties for each resource: one, like responses, containing an array of that resource’s type; the other is a singular version that contains the resource directly. For example, the request object may look something like:

{
"articles": [ { title: "New article" } ]
}

Or it could look like this:

{
"article": { title: "New article" }
}

Note that the property is now “article”, the singular, to avoid polymorphism.

My recommendation, and what I do, is to support both formats. If the plural is not populated, then the singular is the fallback, as a crutch for clients. There comes a cost with supporting two ways to do something, but the Robustness Principle comes into play here. Make affordances for your developers, and they will love you for it.

Polymorphic “Data” Responses

As I was getting technical feedback on this, one of the reviewers (John Sheehan, CEO of Runscope) brought something to my attention—the attribute-per-resource method actually makes some people’s lives actively more difficult. The extremely illustrative example is the Twilio C# client. At the time of this writing, the client had to define 79 classes for the sole purpose of being able to map response attributes to classes. Essentially, when parsing the response, they needed a class for each response attribute, which led to the creation of 79 classes.

I’ve never written C#, so I can’t say if there’s an easier way to do this, but Twilio’s a company full of smart people—who, incidentally, are some of the best API evangelists in the world—and I feel pretty confident that if there was a better way, they’d be using it.

None of this looks fun. If it seems likely that your API will have an audience with C# developers, I’d strongly recommend against the property-per-resource approach. In that case, I’d actually recommend embracing polymorphism, as it would offer a better user experience.

So in this case, your response may actually look something like this:

{
"data": [
	{title: "Article 1"},
	{title: "Article 2"},
	{title: "Article 3"}
]
}

Notice that the data attribute has replaced our resource name, so we can now get to the JSON through data every time? The trade-off is that your client now needs to know what’s coming from the API, or needs their software to be able to intelligently detect it. In this case, it may be a good idea to supply a new content type (for use in Accept and Content-Type headers) for each of your resources, so the client can determine the type of your resource without parsing the JSON.

A request under this scheme, similarly, can be polymorphic as well. In this case, we don’t even need a wrapper in JSON. Just send the data:

{
title: "My title"
}

For requests containing multiple resources, send the array:

[ { title: "Article 1" },
{ title: "Article 2" },
{ title: "Article 3" }]

Supporting a Mixed Crowd

What if you have an audience that is a mix of Go developers and C# developers? Someone has to lose, right?

Not necessarily. The Go developers don’t have to put up with the polymorphism, and the C# developers don’t need to spend entire days stubbing out classes just to get at JSON data.

These are, after all, simply two different representations of the same data. How do we support two different representations of the same data?

That’s right, headers. You can use a request header to signal that polymorphism is preferred or not supported, and a response header to signal that polymorphism is or is not present.

In the end, do what’s best for the developers in your audience. If you have 100 developers in your audience, and you can spend an hour saving them 15 minutes, technology just advanced 24 hours for free.

Paging Responses

It’s very rare for a client to want all the information you ever stored in the history of ever. It’s even rarer that you want to supply all the information you ever stored in the history of ever in a single request/response cycle.

Which is why APIs decided to page things, offering a single “page” of the response—returning the first N results, and telling you to request page 2 for the next N results, and so on.

Which is all well and good. This saves bandwidth and computation for everyone involved. I shudder to think what it would do to my phone to parse the JSON for every tweet in my stream every time I wanted to check for new tweets. That’s clearly more work than we need to be doing.

Note: be wary of race conditions while paging responses; make sure that you’re returning your resources sorted in such a manner that a new resource being created doesn’t affect paging already in process. A good way to do this is, instead of saying “show me page 2”, say “show me the 20 results following this resource ID”. That way, even when new resources are created, your clients still get the results they asked for, with no repetitions or skips. The downside of this approach is that it makes caching harder, as the “page” links change every time a new resource is created or a resource changes its position in the list. It also makes it harder to “jump to” certain pages; you can only say “show me the next (or previous) page”, not “show me the fifth page”.

We talked about how you can specify which page you want and the range of resources on each page in the last chapter, and the tradeoffs in that decision. But how do you return that information to the client in the response?

For the love of your developers, by default sort your data in the manner it is most often consumed. Don’t order tweets alphabetically or sort them by timestamp ascending. Let your sort order reflect the common use cases of the data.

A lot of the APIs that use query parameters to specify page ranges will then use a “page” parameter in the body to reflect that information in the response.

GET /things?page=3

{
 "things": [{...}, {...}],
 "page": {
   "current": 3,
   "max": 12
 }
}

Much like using the query parameters for specifying the page, this approach optimises for browser compatibility. You can view the request and responses entirely in a browser.

My preferred approach is to relegate meta information to the headers. Just like a request header is used to specify which page and range to return, a response header is used to specify which page and range were returned:

Page: 4
TotalPages: 12
Count: 40

Again, just make that tradeoff with your audience in mind.

Don’t Ignore My Accept Header

If a client specifically specifies an Accept header, denoting the type of information they’re capable of handling, you should abide by it. This allows the client to tell you which data format they prefer, but you should never default to a format the client has not specifically asked for. If a client sends an Accept header of application/json and you, for whatever reason, can only respond with XML, you should not just serve the XML. That’s a waste of everyone’s time and bandwidth, because the client likely can’t do anything with that data you just gave them.

Instead, use the 406 Not Acceptable status code in response. It’s a status code that is specifically intended for requests that the server is unable to fulfill because their Accept header asks for a data format the server doesn’t use or can’t provide.

Don’t Misuse Status Codes

On that note, you should be generally aware of the HTTP status codes you have available for your use, and their meanings. Don’t return 200 when a resource is created, return a 201. That’s what the separate code exists for. These codes are surfaced in pretty much every language and HTTP library, and provide a lot of information for how a response should be handled before the response is even received. Using them correctly will give clients a lot of information and requires very little effort on your part.

Wrapping Up

These are the basic necessities of serving requests and responses. You don’t need to do everything my way, but you should avoid the pitfalls we discussed above. These are not abstract, theoretical principles I derived from meditation or a very expensive degree; they’re hard lessons I’ve learned while writing API clients. If you ever think “Ah, nobody will care if this response type is polymorphic” (as I’m sure one just idly thinks all the time), I’m here as proof against your assertion. It hurts. Please don’t do these things, they make me cry.

To recap:

Polymorphism can make the lives of your developers much easier or much harder, and should not be undertaken lightly.
Requests and responses should be moulded to make life easier for your developers. Remember, supporting multiple representations of the same data can save hours for people.
Meta information in the response is undesirable, but may be unavoidable, depending on your preference for browser compatibility.
Accept headers are important and should be obeyed.
Status codes are powerful and can signify very specific things, so be aware of them and use them appropriately and specifically.

In the next chapter, we’re going to talk about errors for a bit. They’re one of the things that can make or break an API experience, so definitely stick around for that.

Up next

7 Errors