Table of Contents
- Our use case
- The state of the art
Maybe there is another way
- Impure implementation
- Pure implementation
- What about tests?
- Adding benchmarks
- Documenting your API
Thank you for your interest in this book. I hope you’ll have a good time reading it and learn something from it.
This book is intended for the intermediate Scala programmer who is interested in functional programming and works mainly on the web service backend side. Ideally she has experience with libraries like Akka HTTP1 and Slick2 which are in heavy use in that area.
However maybe you have wondered if we can’t do better even though aforementioned projects are battle tested and proven.
The answer to this can be found in this book which is intended to be read from cover to cover in the given order. Within the book the following libraries will be used: Cats3, Cats Effect4, http4s5, Doobie6, Refined7, fs28 and probably others. ;-)
This book uses the Creative Commons Attribution ShareAlike 4.0 International (CC BY-SA 4.0) license1. The code snippets in this book are licensed under CC02 which means you can use them without restriction. Excerpts from libraries maintain their license.
I would like to thank my beloved wife and family who bear with me and make all of this possible.
Also I send a big thank you to alle the nice people from the Scala community which I’ve had the pleasure to meet.
For better understanding we will implement a small use case in both impure and pure way. The following section will outline the specification.
First we need to specify the exact scope and API of our service. We’ll design a service with a minimal API to keep things simple. It shall fulfil the following requirements.
The service shall provide HTTP API endpoints for:
- the creation of a product data type identified by a unique id
- adding translations for a product name by language code and unique id
- returning the existing translations for a product
- returning a list of all existing products with their translations
We will keep the model very simple to avoid going overboard with the implementation.
- A language code shall be defined by the ISO 639-1 (e.g. a two letter code).
- A translation shall contain a language code and a product name (non-empty string).
- A product shall contain a unique id (UUID version 4) and a list of translations.
The data will be stored in a relational database (RDBMS). Therefore we need to define the tables and relations within the database.
products must contain only the unique id which is also the primary key.
names must contain a column for the product id, one for the language code and one for the name. Its primary key is the combination of the product id and the language code. All columns must not be null. The relation to the products is realised by a foreign key constraint to the
products table via the product id.
The HTTP API shall provide the following endpoints on the given paths:
||POST||Create a product.|
||GET||Get all products and translations.|
||GET||Get all translations for the product.|
The data shall be encoded in JSON using the following specification:
This should be enough to get us started.
Within the Scala ecosystem the Akka-HTTP library is a popular choice for implementing server side backends for HTTP APIs. Another quite popular option is the Play framework but using a full blown web framework to just provide a thin API is overkill in most cases. As most services need a database the Slick library is another popular choice which completes the picture.
However while all mentioned libraries are battle tested and proven they still have problems.
In the domain of functional programming we want referential transparency which we will define in the following way:
Building on that we need pure functions which are
- only dependent on their input
- have no side effects
This means in turn that our functions will be referential transparent.
But, the mentioned libraries are built upon the
Future from Scala which uses eager evaluation and breaks referential transparency. Let’s look at an example.
The code above will print the text
Hi there! two times. But how about the following one?
Instead of printing the text two times it will print it only once even when there is no usage of
printF at all (try omitting the for comprehension). This means that
Future breaks referential transparency!
If we want referential transparency, we must push the side effects to the boundaries of our system (program) which can be done by using lazy evaluation. Let’s repeat the previous example in a different way.
The above code will produce no output. Only if we evaluate the variable
effect which is of type
IO[Unit] will the output be generated (try
effect.unsafeRunSync in the REPL). Also the second approach works like expected.
Suddenly we can much more easily reason about our code! And why is that? Well we don’t have unexpected side effects caused by code running even when it doesn’t need to. This is a sneak peak how pure code looks like. Now we only need to implement pure libraries for our use, or do we?
Luckily for us meanwhile there are several pure options available in the Scala ecosystem. We will stick to the Cats family of libraries namely http4s and Doobie as replacements for Akka-HTTP and Slick. They build upon the Cats Effect library which is an implementation of an IO monad for Scala. Some other options exist but we’ll stick to the one from Cats.
To be able to contrast both ways of implementing a service we will first implement it using Akka-HTTP and Slick and will then migrate to http4s and Doobie.
We’ll be using the following libraries for the impure version of the service:
- Akka (including Akka-HTTP and Akka-Streams)
- Slick (as database layer)
- Flyway for database migrations (or evolutions)
- Circe for JSON codecs and akka-http-json as wrapper
- Refined for using refined types
- the PostgreSQL JDBC driver
I’ll spare you the sbt setup as you can look that up in the code repository (e.g. the
impure folder in the book repo).
First we’ll implement our models which are simple and straightforward. At first we need a class to store our translations or better a single translation.
Technically it is okay but we have a bad feeling about it. Using
Option[String] is of no use because both fields have to be set. But a
String can always be
null and contain a lot of unexpected stuff (literally anything).
So let us define some refined types which we can use later on. At first we need a language code which obeys the restrictions of ISO-639-1 and we need a stronger definition for a product name. For the former we use a regular expression and for the latter we simply expect a string which is not empty.
Now we can give our translation model another try.
Much better and while we’re at it we can also write the JSON codecs using the refined module of the Circe library. We put them into the companion object of the model.
Now onwards to the product model. Because we already know of refined types we can use them from start here.
If we look closely we realise that a
List maybe empty. Which is valid for the list but not for our product because we need at least one entry. Luckily for us the Cats library has us covered with the
NonEmptyList data type. Including the JSON codecs this leads us to our final implementation.
Last but not least we really should be using the existing
UUID data type instead of rolling our own refined string version - even when it is cool. ;-)
We kept the type name
ProductId by using a type alias. This is convenient but remember that a type alias does not add extra type safety (e.g.Â
type Foo = String will be a
Well, maybe because a list may contain duplicate entries but the database will surely not because of unique constraints! So, let’s switch to a
NonEmptySet which is also provided by Cats.
Now we have the models covered and can move on to the database layer.
The database layer should provide a programmatic access to the database but also should it manage changes in the database. The latter one is called migrations or evolutions. From the available options we chose Flyway as the tool to manage our database schema.
Flyway uses raw SQL scripts which have to be put into a certain location being
/db/migration (under the
resources folder) in our case. Also the files have to be named like
XX being a number) starting with
V1. Please note that there are two underscores between the version prefix and the rest of the name! Because our database schema is very simply we’re done quickly:
In the code you’ll see that we additionally set comments which are omitted from the code snippet above. This might be overkill here but it is a very handy feature to have and I advice you to use it for more complicated database schemas. Because the right comment (read information) in the right place might save a lot of time when trying to understand things.
Next we move on to the programmatic part which at first needs a configuration of our database connection. With Slick you have a multitude of options but we’ll use the “Typesafe Config”1 approach.
After we have this in place we can run the migrations via the API of Flyway. For this we have to load the configuration (we do it by creating an actor system), extract the needed information and create a JDBC url and use that with username and password to obtain a Flyway instance. On that one we simply call the method
migrate() which will do the right thing. Basically it will check if the schema exists and decide to either create it, apply pending migrations or simply do nothing. The method will return the number of applied migrations.
Let us continue to dive into the Slick table definitions.
Slick offers several options for approaching the database. For our example we will be using the lifted embedding but if needed Slick also provides the ability to perform plain SQL queries.
For the lifted embedding we have to define out tables in a way Slick can understand. While this can be tricky under certain circumstances our simple model is straightforward to implement.
As you can see above we’re using simple data types (not the refined ones) to have a more easy Slick implementation. However we can also use refined types for the price of using either the slick-refined library or writing custom column mappers.
Next we’ll implement the table for the translations which will also need some constraints.
As you can see the definition of constraints is also pretty simple. Now our repository needs some functions for a more convenient access to the data.
The last two functions are helpers to enable us to create a load queries which we can compose. They are used in the
updateProduct functions to create a list of queries that are executed as bulk while the call to
transactionally ensures that they will run within a transaction. When updating a product we first delete all existing translations to allow the removal of existing translations via an update. To be able to do so we use the
andThen helper from Slick.
loadProduct function simply returns a list of database rows from the needed join. Therefore we need a function which builds a
Product type out of that.
But oh no! The compiler refuses to build it:
It seems we have to provide an instance of
Order for our
Translation model to make Cats happy. So we have think of an ordering for our model. A simple approach would be to simply order by the language code. Let’s try this:
You might have noticed the explicit call to
.value to get the underlying string instance of our refined type. This is needed because the other option (using
x.compare(y)) will compile but bless you with stack overflow errors. The reason is probably that the latter is compiled into code calling
OrderOps#compare which is recursive.
So far we should have everything in place to make use of our database. Now we need to wire it all together.
Defining the routes is pretty simple if you’re used to the Akka-HTTP routing DSL syntax.
We will fill in the details later on. But now for starting the actual server to make use of our routes.
The code will fire up a server using the defined routes and hostname and port from the configuration to start a server. It will run until you press enter and then terminate. Let us now visit the code for each routing endpoint. We will start with the one for returning a single product.
We load the raw product data from the repository and convert it into a proper product model. But to make the types align we have to wrap the second call in a
Future otherwise we would get a compiler error. We don’t need to marshal the response because we are using the akka-http-json library which provides for example an
ErrorAccumulatingCirceSupport import that handles this. Unless of course you do not have circe codecs defined for your types.
The route for updating a product is also very simple. We’re extracting the product entity via the
entity(as[T]) directive from the request body and simply give it to the appropriate repository function. Now onwards to creating a new product.
As you can see the function is basically the same except that we’re calling a different function from the repository. Last but not least let us take a look at the return all products endpoint.
This looks more complicated that the other endpoints. So what exactly are we doing here?
Well first we load the raw product data from the repository. Afterwards we convert it into the proper data model or to be more exact into a list of product entities.
The first thing that comes to mind is that we’re performing operations in memory. This is not different from the last time when we converted the data for a single product. Now however we’re talking about all products which may be a lot of data. Another obvious point is that we get a list of
Option[Product] which we explicitly flatten at the end.
Maybe we should consider streaming the results. But we still have to group and combine the rows which belong to a single product into a product entity. Can we achieve that with streaming? Well, let’s look at our data flow.
We receive a list of 3 columns from the database in the following format:
product id, language code, name. The tricky part being that multiple rows (list entries) can belong to the same product recognizable by the same value for the first column
product id. At first we should simplify our problem by ensuring that the list will be sorted by the
product id. This is done by adjusting the function
loadProducts in the repository.
Now we can rely on the fact that we have seen all entries for one product if the product id in our list changes. Let’s adjust our code in the endpoint to make use of streaming now. Because Akka-HTTP is based on Akka-Streams we can simply use that.
Wow, this may look scary but let’s break it apart piece by piece. At first we need an implicit value which provides streaming support for JSON. Next we create a
Source from the database stream. Now we implement the processing logic via the high level streams API. We collect every defined output of our helper function
fromDatabase which leads to a stream of
Product entities. But we have created way too many (Each product will be created as often as it has translations.). So we group our stream by the product id which creates a new stream for each product id holding only the entities for the specific product. We fold over each of these streams by merging together the list of translations (
names). Afterwards we merge the streams back together and run another collect function to simply get a result stream of
Product and not of
Option[Product]. Last but not least the stream is passed to the
complete function which will do the right thing.
The solution has two problems:
- The number of individual streams (and thus products) is limited to
groupByoperator holds the references to these streams in memory opening a possible out of memory issue here.
As the first problem is simply related to the usage of
groupBy we may say that we only have one problem: The usage of
For a limited amount of data the proposed solution is perfectly fine so we will leave it as is for now.
Regarding the state of our service we have a working solution, so congratulations and let’s move on to the pure implementation.
Like in the previous section I will spare you the details of the sbt setup. We will be using the following set of libraries:
- Doobie (as database layer)
- Flyway for database migrations (or evolutions)
- Circe for JSON codecs
- Refined for using refined types
- the PostgreSQL JDBC driver
- pureconfig (for proper configuration loading)
Last time we simply loaded our configuration via the typesafe config library but can’t we do a bit better here? The answer is yes by using the pureconfig1 library. First we start by implementing the necessary parts of our configuration as data types.
As we can see the code is pretty simple. The implicits in the companion objects are needed for pureconfig to actually map from a configuration to your data types. As you can see we are using a function
deriveReader which will derive (like in mathematics) the codec (Yes, it is similar to a JSON codec thus the name.) for us.
Below is an example of deriving a
Order instance using the kittens 2library. It uses shapeless under the hood and provides automatic and semi automatic derivation for a lot of type class instances from Cats like
Functor and so on.
Because we have already written our models we just re-use them here. The only thing we change is the semi automatic derivation of the JSON codecs. We just need to import the appropriate circe package and call the derive functions.
In general the same applies to the database layer as we have already read in the “impure” section.
For the sake of simplicity we will stick to Flyway for our database migrations. However we will wrap the migration code in a different way (read Encapsulate it properly within an
IO to defer side effects.). While we’re at it we may just as well write our migration code using the interpreter pattern (it became famous under the name “tagless final” in Scala).
We define a trait which describes the functionality desired by our interpreter and use a higher kinded type parameter to be able to abstract over the type. But now let’s continue with our Flyway interpreter.
As we can see, the implementation is pretty simple and we just wrap our code into an
IO monad to constrain the effect. Having the migration code settled we can move on to the repository.
If we take a closer look at the method definition of
Flyway.migrate, we see this:
IO will gladly defer side effects for us it won’t stop enclosed code from throwing exceptions. This is not that great. So what can we do about it?
Having an instance of
MonadError in scope we could just use the
.attempt function provided by it. But is this enough or better does this provide a sensible solution for us? Let’s play a bit on the REPL.
This looks like we just have to use
MonadError then. Hurray, we don’t need to change our code in the migrator. As model citizens of the functional programming camp we just defer the responsibility upwards to the calling site.
As we already started with using a tagless final approach we might as well continue with it and define a base for our repository.
There is nothing exciting here except that we feel brave now and try to use proper refined types in our database functions. This is possible due to the usage of the doobie-refined module. To be able to map the
UUID data type (and others) we also need to include the doobie-postgresql module. For convenience we are still using
ProductId instead of
UUID in our definition. In addition we wire the return type of
loadProducts to be a
fs2.Stream because we want to achieve pure functional streaming here. :-)
So let’s see what a repository using doobie looks like.
We keep our higher kinded type as abstract as we can but we want it to be able to suspend our side effects. Therefore we require an implicit
If we look at the detailed function definitions further below, the first big difference is that with doobie you write plain SQL queries. You can do this with Slick too4 but with doobie it is the only way. If you’re used to object relational mapping (ORM) or other forms of query compilers then this may seem strange at first. But: “In data processing it seems, all roads eventually lead back to SQL!”5 ;-)
We won’t discuss the benefits or drawbacks here but in general I also lean towards the approach of using the de facto lingua franca for database access because it was made for this and so far no query compiler was able to beat hand crafted SQL in terms of performance. Another benefit is that if you ask a database guru for help, she will be much more able to help you with plain SQL queries than with some meta query which is compiled into something that you have no idea of.
loadProduct function simply returns all rows for a single product from the database like its Slick counterpart in the impure variant. The parameter will be correctly interpolated by Doobie therefore we don’t need to worry about SQL injections here. We specify the type of the query, instruct Doobie to transform it into a sequence and give it to the transactor.
loadProducts function is equivalent to the first one but it returns the data for all products sorted by product and as a stream using the fs2 library which provides pure functional streaming.
When saving a product we use monadic notation for our program to have it short circuit in the case of failure. Doobie will also put all commands into a database transaction. The function itself will try to create the “master” entry into the products table and save all translations afterwards.
updateProduct function uses also monadic notation like the
saveProduct function we talked about before. The difference is that it first deletes all known translations before saving the given ones.
The routing DSL of http4s differs from the one of Akka-HTTP. Although I like the latter one more it poses no problem to model out a base for our routes.
As we can see the DSL is closer to Scala syntax and quite easy to read. But before we move on to the details of each route let’s think about how we can model this a bit more abstract. While it is fine to have our routes bound to
IO it would be better to have more flexibility here. We have several options here but for starters we just extract our routes into their own classes like in the following schema.
So far they only need the repository to access and manipulate data. Now let’s take on the single route implementations.
First we need to bring JSON codecs in scope for http4s thus the implicit definitions on top of the file. In the route for loading a single product we simply load the database rows which we pipe through our helper function to construct a proper
Product and return that.
The update route (via
PUT) transforms the request body into a
Product and gives that to the update function of the repository. Finally a
NoContent response is returned.
Our first take on the routes for products looks pretty complete already. Again we need implicit definitions for our JSON codecs to be able to serialize and de-serialize our entities. The
POST route for creating a product is basically the same as the update route from the previous part. We create a
Product from the request body, pass it to the save function of the repository and return a 205
GET route for returning all products calls the appropriate repository function which returns a stream which we map over using our helper function. Afterwards we use
collect to convert our stream from
Option[Product] to a stream of
Product which we pass to the
Ok function of http4s.
To solve this we need to dive into the fs2 API and leverage it’s power to merge our products back together. So let’s see how we do.
Because we believe ourselves to be clever we pick the simple sledge hammer approach and just run some accumulator on the stream. So what do we need? A helper function and some code changes on the stream (e.g. in the route).
So this function will take a list (that may be empty) and a product and will merge the top most element (the head) of the list with the given one. It will return an updated list that either contains an updated head element or a new head. Leaving aside the question of who guarantees that the relevant list element will always be the head, we may use it.
Looks so simple, does it? Just a simple
fold which uses our accumulator and we should be settled. But life is not that simple…
The compiler complains that we have changed the type of the stream and rightly so. So let’s fix that compiler error.
Let’s take a look again and think about what it means to change a stream of products into a stream of a list of products. It means that we will be building the whole thing in memory! Well if we wanted that we could have skipped streaming at all. So back to the drawing board.
We need to process our stream of database columns (or products if we use the converter like before) in such a way that all related entities will be grouped into one product and emitted as such. After browsing the documentation of fs2 we stumble upon a function called
groupAdjacentBy so we try that one.
Okay, this does not look complicated and it even compiles - Hooray! :-)
So let’s break it apart piece by piece. The group function of fs2 will partition the input depending on the given function into chunks. A
Chunk is used internally by fs2 for all kinds of stuff. You may compare it to a sub-stream of Akka-Streams. However the documentation labels it as: Strict, finite sequence of values that allows index-based random access of elements.
Having our chunks we can map over each one converting it into a list which is then passed to our helper function
fromDatabase to create proper products. Last but not least we need to collect our entities to get from an
Option[Product] to a stream of
Now that we have a proper streaming solution we try it out but what do we get when we expect a list of products?
Well, whatever this is, it is not JSON! It might look like it, but it isn’t. However quite often you can see such things in the wild (read in production).
If we think about it then this sounds like a bug in http4s and indeed we find an issue7 for it. Because the underlying problem is not as trivial as it first sounds maybe we should try to work around the issue.
The fs2 API offers concatenation of streams and the nifty
intersperse function to insert elements between emitted ones. So let’s give it a try.
First we create streams for the first and last JSON that we need to emit. Please not that we cannot simply use a
String here but have to lift it into our HKT
F. The usage of
pure is okay because we simply lift a fixed value. Then we extend our original stream processing by explicitly converting our products to JSON and inserting the delimiter (a comma) manually using the
intersperse function. In the end we simply concatenate our streams and return the result.
Our solution is quite simple, having the downside that we need to suppress a warning from the wartremover8 tool. This is somewhat annoying but can happen. If we remove the annotation, we’ll get a compiler error:
So let’s check if we have succeeded:
This looks good, so we congratulations: We are done with our routes!
Within our main entry point we simply initialise all needed components and wire them together. We’ll step through each part in this section. The first thing you’ll notice is that we use the
IOApp provided by the Cats effect library9.
Yet again we need to suppress a warning from wartremover here. But let’s continue to initialising the database connection.
We create our database migrator explicitly wired to the
IO data type. Now we start with a for comprehension in which we load our configuration via pureconfig yet again within an
IO. After successful loading of the configuration we continue with migrating the database. Finally we create the transactor needed by Doobie and the database repository.
Here we create our routes via the classes, combine them (via
<+> operator) and create the http4s app explicitly using an
IO thus wiring our abstract routes to
IO. The service will - like the impure one - run until you press enter. But it won’t run yet. ;-)
If you remember playing around with
MonadError then you’ll recognize the
attempt here. We attempt to run our program and execute possible side effects via the
unsafeRunSync method from Cats effect. But to provide a proper return type for the
IOApp we need to evaluate the return value which is either an error or a proper exit code. In case of an error we print it out on the console (no fancy logging here) and explicitly set an error code as the return value.
As it seems we are done with our pure service! Or are we? Let’s see what we need add to both services if we want to test them.
In the domain of strong static typing (not necessarily functional) you might hear phrases like “It compiles therefore it must be correct thus we don’t need tests!”. While there is a point that certain kinds of tests can be omitted in favour of strong static typing such stances overlook that even a correctly typed program may produce the wrong output. The other extreme (coming from dynamic typed land) is to substitute typing with testing - which is even worse. Remember that testing is usually a probabilistic approach and cannot guarantee the absence of bugs. If you have ever refactored a large code base in both paradigms then you’ll very likely come to esteem a good type system.
However, we need tests, so let’s write some. But before let us think a bit about what kinds of tests we need. :-)
Our service must read and create data in the JSON format. This format should be fixed and changes to it should raise some red flag because: Hey, we just broke our API! Furthermore we want to unleash the power of ScalaCheck1 to benefit from property based testing. But even when we’re not using that we can still use it to generate test data for us.
Besides the regular unit tests there should be integration tests if a service is written. We can test a lot of things on the unit test side but in the end the integration of all our moving parts is what matters and often (usually on the not so pure side of things) you would have to trick (read mock) a lot to test things in isolation.
We will start with writing some data generators using the ScalaCheck library.
ScalaCheck already provides several generators for primitives but for our data models we have to do some more plumbing. Let’s start with generating a language code.
Gen.oneOf helper from the library the code becomes dead simple. Generating a
UUID is nothing special either.
You might be tempted to use
Gen.const here but please don’t because that one will be memorized and thus never change. Another option is using a list of randomly generated UUID values from which we then chose one. That would be sufficient for generators which only generate a single product but if we want to generate lists of them we would have duplicate ids sooner than later.
So what do we have here? We want to generate a non empty string (because that is a requirement for our
ProductName) but we also want to return a properly typed entity. First we let ScalaCheck generate a non empty list of random characters which we give to a utility function of refined. However we need a fallback value in case the validation done by refined fails. Therefore we defined a general default product name beforehand.
Now that we have generators for language codes and product names we can write a generator for our
As we can see the code is also quite simple. Additionally we create an implicit arbitrary value which will be used automatically by the
forAll test helper if it is in scope. To be able to generate a
Product we will need to provide a non empty list of translations.
The first generator will create a non empty list of translations but will be typed as a simple
List. Therefore we create a second generator which uses the
fromList helper of the non empty list from Cats. Because that helper returns an
Option (read is a safe function) we need to fallback to using a simple
of function at the end.
With all these in place we can finally create our
The code is basically the same as for
Translation - the arbitrary implicit included.
To avoid repeating the construction of our unit test classes we will implement a base class for tests which is quite simple.
Feel free to use other test styles - after all ScalaTest offers a lot2 of them. I tend to lean towards the more verbose ones like
WordSpec. Maybe that is because I spent a lot of time with RSpec3 in the Ruby world. ;-)
The code above is a very simple test of our helper function
fromDatabase which works in the following way:
forAllwill generate a lot of
Productentities using the generator.
- From each entity a list of “rows” is constructed like they would appear in the database.
- These constructed rows are given to the
- The returned
Optionmust then contain the generated value.
Because we construct the input for the function from a valid generated instance the function must always return a valid output.
Now let’s continue with testing our JSON codec for
We need our JSON codec to provide several guarantees:
- It must fail to decode invalid JSON input format (read garbage).
- It must fail to decode valid JSON input format with invalid data (read wrong semantics).
- It must succeed to decode completely valid input.
- It must encode JSON which contains all fields included in the model.
- It must be able to decode JSON that itself encoded.
The first one is pretty simple and to be honest: You don’t have to write a test for this because that should be guaranteed by the Circe library. Things look a bit different for very simple JSON representations though (read when encoding to numbers or strings).
I’ve seen people arguing about point 5 and there may be applications for it but implementing encoders and decoders in a non-reversible way will make your life way more complicated.
There is not much to say about the test above: It will generate a lot of random strings which will be passed to the decoder which must fail.
This test will generate random instances for
id which are all wrong because it must be a UUID and not a string. Also the instances for
names will mostly (but maybe not always) be wrong because there might be empty strings or an even empty list. So the decoder is given a valid JSON format but invalid values, therefore it must fail.
In this case we manually construct a valid JSON input using values from a generated valid
Product entity. This is passed to the decoder and the decoder must not only succeed but return an instance equal to the generated one.
The test will generate again a lot of entities and we construct a JSON string from each. We then expect the string to include several field names and their correctly encoded values. You might ask why we do not check for more things like: Are these fields the only ones within the JSON string? Well, this would be more cumbersome to test and a JSON containing more fields than we specify won’t matter for the decoder because it will just ignore them.
Here we encode a generated entity and pass it to the encoder which must return the same entity.
So this is basically what we do for models. Because we have more than one we will have to write tests for each of the others. I will spare you the JSON tests for
Translation but that one also has a helper function called
fromUnsafe so let’s take a look at the function.
This function simply tries to create a valid
Translation entity from unsafe input values using the helpers provided by refined. As we can see it is a total function (read is safe to use). To cover all corner cases we must test it with safe and unsafe input.
Here we generate two random strings which we explicitly check to be invalid using the
whenever helper. Finally the function must return an empty
None for such values.
The test for valid input is very simple because we simply use the values from our automatically generated valid instances. :-)
So far we have no tests for our
Repository class which handles all the database work. Neither have we tests for our routes. We have several options for testing here but before can test either of them we have do to some refactoring. For starters we should move our routes out of our main application into separate classes to be able to test them more easily.
Yes, we should. There are of course limits and pros and cons to that but in general this makes sense. Also this has nothing to do with being “impure” or “pure” but with clean structure.
Moving the routes into separate classes poses no big problem we simply create a
ProductRoutes and a
ProductsRoutes class which will hold the appropriate routes. As a result our somewhat messy main application code becomes more readable.
We simply create our instances from our routing classes and construct our global routes directly from them. This is good but if we want to test the routes in isolation we still have the problem that they are hard-wired to our
Repository class which is implemented via Slick. Several options exist to handle this:
- Use an in-memory test database with according configuration.
- Abstract further and use a trait instead of the concrete repository implementation.
- Write integration tests which will require a working database.
Using option 1 is tempting but think about it some more. While the benefit is that we can use our actual implementation and just have to fire up an in-memory database (for example h2), there are also some drawbacks:
- You have to handle evolutions for the in-memory database.
- Your evolutions have to be completely portable SQL (read ANSI SQL). Otherwise you’ll have to write each of your evolutions scripts two times (one for production, one for testing).
- Your code has to be database agnostic. This sounds easier than it is. Even the tools you’re using may use database specific features under the hood.
- Several features are simply not implemented in some databases. Think of things like cascading deletion via foreign keys.
Taking option 2 is a valid choice but it will result in more code. Also you must pay close attention to the “test repository” implementation to avoid introducing bugs there. Going for the most simple approach is usually feasible. Think of a simple test repository implementation that will just return hard coded values or values passed to it via constructor.
However we will go with option 3 in this case. It has the drawback that you’ll have to provide a real database environment (and maybe more) for testing. But it is as close to production as you can get. Also you will need these either way to test your actual repository implementation, so let’s get going.
First we need to configure our test database because we do not want to accidentally wipe a production database. For our case we leave everything as is and just change the database name.
Another thing we should do is provide a test configuration for our logging framework. We use the logback library and Slick will produce a lot of logging output on the
DEBUG level so we should fix that. It is nice to have logging if you need it but it also clutters up your log files. We create a file
logback-test.xml in the directory
src/it/resources which should look like this:
Due to the nature of integration tests we want to use production or “production like” settings and environment. But really starting our application or service for each test will be quite cumbersome so we should provide a base class for our tests. In this class we will start our service, migrate our database and provide an opportunity to shut it down properly after testing.
Because we do not want to get into trouble when running an Akka-HTTP on the same port, we first create a helper function which will determine a free port number.
The code is quite simple and very useful for such cases. Please not that it is important use
setReuseAddress because otherwise the found socket will be blocked for a certain amount of time. But now let us continue with our base test class.
As you can see we are using the Akka-Testkit to initialise an actor system. This is useful because there are several helpers available which you might need. We configure the actor system with our free port using the loaded configuration as fallback. Next we globally create an actor materializer which is needed by Akka-HTTP and Akka-Streams. Also we create a globally available Flyway instance to make cleaning and migrating the database easier.
The base class also implements the
afterAll methods which will be run before and after all tests. They are used to initially migrate the database and to shut down the actor system properly in the end.
Now that we our parts in place we can write an integration test for our repository implementation.
First we need to do some things globally for the test scope.
We create one
Repository instance for all our tests here. The downside is that if one test crashes it then the other will be affected too. On the other hand we avoid running into database connection limits and severe code limbo to ensure closing a repository connection after each test no matter the result.
Also we clean and migrate before each test and clean also after each test. This ensures having a clean environment.
Onwards to the test for loading a single product.
Loading a non existing product must not produce any result and is simple to test if our database is empty. Testing the loading of a real product is not that much more complicated. We use the ScalaCheck generators to create one, save it and load it again. The loaded product must of course be equal to the saved one.
Testing the loading of all products if none exit is trivial like the one for a non existing single product. For the case of multiple products we generate a list of them which we save. Afterwards we load them and use the same transformation logic like in the routes to be able to construct proper
Product instances. One thing you might notice is the explicit sorting which is due to the fact that we want to ensure that our product lists are both sorted before comparing them.
Here we test the saving which in the first case should simply write the appropriate data into the database. If the product already exists however this should not happen. Our database constraints will ensure that this does not happen (or so we hope ;-)). Slick will throw an exception which we catch in the test code using the
recover method from
Future to return a zero indicating no affected database rows. In the end we test for this zero and also check if the originally saved product has not been changed.
For testing an update we generate two samples, save one to the database, change the id of the other to the one from the first and execute an update. This update should proceed without problems and the data in the database must have been changed correctly.
If the product does not exist then we use the same
recover technique like in the
Congratulations, we have made a check mark on our first integration test using a real database using randomly generated data!
Regarding the route testing there several options as always. For this one we will define some use cases and then develop some more helper code which will allow us to fire up our routes and do real HTTP requests and then check the database and the responses. We build upon our
BaseSpec class and call it
BaseUseCaseSpec. In it we will do some more things like define a global base URL which can be used from the tests to make correct requests. Additionally we will write a small actor which simply starts an Akka-HTTP server.
As you can see, the actor is quite simple. Upon receiving the
Start command it will initialise the routes and start an Akka-HTTP server. To be able to do this it needs the database repository and an actor materializer which are passed via the constructor. Regarding the
BaseUseCaseSpec we will concentrate on the code that differs from the base class.
Here we create our base URL and our database repository while we use the
beforeAll function to initialise our actor before any test is run. Please note that this has the same drawback like sharing the repository across tests: If a test crashes your service then the others will be affected too.
But let’s write a test for our first use case: loading a product!
Again we will use the
afterEach helpers to clean up our database. Now let’s take a look at a test for loading a product that does not exist.
Maybe a bit verbose but we do a real HTTP request here and check the response status code. So how does it look if we run it?
That is not cool. What happened? Let’s take a look at the code.
Well, no wonder - we are simply returning the output of our
fromDatabase helper function which may be empty. This will result in an HTTP status code 200 with an empty body. If you don’t believe me just fire up the service and do a request by hand via curl or httpie.
Luckily for us Akka-HTTP has us covered with the
rejectEmptyResponse directive which we can use.
Cool, it seems we’re set with this one. So onward to testing to load an existing product via the API.
Here we simple save our generated product into the database before executing our request. We also check if the product has actually been written to be on the safe side. Additionally we also check if the decoded response body matches the product we expect. In contrast to our first test this one works instantly so not all hope is lost for our coding skills. ;-)
We’ll continue with the use case of saving (or creating) a product via the API. This time we will make the code snippets shorter.
Here we test posting garbage instead of valid JSON to the endpoint which must result in a “400 Bad Request” returned to us.
This one executes it bit more but basically we try to save (or better create) an already existing product. Therefore the constraints of our database should product an error which in turn must return a “500 Internal Server Error” to use. Additionally we verify that the existing product in the database was not changed.
Last but not least we are testing to save a not already existing valid product into the database. Again we check for the expected status code of “200 OK” and verify that the product saved into the database is the one we sent to the API. Let’s move on to testing the loading of all products now.
Our first test case is loading all products if no product exists so we expect an empty list here and the appropriate status code.
This one is also straight forward: We save our generated list of products to the database and query the API which must return a “200 OK” status code and a correct list of products in JSON format. Looks like we have one more use case to tackle: Updating a product via the API.
First we test with garbage JSON in the request. Before doing the request we actually create a product to avoid getting an error caused by a possibly missing product. Afterwards we check for the expected “400 Bad Request” status code and verify that our product has not been updated.
Next we test updating an existing product using valid JSON. We check the status code and if the product has been correctly updated within the database.
Finally we test updating a non existing product which should produce the expected status code and not save into the database. Wow, it seems we done for good with our impure implementation. Well except for some benchmarking but let’s save that for later.
If we look onto our test code coverage (which is a metric that you should use) then things look pretty good. We are missing some parts but in general we should have things covered.
We will skip the explanation of the ScalaCheck generators because they only differ slightly from the ones used in the impure part.
The model tests are omitted here because they are basically the same as in the impure section. If you are interested in them just look at the source code. In contrast to the impure part we will now write unit tests for our routes. Meaning we will be able to test our routing logic without spinning up a database.
To be able to test our routes we will have to implement a
TestRepository first which we will use instead of the concrete implementation which is wired to a database.
As you can see we try to stay abstract (using our HKT
F[_] here) and we have basically left out the implementation of
loadProducts because it will just return an empty stream. We will get back to it later on. Aside from that the class can be initialised with a (potentially empty) list of
Product entities which will be used as a “database”. The save and update functions won’t change any data, they will just return a
0 or a
1 depending on the product being present in the seed data list.
Above you can see the test for querying a non existing product which must return an empty response using a “404 Not Found” status code. First we try to create a valid URI from our generated
ProductId. If that succeeds we create a small service wrapper for our routes in which we inject our empty
TestRepository. Finally we create a response using this service and a request that we construct. Because we are in the land of
IO we have to actually execute it (via
unsafeRunSync) to get any results back. Finally we validate the status code and the response body.
It seems that we just pass an empty response if we do not find the product. This is not nice, so let’s fix this. The culprit is the following line in our
Okay, this looks easy. Let’s try this one:
Oh no, the compiler complains:
Right, before we had an
Option[Product] here for which we created an implicit JSON encoder. So if we create one for the
Product itself then we should be fine.
Now back to our test:
Great! Seems like we are doing fine, so let’s continue. Next in line is testing a query for an existing product.
This time we generate a whole product, again try to create a valid URI and continue as before. But this time we inject our
TestRepository containing a list with our generated product. In the end we test our expected status code and the response body must contain our product. For the last part to work we must have an implicit
EntityDecoder in scope.
Here we are trying to update a product which doesn’t have to exist because we send totally garbage JSON with the request which should result in a “400 Bad Request” status code. However if we run our test we get an exception instead:
So let’s take a deep breath and look at our code:
As we can see we do no error handling at all. So maybe we can rewrite this a little bit.
Now we explicitly handle any error which occurs when decoding the request entity. But in fact: I am lying to you. We only handle the invalid message body failure here. On the other hand it is enough to make our test happy. :-)
Onwards to our next test cases in which we use a valid JSON payload for our request.
We expect a “404 Not Found” if we try to update a valid product which does not exist. But what do we get in the tests?
Well not exactly what we planned for but it is our own fault. We used the
*> operator which ignores the value from the previous operation. So we need to fix that.
We rely on the return value of our update function which contains the number of affected database rows. If it is zero then nothing has been done implying that the product was not found. Otherwise we return our “204 No Content” response as before. Still we miss one last test for our product routes.
Basically this is the same test as before with the exception that we now give our routes a properly seeded test repository. Great, we have tested our
ProductRoutes and without having to spin up a database! But we still have work to do, so let’s move on to testing the
ProductsRoutes implementation. Before we do that we adapt the code for creating a product using our gained knowledge from our update test.
Now to our tests, we will start with sending garbage JSON via the POST request.
There is nothing special here, the test is same as for the
ProductRoutes except for the changed URI and HTTP method. Also the code is a bit simpler because we do not need to generate a dynamic request URI like before.
Saving a product using valid JSON payload should succeed and in fact it does because of the code we have in our
TestRepository instance. If you remember we use the following code for
This code will return a
0 if the product we try to save does not exist in the seed data set and only a
1 if it can be found within aforementioned set. This code clearly doesn’t make any sense except for our testing. This way we can ensure the behaviour of the save function without having to create a new
Repository instance with hard coded behaviour. :-)
We use the empty repository this time to ensure that the
saveProduct function will return a zero, triggering the desired logic in our endpoint. Almost done, so let’s check the endpoint for returning all products.
We simply expect an empty list if no products exist. It is as simple as that and works right out of the box. Last but not least we need to test the return of existing products. But before we do this let’s take a look at our
Uh, oh, that does not bode well! So we will need to fix that first. Because we were wise to chose fs2 as our streaming library of choice the solution is as simple as this.
Now we can write our last test.
To decode the response correctly an implicit
EntityDecoder of the appropriate type is needed in scope. But the rest of the test should look pretty familiar to you by now.
It seems the only parts left to test are the
FlywayDatabaseMigrator and our
DoobieRepository classes. Testing them will require a running database so we are leaving the cosy world of unit tests behind and venture forth into integration test land. But fear not, we already have some - albeit impure - experience here.
As usual we start up by implementing a base class that we can use to provide common settings and functions across our tests.
You can see that we keep it simple here and only load the database configuration and ensure that is has been indeed loaded correctly in the
To ensure the basic behaviour of our
FlywayDatabaseMigrator we write a simple test.
Within this test we construct an invalid database configuration and expect that the call to
migrate throws an exception. If you remember, we had this issue already and chose not to handle any exceptions but let the calling site do this - for example via a
The other two tests are also quite simple, we just expect it to return either zero or the number of applied migrations depending on the state of the database. It goes without saying that we of course use the
afterEach helpers within the test to prepare and clean our database properly.
Last but not least we take a look at testing our actual repository implementation which uses Doobie. To avoid trouble we need to define a globally available
ContextShift in our test which is as simple as this:
Now we can start writing our tests.
Here we simply test that the
loadProduct function returns an empty list if the requested product does not exist in the database.
From now on we’ll omit the transactor and repository creation from the code examples. As you can see a generated product is saved to the database and loaded again and verified in the end.
loadProducts returns an empty stream if no products exist is as simple as the code above. :-)
In contrast the test code for checking the return of existing products is a bit more involving. But let’s step through it together. First we save the list of generated products to the database which we do using the
traverse function provided by Cats. In impure land we used
Future.sequence here if you remember - but now we want to stay pure. ;-)
Next we call our
loadProducts function and apply a part of the logic from our
ProductsRoutes to it, namely we construct a proper stream of products which we turn into a list via
list in the end. Finally we check that the list is not empty and equal to our generated list.
The code for testing
saveProduct is nearly identical to the
loadProduct test as you can see. We simply check additionally that the function returns the number of affected database rows.
Updating a non existing product must return a zero and save nothing to the database, which is what we test above.
Finally we test updating a concrete product by generating two of them, saving the first into the database and running an update using the second with the id from the first.
Wow, it seems we are finished! Congratulations, we can now check mark the point “write a pure http service in Scala” on our list. :-)
Now that we have our implementations in place we can start comparing them. We will start with implementing some benchmarks to test the performance of both implementations.
There are several applications available to perform load tests and benchmarks. Regarding the latter the Apache JMeter1 project is a good starting point. It is quite easy to get something running. Like the documentation says: For the real stuff you should only use the command line application and use the GUI to create and test your benchmark.
We’ll skip a long introduction and tutorial for JMeter because you can find a lot within the documentation and there are lots of tutorials online.
Within the book repository you’ll find a folder named
jmeter it contains several things:
- Several files ending with
Pure-Create-Products.jmxand so on.
- A CSV file containing 100.000 valid product IDs named
- A file named
.jmx files are the configuration files for JMeter which can be used to run the benchmarks. They are hopefully named understandable and are expected to be run in the following order:
- Create products
- Load products
- Update products
- Load all products
product-ids.csv is expected in the
/tmp folder, so you’ll have to copy it there or adjust the benchmark configurations. Finally the file
benchmarks.md holds detailed information about the benchmark runs (each one was done three times in a row).
Service and testing software (Apache JMeter) were run on different workstations connected via 100 MBit/s network connection.
|CPU||Core i5-9600K, 6 Cores, 3,7 GHz|
|HDD||2x Samsung SSD 860 PRO 512GB, SATA|
|OS||FreeBSD 12 (HT disabled)|
|CPU||AMD Ryzen Threadripper 2950X|
|HDD||2x Samsung SSD 970 PRO 512GB, M.2|
|OS||FreeBSD 12 (HT disabled)|
Apache JMeter version 5.1.1 was used to run the benchmark and if not noted otherwise 10 threads were used with a 10 seconds ramp up time for each benchmark.
So let’s start with comparing the results. As mentioned more details can be found in the file
benchmarks.md. We’ll stick to using the average of the metrics across all three benchmark runs. The following abbreviations will be used in the tables and legends.
- The average response time in milli seconds.
- The median response time in milli seconds.
- 90 percent of all requests were handled within the response time in milli seconds or less.
- 95 percent of all requests were handled within the response time in milli seconds or less.
- 99 percent of all requests were handled within the response time in milli seconds or less.
- The minium response time in milli seconds.
- The maximum response time in milli seconds.
- The error rate in percent.
- The number of requests per second that could be handled.
- The maximum amount of memory used by the service during the benchmark in MB.
- The average system load on service machine during the benchmark.
Wow, I honestly have to say that I didn’t expect that. Usually the world has come to believe that the impure approach might be dirty but is definitely always faster. Well it seems we’re about to correct that. I don’t know about you but I’m totally fine with that. ;-)
But now let’s break it apart piece by piece. The first thing that catches the eye is that the pure service seems to be about seven times faster than the impure one! The average 100 requests per second on the impure side stand against an average 765 requests per second on the pure side. Also the metrics regarding the response times support that. Regarding the memory usage we can see that the pure service needed about 13% more memory than the impure one. Living in times in which memory is cheap I consider this a small price to pay for a significant performance boost.
Last but not least I found it very interesting that the average system load was much higher (nearly twice as high) in the impure implementation. While it is okay to make use of your resources a lower utilisation allows more “breathing room” for other tasks (operating system, database, etc.).
The loading benchmark provides a more balanced picture. While the pure service is still slightly ahead (about 7% faster), it uses about 6% more memory than the impure one. Overall both implementations deliver nearly the same results. But again the pure one causes significant lower system load like in the first benchmark.
Updating existing products results in nearly the same picture as the “create products” benchmark. Interestingly the impure service performs about 20% better on an update than on a create. I have no idea why but it caught my eye. The other metrics are as said nearly identical to the first benchmark. The pure service uses a bit more memory (around 8%) but is around 6 times faster than the impure one causing only half of the system load.
For our last benchmark we load all existing products via the
GET /products route. Because this causes a lot of load we reduce the number of threads in our JMeter configuration from 10 to 2 and only use 50 iterations. But enough talk here are the numbers.
As you can see the difference in system load is way smaller this time. While it is still 25% a load of 4 versus 5 on a machine like the test machine makes almost no difference. However the pure service is again faster (about 25%). Looking at the memory footprint we can see that the impure one uses nearly seven times as much memory as the pure one.
But before we burst into cheers about that let’s remember what we did in the impure implementation! Yes, we used the
groupBy operator of Akka which keeps a lot of stuff in memory so the fault for this is ours. ;-)
Because I’m not in the mood to mess with Akka until we’re on the memory safe side here, we’ll just ignore the memory footprint for this benchmark. Summarising that the pure service is again fast than the impure one.
If you look around in the internet (and also in the literature) you’ll find a lot of sources stating that “functional programming is slow” or “functional programming does not perform” and so on. Well I would argue that we have proven that this is not the case! Although we cannot generalise our findings because we only took a look at a specific niche within the corner of a specific environment, I think this is pretty exciting!
Not only do you benefit from having code that can more easily reasoned about but you gain better testing possibilities and in the end your application also performs better! :-)
While we pay some price (increased memory footprint) for it because there is no free lunch. It seems that it is worth it to work in a clean and pure fashion. So next time someone argues in favour of some dirty impure monstrosity because it is faster, just remember and tell ‘em that this might not be true!
Before we celebrate ourselves we have to tackle one missing point: We have to document our API.
No, it is not! Leaving the issues of proper documented code aside here, we will concentrate on documenting the API. The de facto standard in our days seems to be using Swagger1 for this. To keep things simple we will stick to it. Besides that it won’t hurt to have some documentation in text form (a small file could be enough) which should explain the quirks of our API. The bigger the project the earlier you may encounter flaws in the logic which might not be changeable because whatever reasons there are. ;-)
Swagger provides a bit of tooling and there are likely lots of projects trying to bring it to your favourite web framework or tool kit. In many cases it might be a good idea to bundle the Swagger UI2 with your service and make it available on a specific path. Depending on your needs and environment you’ll want to protect that path via authentication and make it configurable for turning it off in production.
Having the UI you must decide which path you want to go down:
- Write a
swagger.ymlfile which describes your API, create JSON from it and deliver that as an asset.
- Create the JSON dynamically via some library at runtime.
The first point has the benefit that you description will be more or less set in stone and you avoid possible performance impacts and other quirks at runtime. However you must think of a way to test that your service actually fulfils the description (read specification) that you deliver. The most common tool for writing your API description is probably the Swagger Editor3.
Taking the second path will result in a description which reflects your actual code. However none of the tools available I’ve seen so far has fulfilled that promise to 100 percent. You’ll very likely have to make extensive use of annotations to make your API description usable, resulting also in a decoupling of code and description. Also you might encounter “funny” deduced data types like
Future1 and so on - which are annoying and confusing for the user. For those favouring the impure approach there is the swagger-akka-http library4.
The answer is yes! We can describe our API using static typing and have the compiler check it and can deduce server and client code from it.
Describing your API using types is not bleeding edge academic research stuff like you might have guessed. There are several libraries existing for it! :-)
Personally I stumbled upon such things first some years ago when seeing the talk “Using object algebras to design embedded DSLs” (Curry On 2016)5. The related project is the library endpoints6. However there are other projects too including the rho library7 included in the http4s project. Another one is tapir8 which we will be using in our example.
First I wanted to use something which allows us to generate a http4s server. This already narrowed down the options a bit. Also it should be able to generate API documentation (which nowadays means Swagger/OpenAPI support). Furthermore it should support not only http4s but more options. So after playing a bit around I decided to use the tapir library.
We basically cloned our
pure folder into the
tapir folder and start to apply our changes to the already pure implementation. But first some theory.
The tapir library assumes that you describe your API using the
Endpoint type which is more concrete defined as follows:
Endpoint[I, E, O, S]
- The type
Idefines the input given into the endpoint.
- Type type
Edefines the error (or errors) which may be returned by the endpoint.
- The type
Odefines the possible output of the endpoint.
- The type
Sspecifies the type of streams which are used for in- and output.
An endpoint can have attributes like
description which will be used in the generated documentation. You can also map input and output parameters into case classes.
Having the basics settled we can try to write our first endpoint. Let’s refactor our product routes. We will define our endpoints in the companion object of the class.
So what do we have here? First we specify the HTTP method by using the
get function of the endpoint. Now we need to define our path and inputs. We do this using the
in helper which accepts path fragments separated by slashes and also a
path[T] helper which allows us to extract a type directly from a path fragment. This way we define our entry point
product/id in which
id must match our
To be able to be more flexible about our returned status codes we use the
errorOut function which in our case just receives a status code (indicated by passing
statusCode to it).
Finally we define that the endpoint will return the JSON representation of a product by using the
jsonBody helpers. This all is reflected in the actual type signature of our endpoint which reads
Endpoint[ProductId, StatusCode, Product, Nothing]. If we remember the basics then we know that this amounts to an endpoint which takes a
ProductId as input, produces a
StatusCode as possible error and returns a
Product upon success.
Our endpoint alone won’t do us any good so we need an actual server side implementation of it. While we could have used the
serverLogic function to directly attach our logic onto the endpoint definition this would have nailed us down to a concrete server implementation.
So we’re going to implement it in the
We use the
toRoutes helper of tapir which expects a function with the actual logic. As you can see the implementation is straightforward and only differs slightly from our original one. Currently there is no other way to handle our “not found” case than using the
fold at the end. But if you remember, we did the same thing in the original code.
That was not that difficult, for something which some people like to talk about as “academic fantasies from fairy tale land”. ;-)
Onward to our next route: updating an existing product. First we need to define our endpoint.
This is only slightly more code than our first endpoint. We use the
put method this time and the same logic as before to extract our product id from the path. But we also need our product input which we expect as JSON in the request body. The
jsonBody function used is also extended with the
description helper here which will provide data for a possibly generated documentation. We’ll come to generated API docs later on.
We also restrict our errors to status codes via the
errorOut(statusCode) directive. Last but not least we have to define our output. Per default a status code of “200 OK” will be used which is why we override it in the
out function with the “204 No Content” preferred by us.
The implementation is again very similar to our original one. Except that it is even a bit simpler. This is because we do not have to worry about wrongly encoded input (Remember our
handleErrorWith directive?). The tapir library will by default return a “400 Bad Request” status if any provided input cannot be decoded.
Within the pattern match we use the status codes provided by tapir and map the returned values to the correct type which is an
Either[StatusCode, Unit] because of our endpoint type. This results from our endpoint type signature being
Endpoint[(ProductId, Product), StatusCode, Unit, Nothing]. This translates to having an input of both a
ProductId and a
Product and returning a
StatusCode in the error case or
Unit upon success.
Now we only need to combine both routes and we’re set.
So, let us run our tests and see what happens.
Well, not what we expected, or is it? To be honest I personally expected more errors but maybe I’m just doing this stuff for too long. ;-)
If we look into the error we find that because the encoding problems are now handled for us the response not only contains a status code of “400 Bad Request” but also an error message: “Invalid value for: body”. Because I’m fine with that I just adjust the test and let it be good. :-)
Pretty awesome, we have already half of our endpoints done. So let’s move on to the remaining ones and finally see how to generate documentation and also a client for our API.
As we can see the endpoint definition for creating a product does not differ from the one that was used to update one. Except that we have a different path here and do not need to extract our
ProductId from the URL path.
The implementation is again pretty simple. In case that the
saveProduct function returns a zero we output a “500 Internal Server Error” because the product has not been saved into the database.
Finally we have our streaming endpoint left, so let’s see how we can do this via tapir.
The first thing we can see is that we use a
def instead of a
val this time. This is caused by some necessities on the Scala side. If we want to abstract over a type parameter then we need to use a
We also have set the last type parameter not to
Nothing but to something concrete this time. This is because we actually want to stream something. ;-)
It is a bit annoying that we have to define it two times (once for the output type and once for the “stream” type). Much nicer would be something like
Endpoint[I, E, Byte, Stream[F, _]] but currently this is not the way we can do it.
So we again specify the HTTP method (via
get) and the path (which is “products”). The
errorOut helper once again restricts our error output to the status code. Finally we set the output of the endpoint by declaring a streaming entity (via
But is is sufficient and we also directly specify the returned media type to be JSON.
Again our implementation is quite the same compared to the original one. Except that in the end we convert our stream of
String into a stream of
Byte using the
utf8Encode helper from the fs2 library.
Damn, so close. But let’s keep calm and think. Or ask around on the internet. Which is totally fine. Actually it is all in the compiler error message.
We first convert our response explicitly into the right side of an
Either because the left side is used for the error case. Afterwards we provide the needed
Unit => ... function in which we lift our response value via
pure into the context of
So let’s go crazy and simply combine our routes like in the previous part and run the test via
testOnly *.ProductsRoutesTest on the sbt console.
Yes! Very nice, it seems like we are done with implementing our routes via tapir endpoints.
So, we can now look at documenting our API via OpenAPI using the tooling provided by tapir. But first we should actually modify our main application entry point to provide the documentation for us.
We use the
toOpenAPI helper provided by tapir which generates a class structure describing our API from a list of given endpoints. Additionally we use the
SwaggerHttp4s helper which includes the Swagger UI for simple documentation browsing. All of it is made available under the
/docs path. So calling
http://localhost:57344/docs with your browser should open the UI and the correct documentation.
But while browsing there we can see that it provides our models and endpoints but documentation could be better. So what can we do about it?
The answer is simple: Use the helpers provided by tapir to add additional information to our endpoints.
Besides functions like
name tapir also provides
example which will result in having concrete examples in the documentation. To use this we must construct example values of the needed type. A
Product example could look like this.
We can now use it in our product endpoint description.
As you can see we make use of
example here. Also the path parameter
id is described that way.
Here we also add a description to the simple status code output explaining explicitly that no content will be returned upon success. While the 204 status code should be enough to say this you can never be sure enough. ;-)
We’ll skip the create endpoint because it looks nearly the same as the update endpoint. Instead let’s take a look at our streaming endpoint.
This time we need to provide our example as a string because of the nature (read type) of our endpoint. We use the non empty list of examples that we created (you can look it up in
ProductsRoutes.scala) and convert it into a JSON string.
About this book
Maybe there is another way
What about tests?
Documenting your API