Pure functional HTTP APIs in Scala
Pure functional HTTP APIs in Scala
Jens Grassel
Buy on Leanpub

Foreword

Thank you for your interest in this book. I hope you’ll have a good time reading it and learn something from it.

About this book

This book is intended for the intermediate Scala programmer who is interested in functional programming and works mainly on the web service backend side. Ideally she has experience with libraries like Akka HTTP1 and Slick2 which are in heavy use in that area.

However maybe you have wondered if we can’t do better even though aforementioned projects are battle tested and proven.

The answer to this can be found in this book which is intended to be read from cover to cover in the given order. Within the book the following libraries will be used: Cats3, Cats Effect4, http4s5, Doobie6, Refined7, fs28, tapir9, Monocle10 and probably others. ;-)

This edition includes a chapter about migrating the project to Scala 3. Which includes all the nasty issues that we tend to run into if we touch code after a longer time.

Code and book source can be found in the following repository: https://github.com/jan0sch/pfhais

Copyleft Notice

This book uses the Creative Commons Attribution ShareAlike 4.0 International (CC BY-SA 4.0) license. The code snippets in this book are licensed under CC0 which means you can use them without restriction. Excerpts from libraries maintain their license.

Thanks

I would like to thank my beloved wife and family who bear with me and make all of this possible.
Also I send a big thank you to all the nice people from the Scala community which I’ve had the pleasure to meet. Special thanks go to Adam Warski (tapir), Frank S. Thomas (refined), Julien Truffaut (Monocle) and Ross A. Baker (http4s) for their help, advise and patience.

Our use case

For better understanding we will implement a small use case in both impure and pure way. The following section will outline the specification.

Service specification

First we need to specify the exact scope and API of our service. We’ll design a service with a minimal API to keep things simple. It shall fulfil the following requirements.

The service shall provide HTTP API endpoints for:

  1. the creation of a product data type identified by a unique id
  2. adding translations for a product name by language code and unique id
  3. returning the existing translations for a product
  4. returning a list of all existing products with their translations

Data model

We will keep the model very simple to avoid going overboard with the implementation.

  1. A language code shall be defined by the ISO 639-1 (e.g. a two letter code).
  2. A translation shall contain a language code and a product name (non-empty string).
  3. A product shall contain a unique id (UUID version 4) and a list of translations.

Database

The data will be stored in a relational database (RDBMS). Therefore we need to define the tables and relations within the database.

The products table

The table products must contain only the unique id which is also the primary key.

The names table

The table names must contain a column for the product id, one for the language code and one for the name. Its primary key is the combination of the product id and the language code. All columns must not be null. The relation to the products is realised by a foreign key constraint to the products table via the product id.

HTTP API

The HTTP API shall provide the following endpoints on the given paths:

Path HTTP method Function
/products POST Create a product.
/products GET Get all products and translations.
/product/{UUID} PUT Add translations.
/product/{UUID} GET Get all translations for the product.

The data shall be encoded in JSON using the following specification:

JSON for a translation
1 {
2   "lang": "ISO-639-1 Code",
3   "name": "A non empty string."
4 }
JSON for a product
1 {
2   "id": "The-UUID-of-the-product",
3   "names": [
4     // A list of translations.
5   ]
6 }

This should be enough to get us started.

The state of the art

Within the Scala ecosystem the Akka-HTTP library is a popular choice for implementing server side backends for HTTP APIs. Another quite popular option is the Play framework but using a full blown web framework to just provide a thin API is overkill in most cases. As most services need a database the Slick library is another popular choice which completes the picture.

However while all mentioned libraries are battle tested and proven they still have problems.

Problems

In the domain of functional programming we want referential transparency which we will define in the following way:

Building on that we need pure functions which are

  1. only dependent on their input
  2. have no side effects

This means in turn that our functions will be referential transparent.

But, the mentioned libraries are built upon the Future from Scala which uses eager evaluation and breaks referential transparency. Let’s look at an example.

Future example 1
1 import scala.concurrent.Future
2 import scala.concurrent.ExecutionContext.Implicits.global
3 
4 for {
5   _ <- Future { println("Hi there!") }
6   _ <- Future { println("Hi there!") }
7 } yield ()

The code above will print the text Hi there! two times. But how about the following one?

Future example 2
1 import scala.concurrent.Future
2 import scala.concurrent.ExecutionContext.Implicits.global
3 
4 val printF = Future { println("Hi there!") }
5 
6 for {
7   _ <- printF
8   _ <- printF
9 } yield ()

Instead of printing the text two times it will print it only once even when there is no usage of printF at all (try omitting the for comprehension). This means that Future breaks referential transparency!

Maybe there is another way

If we want referential transparency, we must push the side effects to the boundaries of our system (program) which can be done by using lazy evaluation. Let’s repeat the previous example in a different way.

IO example 1
1 import cats.effect.IO
2 import cats.implicits._
3 
4 val effect = for {
5   _ <- IO(println("Hi there!"))
6   _ <- IO(println("Hi there!"))
7 } yield ()

The above code will produce no output. Only if we evaluate the variable effect which is of type IO[Unit] will the output be generated (try effect.unsafeRunSync in the REPL). Also the second approach works like expected.

IO example 2
1 import cats.effect.IO
2 import cats.implicits._
3 
4 val printF = IO(println("Hi there!"))
5 
6 val effect = for {
7   _ <- printF
8   _ <- printF
9 } yield ()

Suddenly we can much more easily reason about our code! And why is that? Well we don’t have unexpected side effects caused by code running even when it doesn’t need to. This is a sneak peak how pure code looks like. Now we only need to implement pure libraries for our use, or do we?

Luckily for us meanwhile there are several pure options available in the Scala ecosystem. We will stick to the Cats family of libraries namely http4s and Doobie as replacements for Akka-HTTP and Slick. They build upon the Cats Effect library which is an implementation of an IO monad for Scala. Some other options exist but we’ll stick to the one from Cats.

To be able to contrast both ways of implementing a service we will first implement it using Akka-HTTP and Slick and will then migrate to http4s and Doobie.

Impure implementation

We’ll be using the following libraries for the impure version of the service:

  1. Akka (including Akka-HTTP and Akka-Streams)
  2. Slick (as database layer)
  3. Flyway for database migrations (or evolutions)
  4. Circe for JSON codecs and akka-http-json as wrapper
  5. Refined for using refined types
  6. the PostgreSQL JDBC driver

I’ll spare you the sbt setup as you can look that up in the code repository (i.e. the impure folder in the book repo).

Models

First we’ll implement our models which are simple and straightforward. At first we need a class to store our translations or better a single translation.

1 final case class Translation(lang: String, name: String)

Technically it is okay but we have a bad feeling about it. Using Option[String] is of no use because both fields have to be set. But a String can always be null and contain a lot of unexpected stuff (literally anything).

So let us define some refined types which we can use later on. At first we need a language code which obeys the restrictions of ISO-639-1 and we need a stronger definition for a product name. For the former we use a regular expression and for the latter we simply expect a string which is not empty.

Refined types for models
1 type LanguageCode = String Refined MatchesRegex[W.`"^[a-z]{2}$"`.T]
2 type ProductName = String Refined NonEmpty

Now we can give our translation model another try.

Translation model using refined types
1 final case class Translation(lang: LanguageCode, name: ProductName)

Much better and while we’re at it we can also write the JSON codecs using the refined module of the Circe library. We put them into the companion object of the model.

1 object Translation {
2   implicit val decode: Decoder[Translation] =
3     Decoder.forProduct2("lang", "name")(Translation.apply)
4 
5   implicit val encode: Encoder[Translation] =
6     Encoder.forProduct2("lang", "name")(t => (t.lang, t.name))
7 }

Now onwards to the product model. Because we already know of refined types we can use them from start here.

1 type ProductId = String Refined Uuid
2 final case class Product(id: ProductId, names: List[Translation])

If we look closely we realise that a List maybe empty. Which is valid for the list but not for our product because we need at least one entry. Luckily for us the Cats library has us covered with the NonEmptyList data type. Including the JSON codecs this leads us to our final implementation.
Last but not least we really should be using the existing UUID data type instead of rolling our own refined string version - even when it is cool. ;-)

Product model using UUID type and NeL
 1 type ProductId = java.util.UUID
 2 final case class Product(id: ProductId, names: NonEmptyList[Translation])
 3 
 4 object Product {
 5   implicit val decode: Decoder[Product] =
 6     Decoder.forProduct2("id", "names")(Product.apply)
 7 
 8   implicit val encode: Encoder[Product] =
 9     Encoder.forProduct2("id", "names")(p => (p.id, p.names))
10 }

We kept the type name ProductId by using a type alias. This is convenient but remember that a type alias does not add extra type safety (e.g. type Foo = String will be a String).

Well, maybe because a list may contain duplicate entries but the database will surely not because of unique constraints! So, let’s switch to a NonEmptySet which is also provided by Cats.

Product model using UUID and NeS
1 type ProductId = java.util.UUID
2 final case class Product(id: ProductId, names: NonEmptySet[Translation])

Now we have the models covered and can move on to the database layer.

Database layer

The database layer should provide a programmatic access to the database but also should it manage changes in the database. The latter one is called migrations or evolutions. From the available options we chose Flyway as the tool to manage our database schema.

Migrations

Flyway uses raw SQL scripts which have to be put into a certain location being /db/migration (under the resources folder) in our case. Also the files have to be named like VXX__some_name.sql (XX being a number) starting with V1. Please note that there are two underscores between the version prefix and the rest of the name! Because our database schema is very simply we’re done quickly:

Flyway migration for creating the database
 1 CREATE TABLE "products" (
 2   "id" UUID NOT NULL,
 3   CONSTRAINT "products_pk" PRIMARY KEY ("id")
 4 );
 5 
 6 CREATE TABLE "names" (
 7   "product_id" UUID       NOT NULL,
 8   "lang_code"  VARCHAR(2) NOT NULL,
 9   "name"       TEXT       NOT NULL,
10   CONSTRAINT "names_pk" 
11     PRIMARY KEY ("product_id", "lang_code"),
12   CONSTRAINT "names_product_id_fk" 
13     FOREIGN KEY ("product_id") 
14     REFERENCES "products" ("id") 
15     ON DELETE CASCADE ON UPDATE CASCADE
16 );

In the code you’ll see that we additionally set comments which are omitted from the code snippet above. This might be overkill here but it is a very handy feature to have and I advice you to use it for more complicated database schemas. Because the right comment (read information) in the right place might save a lot of time when trying to understand things.

Next we move on to the programmatic part which at first needs a configuration of our database connection. With Slick you have a multitude of options but we’ll use the “Typesafe Config”1 approach.

Database configuration in application.conf
 1 database {
 2   profile = "slick.jdbc.PostgresProfile$"
 3   db {
 4     connectionPool = "HikariCP"
 5     dataSourceClass = "org.postgresql.ds.PGSimpleDataSource"
 6     properties {
 7       serverName = "localhost"
 8       portNumber = "5432"
 9       databaseName = "impure"
10       user = "impure"
11       password = "secret"
12     }
13     numThreads = 10
14   }
15 }

After we have this in place we can run the migrations via the API of Flyway. For this we have to load the configuration (we do it by creating an actor system), extract the needed information and create a JDBC url and use that with username and password to obtain a Flyway instance. On that one we simply call the method migrate() which will do the right thing. Basically it will check if the schema exists and decide to either create it, apply pending migrations or simply do nothing. The method will return the number of applied migrations.

Apply database migrations via Flyway
 1 implicit val system: ActorSystem    = ActorSystem()
 2 implicit val mat: ActorMaterializer = ActorMaterializer()
 3 implicit val ec: ExecutionContext   = system.dispatcher
 4 
 5 val url = "jdbc:postgresql://" +
 6 system.settings.config.getString("database.db.properties.serverName") +
 7 ":" + system.settings.config.getString("database.db.properties.portNumber") +
 8 "/" + system.settings.config.getString("database.db.properties.databaseName")
 9 val user = system.settings.config.getString("database.db.properties.user")
10 val pass = system.settings.config.getString("database.db.properties.password")
11 val flyway = Flyway.configure().dataSource(url, user, pass).load()
12 val _ = flyway.migrate()

Let us continue to dive into the Slick table definitions.

Slick tables

Slick offers several options for approaching the database. For our example we will be using the lifted embedding but if needed Slick also provides the ability to perform plain SQL queries.
For the lifted embedding we have to define out tables in a way Slick can understand. While this can be tricky under certain circumstances our simple model is straightforward to implement.

Slick product table definition
1 final class Products(tag: Tag) extends Table[(UUID)](tag, "products") {
2   def id = column[UUID]("id", O.PrimaryKey)
3 
4   def * = (id)
5 }
6 val productsTable = TableQuery[Products]

As you can see above we’re using simple data types (not the refined ones) to have a more easy Slick implementation. However we can also use refined types for the price of using either the slick-refined library or writing custom column mappers.
Next we’ll implement the table for the translations which will also need some constraints.

Slick translations table definition
 1 final class Names(tag: Tag) extends Table[(UUID, String, String)](tag, "names") {
 2   def productId = column[UUID]("product_id")
 3   def langCode  = column[String]("lang_code")
 4   def name      = column[String]("name")
 5 
 6   def pk = primaryKey("names_pk", (productId, langCode))
 7   def productFk =
 8     foreignKey("names_product_id_fk", productId, productsTable)(
 9       _.id,
10       onDelete = ForeignKeyAction.Cascade,
11       onUpdate = ForeignKeyAction.Cascade
12     )
13 
14   def * = (productId, langCode, name)
15 }
16 val namesTable = TableQuery[Names]

As you can see the definition of constraints is also pretty simple. Now our repository needs some functions for a more convenient access to the data.

Slick repository functions
 1 def loadProduct(id: ProductId): Future[Seq[(UUID, String, String)]] = {
 2   val program = for {
 3     (p, ns) <- productsTable
 4       .filter(_.id === id)
 5       .join(namesTable)
 6       .on(_.id === _.productId)
 7   } yield (p.id, ns.langCode, ns.name)
 8   dbConfig.db.run(program.result)
 9 }
10 
11 def loadProducts(): DatabasePublisher[(UUID, String, String)] = {
12   val program = for {
13     (p, ns) <- productsTable.join(namesTable)
14                  .on(_.id === _.productId).sortBy(_._1.id)
15   } yield (p.id, ns.langCode, ns.name)
16   dbConfig.db.stream(program.result)
17 }
18 
19 def saveProduct(p: Product): Future[List[Int]] = {
20   val cp      = productsTable += (p.id)
21   val program = DBIO.sequence(
22     cp :: saveTranslations(p).toList
23   ).transactionally
24   dbConfig.db.run(program)
25 }
26 
27 def updateProduct(p: Product): Future[List[Int]] = {
28   val program = namesTable
29     .filter(_.productId === p.id)
30     .delete
31     .andThen(DBIO.sequence(saveTranslations(p).toList))
32     .transactionally
33   dbConfig.db.run(program)
34 }
35 
36 protected def saveTranslations(p: Product): NonEmptyList[DBIO[Int]] = {
37   val save = saveTranslation(p.id)(_)
38   p.names.toNonEmptyList.map(t => save(t))
39 }
40 
41 /**
42   * Create a query to insert or update a given translation in the database.
43   *
44   * @param id The unique ID of the product.
45   * @param t  The translation to be saved.
46   * @return A composable sql query for Slick.
47   */
48 protected def saveTranslation(id: ProductId)(t: Translation): DBIO[Int] =
49   namesTable.insertOrUpdate((id, t.lang, t.name))

The last two functions are helpers to enable us to create a load queries which we can compose. They are used in the saveProduct and updateProduct functions to create a list of queries that are executed as bulk while the call to transactionally ensures that they will run within a transaction. When updating a product we first delete all existing translations to allow the removal of existing translations via an update. To be able to do so we use the andThen helper from Slick.
The loadProduct function simply returns a list of database rows from the needed join. Therefore we need a function which builds a Product type out of that.

Helper function to create a Product
 1 def fromDatabase(rows: Seq[(UUID, String, String)]): Option[Product] = {
 2   val po = for {
 3     (id, c, n) <- rows.headOption
 4     t          <- Translation.fromUnsafe(c)(n)
 5     p          <- Product(
 6                     id = id,
 7                     names = NonEmptySet.one[Translation](t)
 8                   ).some
 9   } yield p
10   po.map(
11     p =>
12       rows.drop(1).foldLeft(p) { (a, cols) =>
13         val (id, c, n) = cols
14         Translation.fromUnsafe(c)(n).fold(a)(t =>
15           a.copy(names = a.names.add(t))
16         )
17     }
18   )
19 }

But oh no! The compiler refuses to build it:

Missing cats.Order
1 [error] .../impure/models/Product.scala:45:74:
2   could not find implicit value for parameter A: 
3     cats.kernel.Order[com.wegtam.books.pfhais.impure.models.Translation]
4 [error] p <- Product(id = id, names = NonEmptySet.one[Translation](t)).some
5 [error]                                                            ^

It seems we have to provide an instance of Order for our Translation model to make Cats happy. So we have think of an ordering for our model. A simple approach would be to simply order by the language code. Let’s try this:

Providing Order for LanguageCode
1 import cats._
2 import cats.syntax.order._
3 
4 implicit val orderLanguageCode: Order[LanguageCode] = 
5   new Order[LanguageCode] {
6     def compare(x: LanguageCode, y: LanguageCode): Int =
7       x.value.compare(y.value)
8   }

You might have noticed the explicit call to .value to get the underlying string instance of our refined type. This is needed because the other option (using x.compare(y)) will compile but bless you with stack overflow errors. The reason is probably that the latter is compiled into code calling OrderOps#compare which is recursive.

Providing Order for Translation
1 import cats._
2 import cats.syntax.order._
3 
4 implicit val order: Order[Translation] =
5   new Order[Translation] {
6     def compare(x: Translation, y: Translation): Int =
7       x.lang.compare(y.lang)
8   }

So far we should have everything in place to make use of our database. Now we need to wire it all together.

Akka-HTTP routes

Defining the routes is pretty simple if you’re used to the Akka-HTTP routing DSL syntax.

Basic routes with Akka-HTTP
 1 val route = path("product" / JavaUUID) { id: ProductId =>
 2   get {
 3     ???
 4   } ~ put {
 5     ???
 6   }
 7 } ~ path("products") {
 8   get {
 9     ???
10   } ~
11   post {
12     ???
13   }
14 }

We will fill in the details later on. But now for starting the actual server to make use of our routes.

Starting an Akka-HTTP server
1 val host       = system.settings.config.getString("api.host")
2 val port       = system.settings.config.getInt("api.port")
3 val srv        = Http().bindAndHandle(route, host, port)
4 val pressEnter = StdIn.readLine()
5 srv.flatMap(_.unbind()).onComplete(_ => system.terminate())

The code will fire up a server using the defined routes and hostname and port from the configuration to start a server. It will run until you press enter and then terminate. Let us now visit the code for each routing endpoint. We will start with the one for returning a single product.

Returning a single product
 1 path("product" / JavaUUID) { id: ProductId =>
 2   get {
 3     complete {
 4       for {
 5         rows <- repo.loadProduct(id)
 6         prod <- Future { Product.fromDatabase(rows) }
 7       } yield prod
 8     }
 9   }
10 }

We load the raw product data from the repository and convert it into a proper product model. But to make the types align we have to wrap the second call in a Future otherwise we would get a compiler error. We don’t need to marshal the response because we are using the akka-http-json library which provides for example an ErrorAccumulatingCirceSupport import that handles this. Unless of course you do not have circe codecs defined for your types.

Updating a single product
1 val route = path("product" / JavaUUID) { id: ProductId =>
2   put {
3     entity(as[Product]) { p =>
4       complete {
5         repo.updateProduct(p)
6       }
7     }
8   }
9 }

The route for updating a product is also very simple. We’re extracting the product entity via the entity(as[T]) directive from the request body and simply give it to the appropriate repository function. Now onwards to creating a new product.

Creating a product
1 path("products") {
2   post {
3     entity(as[Product]) { p =>
4       complete {
5         repo.saveProduct(p)
6       }
7     }
8   }
9 }

As you can see the function is basically the same except that we’re calling a different function from the repository. Last but not least let us take a look at the return all products endpoint.

Return all products
 1 path("products") {
 2   get {
 3     complete {
 4       val products = for {
 5         rows <- repo.loadProducts()
 6         ps <- Future {
 7           rows.toList.groupBy(_._1).map {
 8             case (_, cols) => Product.fromDatabase(cols)
 9           }
10         }
11       } yield ps
12       products.map(_.toList.flatten)
13     }
14   }
15 }

This looks more complicated that the other endpoints. So what exactly are we doing here?
Well first we load the raw product data from the repository. Afterwards we convert it into the proper data model or to be more exact into a list of product entities.

The first thing that comes to mind is that we’re performing operations in memory. This is not different from the last time when we converted the data for a single product. Now however we’re talking about all products which may be a lot of data. Another obvious point is that we get a list of Option[Product] which we explicitly flatten at the end.

Maybe we should consider streaming the results. But we still have to group and combine the rows which belong to a single product into a product entity. Can we achieve that with streaming? Well, let’s look at our data flow.
We receive a list of 3 columns from the database in the following format: product id, language code, name. The tricky part being that multiple rows (list entries) can belong to the same product recognizable by the same value for the first column product id. At first we should simplify our problem by ensuring that the list will be sorted by the product id. This is done by adjusting the function loadProducts in the repository.

Sort the returned list of entries.
1 def loadProducts(): DatabasePublisher[(UUID, String, String)] = {
2   val program = for {
3     (p, ns) <- productsTable.join(namesTable).on(_.id === _.productId)
4                .sortBy(_._1.id)
5   } yield (p.id, ns.langCode, ns.name)
6   dbConfig.db.stream(program.result)
7 }

Now we can rely on the fact that we have seen all entries for one product if the product id in our list changes. Let’s adjust our code in the endpoint to make use of streaming now. Because Akka-HTTP is based on Akka-Streams we can simply use that.

Return all products as stream
 1 path("products") {
 2   get {
 3     implicit val jsonStreamingSupport: JsonEntityStreamingSupport =
 4       EntityStreamingSupport.json()
 5 
 6     val src = Source.fromPublisher(repo.loadProducts())
 7     val products: Source[Product, NotUsed] = src
 8       .collect(
 9         cs =>
10           Product.fromDatabase(Seq(cs)) match {
11             case Some(p) => p
12         }
13       )
14       .groupBy(Int.MaxValue, _.id)
15       .fold(Option.empty[Product])(
16         (op, x) => op.fold(x.some)(p =>
17           p.copy(names = p.names ::: x.names).some
18         )
19       )
20       .mergeSubstreams
21       .collect(
22         op =>
23           op match {
24             case Some(p) => p
25         }
26       )
27     complete(products)
28   }
29 }

Wow, this may look scary but let’s break it apart piece by piece. At first we need an implicit value which provides streaming support for JSON. Next we create a Source from the database stream. Now we implement the processing logic via the high level streams API. We collect every defined output of our helper function fromDatabase which leads to a stream of Product entities. But we have created way too many (Each product will be created as often as it has translations.). So we group our stream by the product id which creates a new stream for each product id holding only the entities for the specific product. We fold over each of these streams by merging together the list of translations (names). Afterwards we merge the streams back together and run another collect function to simply get a result stream of Product and not of Option[Product]. Last but not least the stream is passed to the complete function which will do the right thing.

Problems with the solution

The solution has two problems:

  1. The number of individual streams (and thus products) is limited to Int.MaxValue.
  2. The groupBy operator holds the references to these streams in memory opening a possible out of memory issue here.

As the first problem is simply related to the usage of groupBy we may say that we only have one problem: The usage of groupBy. ;-)
For a limited amount of data the proposed solution is perfectly fine so we will leave it as is for now.

Regarding the state of our service we have a working solution, so congratulations and let’s move on to the pure implementation.

Pure implementation

Like in the previous section I will spare you the details of the sbt setup. We will be using the following set of libraries:

  1. http4s
  2. Doobie (as database layer)
  3. Flyway for database migrations (or evolutions)
  4. Circe for JSON codecs
  5. Refined for using refined types
  6. the PostgreSQL JDBC driver
  7. pureconfig (for proper configuration loading)

Pure configuration handling

Last time we simply loaded our configuration via the typesafe config library but can’t we do a bit better here? The answer is yes by using the pureconfig1 library. First we start by implementing the necessary parts of our configuration as data types.

Configuration data types
 1 final case class ApiConfig(host: NonEmptyString, port: PortNumber)
 2 
 3 object ApiConfig {
 4   implicit val configReader: ConfigReader[ApiConfig] =
 5     deriveReader[ApiConfig]
 6 }
 7 
 8 final case class DatabaseConfig(driver: NonEmptyString,
 9                                 url: DatabaseUrl,
10                                 user: DatabaseLogin,
11                                 pass: DatabasePassword)
12 
13 object DatabaseConfig {
14   implicit val configReader: ConfigReader[DatabaseConfig] =
15     deriveReader[DatabaseConfig]
16 }

As we can see the code is pretty simple. The implicits in the companion objects are needed for pureconfig to actually map from a configuration to your data types. As you can see we are using a function deriveReader which will derive (like in mathematics) the codec (Yes, it is similar to a JSON codec thus the name.) for us.

Below is an example of deriving a Order instance using the kittens 2library. It uses shapeless under the hood and provides automatic and semi automatic derivation for a lot of type class instances from Cats like Eq, Order, Show, Functor and so on.

Deriving Order via kittens
1 import cats._
2 import cats.derived
3 
4 implicit val order: Order[Translation] = {
5   import derived.auto.order._
6   derived.semi.order[Translation]
7 }

Models

Because we have already written our models we just re-use them here. The only thing we change is the semi automatic derivation of the JSON codecs. We just need to import the appropriate circe package and call the derive functions.

Derive JSON codecs
1 import io.circe._
2 import io.circe.generic.semiauto._
3 
4 implicit val decode: Decoder[Product] = deriveDecoder[Product]
5 implicit val encode: Encoder[Product] = deriveEncoder[Product]
6 implicit val decode: Decoder[Translation] = deriveDecoder[Translation]
7 implicit val encode: Encoder[Translation] = deriveEncoder[Translation]

Database layer

In general the same applies to the database layer as we have already read in the “impure” section.

Migrations

For the sake of simplicity we will stick to Flyway for our database migrations. However we will wrap the migration code in a different way (read Encapsulate it properly within an IO to defer side effects.). While we’re at it we may just as well write our migration code using the interpreter pattern (it became famous under the name “tagless final” in Scala).

Database migrator base
1 trait DatabaseMigrator[F[_]] {
2   def migrate(url: DatabaseUrl,
3               user: DatabaseLogin,
4               pass: DatabasePassword): F[Int]
5 }

We define a trait which describes the functionality desired by our interpreter and use a higher kinded type parameter to be able to abstract over the type. But now let’s continue with our Flyway interpreter.

Flyway migrator interpreter
 1 final class FlywayDatabaseMigrator extends DatabaseMigrator[IO] {
 2   override def migrate(url: DatabaseUrl,
 3                        user: DatabaseLogin,
 4                        pass: DatabasePassword): IO[Int] =
 5     IO {
 6       val flyway: Flyway = Flyway.configure()
 7         .dataSource(url, user, pass)
 8         .load()
 9       flyway.migrate()
10     }
11 }

As we can see, the implementation is pretty simple and we just wrap our code into an IO monad to constrain the effect. Having the migration code settled we can move on to the repository.

If we take a closer look at the method definition of Flyway.migrate, we see this:

Method definition of Flyway.migrate
1 public int migrate() throws FlywayException

While IO will gladly defer side effects for us it won’t stop enclosed code from throwing exceptions. This is not that great. So what can we do about it?
Having an instance of MonadError in scope we could just use the .attempt function provided by it. But is this enough or better does this provide a sensible solution for us? Let’s play a bit on the REPL.

MonadError on the REPL
 1 @ import cats._, cats.effect._, cats.implicits._
 2 @ val program = for {
 3            _ <- IO(println("one"))
 4            _ <- IO(println("two"))
 5            x <- IO.pure(42)
 6            } yield x
 7 @ program.attempt.unsafeRunSync match {
 8            case Left(e) =>
 9              println(e.getMessage)
10              -1
11            case Right(r) => r
12            }
13 one
14 two
15 res3: Int = 42
16 @ val program = for {
17            _ <- IO(println("one"))
18            _ <- IO(throw new Error("BOOM!"))
19            x <- IO.pure(42)
20            } yield x
21 @ program.attempt.unsafeRunSync match {
22            case Left(e) =>
23              println(e.getMessage)
24              -1
25            case Right(r) => r
26            }
27 one
28 BOOM!
29 res5: Int = -1

This looks like we just have to use MonadError then. Hurray, we don’t need to change our code in the migrator. As model citizens of the functional programming camp we just defer the responsibility upwards to the calling site.

Doobie

As we already started with using a tagless final approach we might as well continue with it and define a base for our repository.

Base trait for the repository
1 trait Repository[F[_]] {
2   def loadProduct(id: ProductId): F[Seq[(ProductId, LanguageCode, ProductName)]]
3 
4   def loadProducts(): Stream[F, (ProductId, LanguageCode, ProductName)]
5 
6   def saveProduct(p: Product): F[Int]
7 
8   def updateProduct(p: Product): F[Int]
9 }

There is nothing exciting here except that we feel brave now and try to use proper refined types in our database functions. This is possible due to the usage of the doobie-refined module. To be able to map the UUID data type (and others) we also need to include the doobie-postgresql module. For convenience we are still using ProductId instead of UUID in our definition. In addition we wire the return type of loadProducts to be a fs2.Stream because we want to achieve pure functional streaming here. :-)
So let’s see what a repository using doobie looks like.

The doobie repository.
 1 final class DoobieRepository[F[_]: Sync](tx: Transactor[F])
 2   extends Repository[F] {
 3 
 4   override def loadProduct(id: ProductId) = ???
 5 
 6   override def loadProducts() = ???
 7 
 8   override def saveProduct(p: Product) = ???
 9 
10   override def updateProduct(p: Product) = ???
11 }

We keep our higher kinded type as abstract as we can but we want it to be able to suspend our side effects. Therefore we require an implicit Sync.3
If we look at the detailed function definitions further below, the first big difference is that with doobie you write plain SQL queries. You can do this with Slick too4 but with doobie it is the only way. If you’re used to object relational mapping (ORM) or other forms of query compilers then this may seem strange at first. But: “In data processing it seems, all roads eventually lead back to SQL!”5 ;-)
We won’t discuss the benefits or drawbacks here but in general I also lean towards the approach of using the de facto lingua franca for database access because it was made for this and so far no query compiler was able to beat hand crafted SQL in terms of performance. Another benefit is that if you ask a database guru for help, she will be much more able to help you with plain SQL queries than with some meta query which is compiled into something that you have no idea of.

Loading a product.
1 override def loadProduct(id: ProductId) = 
2   sql"""SELECT products.id, names.lang_code, names.name 
3         FROM products
4         JOIN names ON products.id = names.product_id
5         WHERE products.id = $id"""
6     .query[(ProductId, LanguageCode, ProductName)]
7     .to[Seq]
8     .transact(tx)

The loadProduct function simply returns all rows for a single product from the database like its Slick counterpart in the impure variant. The parameter will be correctly interpolated by Doobie therefore we don’t need to worry about SQL injections here. We specify the type of the query, instruct Doobie to transform it into a sequence and give it to the transactor.

Load all products
1 override def loadProducts() =
2   sql"""SELECT products.id, names.lang_code, names.name
3       FROM products
4       JOIN names ON products.id = names.product_id
5       ORDER BY products.id"""
6     .query[(ProductId, LanguageCode, ProductName)]
7     .stream
8     .transact(tx)

Our loadProducts function is equivalent to the first one but it returns the data for all products sorted by product and as a stream using the fs2 library which provides pure functional streaming.

Save a product
 1 override def saveProduct(p: Product): F[Int] = {
 2   val namesSql = 
 3     "INSERT INTO names (product_id, lang_code, name) VALUES (?, ?, ?)"
 4   val namesValues = p.names.map(t => (p.id, t.lang, t.name))
 5   val program = for {
 6     pi <- sql"INSERT INTO products (id) VALUES(${p.id})".update.run
 7     ni <- Update[(ProductId, LanguageCode, ProductName)](namesSql)
 8             .updateMany(namesValues)
 9   } yield pi + ni
10   program.transact(tx)
11 }

When saving a product we use monadic notation for our program to have it short circuit in the case of failure. Doobie will also put all commands into a database transaction. The function itself will try to create the “master” entry into the products table and save all translations afterwards.

Update a product
 1 override def updateProduct(p: Product): F[Int] = {
 2   val namesSql =
 3     "INSERT INTO names (product_id, lang_code, name) VALUES (?, ?, ?)"
 4   val namesValues = p.names.map(t => (p.id, t.lang, t.name))
 5   val program = for {
 6     dl <- sql"DELETE FROM names WHERE product_id = ${p.id}".update.run
 7     ts <- Update[(ProductId, LanguageCode, ProductName)](namesSql)
 8             .updateMany(namesValues)
 9   } yield dl + ts
10   program.transact(tx)
11 }

The updateProduct function uses also monadic notation like the saveProduct function we talked about before. The difference is that it first deletes all known translations before saving the given ones.

http4s routes

The routing DSL of http4s differs from the one of Akka-HTTP. Although I like the latter one more it poses no problem to model out a base for our routes.

Base for http4s routes
 1 val productRoutes: HttpRoutes[IO] = HttpRoutes.of[IO] {
 2   case GET -> Root / "product" / id =>
 3     ???
 4   case PUT -> Root / "product" / id =>
 5     ???
 6 }
 7 val productsRoutes: HttpRoutes[IO] = HttpRoutes.of[IO] {
 8   case GET -> Root / "products" =>
 9     ???
10   case POST -> Root / "products" =>
11     ???
12 }

As we can see the DSL is closer to Scala syntax and quite easy to read. But before we move on to the details of each route let’s think about how we can model this a bit more abstract. While it is fine to have our routes bound to IO it would be better to have more flexibility here. We have several options here but for starters we just extract our routes into their own classes like in the following schema.

Routing classes
 1 final class ProductRoutes[F[_]: Sync](repo: Repository[F])
 2   extends Http4sDsl[F] {
 3 
 4   val routes: HttpRoutes[F] = HttpRoutes.of[F] {
 5     case GET -> Root / "product" / UUIDVar(id) =>
 6       ???
 7     case req @ PUT -> Root / "product" / UUIDVar(id) =>
 8       ???
 9   }
10 }
11 
12 final class ProductsRoutes[F[_]: Sync](repo: Repository[F]) 
13   extends Http4sDsl[F] {
14 
15   val routes: HttpRoutes[F] = HttpRoutes.of[F] {
16     case GET -> Root / "products" =>
17       ???
18     case req @ POST -> Root / "products" =>
19       ???
20   }
21 }

So far they only need the repository to access and manipulate data. Now let’s take on the single route implementations.

Product routes
 1 final class ProductRoutes[F[_]: Sync](repo: Repository[F])
 2   extends Http4sDsl[F] {
 3   implicit def decodeProduct = jsonOf
 4   implicit def encodeProduct = jsonEncoderOf
 5 
 6   val routes: HttpRoutes[F] = HttpRoutes.of[F] {
 7     case GET -> Root / "product" / UUIDVar(id) =>
 8       for {
 9         rows <- repo.loadProduct(id)
10         resp <- Ok(Product.fromDatabase(rows))
11       } yield resp
12     case req @ PUT -> Root / "product" / UUIDVar(id) =>
13       for {
14         p <- req.as[Product]
15         _ <- repo.updateProduct(p)
16         r <- NoContent()
17       } yield r
18   }
19 }

First we need to bring JSON codecs in scope for http4s thus the implicit definitions on top of the file. In the route for loading a single product we simply load the database rows which we pipe through our helper function to construct a proper Product and return that.
The update route (via PUT) transforms the request body into a Product and gives that to the update function of the repository. Finally a NoContent response is returned.

Products routes (1st try)
 1 final class ProductsRoutes[F[_]: Sync](repo: Repository[F])
 2   extends Http4sDsl[F] {
 3   implicit def decodeProduct = jsonOf
 4   implicit def encodeProduct = jsonEncoderOf
 5 
 6   val routes: HttpRoutes[F] = HttpRoutes.of[F] {
 7     case GET -> Root / "products" =>
 8       val ps: Stream[F, Product] = repo.loadProducts
 9         .map(cs => Product.fromDatabase(List(cs)))
10         .collect {
11           case Some(p) => p
12         }
13       Ok(ps)
14     case req @ POST -> Root / "products" =>
15       for {
16         p <- req.as[Product]
17         _ <- repo.saveProduct(p)
18         r <- NoContent()
19       } yield r
20   }
21 }

Our first take on the routes for products looks pretty complete already. Again we need implicit definitions for our JSON codecs to be able to serialize and de-serialize our entities. The POST route for creating a product is basically the same as the update route from the previous part. We create a Product from the request body, pass it to the save function of the repository and return a 205 NoContent response.
The GET route for returning all products calls the appropriate repository function which returns a stream which we map over using our helper function. Afterwards we use collect to convert our stream from Option[Product] to a stream of Product which we pass to the Ok function of http4s.

To solve this we need to dive into the fs2 API and leverage it’s power to merge our products back together. So let’s see how we do.

Streaming - Take 1

Because we believe ourselves to be clever we pick the simple sledge hammer approach and just run some accumulator on the stream. So what do we need? A helper function and some code changes on the stream (e.g. in the route).

Merging products
1 def merge(ps: List[Product])(p: Product): List[Product] =
2   ps.headOption.fold(List(p)) { h =>
3     if (h.id === p.id)
4       h.copy(names = h.names ::: p.names) :: ps.drop(1)
5     else
6       p :: ps
7   }

So this function will take a list (that may be empty) and a product and will merge the top most element (the head) of the list with the given one. It will return an updated list that either contains an updated head element or a new head. Leaving aside the question of who guarantees that the relevant list element will always be the head, we may use it.

Adapted streaming route
1 case GET -> Root / "products" =>
2   val ps: Stream[F, Product] = repo.loadProducts
3     .map(cs => Product.fromDatabase(List(cs)))
4     .collect {
5       case Some(p) => p
6     }
7     .fold(List.empty[Product])((acc, p) => Product.merge(acc)(p))
8   Ok(ps)

Looks so simple, does it? Just a simple fold which uses our accumulator and we should be settled. But life is not that simple…

Compiler error
1 found   : fs2.Stream[F,List[com.wegtam.books.pfhais.pure.models.Product]]
2 required: fs2.Stream[F,com.wegtam.books.pfhais.pure.models.Product]
3          .fold(List.empty[Product])((acc, p) => Product.merge(acc)(p))
4               ^

The compiler complains that we have changed the type of the stream and rightly so. So let’s fix that compiler error.

Let’s take a look again and think about what it means to change a stream of products into a stream of a list of products. It means that we will be building the whole thing in memory! Well if we wanted that we could have skipped streaming at all. So back to the drawing board.

Streaming - Take 2

We need to process our stream of database columns (or products if we use the converter like before) in such a way that all related entities will be grouped into one product and emitted as such. After browsing the documentation of fs2 we stumble upon a function called groupAdjacentBy so we try that one.

Proper streaming of products
 1 case GET -> Root / "products" =>
 2   val ps = repo.loadProducts
 3     .groupAdjacentBy(_._1)
 4     .map {
 5       case (id, rows) => Product.fromDatabase(rows.toList)
 6     }
 7     .collect {
 8       case Some(p) => p
 9     }
10   Ok(ps)

Okay, this does not look complicated and it even compiles - Hooray! :-)
So let’s break it apart piece by piece. The group function of fs2 will partition the input depending on the given function into chunks. A Chunk is used internally by fs2 for all kinds of stuff. You may compare it to a sub-stream of Akka-Streams. However the documentation labels it as: Strict, finite sequence of values that allows index-based random access of elements.
Having our chunks we can map over each one converting it into a list which is then passed to our helper function fromDatabase to create proper products. Last but not least we need to collect our entities to get from an Option[Product] to a stream of Product.

JSON trouble

Now that we have a proper streaming solution we try it out but what do we get when we expect a list of products?

Broken JSON
 1 % http :53248/products
 2 HTTP/1.1 200 OK
 3 Content-Type: application/json
 4 Transfer-Encoding: chunked
 5 
 6 {
 7   "id":"8773899b-fcfa-401f-af3e-b188ebb0c00c",
 8   "names":[
 9     {"lang":"de","name":"Erdbeere"},
10     {"lang":"en","name":"Strawberry"}
11   ]
12 }
13 {
14   "id":"983aaf86-abe4-44af-9896-d8f2d2c5f82c",
15   "names":[
16     {"lang":"de","name":"Gurke"},
17     {"lang":"en","name":"Cucumber"}
18   ]
19 }

Well, whatever this is, it is not JSON! It might look like it, but it isn’t. However quite often you can see such things in the wild (read in production).

If we think about it then this sounds like a bug in http4s and indeed we find an issue7 for it. Because the underlying problem is not as trivial as it first sounds maybe we should try to work around the issue.
The fs2 API offers concatenation of streams and the nifty intersperse function to insert elements between emitted ones. So let’s give it a try.

Fix JSON encoding issues
 1 case GET -> Root / "products" =>
 2   val prefix = Stream.eval("[".pure[F])
 3   val suffix = Stream.eval("]".pure[F])
 4   val ps = repo.loadProducts
 5     .groupAdjacentBy(_._1)
 6     .map {
 7       case (id, rows) => Product.fromDatabase(rows.toList)
 8     }
 9     .collect {
10       case Some(p) => p
11     }
12     .map(_.asJson.noSpaces)
13     .intersperse(",")
14   @SuppressWarnings(Array("org.wartremover.warts.Any"))
15   val result: Stream[F, String] = prefix ++ ps ++ suffix
16   Ok(result)

First we create streams for the first and last JSON that we need to emit. Please note that we cannot simply use a String here but have to lift it into our HKT F. The usage of pure is okay because we simply lift a fixed value. Then we extend our original stream processing by explicitly converting our products to JSON and inserting the delimiter (a comma) manually using the intersperse function. In the end we simply concatenate our streams and return the result.
Our solution is quite simple, having the downside that we need to suppress a warning from the wartremover8 tool. This is somewhat annoying but can happen. If we remove the annotation, we’ll get a compiler error:

Wartremover error
1 [error] ... [wartremover:Any] Inferred type containing Any
2 [error] val result: Stream[F, String] = prefix ++ ps ++ suffix
3 [error]                                        ^
4 [error] ... [wartremover:Any] Inferred type containing Any
5 [error] val result: Stream[F, String] = prefix ++ ps ++ suffix
6 [error]                                              ^
7 [error] two errors found

So let’s check if we have succeeded:

Checking our JSON
 1 % http :53248/products
 2 HTTP/1.1 200 OK
 3 Content-Type: text/plain; charset=UTF-8
 4 Transfer-Encoding: chunked
 5 
 6 [
 7     {
 8         "id": "8773899b-fcfa-401f-af3e-b188ebb0c00c", 
 9         "names": [
10             {
11                 "lang": "de",
12                 "name": "Erdbeere"
13             },
14             {
15                 "lang": "en",
16                 "name": "Strawberry"
17             }
18         ]
19     },
20     {
21         "id": "983aaf86-abe4-44af-9896-d8f2d2c5f82c", 
22         "names": [
23             {
24                 "lang": "de",
25                 "name": "Gurke"
26             },
27             {
28                 "lang": "en",
29                 "name": "Cucumber"
30             }
31         ]
32     }
33 ]

This looks good, so we congratulations: We are done with our routes!

Starting the application

Within our main entry point we simply initialise all needed components and wire them together. We’ll step through each part in this section. The first thing you’ll notice is that we use the IOApp provided by the Cats effect library9.

Main application
1 object Pure extends IOApp {
2   @SuppressWarnings(Array("org.wartremover.warts.Any"))
3   def run(args: List[String]): IO[ExitCode] = ???
4 }

Yet again we need to suppress a warning from wartremover here. But let’s continue to initialising the database connection.

Database initialisation
 1 val migrator: DatabaseMigrator[IO] = new FlywayDatabaseMigrator
 2 
 3 val program = for {
 4   (apiConfig, dbConfig) <- IO {
 5     val cfg = ConfigFactory.load
 6     (loadConfigOrThrow[ApiConfig](cfg, "api"),
 7      loadConfigOrThrow[DatabaseConfig](cfg, "database"))
 8   }
 9   ms <- migrator.migrate(dbConfig.url, dbConfig.user, dbConfig.pass)
10   tx = Transactor
11     .fromDriverManager[IO](dbConfig.driver,
12       dbConfig.url,
13       dbConfig.user,
14       dbConfig.pass)
15   repo = new DoobieRepository(tx)

We create our database migrator explicitly wired to the IO data type. Now we start with a for comprehension in which we load our configuration via pureconfig yet again within an IO. After successful loading of the configuration we continue with migrating the database. Finally we create the transactor needed by Doobie and the database repository.

Routes and http4s server
 1 val program = for {
 2   // ...
 3   productRoutes  = new ProductRoutes(repo)
 4   productsRoutes = new ProductsRoutes(repo)
 5   routes         = productRoutes.routes <+> productsRoutes.routes
 6   httpApp        = Router("/" -> routes).orNotFound
 7   server         = BlazeServerBuilder[IO].bindHttp(apiConfig.port,
 8                      apiConfig.host).withHttpApp(httpApp)
 9   fiber          = server.resource.use(_ => IO(StdIn.readLine())).as(ExitCode.Succes\
10 s)
11 } yield fiber

Here we create our routes via the classes, combine them (via <+> operator) and create the http4s app explicitly using an IO thus wiring our abstract routes to IO. The service will - like the impure one - run until you press enter. But it won’t run yet. ;-)

Running the service
 1 program.attempt.unsafeRunSync match {
 2   case Left(e) =>
 3     IO {
 4       println("*** An error occured! ***")
 5       if (e != null) {
 6         println(e.getMessage)
 7       }
 8       ExitCode.Error
 9     }
10   case Right(r) => r
11 }

If you remember playing around with MonadError then you’ll recognize the attempt here. We attempt to run our program and execute possible side effects via the unsafeRunSync method from Cats effect. But to provide a proper return type for the IOApp we need to evaluate the return value which is either an error or a proper exit code. In case of an error we print it out on the console (no fancy logging here) and explicitly set an error code as the return value.

As it seems we are done with our pure service! Or are we? Let’s see what we need add to both services if we want to test them.

What about tests?

In the domain of strong static typing (not necessarily functional) you might hear phrases like “It compiles therefore it must be correct thus we don’t need tests!”. While there is a point that certain kinds of tests can be omitted in favour of strong static typing such stances overlook that even a correctly typed program may produce the wrong output. The other extreme (coming from dynamic typed land) is to substitute typing with testing - which is even worse. Remember that testing is usually a probabilistic approach and cannot guarantee the absence of bugs. If you have ever refactored a large code base in both paradigms then you’ll very likely come to esteem a good type system.

However, we need tests, so let’s write some. But before let us think a bit about what kinds of tests we need. :-)
Our service must read and create data in the JSON format. This format should be fixed and changes to it should raise some red flag because: Hey, we just broke our API! Furthermore we want to unleash the power of ScalaCheck1 to benefit from property based testing. But even when we’re not using that we can still use it to generate test data for us.
Besides the regular unit tests there should be integration tests if a service is written. We can test a lot of things on the unit test side but in the end the integration of all our moving parts is what matters and often (usually on the not so pure side of things) you would have to trick (read mock) a lot to test things in isolation.

Testing the impure service

We will start with writing some data generators using the ScalaCheck library.

Generators

ScalaCheck already provides several generators for primitives but for our data models we have to do some more plumbing. Let’s start with generating a language code.

ScalaCheck generate LanguageCode
1 val genLanguageCode: Gen[LanguageCode] = Gen.oneOf(LanguageCodes.all)

Using the Gen.oneOf helper from the library the code becomes dead simple. Generating a UUID is nothing special either.

ScalaCheck generate UUID
1 val genUuid: Gen[UUID] = Gen.delay(UUID.randomUUID)
2 )

You might be tempted to use Gen.const here but please don’t because that one will be memorized and thus never change. Another option is using a list of randomly generated UUID values from which we then chose one. That would be sufficient for generators which only generate a single product but if we want to generate lists of them we would have duplicate ids sooner than later.

ScalaCheck generate ProductName
1 val DefaultProductName: ProductName = "I am a product name!"
2 val genProductName: Gen[ProductName] = for {
3   cs <- Gen.nonEmptyListOf(Gen.alphaNumChar)
4   name = RefType.applyRef[ProductName](cs.mkString)
5            .getOrElse(DefaultProductName)
6 } yield name

So what do we have here? We want to generate a non empty string (because that is a requirement for our ProductName) but we also want to return a properly typed entity. First we let ScalaCheck generate a non empty list of random characters which we give to a utility function of refined. However we need a fallback value in case the validation done by refined fails. Therefore we defined a general default product name beforehand.

ScalaCheck using Refined unsafeApply
1 val genProductName: Gen[ProductName] =
2   Gen.nonEmptyListOf(Gen.alphaNumChar).map(cs => 
3     Refined.unsafeApply(cs.mkString)
4   )

Now that we have generators for language codes and product names we can write a generator for our Translation type.

ScalaCheck generate Translation
 1 val genTranslation: Gen[Translation] = for {
 2   c <- genLanguageCode
 3   n <- genProductName
 4 } yield
 5   Translation(
 6     lang = c,
 7     name = n
 8   )
 9 
10 implicit val arbitraryTranslation: Arbitrary[Translation] =
11   Arbitrary(genTranslation)

As we can see the code is also quite simple. Additionally we create an implicit arbitrary value which will be used automatically by the forAll test helper if it is in scope. To be able to generate a Product we will need to provide a non empty list of translations.

ScalaCheck generate lists of translations
1 val genTranslationList: Gen[List[Translation]] = for {
2   ts <- Gen.nonEmptyListOf(genTranslation)
3 } yield ts
4 
5 val genNonEmptyTranslationList: Gen[NonEmptyList[Translation]] = for {
6   t  <- genTranslation
7   ts <- genTranslationList
8   ns = NonEmptyList.fromList(ts)
9 } yield ns.getOrElse(NonEmptyList.of(t))

The first generator will create a non empty list of translations but will be typed as a simple List. Therefore we create a second generator which uses the fromList helper of the non empty list from Cats. Because that helper returns an Option (read is a safe function) we need to fallback to using a simple of function at the end.
With all these in place we can finally create our Product instances.

ScalaCheck generate Product
 1 val genProduct: Gen[Product] = for {
 2   id <- genProductId
 3   ts <- genNonEmptyTranslationList
 4 } yield
 5   Product(
 6     id = id,
 7     names = ts
 8   )
 9 
10 implicit val arbitraryProduct: Arbitrary[Product] =
11   Arbitrary(genProduct)

The code is basically the same as for Translation - the arbitrary implicit included.

Unit Tests

To avoid repeating the construction of our unit test classes we will implement a base class for tests which is quite simple.

Base class for unit tests
1 abstract class BaseSpec extends WordSpec 
2   with MustMatchers with ScalaCheckPropertyChecks {}

Feel free to use other test styles - after all ScalaTest offers a lot2 of them. I tend to lean towards the more verbose ones like WordSpec. Maybe that is because I spent a lot of time with RSpec3 in the Ruby world. ;-)

Testing Product#fromDatabase
1 import com.wegtam.books.pfhais.impure.models.TypeGenerators._
2 
3 forAll("input") { p: Product =>
4   val rows = p.names.map(t => (p.id, t.lang.value, t.name.value)).toList
5   Product.fromDatabase(rows) must contain(p)
6 }

The code above is a very simple test of our helper function fromDatabase which works in the following way:

  1. The forAll will generate a lot of Product entities using the generator.
  2. From each entity a list of “rows” is constructed like they would appear in the database.
  3. These constructed rows are given to the fromDatabase function.
  4. The returned Option must then contain the generated value.

Because we construct the input for the function from a valid generated instance the function must always return a valid output.

Now let’s continue with testing our JSON codec for Product.

Testing a JSON codec

We need our JSON codec to provide several guarantees:

  1. It must fail to decode invalid JSON input format (read garbage).
  2. It must fail to decode valid JSON input format with invalid data (read wrong semantics).
  3. It must succeed to decode completely valid input.
  4. It must encode JSON which contains all fields included in the model.
  5. It must be able to decode JSON that itself encoded.

The first one is pretty simple and to be honest: You don’t have to write a test for this because that should be guaranteed by the Circe library. Things look a bit different for very simple JSON representations though (read when encoding to numbers or strings).
I’ve seen people arguing about point 5 and there may be applications for it but implementing encoders and decoders in a non-reversible way will make your life way more complicated.

Testing decoding garbage input
1 forAll("input") { s: String =>
2   decode[Product](s).isLeft must be(true)
3 }

There is not much to say about the test above: It will generate a lot of random strings which will be passed to the decoder which must fail.

Testing invalid input values
1 forAll("id", "names") { (id: String, ns: List[String]) =>
2   val json = """{
3     |"id":""" + id.asJson.noSpaces + """,
4     |"names":""" + ns.asJson.noSpaces + """
5     |}""".stripMargin
6   decode[Product](json).isLeft must be(true)
7 }

This test will generate random instances for id which are all wrong because it must be a UUID and not a string. Also the instances for names will mostly (but maybe not always) be wrong because there might be empty strings or an even empty list. So the decoder is given a valid JSON format but invalid values, therefore it must fail.

Testing valid input
 1 forAll("input") { i: Product =>
 2   val json = s"""{
 3     |"id": ${i.id.asJson.noSpaces},
 4     |"names": ${i.names.asJson.noSpaces}
 5     |}""".stripMargin
 6   withClue(s"Unable to decode JSON: $json") {
 7     decode[Product](json) match {
 8       case Left(e)  => fail(e.getMessage)
 9       case Right(v) => v must be(i)
10     }
11   }
12 }

In this case we manually construct a valid JSON input using values from a generated valid Product entity. This is passed to the decoder and the decoder must not only succeed but return an instance equal to the generated one.

Test included fields
1 forAll("input") { i: Product =>
2   val json = i.asJson.noSpaces
3   json must include(s""""id":${i.id.asJson.noSpaces}""")
4   json must include(s""""names":${i.names.asJson.noSpaces}""")
5 }

The test will generate again a lot of entities and we construct a JSON string from each. We then expect the string to include several field names and their correctly encoded values. You might ask why we do not check for more things like: Are these fields the only ones within the JSON string? Well, this would be more cumbersome to test and a JSON containing more fields than we specify won’t matter for the decoder because it will just ignore them.

Decoding encoded JSON
1 forAll("input") { p: Product =>
2   decode[Product](p.asJson.noSpaces) match {
3     case Left(_)  => fail("Must be able to decode encoded JSON!")
4     case Right(d) => withClue("Must decode the same product!")(d must be(p))
5   }
6 }

Here we encode a generated entity and pass it to the encoder which must return the same entity.

More tests

So this is basically what we do for models. Because we have more than one we will have to write tests for each of the others. I will spare you the JSON tests for Translation but that one also has a helper function called fromUnsafe so let’s take a look at the function.

Translation#fromUnsafe
1 def fromUnsafe(lang: String)(name: String): Option[Translation] =
2   for {
3     l <- RefType.applyRef[LanguageCode](lang).toOption
4     n <- RefType.applyRef[ProductName](name).toOption
5   } yield Translation(lang = l, name = n)

This function simply tries to create a valid Translation entity from unsafe input values using the helpers provided by refined. As we can see it is a total function (read is safe to use). To cover all corner cases we must test it with safe and unsafe input.

Testing Translation#fromUnsafe (1)
 1 forAll("lang", "name") { (l: String, n: String) =>
 2   whenever(
 3     RefType
 4       .applyRef[LanguageCode](l)
 5       .toOption
 6       .isEmpty || RefType.applyRef[ProductName](n).toOption.isEmpty
 7   ) {
 8     Translation.fromUnsafe(l)(n) must be(empty)
 9   }
10 }

Here we generate two random strings which we explicitly check to be invalid using the whenever helper. Finally the function must return an empty Option e.g. None for such values.

Testing Translation#fromUnsafe (2)
1 forAll("input") { t: Translation =>
2   Translation.fromUnsafe(t.lang.value)(t.name.value) must contain(t)
3 }

The test for valid input is very simple because we simply use the values from our automatically generated valid instances. :-)

So far we have no tests for our Repository class which handles all the database work. Neither have we tests for our routes. We have several options for testing here but before can test either of them we have do to some refactoring. For starters we should move our routes out of our main application into separate classes to be able to test them more easily.

Yes, we should. There are of course limits and pros and cons to that but in general this makes sense. Also this has nothing to do with being “impure” or “pure” but with clean structure.

Some refactoring

Moving the routes into separate classes poses no big problem we simply create a ProductRoutes and a ProductsRoutes class which will hold the appropriate routes. As a result our somewhat messy main application code becomes more readable.

New Impure main application
 1 def main(args: Array[String]): Unit = {
 2   implicit val system: ActorSystem    = ActorSystem()
 3   implicit val mat: ActorMaterializer = ActorMaterializer()
 4   implicit val ec: ExecutionContext   = system.dispatcher
 5 
 6   val url = ???
 7   val user           = ???
 8   val pass           = ???
 9   val flyway: Flyway = ???
10   val _              = flyway.migrate()
11 
12   val dbConfig: DatabaseConfig[JdbcProfile] =
13     DatabaseConfig.forConfig("database", system.settings.config)
14   val repo = new Repository(dbConfig)
15 
16   val productRoutes  = new ProductRoutes(repo)
17   val productsRoutes = new ProductsRoutes(repo)
18   val routes         = productRoutes.routes ~ productsRoutes.routes
19 
20   val host       = system.settings.config.getString("api.host")
21   val port       = system.settings.config.getInt("api.port")
22   val srv        = Http().bindAndHandle(routes, host, port)
23   val pressEnter = StdIn.readLine()
24   srv.flatMap(_.unbind()).onComplete(_ => system.terminate())
25 }

We simply create our instances from our routing classes and construct our global routes directly from them. This is good but if we want to test the routes in isolation we still have the problem that they are hard-wired to our Repository class which is implemented via Slick. Several options exist to handle this:

  1. Use an in-memory test database with according configuration.
  2. Abstract further and use a trait instead of the concrete repository implementation.
  3. Write integration tests which will require a working database.

Using option 1 is tempting but think about it some more. While the benefit is that we can use our actual implementation and just have to fire up an in-memory database (for example h2), there are also some drawbacks:

  1. You have to handle evolutions for the in-memory database.
  2. Your evolutions have to be completely portable SQL (read ANSI SQL). Otherwise you’ll have to write each of your evolutions scripts two times (one for production, one for testing).
  3. Your code has to be database agnostic. This sounds easier than it is. Even the tools you’re using may use database specific features under the hood.
  4. Several features are simply not implemented in some databases. Think of things like cascading deletion via foreign keys.

Taking option 2 is a valid choice but it will result in more code. Also you must pay close attention to the “test repository” implementation to avoid introducing bugs there. Going for the most simple approach is usually feasible. Think of a simple test repository implementation that will just return hard coded values or values passed to it via constructor.

However we will go with option 3 in this case. It has the drawback that you’ll have to provide a real database environment (and maybe more) for testing. But it is as close to production as you can get. Also you will need these either way to test your actual repository implementation, so let’s get going.

Integration Tests

First we need to configure our test database because we do not want to accidentally wipe a production database. For our case we leave everything as is and just change the database name.

Configuration file for integration tests
 1 api {
 2   host = "localhost"
 3   port = 49152
 4 }
 5 
 6 database {
 7   profile = "slick.jdbc.PostgresProfile$"
 8   db {
 9     connectionPool = "HikariCP"
10     dataSourceClass = "org.postgresql.ds.PGSimpleDataSource"
11     properties {
12       serverName = "localhost"
13       portNumber = "5432"
14       databaseName = "impure_test"
15       user = "impure"
16       password = "secret"
17     }
18     numThreads = 10
19   }
20 }

Another thing we should do is provide a test configuration for our logging framework. We use the logback library and Slick will produce a lot of logging output on the DEBUG level so we should fix that. It is nice to have logging if you need it but it also clutters up your log files. We create a file logback-test.xml in the directory src/it/resources which should look like this:

Logging configuration for integration tests
 1 <?xml version="1.0" encoding="UTF-8"?>
 2 <configuration debug="false">
 3   <appender name="console" class="ch.qos.logback.core.ConsoleAppender">
 4     <filter class="ch.qos.logback.classic.filter.ThresholdFilter">
 5       <level>WARN</level>
 6     </filter>
 7     <encoder>
 8       <pattern>%date %highlight(%-5level) %cyan(%logger{0}) - %msg%n</pattern>
 9     </encoder>
10   </appender>
11 
12   <appender name="async-console" class="ch.qos.logback.classic.AsyncAppender">
13     <appender-ref ref="console"/>
14     <queueSize>5000</queueSize>
15     <discardingThreshold>0</discardingThreshold>
16   </appender>
17 
18   <logger name="com.wegtam.books.pfhais.impure" level="INFO" additivity="false">
19     <appender-ref ref="console"/>
20   </logger>
21 
22   <root>
23     <appender-ref ref="console"/>
24   </root>
25 </configuration>

Due to the nature of integration tests we want to use production or “production like” settings and environment. But really starting our application or service for each test will be quite cumbersome so we should provide a base class for our tests. In this class we will start our service, migrate our database and provide an opportunity to shut it down properly after testing.

Because we do not want to get into trouble when running an Akka-HTTP on the same port, we first create a helper function which will determine a free port number.

Finding a free port number for tests
1 import java.net.ServerSocket
2 
3 def findAvailablePort(): Int = {
4   val serverSocket = new ServerSocket(0)
5   val freePort     = serverSocket.getLocalPort
6   serverSocket.setReuseAddress(true)
7   serverSocket.close()
8   freePort
9 }

The code is quite simple and very useful for such cases. Please note that it is important to use setReuseAddress because otherwise the found socket will be blocked for a certain amount of time. But now let us continue with our base test class.

BaseSpec for integration tests
 1 abstract class BaseSpec
 2     extends TestKit(
 3       ActorSystem(
 4         "it-test",
 5         ConfigFactory
 6           .parseString(s"api.port=${BaseSpec.findAvailablePort()}")
 7           .withFallback(ConfigFactory.load())
 8       )
 9     )
10     with AsyncWordSpecLike
11     with MustMatchers
12     with ScalaCheckPropertyChecks
13     with BeforeAndAfterAll
14     with BeforeAndAfterEach {
15 
16   implicit val materializer: ActorMaterializer = ActorMaterializer()
17 
18   private val url  = ???
19   private val user = ???
20   private val pass = ???
21   protected val flyway: Flyway = 
22     Flyway.configure().dataSource(url, user, pass).load()
23 
24   override protected def afterAll(): Unit =
25     TestKit.shutdownActorSystem(system, FiniteDuration(5, SECONDS))
26 
27   override protected def beforeAll(): Unit = {
28     val _ = flyway.migrate()
29   }
30 }

As you can see we are using the Akka-Testkit to initialise an actor system. This is useful because there are several helpers available which you might need. We configure the actor system with our free port using the loaded configuration as fallback. Next we globally create an actor materializer which is needed by Akka-HTTP and Akka-Streams. Also we create a globally available Flyway instance to make cleaning and migrating the database easier.
The base class also implements the beforeAll and afterAll methods which will be run before and after all tests. They are used to initially migrate the database and to shut down the actor system properly in the end.

Testing the repository

Now that we our parts in place we can write an integration test for our repository implementation.

First we need to do some things globally for the test scope.

Repository test: global stuff
 1 private val dbConfig: DatabaseConfig[JdbcProfile] =
 2   DatabaseConfig.forConfig("database", system.settings.config)
 3 private val repo = new Repository(dbConfig)
 4 
 5 override protected def beforeEach(): Unit = {
 6   flyway.clean()
 7   val _ = flyway.migrate()
 8   super.beforeEach()
 9 }
10 
11 override protected def afterEach(): Unit = {
12   flyway.clean()
13   super.afterEach()
14 }
15 
16 override protected def afterAll(): Unit = {
17   repo.close()
18   super.afterAll()
19 }

We create one Repository instance for all our tests here. The downside is that if one test crashes it then the other will be affected too. On the other hand we avoid running into database connection limits and severe code limbo to ensure closing a repository connection after each test no matter the result.
Also we clean and migrate before each test and clean also after each test. This ensures having a clean environment.

Onwards to the test for loading a single product.

Repository test: loadProduct
 1 "#loadProduct" when {
 2   "the ID does not exist" must {
 3     "return an empty list of rows" in {
 4       val id = UUID.randomUUID
 5       for {
 6         rows <- repo.loadProduct(id)
 7       } yield {
 8         rows must be(empty)
 9       }
10     }
11   }
12 
13   "the ID exists" must {
14     "return a list with all product rows" in {
15       genProduct.sample match {
16         case None => fail("Could not generate data sample!")
17         case Some(p) =>
18           for {
19             _    <- repo.saveProduct(p)
20             rows <- repo.loadProduct(p.id)
21           } yield {
22             Product.fromDatabase(rows) match {
23               case None => fail("No product created from database rows!")
24               case Some(c) =>
25                 c.id must be(p.id)
26                 c mustEqual p
27             }
28           }
29       }
30     }
31   }
32 }

Loading a non existing product must not produce any result and is simple to test if our database is empty. Testing the loading of a real product is not that much more complicated. We use the ScalaCheck generators to create one, save it and load it again. The loaded product must of course be equal to the saved one.

Repository test: loadProducts
 1 "#loadProducts" when {
 2   "no products exist" must {
 3     "return an empty stream" in {
 4       val src = Source.fromPublisher(repo.loadProducts())
 5       for {
 6         ps <- src.runWith(Sink.seq)
 7       } yield {
 8         ps must be(empty)
 9       }
10     }
11   }
12 
13   "some products exist" must {
14     "return a stream with all product rows" in {
15       genProducts.sample match {
16         case None => fail("Could not generate data sample!")
17         case Some(ps) =>
18           val expected = ps.flatMap(
19           p => p.names.toNonEmptyList.toList.map(
20             n => (p.id, n.lang, n.name)
21             )
22           )
23           for {
24             _ <- Future.sequence(ps.map(p => repo.saveProduct(p)))
25             src = Source
26               .fromPublisher(repo.loadProducts())
27               // more code omitted here
28             rows <- src.runWith(Sink.seq)
29           } yield {
30             rows must not be (empty)
31             rows.size mustEqual ps.size
32             rows.toList.sorted mustEqual ps.sorted
33           }
34       }
35     }
36   }
37 }

Testing the loading of all products if none exit is trivial like the one for a non existing single product. For the case of multiple products we generate a list of them which we save. Afterwards we load them and use the same transformation logic like in the routes to be able to construct proper Product instances. One thing you might notice is the explicit sorting which is due to the fact that we want to ensure that our product lists are both sorted before comparing them.

Repository test: saveProduct
 1 "#saveProduct" when {
 2   "the product does not already exist" must {
 3     "save the product to the database" in {
 4       genProduct.sample match {
 5         case None => fail("Could not generate data sample!")
 6         case Some(p) =>
 7           for {
 8             cnts <- repo.saveProduct(p)
 9             rows <- repo.loadProduct(p.id)
10           } yield {
11             withClue("Data missing from database!")(
12               cnts.fold(0)(_ + _) must be(p.names.toNonEmptyList.size + 1))
13             Product.fromDatabase(rows) match {
14               case None => fail("No product created from database rows!")
15               case Some(c) =>
16                 c.id must be(p.id)
17                 c mustEqual p
18             }
19           }
20       }
21     }
22   }
23 
24   "the product does already exist" must {
25     "return an error and not change the database" in {
26       (genProduct.sample, genProduct.sample) match {
27         case (Some(a), Some(b)) =>
28           val p = b.copy(id = a.id)
29           for {
30             cnts <- repo.saveProduct(a)
31             nosv <- repo.saveProduct(p).recover {
32               case _ => 0
33             }
34             rows <- repo.loadProduct(a.id)
35           } yield {
36             withClue("Saving a duplicate product must fail!")(nosv must be(0))
37             Product.fromDatabase(rows) match {
38               case None => fail("No product created from database rows!")
39               case Some(c) =>
40                 c.id must be(a.id)
41                 c mustEqual a
42             }
43           }
44         case _ => fail("Could not create data sample!")
45       }
46     }
47   }
48 }

Here we test the saving which in the first case should simply write the appropriate data into the database. If the product already exists however this should not happen. Our database constraints will ensure that this does not happen (or so we hope ;-)). Slick will throw an exception which we catch in the test code using the recover method from Future to return a zero indicating no affected database rows. In the end we test for this zero and also check if the originally saved product has not been changed.

Repository test: updateProduct
 1 "#updateProduct" when {
 2   "the product does exist" must {
 3     "update the database" in {
 4       (genProduct.sample, genProduct.sample) match {
 5         case (Some(a), Some(b)) =>
 6           val p = b.copy(id = a.id)
 7           for {
 8             cnts <- repo.saveProduct(a)
 9             upds <- repo.updateProduct(p)
10             rows <- repo.loadProduct(a.id)
11           } yield {
12             withClue("Already existing product was not created!")(
13               cnts.fold(0)(_ + _) must be(a.names.toNonEmptyList.size + 1)
14             )
15             Product.fromDatabase(rows) match {
16               case None => fail("No product created from database rows!")
17               case Some(c) =>
18                 c.id must be(a.id)
19                 c mustEqual p
20             }
21           }
22         case _ => fail("Could not create data sample!")
23       }
24     }
25   }
26 
27   "the product does not exist" must {
28     "return an error and not change the database" in {
29       genProduct.sample match {
30         case None => fail("Could not generate data sample!")
31         case Some(p) =>
32           for {
33             nosv <- repo.updateProduct(p).recover {
34               case _ => 0
35             }
36             rows <- repo.loadProduct(p.id)
37           } yield {
38             withClue("Updating a not existing product must fail!")
39               (nosv must be(0))
40             withClue("Product must not exist in database!")
41               (rows must be(empty))
42           }
43       }
44     }
45   }
46 }

For testing an update we generate two samples, save one to the database, change the id of the other to the one from the first and execute an update. This update should proceed without problems and the data in the database must have been changed correctly.
If the product does not exist then we use the same recover technique like in the saveProduct test.

Congratulations, we have made a check mark on our first integration test using a real database using randomly generated data!

Testing the routes

Regarding the route testing there several options as always. For this one we will define some use cases and then develop some more helper code which will allow us to fire up our routes and do real HTTP requests and then check the database and the responses. We build upon our BaseSpec class and call it BaseUseCaseSpec. In it we will do some more things like define a global base URL which can be used from the tests to make correct requests. Additionally we will write a small actor which simply starts an Akka-HTTP server.

An actor for starting our routes
 1 final class BaseUseCaseActor(repo: Repository, mat: ActorMaterializer)
 2   extends Actor with ActorLogging {
 3   import context.dispatcher
 4 
 5   implicit val system: ActorSystem             = context.system
 6   implicit val materializer: ActorMaterializer = mat
 7 
 8   override def receive: Receive = {
 9     case BaseUseCaseActorCmds.Start =>
10       val productRoutes  = new ProductRoutes(repo)
11       val productsRoutes = new ProductsRoutes(repo)
12       val routes         = productRoutes.routes ~ productsRoutes.routes
13       val host           = context.system.settings.config.getString("api.host")
14       val port           = context.system.settings.config.getInt("api.port")
15       val _              = Http().bindAndHandle(routes, host, port)
16     case BaseUseCaseActorCmds.Stop =>
17       context.stop(self)
18   }
19 }
20 
21 object BaseUseCaseActor {
22   def props(repo: Repository, mat: ActorMaterializer): Props =
23     Props(new BaseUseCaseActor(repo, mat))
24 
25   sealed trait BaseUseCaseActorCmds
26 
27   object BaseUseCaseActorCmds {
28     case object Start extends BaseUseCaseActorCmds
29     case object Stop extends BaseUseCaseActorCmds
30   }
31 }

As you can see, the actor is quite simple. Upon receiving the Start command it will initialise the routes and start an Akka-HTTP server. To be able to do this it needs the database repository and an actor materializer which are passed via the constructor. Regarding the BaseUseCaseSpec we will concentrate on the code that differs from the base class.

Base class for use case testing
 1 abstract class BaseUseCaseSpec
 2   extends TestKit(
 3     ActorSystem(
 4       "it-test",
 5       ConfigFactory
 6         .parseString(s"api.port=${BaseUseCaseSpec.findAvailablePort()}")
 7         .withFallback(ConfigFactory.load())
 8     )
 9   )
10   with AsyncWordSpecLike
11   with MustMatchers
12   with ScalaCheckPropertyChecks
13   with BeforeAndAfterAll
14   with BeforeAndAfterEach {
15   // ...
16   final val baseUrl: String = s"""http://${system.settings.config
17     .getString("api.host")}:${system.settings.config
18     .getInt("api.port")}"""
19 
20   protected val dbConfig: DatabaseConfig[JdbcProfile] =
21     DatabaseConfig.forConfig("database", system.settings.config)
22   protected val repo = new Repository(dbConfig)
23 
24   override protected def beforeAll(): Unit = {
25     val _ = flyway.migrate()
26     val a = system.actorOf(BaseUseCaseActor.props(repo, materializer))
27     a ! BaseUseCaseActorCmds.Start
28   }
29   // ...
30 }

Here we create our base URL and our database repository while we use the beforeAll function to initialise our actor before any test is run. Please note that this has the same drawback like sharing the repository across tests: If a test crashes your service then the others will be affected too.
But let’s write a test for our first use case: loading a product!

Again we will use the beforeEach and afterEach helpers to clean up our database. Now let’s take a look at a test for loading a product that does not exist.

Use case: load a non existing product
 1 "Loading a Product by ID" when {
 2   "the ID does not exist" must {
 3     val expectedStatus = StatusCodes.NotFound
 4 
 5     s"return $expectedStatus" in {
 6       val id = UUID.randomUUID
 7 
 8       for {
 9         resp <- http.singleRequest(
10                   HttpRequest(
11                     method = HttpMethods.GET,
12                     uri = s"$baseUrl/product/$id",
13                     headers = Seq(),
14                     entity = HttpEntity(
15                       contentType = ContentTypes.`application/json`,
16                       data = ByteString("")
17                     )
18                   )
19                 )
20       } yield {
21         resp.status must be(expectedStatus)
22       }
23     }
24   }
25 }

Maybe a bit verbose but we do a real HTTP request here and check the response status code. So how does it look if we run it?

1 [info] LoadProduct:
2 [info] Loading a Product by ID
3 [info]   when the ID does not exist
4 [info]   - must return 404 Not Found *** FAILED ***
5 [info]     200 OK was not equal to 404 Not Found (LoadProduct.scala:88)

That is not cool. What happened? Let’s take a look at the code.

1 get {
2   complete {
3     for {
4       rows <- repo.loadProduct(id)
5       prod <- Future { Product.fromDatabase(rows) }
6     } yield prod
7   }
8 }

Well, no wonder - we are simply returning the output of our fromDatabase helper function which may be empty. This will result in an HTTP status code 200 with an empty body. If you don’t believe me just fire up the service and do a request by hand via curl or httpie.

Luckily for us Akka-HTTP has us covered with the rejectEmptyResponse directive which we can use.

 1 get {
 2   rejectEmptyResponse {
 3     complete {
 4       for {
 5         rows <- repo.loadProduct(id)
 6         prod <- Future { Product.fromDatabase(rows) }
 7       } yield prod
 8     }
 9   }
10 }

Cool, it seems we’re set with this one. So onward to testing to load an existing product via the API.

Use case: load a product
 1 "Loading a Product by ID" when {
 2   "the ID does exist" must {
 3     val expectedStatus = StatusCodes.OK
 4 
 5     s"return $expectedStatus and the Product" in {
 6       genProduct.sample match {
 7         case None => fail("Could not generate data sample!")
 8         case Some(p) =>
 9           for {
10             _    <- repo.saveProduct(p)
11             rows <- repo.loadProduct(p.id)
12             resp <- http.singleRequest(
13                       HttpRequest(
14                         method = HttpMethods.GET,
15                         uri = s"$baseUrl/product/${p.id}",
16                         headers = Seq(),
17                         entity = HttpEntity(
18                           contentType = ContentTypes.`application/json`,
19                           data = ByteString("")
20                         )
21                       )
22                     )
23             body <- resp.entity.dataBytes.runFold(ByteString(""))(_ ++ _)
24           } yield {
25             withClue("Seeding product data failed!")(rows must not be(empty))
26             resp.status must be(expectedStatus)
27             decode[Product](body.utf8String) match {
28               case Left(e)  => fail(s"Could not decode response: $e")
29               case Right(d) => d mustEqual p
30             }
31           }
32       }
33     }
34   }
35 }

Here we simple save our generated product into the database before executing our request. We also check if the product has actually been written to be on the safe side. Additionally we also check if the decoded response body matches the product we expect. In contrast to our first test this one works instantly so not all hope is lost for our coding skills. ;-)
We’ll continue with the use case of saving (or creating) a product via the API. This time we will make the code snippets shorter.

Use case: save a product (1)
 1 val expectedStatus = StatusCodes.BadRequest
 2 
 3 s"return $expectedStatus" in {
 4   for {
 5     resp <- http.singleRequest(
 6       HttpRequest(
 7         method = HttpMethods.POST,
 8         uri = s"$baseUrl/products",
 9         headers = Seq(),
10         entity = HttpEntity(
11           contentType = ContentTypes.`application/json`,
12           data = ByteString(
13             scala.util.Random.alphanumeric.take(256).mkString
14           )
15         )
16       )
17     )
18   } yield {
19     resp.status must be(expectedStatus)
20   }
21 }

Here we test posting garbage instead of valid JSON to the endpoint which must result in a “400 Bad Request” returned to us.

Use case: save a product (2)
 1 val expectedStatus = StatusCodes.InternalServerError
 2 
 3 s"return $expectedStatus and not save the Product" in {
 4   (genProduct.sample, genProduct.sample) match {
 5     case (Some(a), Some(b)) =>
 6       val p = b.copy(id = a.id)
 7       for {
 8         _    <- repo.saveProduct(a)
 9         rows <- repo.loadProduct(a.id)
10         resp <- http.singleRequest(
11           HttpRequest(
12             method = HttpMethods.POST,
13             uri = s"$baseUrl/products",
14             headers = Seq(),
15             entity = HttpEntity(
16               contentType = ContentTypes.`application/json`,
17               data = ByteString(p.asJson.noSpaces)
18             )
19           )
20         )
21         rows2 <- repo.loadProduct(a.id)
22       } yield {
23         withClue("Seeding product data failed!")(rows must not be(empty))
24         resp.status must be(expectedStatus)
25         Product.fromDatabase(rows2) match {
26           case None    =>
27             fail("Seeding product was not saved to database!")
28           case Some(s) =>
29             withClue("Existing product must not be changed!")(s mustEqual a)
30         }
31       }
32     case _ => fail("Could not generate data sample!")
33   }
34 }

This one executes it bit more but basically we try to save (or better create) an already existing product. Therefore the constraints of our database should product an error which in turn must return a “500 Internal Server Error” to use. Additionally we verify that the existing product in the database was not changed.

Use case: save a product (3)
 1 val expectedStatus = StatusCodes.OK
 2 
 3 s"return $expectedStatus and save the Product" in {
 4   genProduct.sample match {
 5     case None => fail("Could not generate data sample!")
 6     case Some(p) =>
 7       for {
 8         resp <- http.singleRequest(
 9           HttpRequest(
10             method = HttpMethods.POST,
11             uri = s"$baseUrl/products",
12             headers = Seq(),
13             entity = HttpEntity(
14               contentType = ContentTypes.`application/json`,
15               data = ByteString(p.asJson.noSpaces)
16             )
17           )
18         )
19         rows <- repo.loadProduct(p.id)
20       } yield {
21         resp.status must be(expectedStatus)
22         Product.fromDatabase(rows) match {
23           case None    => fail("Product was not saved to database!")
24           case Some(s) => s mustEqual p
25         }
26       }
27   }
28 }

Last but not least we are testing to save a not already existing valid product into the database. Again we check for the expected status code of “200 OK” and verify that the product saved into the database is the one we sent to the API. Let’s move on to testing the loading of all products now.

Use case: load all products (1)
 1 val expectedStatus = StatusCodes.OK
 2 
 3 s"return $expectedStatus and an empty list" in {
 4   for {
 5     resp <- http.singleRequest(
 6               HttpRequest(
 7                 method = HttpMethods.GET,
 8                 uri = s"$baseUrl/products",
 9                 headers = Seq(),
10                 entity = HttpEntity(
11                   contentType = ContentTypes.`application/json`,
12                   data = ByteString("")
13                 )
14               )
15             )
16     body <- resp.entity.dataBytes.runFold(ByteString(""))(_ ++ _)
17   } yield {
18     resp.status must be(expectedStatus)
19     decode[List[Product]](body.utf8String) match {
20       case Left(e)  => fail(s"Could not decode response: $e")
21       case Right(d) => d must be(empty)
22     }
23   }
24 }

Our first test case is loading all products if no product exists so we expect an empty list here and the appropriate status code.

Use case: load all products (2)
 1 val expectedStatus = StatusCodes.OK
 2 
 3 s"return $expectedStatus and a list with all products" in {
 4   genProducts.sample match {
 5     case None => fail("Could not generate data sample!")
 6     case Some(ps) =>
 7       for {
 8         _    <- Future.sequence(ps.map(p => repo.saveProduct(p)))
 9         resp <- http.singleRequest(
10                   HttpRequest(
11                     method = HttpMethods.GET,
12                     uri = s"$baseUrl/products",
13                     headers = Seq(),
14                     entity = HttpEntity(
15                       contentType = ContentTypes.`application/json`,
16                       data = ByteString("")
17                     )
18                   )
19                 )
20         body <- resp.entity.dataBytes.runFold(ByteString(""))(_ ++ _)
21       } yield {
22         resp.status must be(expectedStatus)
23         decode[List[Product]](body.utf8String) match {
24           case Left(e)  => fail(s"Could not decode response: $e")
25           case Right(d) => d.sorted mustEqual ps.sorted
26         }
27       }
28   }
29 }

This one is also straight forward: We save our generated list of products to the database and query the API which must return a “200 OK” status code and a correct list of products in JSON format. Looks like we have one more use case to tackle: Updating a product via the API.

Use case: update a product (1)
 1 val expectedStatus = StatusCodes.BadRequest
 2 
 3 s"return $expectedStatus" in {
 4   genProduct.sample match {
 5     case None => fail("Could not generate data sample!")
 6     case Some(p) =>
 7       for {
 8         _    <- repo.saveProduct(p)
 9         rows <- repo.loadProduct(p.id)
10         resp <- http.singleRequest(
11           HttpRequest(
12             method = HttpMethods.PUT,
13             uri = s"$baseUrl/product/${p.id}",
14             headers = Seq(),
15             entity = HttpEntity(
16               contentType = ContentTypes.`application/json`,
17               data = ByteString(scala.util.Random.alphanumeric.take(256).mkString)
18             )
19           )
20         )
21         rows2 <- repo.loadProduct(p.id)
22       } yield {
23         withClue("Seeding product data failed!")(rows must not be(empty))
24         resp.status must be(expectedStatus)
25         Product.fromDatabase(rows2) match {
26           case None    =>
27             fail("Seeding product was not saved to database!")
28           case Some(s) =>
29             withClue("Existing product must not be changed!")(s mustEqual p)
30         }
31       }
32   }

First we test with garbage JSON in the request. Before doing the request we actually create a product to avoid getting an error caused by a possibly missing product. Afterwards we check for the expected “400 Bad Request” status code and verify that our product has not been updated.

Use case: update product (2)
 1 val expectedStatus = StatusCodes.OK
 2 
 3 s"return $expectedStatus and update the Product" in {
 4   (genProduct.sample, genProduct.sample) match {
 5     case (Some(a), Some(b)) =>
 6       val p = b.copy(id = a.id)
 7       for {
 8         _    <- repo.saveProduct(a)
 9         rows <- repo.loadProduct(a.id)
10         resp <- http.singleRequest(
11           HttpRequest(
12             method = HttpMethods.PUT,
13             uri = s"$baseUrl/product/${p.id}",
14             headers = Seq(),
15             entity = HttpEntity(
16               contentType = ContentTypes.`application/json`,
17               data = ByteString(p.asJson.noSpaces)
18             )
19           )
20         )
21         rows2 <- repo.loadProduct(p.id)
22       } yield {
23         withClue("Seeding product data failed!")(rows must not be(empty))
24         resp.status must be(expectedStatus)
25         Product.fromDatabase(rows2) match {
26           case None    => fail("Seeding product was not saved to database!")
27           case Some(s) => s mustEqual p
28         }
29       }
30     case _ => fail("Could not generate data sample!")
31   }
32 }

Next we test updating an existing product using valid JSON. We check the status code and if the product has been correctly updated within the database.

Use case: update product (3)
 1 val expectedStatus = StatusCodes.InternalServerError
 2 
 3 s"return $expectedStatus" in {
 4   genProduct.sample match {
 5     case None => fail("Could not generate data sample!")
 6     case Some(p) =>
 7       for {
 8         resp <- http.singleRequest(
 9           HttpRequest(
10             method = HttpMethods.PUT,
11             uri = s"$baseUrl/product/${p.id}",
12             headers = Seq(),
13             entity = HttpEntity(
14               contentType = ContentTypes.`application/json`,
15               data = ByteString(p.asJson.noSpaces)
16             )
17           )
18         )
19         rows <- repo.loadProduct(p.id)
20       } yield {
21         resp.status must be(expectedStatus)
22         rows must be(empty)
23       }
24   }
25 }

Finally we test updating a non existing product which should produce the expected status code and not save into the database. Wow, it seems we done for good with our impure implementation. Well except for some benchmarking but let’s save that for later.
If we look onto our test code coverage (which is a metric that you should use) then things look pretty good. We are missing some parts but in general we should have things covered.

Testing the pure service

We will skip the explanation of the ScalaCheck generators because they only differ slightly from the ones used in the impure part.

Unit Tests

The model tests are omitted here because they are basically the same as in the impure section. If you are interested in them just look at the source code. In contrast to the impure part we will now write unit tests for our routes. Meaning we will be able to test our routing logic without spinning up a database.

Testing our routes

To be able to test our routes we will have to implement a TestRepository first which we will use instead of the concrete implementation which is wired to a database.

TestRepository implementation - take 1
 1 class TestRepository[F[_]: Effect](data: Seq[Product]) extends Repository[F] {
 2   override def loadProduct(id: ProductId) = {
 3     data.find(_.id === id) match {
 4       case None => Seq.empty.pure[F]
 5       case Some(p) =>
 6         val ns = p.names.toNonEmptyList.toList.to[Seq]
 7         ns.map(n => (p.id, n.lang, n.name)).pure[F]
 8     }
 9   }
10 
11   override def loadProducts() = {
12     Stream.empty
13   }
14 
15   override def saveProduct(p: Product): F[Int] =
16     data.find(_.id === p.id).fold(0.pure[F])(_ => 1.pure[F])
17 
18   override def updateProduct(p: Product): F[Int] =
19     data.find(_.id === p.id).fold(0.pure[F])(_ => 1.pure[F])
20 
21 }

As you can see we try to stay abstract (using our HKT F[_] here) and we have basically left out the implementation of loadProducts because it will just return an empty stream. We will get back to it later on. Aside from that the class can be initialised with a (potentially empty) list of Product entities which will be used as a “database”. The save and update functions won’t change any data, they will just return a 0 or a 1 depending on the product being present in the seed data list.

Unit test: Product routes (1)
 1 val emptyRepository: Repository[IO] = new TestRepository[IO](Seq.empty)
 2 val expectedStatusCode = Status.NotFound
 3 
 4 s"return $expectedStatusCode" in {
 5   forAll("id") { id: ProductId =>
 6     Uri.fromString("/product/" + id.toString) match {
 7       case Left(_) => fail("Could not generate valid URI!")
 8       case Right(u) =>
 9         def service: HttpRoutes[IO] =
10           Router("/" -> new ProductRoutes(emptyRepository).routes)
11         val response: IO[Response[IO]] = service.orNotFound.run(
12           Request(method = Method.GET, uri = u)
13         )
14         val result = response.unsafeRunSync
15         result.status must be(expectedStatusCode)
16         result.body.compile.toVector.unsafeRunSync must be(empty)
17     }
18   }
19 }

Above you can see the test for querying a non existing product which must return an empty response using a “404 Not Found” status code. First we try to create a valid URI from our generated ProductId. If that succeeds we create a small service wrapper for our routes in which we inject our empty TestRepository. Finally we create a response using this service and a request that we construct. Because we are in the land of IO we have to actually execute it (via unsafeRunSync) to get any results back. Finally we validate the status code and the response body.

1 [info]   when GET /product/ID
2 [info]     when product does not exist
3 [info]     - must return 404 Not Found *** FAILED ***
4 [info]       TestFailedException was thrown during property evaluation.
5 [info]         Message: 200 OK was not equal to 404 Not Found
6 [info]         Location: (ProductRoutesTest.scala:46)
7 [info]         Occurred when passed generated values (
8 [info]           id = 3f298ae4-4f7b-415b-9888-c2e9b34a4883
9 [info]         )

It seems that we just pass an empty response if we do not find the product. This is not nice, so let’s fix this. The culprit is the following line in our ProductRoutes file:

1 for {
2   // ...
3   resp <- Ok(Product.fromDatabase(rows))
4 } yield resp

Okay, this looks easy. Let’s try this one:

1 for {
2   // ...
3   resp <- Product.fromDatabase(rows).fold(NotFound())(p => Ok(p))
4 } yield resp

Oh no, the compiler complains:

1 Cannot convert from Product to an Entity, because no 
2   EntityEncoder[F, com.wegtam.books.pfhais.pure.models.Product] 
3   instance could be found.

Right, before we had an Option[Product] here for which we created an implicit JSON encoder. So if we create one for the Product itself then we should be fine.

1 implicit def encodeProduct[A[_]: Applicative]: EntityEncoder[A, Product] = 
2   jsonEncoderOf

Now back to our test:

1 [info]   when GET /product/ID
2 [info]     when product does not exist
3 [info]     - must return 404 Not Found

Great! Seems like we are doing fine, so let’s continue. Next in line is testing a query for an existing product.

Unit test: Product routes (2)
 1 implicit def decodeProduct: EntityDecoder[IO, Product] = jsonOf
 2 val expectedStatusCode = Status.Ok
 3 
 4 s"return $expectedStatusCode and the product" in {
 5   forAll("product") { p: Product =>
 6     Uri.fromString("/product/" + p.id.toString) match {
 7       case Left(_) => fail("Could not generate valid URI!")
 8       case Right(u) =>
 9         val repo: Repository[IO] = new TestRepository[IO](Seq(p))
10         def service: HttpRoutes[IO] =
11           Router("/" -> new ProductRoutes(repo).routes)
12         val response: IO[Response[IO]] = service.orNotFound.run(
13           Request(method = Method.GET, uri = u)
14         )
15         val result = response.unsafeRunSync
16         result.status must be(expectedStatusCode)
17         result.as[Product].unsafeRunSync must be(p)
18     }
19   }
20 }

This time we generate a whole product, again try to create a valid URI and continue as before. But this time we inject our TestRepository containing a list with our generated product. In the end we test our expected status code and the response body must contain our product. For the last part to work we must have an implicit EntityDecoder in scope.

Unit test: Product routes (3)
 1 val expectedStatusCode = Status.BadRequest
 2 
 3 s"return $expectedStatusCode" in {
 4   forAll("id") { id: ProductId =>
 5     Uri.fromString("/product/" + id.toString) match {
 6       case Left(_) => fail("Could not generate valid URI!")
 7       case Right(u) =>
 8         def service: HttpRoutes[IO] =
 9           Router("/" -> new ProductRoutes(emptyRepository).routes)
10         val payload = scala.util.Random.alphanumeric.take(256).mkString
11         val response: IO[Response[IO]] = service.orNotFound.run(
12           Request(method = Method.PUT, uri = u)
13             .withEntity(payload.asJson.noSpaces)
14         )
15         val result = response.unsafeRunSync
16         result.status must be(expectedStatusCode)
17         result.body.compile.toVector.unsafeRunSync must be(empty)
18     }
19   }
20 }

Here we are trying to update a product which doesn’t have to exist because we send totally garbage JSON with the request which should result in a “400 Bad Request” status code. However if we run our test we get an exception instead:

1 [info]   when PUT /product/ID
2 [info]     when request body is invalid
3 [info]     - must return 400 Bad Request *** FAILED ***
4 [info]       InvalidMessageBodyFailure was thrown during property
5 [info]         Occurred when passed generated values (
6 [info]           id = 932682c9-1f9a-463a-84db-a0992d466aa3
7 [info]         )

So let’s take a deep breath and look at our code:

1 case req @ PUT -> Root / "product" / UUIDVar(id) =>
2   for {
3     p <- req.as[Product]
4     _ <- repo.updateProduct(p)
5     r <- NoContent()
6   } yield r

As we can see we do no error handling at all. So maybe we can rewrite this a little bit.

1 case req @ PUT -> Root / "product" / UUIDVar(id) =>
2   req
3     .as[Product]
4     .flatMap { p =>
5       repo.updateProduct(p) *> NoContent()
6     }
7     .handleErrorWith {
8       case InvalidMessageBodyFailure(_, _) => BadRequest()
9     }

Now we explicitly handle any error which occurs when decoding the request entity. But in fact: I am lying to you. We only handle the invalid message body failure here. On the other hand it is enough to make our test happy. :-)

Onwards to our next test cases in which we use a valid JSON payload for our request.

Unit test: Product routes (4)
 1 val expectedStatusCode = Status.NotFound
 2 
 3 s"return $expectedStatusCode" in {
 4   forAll("product") { p: Product =>
 5     Uri.fromString("/product/" + p.id.toString) match {
 6       case Left(_) => fail("Could not generate valid URI!")
 7       case Right(u) =>
 8         def service: HttpRoutes[IO] =
 9           Router("/" -> new ProductRoutes(emptyRepository).routes)
10         val response: IO[Response[IO]] = service.orNotFound.run(
11           Request(method = Method.PUT, uri = u)
12             .withEntity(p)
13         )
14         val result = response.unsafeRunSync
15         result.status must be(expectedStatusCode)
16         result.body.compile.toVector.unsafeRunSync must be(empty)
17     }
18   }
19 }

We expect a “404 Not Found” if we try to update a valid product which does not exist. But what do we get in the tests?

1 Message: 204 No Content was not equal to 404 Not Found

Well not exactly what we planned for but it is our own fault. We used the *> operator which ignores the value from the previous operation. So we need to fix that.

 1 case req @ PUT -> Root / "product" / UUIDVar(id) =>
 2   req
 3     .as[Product]
 4     .flatMap { p =>
 5       for {
 6         cnt <- repo.updateProduct(p)
 7         res <- cnt match {
 8           case 0 => NotFound()
 9           case _ => NoContent()
10         }
11       } yield res
12     }
13     .handleErrorWith {
14       case InvalidMessageBodyFailure(_, _) => BadRequest()
15     }

We rely on the return value of our update function which contains the number of affected database rows. If it is zero then nothing has been done implying that the product was not found. Otherwise we return our “204 No Content” response as before. Still we miss one last test for our product routes.

Unit test: Product routes (5)
 1 val expectedStatusCode = Status.NoContent
 2 
 3 s"return $expectedStatusCode" in {
 4   forAll("product") { p: Product =>
 5     Uri.fromString("/product/" + p.id.toString) match {
 6       case Left(_) => fail("Could not generate valid URI!")
 7       case Right(u) =>
 8         val repo: Repository[IO] = new TestRepository[IO](Seq(p))
 9         def service: HttpRoutes[IO] =
10           Router("/" -> new ProductRoutes(repo).routes)
11         val response: IO[Response[IO]] = service.orNotFound.run(
12           Request(method = Method.PUT, uri = u)
13             .withEntity(p)
14         )
15         val result = response.unsafeRunSync
16         result.status must be(expectedStatusCode)
17         result.body.compile.toVector.unsafeRunSync must be(empty)
18     }
19   }
20 }

Basically this is the same test as before with the exception that we now give our routes a properly seeded test repository. Great, we have tested our ProductRoutes and without having to spin up a database! But we still have work to do, so let’s move on to testing the ProductsRoutes implementation. Before we do that we adapt the code for creating a product using our gained knowledge from our update test.

ProductsRoutes: adapted create endpoint
 1 case req @ POST -> Root / "products" =>
 2   req
 3     .as[Product]
 4     .flatMap { p =>
 5       for {
 6         cnt <- repo.saveProduct(p)
 7         res <- cnt match {
 8           case 0 => NotFound()
 9           case _ => InternalServerError()
10         }
11       } yield res
12     }
13     .handleErrorWith {
14       case InvalidMessageBodyFailure(_, _) => BadRequest()
15     }

Now to our tests, we will start with sending garbage JSON via the POST request.

Unit test: Products routes (1)
 1 val expectedStatusCode = Status.BadRequest
 2 
 3 s"return $expectedStatusCode" in {
 4   def service: HttpRoutes[IO] =
 5     Router("/" -> new ProductsRoutes(emptyRepository).routes)
 6   val payload = scala.util.Random.alphanumeric.take(256).mkString
 7   val response: IO[Response[IO]] = service.orNotFound.run(
 8     Request(method = Method.POST, uri = Uri.uri("/products"))
 9       .withEntity(payload.asJson.noSpaces)
10   )
11   val result = response.unsafeRunSync
12   result.status must be(expectedStatusCode)
13   result.body.compile.toVector.unsafeRunSync must be(empty)
14 }

There is nothing special here, the test is same as for the ProductRoutes except for the changed URI and HTTP method. Also the code is a bit simpler because we do not need to generate a dynamic request URI like before.

Unit test: Products routes (2)
 1 val expectedStatusCode = Status.NoContent
 2 
 3 s"return $expectedStatusCode" in {
 4   forAll("product") { p: Product =>
 5     val repo: Repository[IO] = new TestRepository[IO](Seq(p))
 6     def service: HttpRoutes[IO] =
 7       Router("/" -> new ProductsRoutes(repo).routes)
 8     val response: IO[Response[IO]] = service.orNotFound.run(
 9       Request(method = Method.POST, uri = Uri.uri("/products"))
10         .withEntity(p)
11     )
12     val result = response.unsafeRunSync
13     result.status must be(expectedStatusCode)
14     result.body.compile.toVector.unsafeRunSync must be(empty)
15   }
16 }

Saving a product using valid JSON payload should succeed and in fact it does because of the code we have in our TestRepository instance. If you remember we use the following code for saveProduct:

1 override def saveProduct(p: Product): F[Int] =
2   data.find(_.id === p.id).fold(0.pure[F])(_ => 1.pure[F])

This code will return a 0 if the product we try to save does not exist in the seed data set and only a 1 if it can be found within aforementioned set. This code clearly doesn’t make any sense except for our testing. This way we can ensure the behaviour of the save function without having to create a new Repository instance with hard coded behaviour. :-)

Unit test: Products routes (3)
 1 val expectedStatusCode = Status.InternalServerError
 2 
 3 s"return $expectedStatusCode" in {
 4   forAll("product") { p: Product =>
 5     def service: HttpRoutes[IO] =
 6       Router("/" -> new ProductsRoutes(emptyRepository).routes)
 7     val response: IO[Response[IO]] = service.orNotFound.run(
 8       Request(method = Method.POST, uri = Uri.uri("/products"))
 9         .withEntity(p)
10     )
11     val result = response.unsafeRunSync
12     result.status must be(expectedStatusCode)
13     result.body.compile.toVector.unsafeRunSync must be(empty)
14   }
15 }

We use the empty repository this time to ensure that the saveProduct function will return a zero, triggering the desired logic in our endpoint. Almost done, so let’s check the endpoint for returning all products.

Unit test: Products routes (4)
 1 val expectedStatusCode = Status.Ok
 2 
 3 s"return $expectedStatusCode and an empty list" in {
 4   def service: HttpRoutes[IO] =
 5     Router("/" -> new ProductsRoutes(emptyRepository).routes)
 6   val response: IO[Response[IO]] = service.orNotFound.run(
 7     Request(method = Method.GET, uri = Uri.uri("/products"))
 8   )
 9   val result = response.unsafeRunSync
10   result.status must be(expectedStatusCode)
11   result.as[List[Product]].unsafeRunSync mustEqual List.empty[Product]
12 }

We simply expect an empty list if no products exist. It is as simple as that and works right out of the box. Last but not least we need to test the return of existing products. But before we do this let’s take a look at our TestRepository implementation.

TestRepository: stubbed loadProducts
1 override def loadProducts() = Stream.empty

Uh, oh, that does not bode well! So we will need to fix that first. Because we were wise to chose fs2 as our streaming library of choice the solution is as simple as this.

TestRepository: fixed loadProducts
1 override def loadProducts() = {
2   val rows = data.flatMap { p =>
3     val ns = p.names.toNonEmptyList.toList.to[Seq]
4     ns.map(n => (p.id, n.lang, n.name))
5   }
6   Stream.emits(rows)
7 }

Now we can write our last test.

Unit test: Products routes (5)
 1 implicit def decodeProducts: EntityDecoder[IO, List[Product]] = jsonOf
 2 val expectedStatusCode = Status.Ok
 3 
 4 s"return $expectedStatusCode and a list of products" in {
 5   forAll("products") { ps: List[Product] =>
 6     val repo: Repository[IO] = new TestRepository[IO](ps)
 7     def service: HttpRoutes[IO] =
 8       Router("/" -> new ProductsRoutes(repo).routes)
 9     val response: IO[Response[IO]] = service.orNotFound.run(
10       Request(method = Method.GET, uri = Uri.uri("/products"))
11     )
12     val result = response.unsafeRunSync
13     result.status must be(expectedStatusCode)
14     result.as[List[Product]].unsafeRunSync mustEqual ps
15   }
16 }

To decode the response correctly an implicit EntityDecoder of the appropriate type is needed in scope. But the rest of the test should look pretty familiar to you by now.

It seems the only parts left to test are the FlywayDatabaseMigrator and our DoobieRepository classes. Testing them will require a running database so we are leaving the cosy world of unit tests behind and venture forth into integration test land. But fear not, we already have some - albeit impure - experience here.

Integration Tests

As usual we start up by implementing a base class that we can use to provide common settings and functions across our tests.

BaseSpec for pure integration tests
 1 abstract class BaseSpec extends WordSpec 
 2     with MustMatchers
 3     with ScalaCheckPropertyChecks
 4     with BeforeAndAfterAll
 5     with BeforeAndAfterEach {
 6 
 7   protected val config = ConfigFactory.load()
 8   protected val dbConfig = loadConfig[DatabaseConfig](config, "database")
 9 
10   override def beforeAll(): Unit = {
11     val _ = withClue("Database configuration could not be loaded!") {
12       dbConfig.isRight must be(true)
13     }
14   }
15 }

You can see that we keep it simple here and only load the database configuration and ensure that is has been indeed loaded correctly in the beforeAll function.

Testing the FlywayDatabaseMigrator

To ensure the basic behaviour of our FlywayDatabaseMigrator we write a simple test.

Integration test: database migrator (1)
 1 "the database is not available" must {
 2   "throw an exception" in {
 3     val cfg = DatabaseConfig(
 4       driver = "This is no driver name!",
 5       url = "jdbc://some.host/whatever",
 6       user = "no-user",
 7       pass = "no-password"
 8     )
 9     val migrator: DatabaseMigrator[IO] = new FlywayDatabaseMigrator
10     val program = migrator.migrate(cfg.url, cfg.user, cfg.pass)
11     an[FlywayException] must be thrownBy program.unsafeRunSync
12   }
13 }

Within this test we construct an invalid database configuration and expect that the call to migrate throws an exception. If you remember, we had this issue already and chose not to handle any exceptions but let the calling site do this - for example via a MonadError instance.

Integration test: database migrator (2)
 1 dbConfig.map { cfg =>
 2   val migrator: DatabaseMigrator[IO] = new FlywayDatabaseMigrator
 3   val program = migrator.migrate(cfg.url, cfg.user, cfg.pass)
 4   program.unsafeRunSync must be > 0
 5 }
 6 // ---
 7 dbConfig.map { cfg =>
 8   val migrator: DatabaseMigrator[IO] = new FlywayDatabaseMigrator
 9   val program = migrator.migrate(cfg.url, cfg.user, cfg.pass)
10   val _ = program.unsafeRunSync
11   program.unsafeRunSync must be(0)
12 }

The other two tests are also quite simple, we just expect it to return either zero or the number of applied migrations depending on the state of the database. It goes without saying that we of course use the beforeEach and afterEach helpers within the test to prepare and clean our database properly.

Last but not least we take a look at testing our actual repository implementation which uses Doobie. To avoid trouble we need to define a globally available ContextShift in our test which is as simple as this:

1 implicit val cs = IO.contextShift(ExecutionContexts.synchronous)

Now we can start writing our tests.

Integration test: repository (1)
 1 val tx = Transactor
 2   .fromDriverManager[IO](c.driver, c.url, c.user, c.pass)
 3 val repo = new DoobieRepository(tx)
 4 forAll("ID") { id: ProductId =>
 5   for {
 6     rows <- repo.loadProduct(id)
 7   } yield {
 8     rows must be(empty)
 9   }
10 }

Here we simply test that the loadProduct function returns an empty list if the requested product does not exist in the database.

Integration test: repository (2)
1 forAll("product") { p: Product =>
2   for {
3     _    <- repo.saveProduct(p)
4     rows <- repo.loadProduct(p.id)
5   } yield {
6     rows must not be(empty)
7     Product.fromDatabase(rows) must contain(p)
8   }
9 }

From now on we’ll omit the transactor and repository creation from the code examples. As you can see a generated product is saved to the database and loaded again and verified in the end.

Integration test: repository (3)
1 val rows = repo.loadProducts().compile.toList
2 rows.unsafeRunSync must be(empty)

Testing that loadProducts returns an empty stream if no products exist is as simple as the code above. :-)

Integration test: repository (4)
 1 forAll("products") { ps: List[Product] =>
 2   for {
 3     _    <- ps.traverse(repo.saveProduct)
 4     rows = repo.loadProducts()
 5       .groupAdjacentBy(_._1)
 6       .map {
 7         case (id, rows) => Product.fromDatabase(rows.toList)
 8       }
 9       .collect {
10         case Some(p) => p
11       }
12       .compile
13       .toList
14   } yield {
15     val products = rows.unsafeRunSync
16     products must not be(empty)
17     products mustEqual ps
18   }
19 }

In contrast the test code for checking the return of existing products is a bit more involving. But let’s step through it together. First we save the list of generated products to the database which we do using the traverse function provided by Cats. In impure land we used Future.sequence here if you remember - but now we want to stay pure. ;-)
Next we call our loadProducts function and apply a part of the logic from our ProductsRoutes to it, namely we construct a proper stream of products which we turn into a list via compile and list in the end. Finally we check that the list is not empty and equal to our generated list.

Integration test: repository (5)
 1 forAll("product") { p: Product =>
 2   for {
 3     cnt  <- repo.saveProduct(p)
 4     rows <- repo.loadProduct(p.id)
 5   } yield {
 6     cnt must be > 0
 7     rows must not be(empty)
 8     Product.fromDatabase(rows) must contain(p)
 9   }
10 }

The code for testing saveProduct is nearly identical to the loadProduct test as you can see. We simply check additionally that the function returns the number of affected database rows.

Integration test: repository (6)
1 forAll("product") { p: Product =>
2   for {
3     cnt  <- repo.updateProduct(p)
4     rows <- repo.loadProduct(p.id)
5   } yield {
6     cnt must be(0)
7     rows must be(empty)
8   }
9 }

Updating a non existing product must return a zero and save nothing to the database, which is what we test above.

Integration test: repository (7)
 1 forAll("productA", "productB") { (a: Product, b: Product) =>
 2   val p = b.copy(id = a.id)
 3   for {
 4     _    <- repo.saveProduct(a)
 5     cnt  <- repo.updateProduct(p)
 6     rows <- repo.loadProduct(p.id)
 7   } yield {
 8     cnt must be > 0
 9     rows must not be(empty)
10     Product.fromDatabase(rows) must contain(p)
11   }
12 }

Finally we test updating a concrete product by generating two of them, saving the first into the database and running an update using the second with the id from the first.

Wow, it seems we are finished! Congratulations, we can now check mark the point “write a pure http service in Scala” on our list. :-)

Adding benchmarks

Now that we have our implementations in place we can start comparing them. We will start with implementing some benchmarks to test the performance of both implementations.

There are several applications available to perform load tests and benchmarks. Regarding the latter the Apache JMeter1 project is a good starting point. It is quite easy to get something running. Like the documentation says: For the real stuff you should only use the command line application and use the GUI to create and test your benchmark.

We’ll skip a long introduction and tutorial for JMeter because you can find a lot within the documentation and there are lots of tutorials online.

Our environment

Within the book repository you’ll find a folder named jmeter it contains several things:

  1. Several files ending with .jmx like Pure-Create-Products.jmx and so on.
  2. A CSV file containing 100.000 valid product IDs named product-ids.csv.
  3. A file named benchmarks.md.

The .jmx files are the configuration files for JMeter which can be used to run the benchmarks. They are hopefully named understandable and are expected to be run in the following order:

  1. Create products
  2. Load products
  3. Update products
  4. Load all products

The file product-ids.csv is expected in the /tmp folder, so you’ll have to copy it there or adjust the benchmark configurations. Finally the file benchmarks.md holds detailed information about the benchmark runs (each one was done three times in a row).

System environment

Service and testing software (Apache JMeter) were run on different workstations connected via 100 MBit/s network connection.

Service workstation

CPU Core i5-9600K, 6 Cores, 3,7 GHz
RAM 32 GB
HDD 2x Samsung SSD 860 PRO 512GB, SATA
OS FreeBSD 12 (HT disabled)
JDK 11.0.4+11-2
DB PostgreSQL 11.3

Client workstation

CPU AMD Ryzen Threadripper 2950X
RAM 32 GB
HDD 2x Samsung SSD 970 PRO 512GB, M.2
OS FreeBSD 12 (HT disabled)
JDK 11.0.4+11-2

Apache JMeter version 5.1.1 was used to run the benchmark and if not noted otherwise 10 threads were used with a 10 seconds ramp up time for each benchmark.

Comparison

So let’s start with comparing the results. As mentioned more details can be found in the file benchmarks.md. We’ll stick to using the average of the metrics across all three benchmark runs. The following abbreviations will be used in the tables and legends.

AVG
The average response time in milli seconds.
MED
The median response time in milli seconds.
90%
90 percent of all requests were handled within the response time in milli seconds or less.
95%
95 percent of all requests were handled within the response time in milli seconds or less.
99%
99 percent of all requests were handled within the response time in milli seconds or less.
MIN
The minium response time in milli seconds.
MAX
The maximum response time in milli seconds.
ERR
The error rate in percent.
R/S
The number of requests per second that could be handled.
MEM
The maximum amount of memory used by the service during the benchmark in MB.
LD
The average system load on service machine during the benchmark.

Create 100.000 products

Metric Impure Pure
AVG 98 12
MED 95 11
90% 129 15
95% 143 18
99% 172 30
MIN 53 5
MAX 1288 675
ERR 0% 0%
R/S 100.56 765.33
MEM 1158 1308
LD 16 9

Wow, I honestly have to say that I didn’t expect that. Usually the world has come to believe that the impure approach might be dirty but is definitely always faster. Well it seems we’re about to correct that. I don’t know about you but I’m totally fine with that. ;-)
But now let’s break it apart piece by piece. The first thing that catches the eye is that the pure service seems to be about seven times faster than the impure one! The average 100 requests per second on the impure side stand against an average 765 requests per second on the pure side. Also the metrics regarding the response times support that. Regarding the memory usage we can see that the pure service needed about 13% more memory than the impure one. Living in times in which memory is cheap I consider this a small price to pay for a significant performance boost.
Last but not least I found it very interesting that the average system load was much higher (nearly twice as high) in the impure implementation. While it is okay to make use of your resources a lower utilisation allows more “breathing room” for other tasks (operating system, database, etc.).

Load 100.000 products

Metric Impure Pure
AVG 7 7
MED 8 7
90% 10 9
95% 11 10
99% 14 19
MIN 4 2
MAX 347 118
ERR 0% 0%
R/S 1162.20 1248.63
MEM 1449 1538
LD 13 8

The loading benchmark provides a more balanced picture. While the pure service is still slightly ahead (about 7% faster), it uses about 6% more memory than the impure one. Overall both implementations deliver nearly the same results. But again the pure one causes significant lower system load like in the first benchmark.

Update 100.000 products

Metric Impure Pure
AVG 78 12
MED 75 11
90% 104 16
95% 115 20
99% 140 34
MIN 42 5
MAX 798 707
ERR 0% 0%
R/S 125.66 765.26
MEM 1176 1279
LD 16 8

Updating existing products results in nearly the same picture as the “create products” benchmark. Interestingly the impure service performs about 20% better on an update than on a create. I have no idea why but it caught my eye. The other metrics are as said nearly identical to the first benchmark. The pure service uses a bit more memory (around 8%) but is around 6 times faster than the impure one causing only half of the system load.

Bulk load all 100.000 products

For our last benchmark we load all existing products via the GET /products route. Because this causes a lot of load we reduce the number of threads in our JMeter configuration from 10 to 2 and only use 50 iterations. But enough talk here are the numbers.

Metric Impure Pure
AVG 19061 14496
MED 19007 14468
90% 19524 14689
95% 19875 14775
99% 20360 14992
MIN 17848 14008
MAX 21315 16115
ERR 0% 0%
R/M 6.30 8.30
MEM 7889 1190
LD 5 4

As you can see the difference in system load is way smaller this time. While it is still 25% a load of 4 versus 5 on a machine like the test machine makes almost no difference. However the pure service is again faster (about 25%). Looking at the memory footprint we can see that the impure one uses nearly seven times as much memory as the pure one.
But before we burst into cheers about that let’s remember what we did in the impure implementation! Yes, we used the groupBy operator of Akka which keeps a lot of stuff in memory so the fault for this is ours. ;-)
Because I’m not in the mood to mess with Akka until we’re on the memory safe side here, we’ll just ignore the memory footprint for this benchmark. Summarising that the pure service is again faster than the impure one.

Summary

If you look around in the internet (and also in the literature) you’ll find a lot of sources stating that “functional programming is slow” or “functional programming does not perform” and so on. Well I would argue that we have proven that this is not the case! Although we cannot generalise our findings because we only took a look at a specific niche within the corner of a specific environment, I think this is pretty exciting!

Not only do you benefit from having code that can more easily reasoned about but you gain better testing possibilities and in the end your application also performs better! :-)

While we pay some price (increased memory footprint) for it because there is no free lunch. It seems that it is worth it to work in a clean and pure fashion. So next time someone argues in favour of some dirty impure monstrosity because it is faster, just remember and tell ‘em that this might not be true!

Documenting your API

Before we celebrate ourselves we have to tackle one missing point: We have to document our API.

No, it is not! Leaving the issues of proper documented code aside here, we will concentrate on documenting the API. The de facto standard in our days seems to be using Swagger1 for this. To keep things simple we will stick to it. Besides that it won’t hurt to have some documentation in text form (a small file could be enough) which should explain the quirks of our API. The bigger the project the earlier you may encounter flaws in the logic which might not be changeable because whatever reasons there are. ;-)

The lay of the land

Swagger provides a bit of tooling and there are likely lots of projects trying to bring it to your favourite web framework or tool kit. In many cases it might be a good idea to bundle the Swagger UI2 with your service and make it available on a specific path. Depending on your needs and environment you’ll want to protect that path via authentication and make it configurable for turning it off in production.

Having the UI you must decide which path you want to go down:

  1. Write a swagger.yml file which describes your API, create JSON from it and deliver that as an asset.
  2. Create the JSON dynamically via some library at runtime.

The first point has the benefit that you description will be more or less set in stone and you avoid possible performance impacts and other quirks at runtime. However you must think of a way to test that your service actually fulfils the description (read specification) that you deliver. The most common tool for writing your API description is probably the Swagger Editor3.

Taking the second path will result in a description which reflects your actual code. However none of the tools available I’ve seen so far has fulfilled that promise to 100 percent. You’ll very likely have to make extensive use of annotations to make your API description usable, resulting also in a decoupling of code and description. Also you might encounter “funny” deduced data types like Future1 and so on - which are annoying and confusing for the user. For those favouring the impure approach there is the swagger-akka-http library4.

The answer is yes! We can describe our API using static typing and have the compiler check it and can deduce server and client code from it.

Using types to describe an API

Describing your API using types is not bleeding edge academic research stuff like you might have guessed. There are several libraries existing for it! :-)

Personally I stumbled upon such things first some years ago when seeing the talk “Using object algebras to design embedded DSLs” (Curry On 2016)5. The related project is the library endpoints6. However there are other projects too including the rho library7 included in the http4s project. Another one is tapir8 which we will be using in our example.
In the Haskell camp there is the beautiful Servant library9

First I wanted to use something which allows us to generate a http4s server. This already narrowed down the options a bit. Also it should be able to generate API documentation (which nowadays means Swagger/OpenAPI support). Furthermore it should support not only http4s but more options. So after playing a bit around I decided to use the tapir library.

A pure implementation using tapir.

We basically cloned our pure folder into the tapir folder and start to apply our changes to the already pure implementation. But first some theory.

Basics

The tapir library assumes that you describe your API using the Endpoint type which is more concrete defined as follows: Endpoint[I, E, O, S]

  • The type I defines the input given into the endpoint.
  • Type type E defines the error (or errors) which may be returned by the endpoint.
  • The type O defines the possible output of the endpoint.
  • The type S specifies the type of streams which are used for in- and output.

An endpoint can have attributes like name or description which will be used in the generated documentation. You can also map input and output parameters into case classes.

Regarding the encoding and decoding of data we will need our Circe codecs but additionally some schema definitions required by tapir. In concrete we will need to define implicit SchemaFor[T] instances for each of our models. We will start with the Translation model.

Tapir schema for Translation
1 implicit val schemaFor: SchemaFor[Translation] = SchemaFor(
2   Schema.SProduct(
3     Schema.SObjectInfo("Translation"),
4     List(("lang", Schema.SString), ("name", Schema.SString)),
5     List("lang", "name")
6   )
7 )

As you can see this is quite straightforward and not very complicated. In the future we might be able to derive such things but for now we have to define them. The schema is defined as a product type which is further described by the “object info” type containing a name, the field names and their schemas and at least a list of field names which are required to construct the type.

Tapir schema for Product (1)
 1 implicit val schemaFor: SchemaFor[Product] = SchemaFor(
 2   Schema.SProduct(
 3     Schema.SObjectInfo("Product"),
 4     List(
 5       ("id", Schema.SString),
 6       ("names", Schema.SArray(Translation.schemaFor.schema))
 7     ),
 8     List("id", "names")
 9   )
10 )

This is basically the same thing as before except that we rely on the existing schema definition of Translation and use that here. You might notice that we explicitly define our NonEmptySet as an SArray here. This is because we want a list like representation on the JSON side.

Yes, we can! :-) To gain more flexibility we can provide a generic schema for our NonEmptySet type.

Tapir schema for NonEmptySet from Cats
1 implicit def schemaForNeS[T](implicit a: SchemaFor[T]): 
2   SchemaFor[NonEmptySet[T]] =
3     SchemaFor(Schema.SArray(a.schema))

Here we define that a NonEmptySet will be an array using the schema of whatever type it contains. Now we rewrite our previous schema definition as follows.

Tapir schema for Product (2)
 1 implicit val schemaFor: SchemaFor[Product] = SchemaFor(
 2   Schema.SProduct(
 3     Schema.SObjectInfo("Product"),
 4     List(
 5       ("id", Schema.SString),
 6       ("names", schemaForNeS[Translation].schema)
 7     ),
 8     List("id", "names")
 9   )
10 )

It is not necessarily shorter but we have gained some more flexibility and can reuse our schema for the non empty set in several places.

Product routes

Having the basics settled we can try to write our first endpoint. Let’s refactor our product routes. We will define our endpoints in the companion object of the class.

Tapir endpoint for loading a product
1 // Our type is Endpoint[ProductId, StatusCode, Product, Nothing]
2 val getProduct = endpoint.get
3   .in("product" / path[ProductId]("id"))
4   .errorOut(statusCode)
5   .out(jsonBody[Product])

So what do we have here? First we specify the HTTP method by using the get function of the endpoint. Now we need to define our path and inputs. We do this using the in helper which accepts path fragments separated by slashes and also a path[T] helper which allows us to extract a type directly from a path fragment. This way we define our entry point product/id in which id must match our ProductId type.
To be able to be more flexible about our returned status codes we use the errorOut function which in our case just receives a status code (indicated by passing statusCode to it).
Finally we define that the endpoint will return the JSON representation of a product by using the out and jsonBody helpers. This all is reflected in the actual type signature of our endpoint which reads Endpoint[ProductId, StatusCode, Product, Nothing]. If we remember the basics then we know that this amounts to an endpoint which takes a ProductId as input, produces a StatusCode as possible error and returns a Product upon success.

Our endpoint alone won’t do us any good so we need an actual server side implementation of it. While we could have used the serverLogic function to directly attach our logic onto the endpoint definition this would have nailed us down to a concrete server implementation.
So we’re going to implement it in the ProductRoutes class.

http4s implementation of the load product endpoint
1 val getRoute: HttpRoutes[F] = ProductRoutes.getProduct.toRoutes { id =>
2   for {
3     rows <- repo.loadProduct(id)
4     resp = Product
5       .fromDatabase(rows)
6       .fold(StatusCodes.NotFound.asLeft[Product])(_.asRight[StatusCode])
7   } yield resp
8 }

We use the toRoutes helper of tapir which expects a function with the actual logic. As you can see the implementation is straightforward and only differs slightly from our original one. Currently there is no other way to handle our “not found” case than using the fold at the end. But if you remember, we did the same thing in the original code.

That was not that difficult, for something which some people like to talk about as “academic fantasies from fairy tale land”. ;-)

Onward to our next route: updating an existing product. First we need to define our endpoint.

Tapir endpoint for updating a product
 1 // Our type is Endpoint[(ProductId, Product), StatusCode, Unit, Nothing]
 2 val updateProduct =
 3   endpoint.put
 4     .in("product" / path[ProductId]("id"))
 5     .in(
 6       jsonBody[Product]
 7         .description("The updated product data which should be saved.")
 8     )
 9     .errorOut(statusCode)
10     .out(statusCode(StatusCodes.NoContent))

This is only slightly more code than our first endpoint. We use the put method this time and the same logic as before to extract our product id from the path. But we also need our product input which we expect as JSON in the request body. The jsonBody function used is also extended with the description helper here which will provide data for a possibly generated documentation. We’ll come to generated API docs later on.
We also restrict our errors to status codes via the errorOut(statusCode) directive. Last but not least we have to define our output. Per default a status code of “200 OK” will be used which is why we override it in the out function with the “204 No Content” preferred by us.

http4s implementation of the update product endpoint
 1 private val updateRoute: HttpRoutes[F] = 
 2   ProductRoutes.updateProduct.toRoutes {
 3     case (id, p) =>
 4       for {
 5         cnt <- repo.updateProduct(p)
 6         res = cnt match {
 7           case 0 => StatusCodes.NotFound.asLeft[Unit]
 8           case _ => ().asRight[StatusCode]
 9         }
10       } yield res
11   }

The implementation is again very similar to our original one. Except that it is even a bit simpler. This is because we do not have to worry about wrongly encoded input (Remember our handleErrorWith directive?). The tapir library will by default return a “400 Bad Request” status if any provided input cannot be decoded.
Within the pattern match we use the status codes provided by tapir and map the returned values to the correct type which is an Either[StatusCode, Unit] because of our endpoint type. This results from our endpoint type signature being Endpoint[(ProductId, Product), StatusCode, Unit, Nothing]. This translates to having an input of both a ProductId and a Product and returning a StatusCode in the error case or Unit upon success.

Now we only need to combine both routes and we’re set.

combined product routes
1 @SuppressWarnings(Array("org.wartremover.warts.Any"))
2 val routes: HttpRoutes[F] = getRoute <+> updateRoute

So, let us run our tests and see what happens.

1 [info]   when PUT /product/ID
2 [info]     when request body is invalid
3 [info]     - must return 400 Bad Request *** FAILED ***
4 [info]       TestFailedException was thrown during property evaluation.
5 [info]         Message: Vector(...) was not empty
6 [info]         Location: (ProductRoutesTest.scala:100)
7 [info]         Occurred when passed generated values (
8 [info]           id = b55df341-a165-40c4-87ba-3d1c5cfb2f0c
9 [info]         )

Well, not what we expected, or is it? To be honest I personally expected more errors but maybe I’m just doing this stuff for too long. ;-)
If we look into the error we find that because the encoding problems are now handled for us the response not only contains a status code of “400 Bad Request” but also an error message: “Invalid value for: body”. Because I’m fine with that I just adjust the test and let it be good. :-)

Pretty awesome, we have already half of our endpoints done. So let’s move on to the remaining ones and finally see how to generate documentation and also a client for our API.

Products routes

Tapir endpoint for creating a product
1 val createProduct: Endpoint[Product, StatusCode, Unit, Nothing] =
2   endpoint.post
3     .in("products")
4     .in(
5       jsonBody[Product]
6         .description("The product data which should be created.")
7     )
8     .errorOut(statusCode)
9     .out(statusCode(StatusCodes.NoContent))

As we can see the endpoint definition for creating a product does not differ from the one that was used to update one. Except that we have a different path here and do not need to extract our ProductId from the URL path.

http4s implementation of the create product endpoint
 1 val createRoute: HttpRoutes[F] =
 2   ProductsRoutes.createProduct.toRoutes { product =>
 3     for {
 4       cnt <- repo.saveProduct(product)
 5       res = cnt match {
 6         case 0 => StatusCodes.InternalServerError.asLeft[Unit]
 7         case _ => ().asRight[StatusCode]
 8       }
 9     } yield res
10   }

The implementation is again pretty simple. In case that the saveProduct function returns a zero we output a “500 Internal Server Error” because the product has not been saved into the database.

Finally we have our streaming endpoint left, so let’s see how we can do this via tapir.

Tapir endpoint for loading all products
1 // Our type is Endpoint[Unit, StatusCode, Stream[F, Byte], Stream[F, Byte]]
2 def getProducts[F[_]] =
3   endpoint.get
4     .in("products")
5     .errorOut(statusCode)
6     .out(
7       streamBody[Stream[F, Byte]](schemaFor[Byte], tapir.MediaType.Json())
8     )

The first thing we can see is that we use a def instead of a val this time. This is caused by some necessities on the Scala side. If we want to abstract over a type parameter then we need to use a def here.
We also have set the last type parameter not to Nothing but to something concrete this time. This is because we actually want to stream something. ;-)
It is a bit annoying that we have to define it two times (once for the output type and once for the “stream” type). Much nicer would be something like Endpoint[I, E, Byte, Stream[F, _]] but currently this is not the way we can do it.
So we again specify the HTTP method (via get) and the path (which is “products”). The errorOut helper once again restricts our error output to the status code. Finally we set the output of the endpoint by declaring a streaming entity (via streamBody).

But is is sufficient and we also directly specify the returned media type to be JSON.

http4s implementation of the load all products endpoint (1)
 1 @SuppressWarnings(Array("org.wartremover.warts.Any"))
 2 val getRoute: HttpRoutes[F] = ProductsRoutes.getProducts.toRoutes {
 3   val prefix = Stream.eval("[".pure[F])
 4   val suffix = Stream.eval("]".pure[F])
 5   val ps = repo.loadProducts
 6     .groupAdjacentBy(_._1)
 7     .map {
 8       case (id, rows) => Product.fromDatabase(rows.toList)
 9     }
10     .collect {
11       case Some(p) => p
12     }
13     .map(_.asJson.noSpaces)
14     .intersperse(",")
15   val result: Stream[F, String] = prefix ++ ps ++ suffix
16   val bytes: Stream[F, Byte]    = result.through(fs2.text.utf8Encode)
17   bytes
18 }

Again our implementation is quite the same compared to the original one. Except that in the end we convert our stream of String into a stream of Byte using the utf8Encode helper from the fs2 library.

1   found   : fs2.Stream[F,Byte]
2   required: Unit => F[Either[tapir.model.StatusCode,fs2.Stream[F,Byte]]]
3      (which expands to)  Unit => F[Either[Int,fs2.Stream[F,Byte]]]
4      bytes
5      ^

Damn, so close. But let’s keep calm and think. Or ask around on the internet. Which is totally fine. Actually it is all in the compiler error message.

http4s implementation of the load all products endpoint (2)
1 @SuppressWarnings(Array("org.wartremover.warts.Any"))
2 val getRoute: HttpRoutes[F] = ProductsRoutes.getProducts.toRoutes {
3   // ...
4   val result: Stream[F, String] = prefix ++ ps ++ suffix
5   val bytes: Stream[F, Byte]    = result.through(fs2.text.utf8Encode)
6   val response: Either[StatusCode, Stream[F, Byte]] = Right(bytes)
7   (_: Unit) => response.pure[F]
8 }

We first convert our response explicitly into the right side of an Either because the left side is used for the error case. Afterwards we provide the needed Unit => ... function in which we lift our response value via pure into the context of F.
So let’s go crazy and simply combine our routes like in the previous part and run the test via testOnly *.ProductsRoutesTest on the sbt console.

1 [info] All tests passed.

Yes! Very nice, it seems like we are done with implementing our routes via tapir endpoints.

Documentation via OpenAPI

So, we can now look at documenting our API via OpenAPI using the tooling provided by tapir. But first we should actually modify our main application entry point to provide the documentation for us.

Provide Swagger UI and documentation
 1 // ...
 2   productRoutes  = new ProductRoutes(repo)
 3   productsRoutes = new ProductsRoutes(repo)
 4   docs = List(
 5     ProductRoutes.getProduct,
 6     ProductRoutes.updateProduct,
 7     ProductsRoutes.getProducts,
 8     ProductsRoutes.createProduct
 9   ).toOpenAPI("Pure Tapir API", "1.0.0")
10   docsRoutes = new SwaggerHttp4s(docs.toYaml)
11   routes     = productRoutes.routes <+> productsRoutes.routes
12   httpApp    = Router(
13     "/" -> routes,
14     "/docs" -> docsRoutes.routes
15   ).orNotFound
16 // ...

We use the toOpenAPI helper provided by tapir which generates a class structure describing our API from a list of given endpoints. Additionally we use the SwaggerHttp4s helper which includes the Swagger UI for simple documentation browsing. All of it is made available under the /docs path. So calling http://localhost:57344/docs with your browser should open the UI and the correct documentation.
But while browsing there we can see that it provides our models and endpoints but documentation could be better. So what can we do about it?
The answer is simple: Use the helpers provided by tapir to add additional information to our endpoints.

Providing example data

Besides functions like description or name tapir also provides example which will result in having concrete examples in the documentation. To use this we must construct example values of the needed type. A Product example could look like this.

Example Product type for the API documentation
 1 val example = Product(
 2   id = java.util.UUID.randomUUID,
 3   names = NonEmptySet.one(
 4       Translation(
 5         lang = "de",
 6         name = "Das ist ein Name."
 7       )
 8     ) ++
 9     NonEmptySet.one(
10       Translation(
11         lang = "en",
12         name = "That's a name."
13       )
14     ) ++
15     NonEmptySet.one(
16       Translation(
17         lang = "es",
18         name = "Ese es un nombre."
19       )
20     )
21 )

We can now use it in our product endpoint description.

Documented tapir endpoints (1)
 1 val getProduct = endpoint.get
 2   .in(
 3     "product" / path[ProductId]("id")
 4       .description("The ID of a product which is a UUID.")
 5       .example(example.id)
 6   )
 7   .errorOut(statusCode)
 8   .out(
 9     jsonBody[Product]
10     .description("The product associated with the given ID.")
11     .example(example)
12   )
13   .description(
14     "Returns the product specified by the ID given in the URL path.
15     If the product does not exist then a HTTP 404 error is returned."
16   )

As you can see we make use of description and example here. Also the path parameter id is described that way.

Documented tapir endpoints (2)
 1 val updateProduct =
 2   endpoint.put
 3     .in(
 4       "product" / path[ProductId]("id")
 5         .description("The ID of a product which is a UUID.")
 6         .example(example.id)
 7     )
 8     .in(
 9       jsonBody[Product]
10         .description("The updated product data which should be saved.")
11         .example(example)
12     )
13     .errorOut(statusCode)
14     .out(
15       statusCode(StatusCodes.NoContent)
16         .description("Upon successful product update no content is returned.")
17     )
18     .description(
19       "Updates the product specified by the ID given in the URL path.
20       The product data has to be passed encoded as JSON in the request body.
21       If the product does not exist then a HTTP 404 error is returned."
22     )

Here we also add a description to the simple status code output explaining explicitly that no content will be returned upon success. While the 204 status code should be enough to say this you can never be sure enough. ;-)
We’ll skip the create endpoint because it looks nearly the same as the update endpoint. Instead let’s take a look at our streaming endpoint.

Documented tapir endpoints (3)
 1 def getProducts[F[_]] =
 2   endpoint.get
 3     .in("products")
 4     .errorOut(statusCode)
 5     .out(
 6       streamBody[Stream[F, Byte]](schemaFor[Byte], tapir.MediaType.Json())
 7         .example(examples.toList.asJson.spaces2)
 8     )
 9     .description(
10       "Return all existing products in JSON format as a stream of bytes."
11     )

This time we need to provide our example as a string because of the nature (read type) of our endpoint. We use the non empty list of examples that we created (you can look it up in ProductsRoutes.scala) and convert it into a JSON string.
If we now visit our swagger endpoint we’ll see nice examples included in the documentation. Pretty cool, especially because many people will look at the examples not at the specification. This might be because (too) many of us have seen an API not fulfilling its specification. This shouldn’t happen in our case because we’re deriving it, yeah! But nonetheless examples are very nice to have. :-)

If we take a closer look at our model descriptions then we might see that we could do better in some cases. I’m thinking of our ID fields being simple strings instead of UUIDs and the language code which is also defined as a simple string. So let’s get going and clean that up!

Refining the generated documentation

Looking at the intermediate model for our API documentation (see the OpenAPI class structure in the tapir library) we realise that modifying such a deeply nested case class structure might result in some really messy code. I mean we have probably all been there at some point in our life as developer. ;-)

Yes, we can! Confronted with big and nested structures we should pick a tool from our functional programming toolbox which is called optics10.
Don’t be scared by the name or all that mathematics, there exist some usable libraries for it. In our case we will pick Monocle11 which provides profunctor optics for Scala. The basic idea of optics is to provide pure functional abstractions for the manipulation of immutable objects. Because of their pure nature they are composable which results in code which is more flexible and can more easily be reasoned about.

Now that we have that cleared let’s make a plan what we actually want to do.

  1. Adjust the URL parameter descriptions of {id} to mark them as kind of UUID.
  2. Adjust the id attribute of our Product model to mark it as kind of UUID.
  3. Adjust the lang attribute of our Translation model to mark it as an ISO-639-1 language code.

If we take a look at the code within the tapir library, we see that the Schema and SchemaFor code which is used for codecs does not yet support a dedicated UUID type. There is a detour for it using Schema.SString.

Now we look a bit further into the OpenAPI code and find that it supports several interesting attributes which we might use. For now we will stick to the attribute pattern of the Schema class in that part. It is intended to hold a pattern (read regular expression) which describes the format of a string type.
Okay, so we need to define some (or better exactly 2) regular expressions. But hey wait, we already have one for our language code! :-)

Well, using some shapeless12 magic we might use an implicit Witness which should be provided by our refined type.

Idea to reuse regular expression from refined type
1 def extractRegEx[S <: String](implicit ws: Witness.Aux[S]): String =
2   ws.value

The code above is a rough idea so don’t count on it. We’ll see later on if I was right or did suffer from the hallucination of actually understanding what I am doing. ;-)
In the hope that this is settled we need one additional regular expression for a UUID. These are defined in RFC-412213 and ignoring the special edge case of a “NIL UUID” we come up with the following solution.

Regular expression for matching UUIDs
1 ^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[089ab][0-9a-f]{3}-[0-9a-f]{12}$

According to the OpenAPI specification we are nailed down to Javascript regular expressions (read the ECMA 262 regular expression dialect14). Oh why cruel fate? Well, let us deal with that when we have our other puzzle pieces in order.

Before we dive right in let’s play around a bit to get used to this fancy optics thing.

1 val docs: OpenAPI = ???
2 val paths: Lens[OpenAPI, ListMap[String, PathItem]] =
3   GenLens[OpenAPI](_.paths)
4 val test = (paths composeLens at("/product/{id}")).get(d)

We defined our first Lens via the GenLens macro. It is supposed to give us the path definitions from an OpenAPI object. In the last part we use the compose functionality to query a specific item from the paths which we address via a string because it is a ListMap using string keys.

1 [error] ...scala: ambiguous implicit values:
2 [error]  both method atMap in object At of type 
3            [K, V]=> monocle.function.At[Map[K,V],K,Option[V]]
4 [error]  and method atSet in object At of type 
5            [A]=> monocle.function.At[Set[A],A,Boolean]
6 [error]  match expected type monocle.function.At[S,String,A]
7 [error]     val x  = (paths composeLens at("/product/{id}")).get(d)
8 [error]                                   ^
9 [error] one error found

Oh no, the compiler yells at us! Some investigation leads to the conclusion that there is a type class instance missing for ListMap. So we build one. Luckily for us this is practically identical to the one for Map.

At type class instance for ListMap
1 implicit def atListMap[K, V]: At[ListMap[K, V], K, Option[V]] = At(
2   i => Lens((_: ListMap[K, V]).get(i))(optV => map => 
3     optV.fold(map - i)(v => map + (i -> v))
4   )
5 )

And now our code compiles, hooray! :-)
While we’re at it we also provide an instance for Index on ListMap.

Index type class instance for ListMap
1 implicit def listMapIndex[K, V]: Index[ListMap[K, V], K, V] = Index.fromAt

Okay, back to work. Thinking a bit about our problem we realise that we need a couple of lenses which we can compose to modify the needed parts of the documentation structure.

Generating lenses
 1 val paths: Lens[OpenAPI, ListMap[String, PathItem]] =
 2   GenLens[OpenAPI](_.paths)
 3 val getOps: Lens[PathItem, Option[Operation]] =
 4   GenLens[PathItem](_.get)
 5 val putOps: Lens[PathItem, Option[Operation]] =
 6   GenLens[PathItem](_.put)
 7 val operationParams: Lens[Operation, List[OpenAPI.ReferenceOr[Parameter]]] =
 8   GenLens[Operation](_.parameters)
 9 val pathParams: Lens[PathItem, List[OpenAPI.ReferenceOr[Parameter]]] =
10   GenLens[PathItem](_.parameters)
11 val parameterSchema: Lens[Parameter, OpenAPI.ReferenceOr[Schema]] =
12   GenLens[Parameter](_.schema)
13 val schemaPattern: Lens[Schema, Option[String]] =
14   GenLens[Schema](_.pattern)

Well this is quite a lot but let’s break it apart piece by piece. In general we use the GenLens macro of Monocle to create the lenses for us.
First we create a lens which returns the defined path items which is a ListMap. We later use the at function of the type class to grab a concrete entry from it. This “concrete” entry will be a PathItem that contains more information. Next are some lenses which will return an Operation from the aforementioned PathItem. Depending on the type of the operation (GET, POST, etc.) we return the appropriate entry.
Now we need to grab the parameters used for the endpoints which can either be collected directly from a PathItem or an Operation. These are both lists of the type Parameter. Okay, I’m lying straight to your face here. In fact they are ReferenceOr[Parameter] which means an Either[Reference, Parameter]. But in our use case we only have them as parameters so we’ll ignore this for now.
Last but not least we need to grab the Schema of a parameter and from that one the pattern field. The schema describing the parameter is also an Either[Reference, Schema] which we will ignore too in this case. Now we can play around with our lenses and various combinations.

Some examples
 1 // Delete the entry at "/product/{id}"
 2 (paths composeLens at("/product/{id}")).set(None)(docs)
 3 // Replace the path parameters with an empty list at "/product/{id}"
 4 (paths composeLens at("/product/{id}") composeOptional possible 
 5   composeLens pathParams)
 6     .set(List.empty)(docs)
 7 // Traverse through all schemas in all path parameters at "/product/{id}"
 8 (paths composeLens at("/product/{id}") composeOptional possible 
 9   composeLens pathParams composeTraversal each composeOptional 
10   possible composeLens parameterSchema)
11     .getAll(docs)
12 // Set the pattern field in all schemas in all path parameters
13 // at "/product/{id}"
14 (paths composeLens at("/product/{id}") composeOptional possible 
15   composeLens pathParams composeTraversal each composeOptional 
16   possible composeLens parameterSchema composeOptional possible 
17   composeLens schemaPattern)
18     .set(Option("Optics are soo cool!"))(docs)

This can be confusing at a first look but it is actually very powerful and clean for modifying deeply nested structures. The current API is a bit verbose but there is hope.

So what we can take away from this is that we can update our pattern field within all parameters using the following code. Before you ask: We can update all parameters because we have only one. ;-)

Update the pattern attribute in the API documentation
1 (paths composeLens at("/product/{id}") composeOptional possible 
2   composeLens pathParams composeTraversal each composeOptional 
3   possible composeLens parameterSchema composeOptional possible 
4   composeLens schemaPattern)
5     .set(Option("Fancy UUID regex here!"))(docs)

Seems we can (as good as) check off the first point on our list. Leaving us with modifying the pattern field of the actual model schemas. To get to these we have to define some more lenses.

Additional lenses for content schemas
1 val components: Lens[OpenAPI, Option[Components]] =
2   GenLens[OpenAPI](_.components)
3 // type Lens[Components, ListMap[String, OpenAPI.ReferenceOr[Schema]]]
4 val componentsSchemas =
5   GenLens[Components](_.schemas)
6 // type Lens[Schema, ListMap[String, OpenAPI.ReferenceOr[Schema]]]
7 val schemaProperties =
8   GenLens[Schema](_.properties)

The power of lenses allows us to traverse all of our structures and modify all affected models at once. So we will use the functionality provided by the Each type class here. Let’s try something like the following code.

Traverse each path item
1 (paths composeTraversal each composeLens getOps).getAll(docs)

Looks good so far but it results in a compiler error.

1 [error] ...: diverging implicit expansion for type cats.kernel.Order[A]
2 [error] starting with method catsKernelStdOrderForSortedSet in trait 
3         LowPrioritySortedSetInstancesBinCompat1
4 [error]       (paths composeTraversal each composeLens getOps).getAll(docs)
5 [error]                               ^

This looks like an error from a binary incompatible cats version. But hey, this time the compiler is lying to us. Maybe you have guessed it already: we’re missing another type class instance. This time the one for Each which we need to traverse our ListMap structure. So let’s write one! :-)

Each type class instance for ListMap
 1 implicit def listMapTraversal[K, V]: Traversal[ListMap[K, V], V] = 
 2   new Traversal[ListMap[K, V], V] {
 3     def modifyF[F[_]: Applicative](f: V => F[V])(s: ListMap[K, V]): 
 4       F[ListMap[K, V]] =
 5         s.foldLeft(Applicative[F].pure(ListMap.empty[K, V])) {
 6           case (acc, (k, v)) =>
 7             Applicative[F].map2(f(v), acc)((head, tail) => 
 8               tail + (k -> head))
 9       }
10   }
11 
12 implicit def listMapEach[K, V]: Each[ListMap[K, V], V] =
13   Each(listMapTraversal)

As noted above these things will hopefully be in the next Monocle release together with a much nicer API. :-D
The information we need to modify is within the components field of the generated documentation. While we could traverse all the stuff (paths, operations) these only hold references which are of no use for us. So let’s update our Product model.

Update Product model description via lenses
1 (components composeOptional possible composeLens componentsSchemas 
2  composeLens at("Product") composeOptional possible composeOptional 
3  possible composeLens schemaProperties composeLens at("id") 
4  composeOptional possible composeOptional possible composeLens 
5  schemaPattern)
6   .set(Option("Our UUID regex here!"))(docs)

This is quite a lot but we actually only instruct our optics how to traverse down the structure. In general we need to take care of our possible field types here. Due to the nature of the generated structure having a lot of Option and Either fields we need way more boilerplate here. But as mentioned it is really not that complicated. We compose our lenses via composeLens but may need things like composeOptional possible to compose on a defined Option[T] which we have sometimes to duplicate when having nested occurrences of these. The same instructions can be used to zoom in on the right side of an Either.

Update Translation model description via lenses
1 (components composeOptional possible composeLens componentsSchemas 
2  composeLens at("Translation") composeOptional possible composeOptional 
3  possible composeLens schemaProperties composeLens at("lang") 
4  composeOptional possible composeOptional possible composeLens 
5  schemaPattern)
6   .set(Option("Our language code regex here!"))(docs)

The modification for the Translation model looks quite the same so we could possibly make a function out of it which would get some parameters and return the updated structure.

Oh yeah, right we thought about extracting the regular expression directly from the refined type to avoid code duplication. So let’s try to use the function defined earlier on like this val langRegex = extractRegEx[LanguageCode].

1 [error] : type arguments [com.wegtam.books.pfhais.tapir.models.LanguageCode] 
2   do not conform to method extractRegEx's type parameter bounds [S <: String]
3 [error]     val langRegex = extractRegEx[LanguageCode]
4 [error]                                 ^

Okay, seems like I really don’t understand what I’m doing. ;-) Luckily for us the Scala community has nice, smart and helpful people in it. The solution is to write a type class which will support us extracting the desired parameter.

Type class to extract the regular expression
 1 import eu.timepit.refined.api._
 2 import eu.timepit.refined.string._
 3 import shapeless.Witness
 4 
 5 trait RefinedExtract[T] {
 6   def regex: String
 7 }
 8 
 9 object RefinedExtract {
10   implicit def instance[T, S <: String](
11       implicit ev: String Refined MatchesRegex[S] =:= T,
12       ws: Witness.Aux[S]
13   ): RefinedExtract[T] = new RefinedExtract[T] { val regex = ws.value }
14 }

This allows us to have the desired effect in just this small piece of code.

Extract and transform regular expression from LanguageCode
1 val typeRegex = implicitly[RefinedExtract[LanguageCode]].regex
2 // convert to Javascript regular expression
3 val langRegex = "/" + typeRegex + "/"

Now if we take a look at our API documentation it looks better. Although we can see that the URL parameter pattern information is not used. But as mentioned before there is a lot of work going on at the tapir site currently and this will also get fixed then.

However don’t forget about optics because their usage goes far beyond what we have done here.

Moving to Scala 3…

When I wrote this book version 2.12 was the thing and 2.13 was not yet released. While I did enable cross-building for 2.13 after it was released the next big version was still far away. But then it happened and Scala 3 was released. However it took some time until libraries and frameworks moved to support it. As I am writing this some still have not added support for it because in some areas this involves a significant amount of work.

However, I always wanted to do an update to Scala 3 and while I might still find time to write a complete book about that, I chose not to wait longer although some libraries I would like to use are still not ported. Therefore the topic of this chapter will be to update our small service (the tapir version) to Scala 3 while dropping some not yet ported libraries.

But instead of jumping right into the middle of it we might be better of to look at our options and plan our migration accordingly.

The first step should be to switch to Scala 2.13 as major version and update all dependencies to their latest versions. This alone will be quite some work but it will ease the migration to Scala 3 for which we will try to use some tools which are available.

Since version 1.5 the sbt build tool supports Scala 3 directly so there is no more need to add the sbt-dotty plugin to your build. Additionally it supports a new syntax for dependencies which will allow us to use 2.13 libraries in 3 and vice versa.

Using Scala 2.13 libraries in Scala 3 projects
1 libraryDependency +=
2   ("my.domain" %% "my-lib" % "x.y.z").cross(CrossVersion.for3Use2_13)

The example above instructs sbt to use a Scala 2.13 library for Scala 3. If you want to do the opposite then you have to use CrossVersion.for2_13Use3 instead which will make sbt use a Scala 3 library for Scala 2.13.

Furthermore there is the Scala-3-Migrate plugin for sbt which supports on a variety of topics when migration a project to Scala 3.

So the second step would be to use the Scala-3-Migrate plugin to guide our migration to Scala 3. During this phase we will see what can be kept, what can be used with some restrictions and what has to be dropped.

Step 1: Updating to 2.13.x

The currently recommended version to start a migration from is 2.13.7 so we will target this Scala version for updating our project. In the source code you can see that I simply copied the tapir folder of our project and named it tapir-scala-3 to not mess with our existing code.

First steps include updating sbt to a recent version as well as updating the sbt-plugins that we are using to their latest versions. Also some changes are made in regard to the compiler plugins. The kind-projector plugin needs a different way to be specififed (see cross CrossVersion.full in build.sbt) and the monadic-for plugin stays for now but will have to be removed when we’re on Scala 3. And while at it the migration plugin to support us is added as well:

Add the scala3-migrate plugin to sbt
1 addSbtPlugin("ch.epfl.scala" % "sbt-scala3-migrate" % "0.5.0")

Now we switch the default version for Scala to 2.13.7 and try to compile the project. We run into some missing dependencies errors which will force our hand into upgrading several dependencies. In addition we stumble upon the matter that the compiler flag -Xlint:nullary-override has been dropped so we remove it or comment it out.

Furthermore to reduce the clutter in our build file we remove support for Scala 2.12 and the related compiler options. In the case that you have to support older versions of Scala (cross compilation) things get more complicated. In our case we can move completely to Scala 3. :-)

Details and some compiling issues

So what was done until now?

  1. include kind-projector plugin via CrossVersion.full
  2. switch to Scala 2.13.7 aus default version
  3. remove Scala 2.12 and related settings
  4. update doobie to 0.8.8
  5. update http4s to 0.21.31
  6. update tapir to 0.11.11
  7. update circe to 0.14.1
  8. remove dropped compiler flags (for 2.13!)
  9. disable Xfatal-warnings

So far compiling our main code resulted in nagging us to fix several issues with auto-application of missing brackets for function calls, the main culprit being unsafeRunSync here which has to be unsafeRunSync(). Also some unused variable issues were popping up and are fixed easily too.

Now onwards to compiling the tests and we have some more issues. So far the integration tests compile fine but the unit tests spill out an error:

Compiler errors in the tests
 1 [error] .../TestRepository.scala:26:51: trait Seq takes type parameters
 2 [error]         val ns = p.names.toNonEmptyList.toList.to[Seq]
 3 [error]                                                   ^
 4 [error] .../TestRepository.scala:26:50: missing argument list for method to 
 5           in trait IterableOnceOps
 6 [error] Unapplied methods are only converted to functions when a function type
 7           is expected.
 8 [error] You can make this conversion explicit by writing `to _` or `to(_)`
 9           instead of `to`.
10 [error]         val ns = p.names.toNonEmptyList.toList.to[Seq]
11 [error]                                                  ^
12 [error] .../TestRepository.scala:32:49: trait Seq takes type parameters
13 [error]       val ns = p.names.toNonEmptyList.toList.to[Seq]
14 [error]                                                 ^
15 [error] .../TestRepository.scala:32:48: missing argument list for method to
16           in trait IterableOnceOps
17 [error] Unapplied methods are only converted to functions when a function type
18           is expected.
19 [error] You can make this conversion explicit by writing `to _` or `to(_)`
20           instead of `to`.
21 [error]       val ns = p.names.toNonEmptyList.toList.to[Seq]
22 [error]                                                ^
23 [error] four errors found

This looks big at first but let us stay calm an read the error messages. So the “trait Seq takes takes type parameters” eh? The second one says something about “unapplied methods” but isn’t exactly helpful either.

Well, we fire up a REPL of course and as we are (or should be) in sbt we can simply use the console command. The sbt console will only work if we fixed all our compilation errors in the main code.

Check for possible function for converting List
1 scala> List(1, 2, 3).to<TAB>
2 // This should show a list of possible functions.

So it seems we are missing our plain old (.to[T]). While there is a .to() function it requires a collection factory. So what about .toSeq? We did not use it in the past because it converted into a mutable sequence. But what about now?

Test List conversion in the REPL
 1 scala> val a: scala.collection.immutable.Seq[Int] = List(1, 2, 3).toSeq
 2 val a: Seq[Int] = List(1, 2, 3)
 3 scala> a.getClass
 4 val res0: Class[_ <: Seq[Int]] = class scala.collection.immutable.$colon$colon
 5 
 6 scala> a.getClass.getCanonicalName
 7 val res1: String = scala.collection.immutable.$colon$colon
 8 
 9 scala> a.getClass.getName
10 val res2: String = scala.collection.immutable.$colon$colon

Well, well this looks pretty good I’d say so let’s adjust the code. And quickly we get a big type error but the gist of it is:

Compiler error about invariant types
1 [error] Note: List[...] <: Seq[...], but type F is invariant in type _.
2 [error] You may wish to define _$$1 as +_$$1 instead. (SLS 4.5)
3 [error]         ns.map(n => (p.id, n.lang, n.name)).pure[F]
4 [error]                                                 ^
5 [error] one error found

Good news first: The original error is gone and we can even simplify the code around the second error source by removing the toSeq completely. But the remaining one is heavier. So let’s take a step back and take a deep breathe. If we take a look at our function signature we can see that it requires a Seq but what if we simply change it to List?

So let us try it and see how far we get. First we have to change the type signature of the function loadProduct in Repository to have a List instead of a Seq in its return type. Afterwards the compiler will tell us exactly in which places we have to make changes. Furthermore we can also remove some imports (scala.collection.immutable.Seq) which are no longer needed.

Okay, onwards to… Did you run the tests? ;-)

While executing the tests we discover that some unit tests are failing and the integration tests look good. Additionally I get a warning that the Flyway library should be updated. But we save this for later. Let us take a look at our failing tests first. We can see that they error out because of an exception:

The dreaded NoSuchMethodError after a dependency upgrade
1 java.lang.NoSuchMethodError: 'cats.data.Kleisli
2   org.http4s.HttpRoutes$.apply(scala.Function1, cats.effect.Sync)'

This does not look good and it seems to be not caught by the compiler. We should start our service to see if it happens there too. So after a sbt run we can see that it affects our main code also.

Nice we just broke our service. Welcome to the world of software development! :-)

Most likely we are in trouble due to updating dependencies and running into some binary incompatibility issues. The error message indicates that it might either be cats or http4s related. To get some more insights we should issue the sbt evicted command and take a look at the output. We find some messages about replaced versions.

Output snippets from the sbt evicted command
1 * org.http4s:http4s-dsl_2.13:0.21.31 is selected over 0.21.0-M5
2 ...
3 * org.typelevel:cats-effect_2.13:2.5.1 is selected over {2.0.0, ...}
4 ...
5 * org.typelevel:cats-core_2.13:2.6.1 is selected over {2.0.0, ...}
6 ...
7 * org.http4s:http4s-blaze-server_2.13:0.21.31 is selected over 0.21.0-M5
8 ...

Now we need to perform some investigations regarding the libraries which means digging into changelog entries, release notes and bug reports which might support our idea of something was broken. The cats part of the equation looks fine but there were some changes in the http4s library which might be the cause for our problem here. As the older version (0.21.0-M5) is a pre-release this is something that is totally valid and should always be on our radar. The older version is a dependency of tapir so that means we have to upgrade tapir as well which means “More breaking changes, yeah!” ;-)

But before we tackle this problem we might as well quickly update the Flyway library to get rid of the warning in our tests. Brave as we are we jump to the most recent release and also update the driver for PostgreSQL as well. But what is this?

Type error after Flyway upgrade
1 [error] .../FlywayDatabaseMigrator.scala:35:21: type mismatch;
2 [error]  found   : org.flywaydb.core.api.output.MigrateResult
3 [error]  required: Int
4 [error]       flyway.migrate()
5 [error]                     ^
6 [info] org.flywaydb.core.api.output.MigrateResult <: Int?
7 [info] false
8 [error] one error found

You didn’t expect this to be easy, did you? ;-) But this doesn’t look like a big issue. The return type of the migrate function was changed upstream and the only decision we have to make is if we want to change our function return type accordingly and simply pass the information onwards or do we change the function a bit and still only return the number of applied migrations. I pick the lazy route this time and simply append .migrationsExecuted to the call to .migrate() and we’re done with it.

Now onwards to our tapir update. Before we simply upgrade to the latest version we should give it some more thought. Tapir is still in a heavy development phase and might depend on pre-release versions of other libraries again. So we better look some things up. The file project/Versions.scala within the tapir source repository gives us our needed insights. If we do not want to upgrade http4s even higher then it seems we will have to pick a tapir 0.17.x release. Such a jump will likely include lots of breaking changes so another option would be to pick the lowest possible tapir release with a compatible http4s dependency.

We can either upgrade to the highest tapir version with a still compatible http4s dependency. Or we try to do the “minimum viable upgrade” and pick the lowest possible tapir version with a compatible http4s dependency to reduce our changes to a minimum. Last but not least we have the option to upgrade to the latest tapir version and upgrade all other dependencies as well.

The last option might be tempting but it will force us to upgrade not only http4s but other dependencies as well and we will likely head straight into “upgrade dependency hell” and may not even succeed.

Of our other options we can pick either the version 0.17.20 or something from the 0.12.x line of the tapir releases. Please not that the artefact organization name for tapir has changed! If you simply change the version number you will get unresolved dependency errors.

Upgrading software is not for the faint hearted so let’s be brave and try to update to 0.17.20. The line between bravery and stupidity is a bit hazy but we’ll see how we do. :-)

The first thing we stumble upon is of course a ton of errors because the namespace for tapir changed. Because changing it is simple but tedious it screams for automation and therefore we’ll use a shell script1.

Shell script to fix import namespace errors
1 % for i in `find tapir-scala-3 -name "*.scala"`; do
2 %   sed -i '' -e s/'import tapir'/'import sttp.tapir'/g $i
3 % done

This script is specific to the sed version used in the BSD operating systems! It is a simple loop being fed from a find command and uses sed to perform a search and replace operation directly in the file. The -i '' parameter ensures that no backup is saved (We are within in version control anyway.).

Okay after fixing that and the change of StatusCode and StatusCodes from tapir to sttp we still get a log of errors which look quite intimidating. Deciding that bravery is all good and so we turn to plan B switching the tapir version to 0.12.28. ;-)

We still get a bunch of errors now but they are less in number and seem mostly related to schema creation and derivation. Also not the most easy topic but as we get the same errors on 0.17.x plus a load more we might as well try to fix them. The first guess is that some code has been moved and indeed it seems that our Schema.SWhatever types are now under SchemaType.SWhatever so this should be fixed easily. Additionally we need to do small adjustments regarding changed signatures and use Schema(SchemaType.SWhatever) instead of Schema.SWhatever in some places.

Total error count so far
1 ...
2 [warn] two warnings found
3 [error] 6 errors found

Nice! We are down to a single digit number of errors. It looks like I didn’t fix the StatusCodes issue correctly so after some changes we are down to one error:

Last remaining compiler error after the tapir upgrade
1 [error] .../ProductsRoutes.scala:117:54: not found: value tapir
2 [error] streamBody[Stream[F, Byte]](schemaFor[Byte], tapir.MediaType.Json())
3 [error]                                              ^
4 [error] one error found

After digging a bit through the tapir code we can see that we simply have to pass a CodecFormat.Json() here now. Hooray, it compiles! But before we become too confident, let us run some tests.

All our tests passed
1 [info] All tests passed.

This is good news and furthermore starting our service via sbt run looks good also. :-)

Now we could move on to the next step or we might try updating some more dependencies. For starters we remove the wartremover plugin because it isn’t available for Scala 3 anyway. Besides the plugin we must remove the settings in the build.sbt and the annotations within the code (see the @SuppressWarnings annotations). As a bonus we get rid of some warnings about Any type inference which are false positives anyway. Next is the move to the Ember server for http4s from the Blaze one because Ember is the new default and recommended one. For this our main entry point in Tapir.scala has to be adjusted a bit.

First we change from IOApp to IOApp.WithContext and implement the executionContextResource function. In addition we adjust our blocking thread pool to use 2 threads or half of the available processors.

Adjustments in the main entry point
 1 object Tapir extends IOApp.WithContext {
 2   val availableProcessors: Int =
 3     Runtime.getRuntime().availableProcessors() / 2
 4   val blockingCores: Int =
 5     if (availableProcessors < 2) 2 else availableProcessors
 6   val blockingPool: ExecutorService =
 7     Executors.newFixedThreadPool(blockingCores)
 8   val ec: ExecutionContext =
 9     ExecutionContext.global
10 
11   override protected def executionContextResource: 
12     Resource[SyncIO, ExecutionContext] = Resource.eval(SyncIO(ec))
13 
14   def run(args: List[String]): IO[ExitCode] = {
15     val blocker = Blocker.liftExecutorService(blockingPool)
16     val migrator: DatabaseMigrator[IO] = new FlywayDatabaseMigrator
17     // ...
18       resource = EmberServerBuilder
19         .default[IO]
20         .withBlocker(blocker)
21         .withHost(apiConfig.host)
22         .withPort(apiConfig.port)
23         .withHttpApp(httpApp)
24         .build
25       fiber = resource.use(_ => IO(StdIn.readLine())).as(ExitCode.Success)
26     // ...

The update of the refined library requires us to update pureconfig as well in one step but it just works after increasing the version numbers. The same can be said about logback, cats and kittens. For the latter we make some small adjustment to get rid of a deprecation warning.

Some more changes are required for updating ScalaTest and ScalaCheck but they boil down to changing some imports and names (i.e. Matchers instead of MustMatchers) and the inclusion of the ScalaTestPlus library which acts as a bridge to ScalaCheck now.

The things left to update look like they might be a bit more involving:

  1. Doobie (database layer)
  2. http4s (might not be that difficult as we switched to Ember already)
  3. Monocle (version 3.x brings huge improvements but will require many changes)
  4. tapir (contains breaking changes and might introduce more dependency trouble)

As mentioned before we shouldn’t simply dive in but check what is really needed. To gather the necessary information we move on to the next step.

Step 2: Migrating to Scala 3

We have prepared our battle ground and already included the sbt plugin so we can just issue the migrate-libs tapir command to get some output. This is quite a lot so let’s concentrate on the important parts. First there is some explanation on the top:

Possible status flags for dependencies for Scala 3 migration
1 [info] X             : Cannot be updated to scala 3
2 [info] Valid         : Already a valid version for Scala 3
3 [info] To be updated : Need to be updated to the following version

We should not see the X mark (usually red in the terminal) but here we are and I count two of them. So what do we have?

  1. The better-monadic-for plugin.
  2. The pureconfig library.

The first one is no problem because we can simply drop it and the underlying problem is supposed to be solved in Scala 3. But what about pureconfig? Well let’s worry later and process the output further. We have quite some Valid marks which is great! Several others have other notes on them so onward to take a closer look.

The following dependencies are supposed to work with CrossVersion.for3Use2_13:

  1. Monocle
  2. Refined

Last but not least some dependencies need to be updated further to support Scala 3:

  1. Doobie
  2. http4s
  3. kittens
  4. tapir

So it looks like we won’t get away without doing major upgrades anyway. While at it we might as well add Monocle to our upgrade list because it looks like it will be quite some work either way.

The attentive reader will have noted that the recommended dependency updates have pre-release version numbers and she’ll ask if we really should upgrade them or wait until proper releases have been published. And yes, she is right: Though shall not use pre-release software in production!

For our example here however we do it for demonstrating the upgrade process. In production I would advise you to wait or maybe upgrade and test in a separate environment.

So, tapir has dependencies on http4s and also cats-effect therefore it will surely influence http4s and also doobie which also uses cats-effect. So the first candidate should be kittens because it doesn’t affect the other dependencies. The next one will be Monocle because although maybe not necessary it also doesn’t mess up the other dependencies. While updating kittens is done by simply increasing the version number the Monocle part will likely be more involving. After increasing the version number for Monocle and changing also the artefact group and removing the laws package we are greeted by a number of deprecation warnings and one errors upon compilation. This doesn’t look too bad so maybe we are lucky after all, are we?

For the deprecations there is an open issue for providing Scalafix rules for automatic code rewrite but it is not yet done2 therefore we have to do it ourselves. But at the issue we find a nice list of deprecated methods and their replacements! As for the error message:

Compiler error related to Monocle
1 [error] .../Tapir.scala:117:11: object creation impossible.
2         Missing implementation for:
3 [error]   def modifyA[F[_]](f: V => F[V])
4             (s: scala.collection.immutable.ListMap[K,V])
5             (implicit evidence$1: cats.Applicative[F]):
6             F[scala.collection.immutable.ListMap[K,V]]
7             // inherited from trait PTraversal
8 [error]       new Traversal[ListMap[K, V], V] {
9 [error]           ^

This might look intimidating but actually it is just complaining about a missing implementation so we will have to adjust or rewrite the one we are providing. But wait! Didn’t we provide patches to Monocle for the missing instances for ListMap? Yes we did! So how about removing our custom instances?

Monocle error is gone and only warnings remain
1 [warn] 62 warnings found
2 [success] ...

Nice! Always remember: It pays off to provide your custom extensions and patches upstream!

However now we get a lot of errors if we try to compile our tests. So we need to investigate. But before we do that let’s fix all these deprecation warnings to get our code clean. Some things are pretty trivial but if we remove the possible command which has no replacement then our code does not compile any longer. We could just ignore it because it is a warning but it is a deprecation one therefore it definitely will come back later to bite us. However we ignore it for now and look at our weird compile error in the tests.

Weird error after upgrading another dependency
1 [error] ... object scalatestplus is not a member of package org

This is strange not only because Monocle has no apparent connection to our testing libraries. But doing our research we find an issue in the bugtracker of scalatestplus3 and applying the workaround from there (manually including a dependency to discipline-scalatest) solves our issue. Hooray! But to be honest: I have no idea what is going on behind the scenes here. Likely some dependency issues which cannot be resolved and are silently dropped or so. While we’re at it we simply upgrade our Scala version to 2.13.8.

Ignoring the remaining deprecation warnings our tests are running fine and we take another look at the output of migrate-libs tapir within sbt. It seems we have to at least upgrade to tapir 0.18.x. The current stable version being 0.19.x we can see that it depends on http4s 0.23.x which in turn depends on cats-effect 3. Being a major rewrite version 3 of cats-effect will clash with our doobie version. So we will have to switch to the current pre-release version of it. But at least it is close to being released. :-)

Because the dependencies are so much weaved together we have no choice but to update them all in one step. We won’t have compiling code either way and will likely get misleading error messages if we do them step by step. So let’s increase some version numbers, take a deep breath and parse some compiler errors. To summarise: We update doobie to 1.0.0-RC2, http4s to 0.23.10 and tapir to 0.19.4. Additionally we have to adjust the tapir swagger ui package because some packaging changed.

Total error count in the main code after major upgrades
1 [warn] 5 warnings found
2 [error] 33 errors found
3 [error] (Compile / compileIncremental) Compilation failed

Okay, that doesn’t look too bad. Remember, we fixed similar numbers already. But where to start?

One of the libraries in the background that all others are using is cats-effect so maybe we should start with that one. Reading the migration guide we realise that there is a Scalafix migration which we could use for automatic conversion of our code. But it says: “Remember to run it before making any changes to your dependencies’ versions.” ;-)

So then let us rollback our versions and take a stab at the migration to cats effect 3 via Scalafix. The guide to manually applying the migration is straightforward however the results are a bit underwhelming as nearly nothing is changed. But the migration guide has lots of additional information about changed type hierarchies and so on so we are not left in the dark. Therefore we re-apply our version upgrades again and go on fixing the compilation errors. For convenience we start at our main entry point which is in Tapir.scala.

Concentrating on cats-effect first we need to adjust our IOApp.WithContext into a simply IOApp (Basically reverting our changes from some pages back.) and can remove some code which is not needed any longer. Afterwards we have fixed some errors and the ones that still show up seem to be related to tapir and http4s. On the http4s side it seems that we now need a proper Host class instead of our non-empty string. So one option would be do to something like this:

Possible solution for the configuration type problem
1 host <- IO(
2   com.comcast.ip4s.Host
3     .fromString(apiConfig.host)
4     .getOrElse(throw new RuntimeException("Invalid hostname!"))
5 )

It will work but why do we have introduced properly typed configuration then? On the other hand we might have to drop pureconfig because the migrate plugin told us that there is no version of it for Scala 3 yet. However looking at the repository and bugtracker4 we can see that basic Scala 3 support is supposed to be there. So let’s try to do it the proper way first!

While we’re at it we realise that we also need a Port type also instead of custom PortNumber and of course pureconfig needs to be provided with type classes which can read these types.

A cleaner solution for the configuration type problem
 1 import com.comcast.ip4s.{ Host, Port }
 2 import pureconfig._
 3 import pureconfig.generic.semiauto._
 4 
 5 final case class ApiConfig(host: Host, port: Port)
 6 
 7 object ApiConfig {
 8   implicit val hostReader: ConfigReader[Host] =
 9     ConfigReader.fromStringOpt[Host](Host.fromString)
10   implicit val portReader: ConfigReader[Port] =
11     ConfigReader.fromStringOpt[Port](Port.fromString)
12 
13   implicit val configReader: ConfigReader[ApiConfig] =
14     deriveReader[ApiConfig]
15 }

This is our new ApiConfig class (comments removed from the snippet) and it looks like it works because we have even less compiler errors now. :-)

However there is a new one now for the last part of our for comprehension returning the fiber:

Cats effect related type error in the main entry point
1 found   : cats.effect.IO[cats.effect.ExitCode]
2 required: cats.effect.ExitCode

This is fixed easily though by just changing the fiber = ... to fiber <- ... within the for comprehension. After that we take a look at the error we get from tapir. They are related to the swagger UI and API documentation stuff. Referring to the tapir documentation the changes are quite simple, we just change an import and the way we construct our documentation structure.

Fix the swagger UI problems in the code
 1 import sttp.tapir.swagger.SwaggerUI
 2 //...
 3 docs = OpenAPIDocsInterpreter().toOpenAPI(
 4   List(
 5     ProductRoutes.getProduct,
 6     ProductRoutes.updateProduct,
 7     ProductsRoutes.getProducts,
 8     ProductsRoutes.createProduct
 9   ),
10   "Pure Tapir API",
11   "1.0.0"
12 )
13 updatedDocs = updateDocumentation(docs)
14 docsRoutes  = Http4sServerInterpreter[IO]()
15                 .toRoutes(SwaggerUI[IO](updatedDocs.toYaml))
16 //...
17 httpApp     = Router("/" -> routes, "/docs" -> docsRoutes).orNotFound

Another step done, nice! Let’s enjoy the moment and move on to the other errors which are in our optics part of the file where we define lenses on the OpenAPI structure of tapir which seems to have changed quite a lot. So for starters there is the ReferenceOr structure which simply moved to another place so we can just add an import and reference it directly instead of OpenAPI.ReferenceOr and some errors are gone. Others are about the internal structure for example we now get a Paths type instead of a ListMap on some attributes. But before we dive to deep into this one we might as well thing about refactoring our optics part a bit more because we basically kept our old approach and just changed it so far as to compile with the latest Monocle library. But what about actually utilising the shiny new features? ;-)

But let’s save this for later because the really nice features are Scala 3 only. So we just stub our function out and make it simply return the parameter it receives to make it compile again.

Workaround for the needed Monocle adjustments
1 private def updateDocumentation(docs: OpenAPI): OpenAPI = docs

Some of the errors left are related to tapir schemas so let’s try them first because they are few and directly related to our data models. Instead of specifying everything manually we try the semi-automatic derivation this time. We soon realise that we still have to specify some instances but the code we need to make it compile looks cleaner than the one before:

Changes to the tapir schema definitions for our models
 1 object Translation {
 2   //...
 3   implicit val schemaForLanguageCode: Schema[LanguageCode] =
 4     Schema.string
 5   implicit val schemaForProductName: Schema[ProductName] =
 6     Schema.string
 7   implicit val schemaFor: Schema[Translation] =
 8     Schema.derived[Translation]
 9 }
10 object Product {
11   //...
12   implicit val schemaForProductId: Schema[ProductId] = Schema.string
13 
14   implicit def schemaForNeS[T](implicit a: Schema[T]):
15     Schema[NonEmptySet[T]] = Schema(SchemaType.SArray(a)(_.toIterable))
16 
17   implicit val schemaFor: Schema[Product] = Schema.derived[Product]
18 }

So far so good. If we really nailed it we will only know later when it might blow up in our faces or not. ;-)

Further on we need to replace the Sync type in our routing classes with Async to fix two more errors. The route creation changed so we have to use a Http4sServerInterpreter[F]().toRoutes(...) function now to create our routes. It still gives some errors but let’s look at our endpoint definitions first. The type signature for endpoints changes from [I, E, O, S] to [A, I, E, O, R]. Sorry for the abbreviation overkill here. The details can be looked up at the tapir docs but the gist is that we have an additional type (“security input”) at the start and instead of the “streaming type” we have a “capabilities type” now at the end. Because we don’t use the security input type we can set it to Unit or could also use the PublicEndpoint type alias which is provided by tapir. In the case of streaming endpoints the output type stays as before (Stream[F, Byte]) and the capabilities type at the end becomes Fs2Streams[F] or Any for all non streaming endpoints.

For our streaming endpoint (getProducts) we get an error about the streamBody specification so we can adjust that one or replace it with the new streamTextBody directive. Both ways should work.

Adjust the stream return type for our tapir endpoint
1 streamTextBody(Fs2Streams[F])(CodecFormat.Json(),
2   Option(StandardCharsets.UTF_8))

We are down to a single digit number of compiler errors for our main code. This looks not bad so let’s head on. The main issue now seems to stem from the toRoutes functionality.

Type errors about the toRoutes function from tapir
 1 [error] overloaded method toRoutes with alternatives:
 2 [error]   (serverEndpoints: List[sttp.tapir.server.ServerEndpoint[
 3             sttp.capabilities.fs2.Fs2Streams[F],F]])
 4               org.http4s.HttpRoutes[F] <and>
 5 [error]   (se: sttp.tapir.server.ServerEndpoint[
 6             sttp.capabilities.fs2.Fs2Streams[F],F])
 7               org.http4s.HttpRoutes[F]
 8 [error]  cannot be applied to (sttp.tapir.Endpoint[
 9            Unit,ProductId,StatusCode,Product,Any])
10 [error]     Http4sServerInterpreter[F]().toRoutes(...) { id =>
11 [error]                                  ^

For me this looks like the function will only accept streaming endpoints which doesn’t make sense and would have surely been mentioned in the documentation or some release notes of the tapir project. But we take a closer look and we see that it actually expects ServerEndpoint instances here, not Endpoint ones. Or to quote from the documentation:

To interpret a single endpoint, or multiple endpoints as a server, the endpoint descriptions must be coupled with functions which implement the server logic. The shape of these functions must match the types of the inputs and outputs of the endpoint.

So server logic is added to an endpoint via one of the functions starting with serverLogic of course. ;-)

For our purpose we will use the default one (simply serverLogic). Let’s test it out on one route:

Fix the toRoutes errors using the serverLogic functionality
 1 final class ProductRoutes[F[_]: Async] ... {
 2   //...
 3   private val getRoute: HttpRoutes[F] =
 4     Http4sServerInterpreter[F]().toRoutes(ProductRoutes.getProduct
 5       .serverLogic { id =>
 6         for {
 7           rows <- repo.loadProduct(id)
 8           resp = Product
 9             .fromDatabase(rows)
10             .fold(StatusCode.NotFound.asLeft[Product])(_.asRight[StatusCode])
11         } yield resp
12       })
13 
14   private val updateRoute: HttpRoutes[F] =
15     Http4sServerInterpreter[F]().toRoutes(ProductRoutes.updateProduct
16       .serverLogic {
17         case (_, p) =>
18           for {
19             cnt <- repo.updateProduct(p)
20             res = cnt match {
21               case 0 => StatusCode.NotFound.asLeft[Unit]
22               case _ => ().asRight[StatusCode]
23             }
24           } yield res
25       })
26   //...
27 }

And it compiles fine! So we just need to move our logic into the serverLogic function part and we are set. Pretty cool but once we fix it we get another error from our main entry point:

A new error from our main entry point
1 Tapir.scala:42:26: method executionContextResource overrides nothing

However we can simply remove it and are done. Oh wait! We are skipped two things: first there is still the optics implementation left and second we want to fix this example problem in the one endpoint definition. Besides that we also get a lot of compilation errors for our tests. We turn there first an can very quickly fix our integration tests by removing the no longer used IO.contextShift from our DoobieRepositoryTest and by providing an implicit IORuntime within our BaseSpec class.

It turns out that for our regular tests we can apply the same fix to the BaseSpec there and also remove some obsolete code and adjust our imports because the http4s library has now a type called ProductId which clashes with our own one. After changing the Effect type in our TestRepository to Async the only thing left seems to be the ScalaCheck generators for our ApiConfig but these are also fixed easily.

So we fixed the compilation errors in our tests, but alas some tests are failing. :-(

At least the integration tests look fine, so let’s take a look at the possible reasons for our failing tests. The failing ones are: ApiConfigTest, ProductRoutesTest and ProductsRoutesTest. The first one spills out the following message:

Equality issue causing a failing test
1 ApiConfig(127.0.0.1,34019) was not equal to ApiConfig(127.0.0.1,34019)

I don’t know about you dear reader but I have stumbled into equality issues frequently (not always though but regular) so this should be resolvable by changing the c must be(expected) line in the test.

Fix for the equality issue in the ApiConfig test
1 ConfigSource.fromConfig(config).at("api").load[ApiConfig] match {
2   case Left(e)  => fail(s"Parsing a valid configuration must succeed! ($e)")
3   case Right(c) => withClue("Config must be equal!")(c === expected)
4 }

In addition we add an implicit instance for the Eq of cats into the companion object of the ApiConfig class.

Instance for cats Eq and ApiConfig
1 implicit val eqApiConfig: Eq[ApiConfig] = Eq.instance { (a, b) =>
2   a.host === b.host && a.port === b.port
3 }

This fixes it and we can turn to the two remaining one. We soon find that the error message returned by tapir (which we check in the tests) has changed and is now more detailed so we simply adjust the according line in both tests and we are done here.

Nice, we have a compiling project again and if ignore our stubbed out Monocle function for now the migrate-libs task output some good looking results. So we execute the migrate-scalacOptions task next and take a look at the results. It uses the same notifications like the migrate-libs task to highlight problems. So far we have a couple of flags that the plugin could not recognize, quite a lot which are not valid any longer and some which we can use or have to rename.

First we change our compilerSettings function and add a case for Scala 3.

Compiler settings for Scala 3 in build.sbt
 1 //...
 2 case Some((3, _)) =>
 3   Seq(
 4     "-deprecation",
 5     "-explain-types",
 6     "-feature",
 7     "-language:higherKinds",
 8     "-unchecked",
 9     //"-Xfatal-warnings", // Disable for migration
10     "-Ycheck-init",
11     "-Ykind-projector"
12   )
13 //...

Also we add a libraryDependencies setting into our commonSettings block to only activate plugins for Scala 2.

Load compiler plugins only for Scala 2 in build.sbt
 1 //...
 2 libraryDependencies ++= (
 3   if (scalaVersion.value.startsWith("2")) {
 4     Seq(
 5       compilerPlugin("com.olegpy"    %% "better-monadic-for" % "0.3.1"),
 6       compilerPlugin("org.typelevel" % "kind-projector"      % "0.13.2" cross CrossV\
 7 ersion.full)
 8     )
 9   } else {
10     Seq()
11   }
12 ),
13 //...

Next is settings some flags for dependencies of which we want the 2.13 version.

Using the 2.13 version for some dependencies in build.sbt
1 library.pureConfig.cross(CrossVersion.for3Use2_13),
2 library.refinedCats.cross(CrossVersion.for3Use2_13),
3 library.refinedCore.cross(CrossVersion.for3Use2_13),
4 library.refinedPureConfig.cross(CrossVersion.for3Use2_13),

Last but not least the big topic “new syntax” is lurking around the corner. But we will first add two more compiler flags which should make our 2.13 code compile in Scala 3 and add Scala 3 to the crossScalaVersions setting.

Two migration compiler flags and the 3 version in build.sbt
 1 //...
 2 case Some((3, _)) =>
 3   Seq(
 4     "-deprecation",
 5     "-explain-types",
 6     "-feature",
 7     "-language:higherKinds",
 8     "-unchecked",
 9     //"-Xfatal-warnings", // Disable for migration
10     "-Ycheck-init",
11     "-Ykind-projector",
12     // Gives warnings instead of errors on most syntax changes.
13     "-source:3.0-migration",
14     // Resolve warnings via the compiler of possible.
15     "-rewrite",
16   )
17 //..
18 crossScalaVersions := Seq(scalaVersion.value, "3.1.1"),
19 //...

Time for a first test run! We switch to Scala 3 by using ++3.1.1 in the sbt shell and issue a clean followed by a compile. Please not that you should have reloaded or restarted your sbt instance before to make it reflect all of our changes.

Errors about conflicting dependencies versions
 1 [error] Modules were resolved with conflicting cross-version suffixes in 
 2   ProjectRef(uri(".../tapir-scala-3/"), "tapir"):
 3 [error]    org.scala-lang.modules:scala-xml _2.13, _3
 4 [error]    org.typelevel:simulacrum-scalafix-annotations _3, _2.13
 5 [error]    org.typelevel:cats-kernel _3, _2.13
 6 [error]    eu.timepit:refined _2.13, _3
 7 [error]    org.typelevel:cats-core _3, _2.13
 8 [error] stack trace is suppressed; run last update for the full output
 9 [error] (update) Conflicting cross-version suffixes in:
10   org.scala-lang.modules:scala-xml,
11   org.typelevel:simulacrum-scalafix-annotations,
12   org.typelevel:cats-kernel,
13   eu.timepit:refined,
14   org.typelevel:cats-core
15 [error]

Looks like we have a problem here. The libraries that we include in the version for 2.13 depend on other which collide with transitive dependencies of others using Scala 3. Digging through the maven central we can see that there are artefacts published for Scala 3 for pureconfig and refined, so what about removing our cross version settings and try them directly?

We soon find out that the refined module for pureconfig is not available for Scala 3 and in addition it seems that the generic derivation module is also not there. :-(

We could call it a day and stick with Scala 2 for the time being. However maybe there is a strong need for an upgrade. A library which only works partially under Scala 2 or other reasons. So imagine that we refactor our service a bit by removing some dependencies and add some more boilerplate to make up for it. At first we could simply remove the CrossVersion settings and use only libraries which are released for Scala 3. This move leaves us with hundreds of compiler errors. Many of them seem to be related to refined.

Because we have to start somewhere we try to create our ConfigReader instances for pureconfig manually to avoid the dependency to the derivation module. Luckily there are some helpers like the ConfigReader.forProductX methods which make this quite easy.

A manual constructed reader for pureconfig
1 implicit val configReader: ConfigReader[DatabaseConfig] =
2   ConfigReader.forProduct4("driver", "url", "user", "pass")
3     (DatabaseConfig(_, _, _, _))

The code compiles again (on 2.13!) and the tests are looking good. On Scala 3 we run again into the problem that the refined pureconfig module is not available. So we could either drop our beloved refined types or we write manual readers for them. In Scala 3 we could use the opaque type aliases5 to get more type safety but they are much less powerful than refined types. Well, we’ll see how it goes, so at first we define companion objects for our refined types to gain some more functionality i.e. the from method to convert arbitrary types into refined ones.

A companion object for a refined type providing more functionality
1 type DatabaseLogin = String Refined NonEmpty
2 object DatabaseLogin extends RefinedTypeOps[DatabaseLogin, String]
3   with CatsRefinedTypeOpsSyntax

As you can see this is pretty simple, we don’t even need to implement something ourselves. Getting the instances for ConfigReader looks pretty simple too.

ConfigReader instances for refined types - Take 1
1 implicit val loginReader: ConfigReader[DatabaseLogin] =
2   ConfigReader.fromStringOpt(s => DatabaseLogin.from(s).toOption)
3 implicit val passReader: ConfigReader[DatabasePassword] =
4   ConfigReader.fromStringOpt(s => DatabasePassword.from(s).toOption)
5 implicit val urlReader: ConfigReader[DatabaseUrl] =
6   ConfigReader.fromStringOpt(s => DatabaseUrl.from(s).toOption)

But wait we get the dreaded “ambiguous implicit values” compiler error now. Since our types are all refined from String under the hood, the compiler throws an error at us.

Well, it seems that we don’t need to because the pureconfig module of refined is a thing. But we cannot use it as it is because the usage of reflection is a no-go with Scala 3. But we can try to copy the base function and make some adjustments.

Copied and adjusted converter function from refined-pureconfig
 1 implicit def refTypeConfigConvert[F[_, _], T, P](
 2     implicit configConvert: ConfigConvert[T],
 3     refType: RefType[F],
 4     validate: Validate[T, P]
 5 ): ConfigConvert[F[T, P]] =
 6   new ConfigConvert[F[T, P]] {
 7     override def from(cur: ConfigCursor): ConfigReader.Result[F[T, P]] =
 8       configConvert.from(cur) match {
 9         case Left(es) => Left(es)
10         case Right(t) =>
11           refType.refine[P](t) match {
12             case Left(because) =>
13               Left(
14                 ConfigReaderFailures(
15                   ConvertFailure(
16                     reason = CannotConvert(
17                       value = cur.valueOpt.map(_.render()).getOrElse("none"),
18                       toType = "a refined type",
19                       because = because
20                     ),
21                     cur = cur
22                   )
23                 )
24               )
25             case Right(refined) => Right(refined)
26           }
27       }
28     override def to(t: F[T, P]): ConfigValue =
29       configConvert.to(refType.unwrap(t))
30   }

So we copied it and simply omitted the type tag stuff which is based on reflection. This way we loose some information for our error message but we gain a hopefully Scala 3 compatible refined type reader for pureconfig. Many thanks again at this point to Frank S. Thomas the creator of the wonderful refined library and all the contributors!

Good, let’s switch to Scala 3 (++3.1.1 in the sbt shell) and try a clean compile which will give us a lot of errors still. However first things first. We notice some refined related ones in the top and fix them. The notation changed and we finally can use literal types so there is no longer a need for this weird W.andsoon.T constructs.

After fixing them we get 266 errors. Oh why cruel fate? But looking at them we can see that a lot of them come from our LanguageCodes class. As we remember that due to macro issues several things are not there yet in libraries under Scala 3 we are again faced with the decision to abandon refined types or find yet another workaround. To keep this chapter from growing exponentially I’ll pick the lazy (and dirty) workaround for this time. A simple nudge in your favourite editor should convert the code in the file to something like this:

Workaround for missing refined macros
1 val all: Seq[LanguageCode] = Seq(
2   LanguageCode.unsafeFrom("ad"),
3   LanguageCode.unsafeFrom("ae"),
4   LanguageCode.unsafeFrom("af"),
5   //...
6 )

This is not the nicest solution but it leaves us with only 17 errors to fix and they look like they have the same cause. So after fixing the same thing in our routing examples we are down to one error:

1 [error] -- [E008] Not Found Error: .../Tapir.scala:47:31
2 [error] 47 |      (apiConfig, dbConfig) <- IO {
3 [error]    |                               ^
4 [error]    |value withFilter is not a member of 
5 [error]    |  cats.effect.IO[(com.wegtam.books.pfhais.tapir.config.ApiConfig,
6 [error]    |  com.wegtam.books.pfhais.tapir.config.DatabaseConfig
7 [error]    |)]
8 [error] one error found
9 [error] one error found

Well the only time I got that one was when I removed the better-monadic-for compiler plugin and it won’t be available for Scala 3 because it shouldn’t be needed. But we can solve it by de-composing our code into several chunks.

Fixing the withFilter error in Tapir.scala
1 for {
2   cfg       <- IO(ConfigFactory.load(getClass().getClassLoader()))
3   apiConfig <- IO(ConfigSource.fromConfig(cfg).at("api")
4     .loadOrThrow[ApiConfig])
5   dbConfig  <- IO(ConfigSource.fromConfig(cfg).at("database")
6     .loadOrThrow[DatabaseConfig])
7   //...

Awesome! We have our main code compiling under Scala 3 now! =)

Compiling the tests suites greets us with a couple of errors though, so no celebrations just yet. Some of them are refined related like in the main code (missing macros) so we will apply our unsafeFrom workaround for them and get rid of them. Then we have an error about implicit values needing an explicit type which is a good practice anyway. The integration tests look similar and are easy to fix too. However we have two failing tests after compilation. One within each route test and both related to checking the some error response for malformed requests. So we simply adjust the error message we test for and we are done.

This is awesome but according to the official migration guide we should also try to migrate our syntax. So let’s run the migrate-syntax tapir command in the sbt shell.

Result of the migrate-syntax command
 1 [info]
 2 [info] The syntax incompatibilities have been fixed in tapir / Test
 3 [info]
 4 [info]
 5 [info] You can now commit the change!
 6 [info] Then you can run the next command:
 7 [info]
 8 [info] migrate tapir
 9 [info]
10 [info]

Looking at the source we can see that only some annotations have been added. And finally we run migrate tapir and it bails out with an error. I’ll spare you the gory details but I did not have any clue about what was going wrong. But we know that everything is fine under Scala 3 anyway. So what about simply ignoring the tools lamentations and move on with our lives? Sounds good? Yeah, to me too. :-)

Before calling it a day we should switch back to 2.13 and do a clean compile to see some unused imports warnings be printed out. These are easy to fix and cleaner code is easier to maintain. Finally we adjust our build.sbt to make the switch permanent.

Switch to Scala 3 by default
1   //...
2   scalaVersion := "3.1.1",
3   crossScalaVersions := Seq(scalaVersion.value),
4   //...

We can also remove some code related to Scala 2 (plugins and compiler settings).

Well, we should change our scalafmt configuration because we surely do not want to get our shiny new Scala 3 code formatted according to Scala 2 syntax. ;-)

A new configuration for scalafmt
 1 version        = 3.4.3
 2 runner.dialect = scala3
 3 style          = defaultWithAlign
 4 # Other options...
 5 danglingParentheses.preset = true
 6 maxColumn                  = 120
 7 newlines.forceBeforeMultilineAssign = def
 8 project.excludeFilters     = [".*\\.sbt"]
 9 rewrite.rules              = [Imports, RedundantBraces, RedundantParens]
10 rewrite.imports.sort       = ascii
11 rewriteTokens              = {
12   ...
13 }
14 spaces.inImportCurlyBraces = true
15 unindentTopLevelOperators  = true

We used the opportunity to upgrade to the latest scalafmt and use a bit different settings. Running a scalafmtAll rewrites a lot of code, so we need to check if it still compiles. This looks fine so we are done, are we?

Wait, we’ve been here before, haven’t we? Hint: remember that updateDocumentation function using optics that we have stubbed out? Sometimes I wonder how many times smaller (or bigger) things get dropped under the table during such migrations only to be re-implemented later with great effort for new. We start with adding the needed import for the new Monocle (import monocle.syntax.all._) and try some things out. First we have this nice implicit extractor for the regular expression which defines our refined LanguageCode type. This one doesn’t seem to work any longer. To keep things simple, we just write it down in the code.

Onwards to the .focus macro which is a killer feature of the new Monocle release. Instead of writing and combining all our custom Lens instances we should now be able to do something like the following:

The Monocle focus in action
1 docs.focus(_.components.some.parameters.each.schema.pattern)
2   .replace(uuidRegex.some)

Looks neat but we get an error that the function components is an overloaded one which seems is not supported. So it won’t be that easy.

After poking around and asking for help (Please ask for help if you need it, there is absolutely nothing wrong with this.) we realise that we will have to generate lenses like before because the shiny focus macro cannot (yet) solve this for us. Sadly the GenLens macro also has problems with overloaded methods therefore we try to define Getter and Setter functionality for these manually. Some lenses can still be generated via GenLens like this GenLens[Operation](_.parameters) for example. For others we define code like the following:

Manually define Getter and Setter
1 val componentsGetter = Getter[OpenAPI, Option[Components]](_.components)
2 val componentsSetter = Setter[OpenAPI, Option[Components]](
3   f => o => o.copy(components = f(o.components))
4 )

However this is even more cumbersome than the manual macros before so maybe there is another way… And it turns out that we have another optics library for Scala which is called Quicklens6. The fact that is comes from the same people that do the tapir project looks promising and indeed we do find a working implementation quickly.

Using optics from Quicklens to update our documentation
 1 private def updateDocumentation(docs: OpenAPI): OpenAPI = {
 2   // Our regular expressions.
 3   val langRegex = ???
 4   val uuidRegex = ???
 5   // Update the documentation structure.
 6   val updateProductId = docs
 7     .modify(_.paths.pathItems.at("/product/{id}").parameters.each
 8     .eachRight.schema.at.eachRight.pattern)
 9     .using(_ => uuidRegex.some)
10   val updateModelProduct = updateProductId
11     .modify(_.components.at.schemas.at("Product").eachRight
12     .properties.at("id").eachRight.pattern)
13     .using(_ => uuidRegex.some)
14   val updateModelTranslation = updateModelProduct
15     .modify(_.components.at.schemas.at("Translation").eachRight
16     .properties.at("lang").eachRight.pattern)
17     .using(_ => langRegex.some)
18   updateModelTranslation
19 }

The .modify functionality looks very similar to .focus from Monocle and apart from the fact that the modifiers have different names we also do the same like in the Monocle code (read traverse through our structure). While Monocle would allow us to use more advanced features we are happy with Quicklens here because it fulfils our needs.

Nothing! Go forth, ship your release and treat yourself for having ported your project to Scala 3. :-)

Of course I cheated because there is always something left like the not working example for one of our ProductsRoutes endpoints and furthermore: We have used package objects in our code which have been dropped in Scala 3. Therefore we should refactor these also. So feel free to do the required changes as a final exercise.

Epilogue

I personally hope that you had some fun reading this, have learned something and got some ideas about what is possible in the realm of programming. For me it was fun to write this book and I also learned a lot while writing it. I want to thank all the people that got back to me with questions, hints and ideas about what might be missing.

The next time some person points out that all this “academic functional nonsense” is for naught please remember some of the things you read here. We have seen several things that imply that we can use pure functional programming in our daily work:

  1. better understandable code
  2. better testable code
  3. better performing code (Yes!)
  4. better type safety using refined types
  5. deriving boilerplate code from our statically typed models and logic
  6. easier working with deeply nested structures using optics

There really is no excuse to stick to impure and messy stuff that is hard to maintain. But please remember also that functional programming is not done for the sake of itself. It is another and in my opinion better way to deliver value. In our field of work this (value) usually means some working software / program / application / whatever.

Also please do not consider this book a complete knowledge trove. Several things might already be outdated, some topics have only been scratched on the surface and other things are missing completely.

So, people of the baud and the electron: Go out there, venture forth, cross boundaries, use the knowledge which is there and last but not least - Have fun!

Notes

Foreword

1https://akka.io

2http://slick.lightbend.com

3https://typelevel.org/cats/

4https://typelevel.org/cats-effect/

5https://http4s.org/

6https://tpolecat.github.io/doobie/

7https://github.com/fthomas/refined

8https://fs2.io/

9https://github.com/softwaremill/tapir

10https://github.com/julien-truffaut/Monocle

Maybe there is another way

Impure implementation

1http://slick.lightbend.com/doc/3.3.1/database.html

Pure implementation

1https://pureconfig.github.io/

2https://github.com/typelevel/kittens

3https://typelevel.org/cats-effect/typeclasses/sync.html

4http://slick.lightbend.com/doc/3.3.1/sql.html

5https://blog.acolyer.org/2019/07/03/one-sql-to-rule-them-all/

6https://httpie.org/

7https://github.com/http4s/http4s/issues/2371

8https://www.wartremover.org/

9https://typelevel.org/cats-effect/datatypes/ioapp.html

What about tests?

1http://www.scalacheck.org/

2http://www.scalatest.org/user_guide/selecting_a_style

3https://rspec.info/

4https://github.com/scalatest/scalatest/issues/1370

Adding benchmarks

1https://jmeter.apache.org/

2https://github.com/http4s/http4s/issues/2855

Documenting your API

1https://swagger.io/

2https://swagger.io/tools/swagger-ui/

3https://swagger.io/tools/swagger-editor/

4https://github.com/swagger-akka-http/swagger-akka-http

5https://www.youtube.com/watch?v=snbsYyBS4Bs

6http://julienrf.github.io/endpoints/

7https://github.com/http4s/rho

8https://github.com/softwaremill/tapir

9https://www.servant.dev/

10https://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/poptics.pdf

11https://github.com/julien-truffaut/Monocle

12https://github.com/milessabin/shapeless

13https://tools.ietf.org/html/rfc4122

14https://www.ecma-international.org/ecma-262/5.1/#sec-7.8.5

Moving to Scala 3…

1We could have used scalafix here but I consider this overkill for such a thing.

2https://github.com/optics-dev/Monocle/issues/1001

3https://github.com/scalatest/scalatestplus-scalacheck/issues/36

4https://github.com/pureconfig/pureconfig/issues/970

5https://docs.scala-lang.org/scala3/reference/other-new-features/opaques.html

6https://github.com/softwaremill/quicklens