2. For Comprehensions
Scala’s for comprehension is the ideal FP abstraction for sequential
programs that interact with the world. Since we’ll be using it a lot,
we’re going to relearn the principles of for and how scalaz can help
us to write cleaner code.
This chapter doesn’t try to write pure programs and the techniques are applicable to non-FP codebases.
2.1 Syntax Sugar
Scala’s for is just a simple rewrite rule, also called syntax
sugar, that doesn’t have any contextual information.
To see what a for comprehension is doing, we use the show and
reify feature in the REPL to print out what code looks like after
type inference.
scala> import scala.reflect.runtime.universe._
scala> val a, b, c = Option(1)
scala> show { reify {
for { i <- a ; j <- b ; k <- c } yield (i + j + k)
} }
res:
$read.a.flatMap(
((i) => $read.b.flatMap(
((j) => $read.c.map(
((k) => i.$plus(j).$plus(k)))))))
There is a lot of noise due to additional sugarings (e.g. + is
rewritten $plus, etc). We’ll skip the show and reify for brevity
when the REPL line is reify>, and manually clean up the generated
code so that it doesn’t become a distraction.
reify> for { i <- a ; j <- b ; k <- c } yield (i + j + k)
a.flatMap {
i => b.flatMap {
j => c.map {
k => i + j + k }}}
The rule of thumb is that every <- (called a generator) is a
nested flatMap call, with the final generator a map containing the
yield body.
2.1.1 Assignment
We can assign values inline like ij = i + j (a val keyword is not
needed).
reify> for {
i <- a
j <- b
ij = i + j
k <- c
} yield (ij + k)
a.flatMap {
i => b.map { j => (j, i + j) }.flatMap {
case (j, ij) => c.map {
k => ij + k }}}
A map over the b introduces the ij which is flat-mapped along
with the j, then the final map for the code in the yield.
Unfortunately we cannot assign before any generators. It has been requested as a language feature but has not been implemented: https://github.com/scala/bug/issues/907
scala> for {
initial = getDefault
i <- a
} yield initial + i
<console>:1: error: '<-' expected but '=' found.
We can workaround the limitation by defining a val outside the for
scala> val initial = getDefault
scala> for { i <- a } yield initial + i
or create an Option out of the initial assignment
scala> for {
initial <- Option(getDefault)
i <- a
} yield initial + i
2.1.2 Filter
It is possible to put if statements after a generator to filter
values by a predicate
reify> for {
i <- a
j <- b
if i > j
k <- c
} yield (i + j + k)
a.flatMap {
i => b.withFilter {
j => i > j }.flatMap {
j => c.map {
k => i + j + k }}}
Older versions of scala used filter, but Traversable.filter
creates new collections for every predicate, so withFilter was
introduced as the more performant alternative.
We can accidentally trigger a withFilter by providing type
information: it is actually interpreted as a pattern match.
reify> for { i: Int <- a } yield i
a.withFilter {
case i: Int => true
case _ => false
}.map { case i: Int => i }
Like in assignment, a generator can use a pattern match on the left
hand side. But unlike assignment (which throws MatchError on
failure), generators are filtered and will not fail at runtime.
However, there is an inefficient double application of the pattern.
2.1.3 For Each
Finally, if there is no yield, the compiler will use foreach
instead of flatMap, which is only useful for side-effects.
reify> for { i <- a ; j <- b } println(s"$i $j")
a.foreach { i => b.foreach { j => println(s"$i $j") } }
2.1.4 Summary
The full set of methods supported by for comprehensions do not share
a common super type; each generated snippet is independently compiled.
If there were a trait, it would roughly look like:
trait ForComprehensible[C[_]] {
def map[A, B](f: A => B): C[B]
def flatMap[A, B](f: A => C[B]): C[B]
def withFilter[A](p: A => Boolean): C[A]
def foreach[A](f: A => Unit): Unit
}
If the context (C[_]) of a for comprehension doesn’t provide its
own map and flatMap, all is not lost. If an implicit
scalaz.Bind[T] is available for T, it will provide map and
flatMap.
2.2 Unhappy path
So far we’ve only looked at the rewrite rules, not what is happening
in map and flatMap. Let’s consider what happens when the for
context decides that it can’t proceed any further.
In the Option example, the yield is only called when i,j,k are
all defined.
for {
i <- a
j <- b
k <- c
} yield (i + j + k)
If any of a,b,c are None, the comprehension short-circuits with
None but it doesn’t tell us what went wrong.
If we use Either, then a Left will cause the for comprehension
to short circuit with extra information, much better than Option for
error reporting:
scala> val a = Right(1)
scala> val b = Right(2)
scala> val c: Either[String, Int] = Left("sorry, no c")
scala> for { i <- a ; j <- b ; k <- c } yield (i + j + k)
Left(sorry, no c)
And lastly, let’s see what happens with a Future that fails:
scala> import scala.concurrent._
scala> import ExecutionContext.Implicits.global
scala> for {
i <- Future.failed[Int](new Throwable)
j <- Future { println("hello") ; 1 }
} yield (i + j)
scala> Await.result(f, duration.Duration.Inf)
caught java.lang.Throwable
The Future that prints to the terminal is never called because, like
Option and Either, the for comprehension short circuits.
Short circuiting for the unhappy path is a common and important theme.
for comprehensions cannot express resource cleanup: there is no way
to try / finally. This is good, in FP it puts a clear ownership of
responsibility for unexpected error recovery and resource cleanup onto
the context (which is usually a Monad as we’ll see later), not the
business logic.
2.3 Gymnastics
Although it is easy to rewrite simple sequential code as a for
comprehension, sometimes we’ll want to do something that appears to
require mental summersaults. This section collects some practical
examples and how to deal with them.
2.3.1 Fallback Logic
Let’s say we are calling out to a method that returns an Option and
if it is not successful we want to fallback to another method (and so
on and so on), like when we’re using a cache:
def getFromRedis(s: String): Option[String]
def getFromSql(s: String): Option[String]
getFromRedis(key) orElse getFromSql(key)
If we have to do this for an asynchronous version of the same API
def getFromRedis(s: String): Future[Option[String]]
def getFromSql(s: String): Future[Option[String]]
then we have to be careful not to do extra work because
for {
cache <- getFromRedis(key)
sql <- getFromSql(key)
} yield cache orElse sql
will run both queries. We can pattern match on the first result but the type is wrong
for {
cache <- getFromRedis(key)
res <- cache match {
case Some(_) => cache !!! wrong type !!!
case None => getFromSql(key)
}
} yield res
We need to create a Future from the cache
for {
cache <- getFromRedis(key)
res <- cache match {
case Some(_) => Future.successful(cache)
case None => getFromSql(key)
}
} yield res
Future.successful creates a new Future, much like an Option or
List constructor.
If functional programming was like this all the time, it’d be a nightmare. Thankfully these tricky situations are the corner cases.
2.3.2 Early Exit
Let’s say we have some condition that should exit early with a successful value.
If we want to exit early with an error, it is standard practice in OOP to throw an exception
def getA: Int = ...
val a = getA
require(a > 0, s"$a must be positive")
a * 10
which can be rewritten async
def getA: Future[Int] = ...
def error(msg: String): Future[Nothing] =
Future.failed(new RuntimeException(msg))
for {
a <- getA
b <- if (a <= 0) error(s"$a must be positive")
else Future.successful(a)
} yield b * 10
But if we want to exit early with a successful return value, the simple synchronous code:
def getB: Int = ...
val a = getA
if (a <= 0) 0
else a * getB
translates into a nested for comprehension when our dependencies are
asynchronous:
def getB: Future[Int] = ...
for {
a <- getA
c <- if (a <= 0) Future.successful(0)
else for { b <- getB } yield a * b
} yield c
2.4 Incomprehensible
The context we’re comprehending over must stay the same: we can’t mix contexts.
scala> def option: Option[Int] = ...
scala> def future: Future[Int] = ...
scala> for {
a <- option
b <- future
} yield a * b
<console>:23: error: type mismatch;
found : Future[Int]
required: Option[?]
b <- future
^
Nothing can help us mix arbitrary contexts in a for comprehension
because the meaning is not well defined.
But when we have nested contexts the intention is usually obvious yet the compiler still doesn’t accept our code.
scala> def getA: Future[Option[Int]] = ...
scala> def getB: Future[Option[Int]] = ...
scala> for {
a <- getA
b <- getB
} yield a * b
<console>:30: error: value * is not a member of Option[Int]
} yield a * b
^
Here we want for to take care of the outer context and let us write
our code on the inner Option. Hiding the outer context is exactly
what a monad transformer does, and scalaz provides implementations
for Option and Either named OptionT and EitherT respectively.
The outer context can be anything that normally works in a for
comprehension, but it needs to stay the same throughout.
We create an OptionT from each method call. This changes the context
of the for from Future[Option[_]] to OptionT[Future, _].
scala> val result = for {
a <- OptionT(getA)
b <- OptionT(getB)
} yield a * b
result: OptionT[Future, Int] = OptionT(Future(<not completed>))
.run returns us to the original context
scala> result.run
res: Future[Option[Int]] = Future(<not completed>)
Alternatively, OptionT[Future, Int] has getOrElse and getOrElseF
methods, taking Int and Future[Int] respectively, returning a
Future[Int].
The monad transformer also allows us to mix Future[Option[_]] calls with
methods that just return plain Future via .liftM[OptionT] (provided by
scalaz):
scala> def getC: Future[Int] = ...
scala> val result = for {
a <- OptionT(getA)
b <- OptionT(getB)
c <- getC.liftM[OptionT]
} yield a * b / c
result: OptionT[Future, Int] = OptionT(Future(<not completed>))
and we can mix with methods that return plain Option by wrapping
them in Future.successful (.pure[Future]) followed by OptionT
scala> def getD: Option[Int] = ...
scala> val result = for {
a <- OptionT(getA)
b <- OptionT(getB)
c <- getC.liftM[OptionT]
d <- OptionT(getD.pure[Future])
} yield (a * b) / (c * d)
result: OptionT[Future, Int] = OptionT(Future(<not completed>))
It is messy again, but it is better than writing nested flatMap and
map by hand. We can clean it up with a DSL that handles all the
required conversions into OptionT[Future, _]
def liftFutureOption[A](f: Future[Option[A]]) = OptionT(f)
def liftFuture[A](f: Future[A]) = f.liftM[OptionT]
def liftOption[A](o: Option[A]) = OptionT(o.pure[Future])
def lift[A](a: A) = liftOption(Option(a))
combined with the |> operator, which applies the function on the
right to the value on the left, to visually separate the logic from
the transformers
scala> val result = for {
a <- getA |> liftFutureOption
b <- getB |> liftFutureOption
c <- getC |> liftFuture
d <- getD |> liftOption
e <- 10 |> lift
} yield e * (a * b) / (c * d)
result: OptionT[Future, Int] = OptionT(Future(<not completed>))
This approach also works for EitherT (and others) as the inner
context, but their lifting methods are more complex and require
parameters. Scalaz provides monad transformers for a lot of its own
types, so it is worth checking if one is available.
Implementing a monad transformer is an advanced topic. Although
ListT exists, it should be avoided because it can unintentionally
reorder flatMap calls according to
https://github.com/scalaz/scalaz/issues/921. A better alternative is
StreamT, which we will visit later.