home search about rss feed twitter ko-fi

JIT: IO and Algebraic Substitution (Scala 3 Video)

The first key about IO types — such as ZIO — is that they let you continue to “reason about your code” using the concepts of referential transparency (RT) and algebraic substitution (AS).

Referential transparency is good

To demonstrate this, here’s a new function named stringLength:

def stringLength(s: String): Int = s.length

Technically the Scala String type already has that length method, but ignoring that for now, the important part of stringLength is that it’s a pure function. Therefore, in this next example, fooLength and stringLength("foo") will always and forever have the exact same value (the integer 3):

val fooLength = stringLength("foo")

So now anywhere in your code you can use fooLength, stringLength("foo"), and the value 3, and they will always and forever have the same value and mean the same thing.

Now contrast that with an impure method like io.StdIn.readLine():

val input = io.StdIn.readLine()

Because people can type an infinite number of things at the command line, the return value of io.StdIn.readLine() will almost always be different.

This means that readLine is NOT referentially transparent, and we cannot use algebraic substitution with it. In the FP world this is bad, because we want to write our code as algebra.

So, what can we do?

This is where the Haskell people invented the IO type.

IO to the rescue

The way you use an IO type is similar to Try, looking something like these two examples in the most basic case:

val input: IO[String] = IO(io.StdIn.readLine())

def printLater(s: String): IO[Unit] = IO {
    println(s)
}

A key here is that IO is like a wrapper, and a lazy wrapper at that: the code that you pass into IO isn’t executed now, it will be executed some time later. This is even true in the first case, where I define input as a val field.

Because IO is lazy, we’re still writing code as a blueprint. Right now we describe what we want, and then some time later that code is actually run.

Instead of thinking of IO as a wrapper, you can think of it as a box — and a bit of a time machine as well. Right now it doesn’t contain anything, it’s just like a box with a label on it that describes what will be in it some time later. You can imagine a label being on the box that says, “Some time later this box will contain either (a) a String or (b) an error.”

TIP: An IO is like getting a receipt from Amazon that “your package will be delivered,” and some time later it will either (a) be delivered just fine (the Either happy case), or (b) be damaged (Either’s unhappy case). (It may also get lost, but that doesn’t fit with my analogy.)

But right now, as mathematician or architect, you just think of it as an IO:

+------+
|  IO  |
+------+

If you’re familiar with a Scala Future, it’s similar to a Future, with one big difference: a Future begins running immediately, but IO is lazy, so nothing happens right now.

Laziness

We say that code like this is lazy, or lazily evaluated. Haskell is famous for being a language where everything is evaluated lazily. You can write all the Haskell code you want, but nothing will happen until you issue the magic incantation that says, “Run the code now.”

Conversely, all of the code I used to write in Java was strict, or strictly evaluated. For example, when you write Java code like this, it runs right now:

String input = getInput();

Frankly, I wrote Java code like this for almost fifteen years, and didn’t even know there was another way to write code.

Laziness helps with RT

What laziness does for us is that it lets us keep thinking of our code as algebra or a blueprint. We know that we’re working with an IO of a specific type — such as an IO[String] — and then we merrily keep creating our blueprint, because nothing happens now.

For instance, we can also write a for expression using our lazy IO values, and again, nothing happens yet:

val result = for
    input <- readInputLater()
    _     <- printLater(input)
yield
    ()

As I mentioned, Haskell has a magic incantation that lets you say, “Okay, I’m ready, run the blueprint now.” The Cats Effect and ZIO libraries have a similar incantation, and historically that incantation has been a function named something like runUnsafe. This means that when you’re ready to run your entire application — or in this example, just this for expression — you run it like this:

runUnsafe(result)

The actual function names vary, but the word “unsafe” is an FP way of saying, “Hang on everyone, we’re going to execute the blueprint now, including all those unsafe side effects.” Until now everything was all equations and blueprints, and nothing happened, but now it’s action time.

And now, to wrap up this lesson, I want to add two notes.

Note 1: Future is not like this

First, it’s important to be clear that the Scala Future does not work like this. For example, if the earlier input value was written like this:

val input: Future[String] = Future(doSomeSideEffect())

the doSomeSideEffect function will be run immediately. This is how Future works, it runs immediately.

From an FP standpoint, this is bad; it’s a design flaw. The flaw is that Future combines the blueprint and its action in the same step. As FPers, we need to be able to layout the blueprint without it starting to run while we’re still working. As a mathematician/architect, we need to be able to separate the what from the when, the description from the execution: Future’s action should only be triggered some time later, specifically when runUnsafe is invoked.

Note 2: Call by-name parameters

If you’re wondering how IO can be lazy, and you’re not familiar with how call by-name parameters work in Scala, please see the Call By-Name lesson in the Appendix. It explains how this works.

Attribution

The “box” icon in this video comes from this icons8.com link.

Update: All of my new videos are now on
LearnScala.dev