JIT: IO and Algebraic Substitution (Scala 3 Video)
The first key about IO
types — such as ZIO
— is that they let you continue to “reason about your code” using the concepts of referential transparency (RT) and algebraic substitution (AS).
Referential transparency is good
To demonstrate this, here’s a new function named stringLength
:
def stringLength(s: String): Int = s.length
Technically the Scala String
type already has that length
method, but ignoring that for now, the important part of stringLength
is that it’s a pure function. Therefore, in this next example, fooLength
and stringLength("foo")
will always and forever have the exact same value (the integer 3
):
val fooLength = stringLength("foo")
So now anywhere in your code you can use fooLength
, stringLength("foo")
, and the value 3
, and they will always and forever have the same value and mean the same thing.
Now contrast that with an impure method like io.StdIn.readLine()
:
val input = io.StdIn.readLine()
Because people can type an infinite number of things at the command line, the return value of io.StdIn.readLine()
will almost always be different.
This means that readLine
is NOT referentially transparent, and we cannot use algebraic substitution with it. In the FP world this is bad, because we want to write our code as algebra.
So, what can we do?
This is where the Haskell people invented the IO
type.
IO to the rescue
The way you use an IO
type is similar to Try
, looking something like these two examples in the most basic case:
val input: IO[String] = IO(io.StdIn.readLine())
def printLater(s: String): IO[Unit] = IO {
println(s)
}
A key here is that IO
is like a wrapper, and a lazy wrapper at that: the code that you pass into IO
isn’t executed now, it will be executed some time later. This is even true in the first case, where I define input
as a val
field.
Because
IO
is lazy, we’re still writing code as a blueprint. Right now we describe what we want, and then some time later that code is actually run.
Instead of thinking of IO
as a wrapper, you can think of it as a box — and a bit of a time machine as well. Right now it doesn’t contain anything, it’s just like a box with a label on it that describes what will be in it some time later. You can imagine a label being on the box that says, “Some time later this box will contain either (a) a String
or (b) an error.”
TIP: An
IO
is like getting a receipt from Amazon that “your package will be delivered,” and some time later it will either (a) be delivered just fine (theEither
happy case), or (b) be damaged (Either
’s unhappy case). (It may also get lost, but that doesn’t fit with my analogy.)
But right now, as mathematician or architect, you just think of it as an IO
:
+------+
| IO |
+------+
If you’re familiar with a Scala Future
, it’s similar to a Future
, with one big difference: a Future
begins running immediately, but IO
is lazy, so nothing happens right now.
Laziness
We say that code like this is lazy, or lazily evaluated. Haskell is famous for being a language where everything is evaluated lazily. You can write all the Haskell code you want, but nothing will happen until you issue the magic incantation that says, “Run the code now.”
Conversely, all of the code I used to write in Java was strict, or strictly evaluated. For example, when you write Java code like this, it runs right now:
String input = getInput();
Frankly, I wrote Java code like this for almost fifteen years, and didn’t even know there was another way to write code.
Laziness helps with RT
What laziness does for us is that it lets us keep thinking of our code as algebra or a blueprint. We know that we’re working with an IO
of a specific type — such as an IO[String]
— and then we merrily keep creating our blueprint, because nothing happens now.
For instance, we can also write a for
expression using our lazy IO
values, and again, nothing happens yet:
val result = for
input <- readInputLater()
_ <- printLater(input)
yield
()
As I mentioned, Haskell has a magic incantation that lets you say, “Okay, I’m ready, run the blueprint now.” The Cats Effect and ZIO libraries have a similar incantation, and historically that incantation has been a function named something like runUnsafe
. This means that when you’re ready to run your entire application — or in this example, just this for
expression — you run it like this:
runUnsafe(result)
The actual function names vary, but the word “unsafe” is an FP way of saying, “Hang on everyone, we’re going to execute the blueprint now, including all those unsafe side effects.” Until now everything was all equations and blueprints, and nothing happened, but now it’s action time.
And now, to wrap up this lesson, I want to add two notes.
Note 1: Future is not like this
First, it’s important to be clear that the Scala Future
does not work like this. For example, if the earlier input
value was written like this:
val input: Future[String] = Future(doSomeSideEffect())
the doSomeSideEffect
function will be run immediately. This is how Future
works, it runs immediately.
From an FP standpoint, this is bad; it’s a design flaw. The flaw is that Future
combines the blueprint and its action in the same step. As FPers, we need to be able to layout the blueprint without it starting to run while we’re still working. As a mathematician/architect, we need to be able to separate the what from the when, the description from the execution: Future
’s action should only be triggered some time later, specifically when runUnsafe
is invoked.
Note 2: Call by-name parameters
If you’re wondering how IO
can be lazy, and you’re not familiar with how call by-name parameters work in Scala, please see the Call By-Name lesson in the Appendix. It explains how this works.
Attribution
The “box” icon in this video comes from this icons8.com link.
Update: All of my new videos are now on
LearnScala.dev