Algebraic Data Type: “A type defined by providing several alternatives, each of which comes with its own constructor. It usually comes with a way to decompose the type through pattern matching. The concept is found in specification languages and functional programming languages. Algebraic data types can be emulated in Scala with case classes.”

## Introduction

I debated for a long time about how to introduce Algebraic Data Types (ADTs) in this book. I finally decided that ADTs aren’t a way of proactively *designing* FP code; instead, they’re a way of *categorizing* the FP code you’ve already written, specifically your FP data models. That is, you don’t sit down and say, “I want to write my `Customer`

class as a ‘sum of the products’ ADT.” Instead, if you do anything, you might look at some code and say, “Hey, Mary, did you notice that the `Customer`

class is a ‘sum of the products’ ADT?”

Because I look at ADTs as a way of categorizing or classifying code rather than designing code, I decided it was best to include this topic as an appendix in this book, rather than as a lesson in the front of the book.

The “TL;DR” version of this lesson goes like this: If you create your data models using (a) `case`

classes with immutable fields and (b) `case`

objects, and (c) those data types have no methods, you’re already writing ADTs.

Surprise: Almost every

`case`

class and`case`

object in this book is some form of ADT. I just didn’t think it was important to categorize them when I wrote them.

## Goals, motivation

The goals of this lesson are:

- To define basic ADT terminology
- To show examples of the different ADT types
- To help demonstrate another way to see your Scala/FP code as algebra

In regards to that last point, a main benefit of being aware of ADTs is that you begin to see your code as being even more like algebra. You specifically begin to see your `case`

classes and `case`

objects as *sets of objects*, and the functions that operate on them as *operators* on those data sets. To understand what that means, it will help to formally define *algebra*.

## What is “Algebra”?

To understand ADTs, you first have to understand what is meant by the word “algebra.” Informally, an *algebra* can be thought of as consisting of two things:

- A set of objects
- The operations that can be applied to those objects to create new objects

Technically an algebra also consists of a third item, the

lawsthat govern the algebra, but I’m not going to cover laws in this lesson.

If you’re like me, you never thought of algebra as anything other than high school algebra, but it turns out that any concept that can be thought of as (a) a set of objects, and (b) operators on those objects is a form of algebra. Here are a few examples.

### Numeric algebra

“High school algebra” is the algebra we learned back in high school (or possibly earlier). It’s more formally known as “numeric algebra,” the algebra of numbers. You can think of it like this:

- A set of objects, such as whole numbers
- The operations that can be used on those objects:
`+`

,`-`

,`*`

(and`/`

)

One thing I never thought about is that the *operators* are used on existing numbers (objects) to create new numbers (objects):

```
1 + 1 = 2
3 * 3 = 9
```

Notice that even in basic math, the numbers `2`

and `9`

are “created” from the numbers `1`

and `3`

by using the `+`

and `*`

operators.

### Relational algebra

Another type of algebra is known as “relational algebra.” This is the algebra of relational databases, and in this algebra the database tables are the “set of objects,” and query commands like `SELECT`

, `UPDATE`

, and `JOIN`

are the “operators” that let you create new objects from the existing objects.

### Algebra in programming

Throughout this book you’ve been using algebra, possibly without knowing it. (I certainly didn’t know it when I started working with FP.) For example, take a look at this `case`

class:

```
case class Pair (
a: Int,
b: Int
)
```

This code creates a new type `Pair`

from two instances of the existing type `Int`

. The *class constructor* itself is an “operator” that lets you create new types from existing Scala types, just like `+`

and `*`

let you create new numbers from existing numbers.

Here’s another example of how you can use Scala operators to create new data types from existing ones:

```
sealed trait Direction
case object North extends Direction
case object South extends Direction
case object East extends Direction
case object West extends Direction
```

To learn about the “algebra” of these two examples, read on ...

## Three types of Algebraic Data Types

ADTs fall into three main categories:

*Sum*type*Product*type- Hybrid types

You actually just saw the first two types, and I’ll explain all three of them in the following sections.

## The Sum type

The `Direction`

example I just showed is called a “Sum type,” or Sum ADT. The Sum type is also referred to as an “enumerated type” because you simply enumerate all of the possible instances of the type. A few important points about this are:

- Sum types are typically created with a
`sealed trait`

as the base type, with instances created as`case`

objects. You use a sealed trait because you don’t want them to be extended. - The number of enumerated types you list are the only possible instances of the base type. In this example,
`Direction`

has four possible values:`North`

,`South`

,`East`

, or`West`

. - We use the phrases “is a” and “or” when talking about Sum types. For example,
`North`

*is a*type of`Direction`

, and`Direction`

is a`North`

*or*a`South`

or an`East`

or a`West`

.

People use different names for the concrete instances in a Sum type, including value constructors, alternates, and cases.

### Another example

As another example, imagine that you need to write your own “boolean” type for Scala. You can write them as a Sum type like this:

```
sealed trait Bool
case object True extends Bool
case object False extends Bool
```

Just like the `Direction`

example, the base type is defined as a `sealed trait`

and the two possible values are defined as `case object`

. Also notice that this approach uses the `sealed trait`

and `case object`

syntax as *operators* to create new data types.

### Why use `sealed trait`

?

A great feature of using `sealed trait`

is that it lets the compiler perform “exhaustiveness checking.” What happens is that a sealed trait can only be extended in the file in which it was defined; because it can’t be extended *anywhere* else, the compiler knows all of the subtypes of the trait that can possibly exist. Because of this, the compiler can exhaustively check the possible cases in `match`

expressions, and it will emit a warning if the `match`

expression isn’t exhaustive. This makes your programming life easier, and your code safer.

### Why use `case object`

?

The reason Sum types use the `case object`

declaration is that they only require *singleton* instances, and the Scala `object`

provides that functionality. For instance, with the `Bool`

example it makes sense to have only one `True`

instance in *all* of your code. There’s no need to create new `True`

and `False`

instances every time you work with boolean values. Scala’s `object`

gives you this singleton functionality.

As Scala/FP developers, we further use `case object`

— as opposed to `object`

— because it provides important additional functionality, with the most important feature being support for pattern matching; its automatically-generated `unapply`

method lets case objects work easily in `match`

expressions. (`case object`

also provides default `equals`

and `hashCode`

methods, extends `Serializable`

, has a good default `toString`

method, etc.)

## The Product type

The second type of ADT is known as a “Product type.” It’s name comes from the fact that you use the Scala `case`

class constructor to create a data type whose number of possible concrete instances can be determined by multiplying the number of possibilities of all of its constructor fields.

Take this class for example:

```
case class DoubleBoo (
b1: Bool,
b2: Bool
)
```

How many possible instances of this class can you have? Well, each field can either be `True`

or `False`

, so the possibilities are:

```
DoubleBoo(True, True)
DoubleBoo(True, False)
DoubleBoo(False, True)
DoubleBoo(False, False)
```

Therefore, the correct answer is that there are four possible instances. You can also derive this answer mathematically:

`b1`

has two possibilities`b2`

has two possibilities- The total number of possible instances is determined by multiplying the number of possibilities of each constructor field, and
`2`

multiplied by`2`

is`4`

Because the number of possible instances of Product ADTs can be calculated by multiplying the number of possible values of every constructor parameter, what do you think the number of possibilities of this `Pair`

type are:

```
case class Pair (
a: Int,
b: Int
)
```

If you answered, “A lot,” that’s close enough. An `Int`

has 2^32 possible values, so if you multiply the number of possible `Int`

values by itself, you get a very large number.

Next, what do you think the number of possibilities are for this class:

```
case class Person (
firstName: String,
lastName: String,
mother: Person,
father: Person
)
```

If you answered, “Infinite,” that’s a good answer. Because a `String`

has an infinite number of possibilities, `Person`

can have an infinite number of concrete instances.

While I don’t concern myself with ADTs too much, this particular point had a significant impact on me. When I first saw this, I realized that any time a function accepted a

`String`

, that`String`

had an infinite number of possibilities. That’s a lot to account for. Similarly, a boolean value has two possibilities, a`Byte`

has 256 possible values, and the`Direction`

Sum type has four possibilities. The lesson for me is that the fewer possibilities you have to deal with, the simpler your code will be. (At the very least, this was the last time I ever used a series of string constants instead of enumerations.)

Before we move on, here are a few important points about the Product type:

- Writing
`case class`

and defining the constructor parameters is essentially the “product” operator. - The number of possible values of a Product type is the product of all possible combinations of the constructor parameters (i.e., a Cartesian product).
- We use the phrases “has a” and “and” when talking about Product types.
`Pair`

*has a*`a`

*and*a`b`

;`Person`

has a`firstName`

,`lastName`

,`mother`

, and`father`

. - As shown in the
`Person`

example, Product types can be recursive;`mother`

and`father`

are declared as`Person`

types inside the`Person`

definition.

## Hybrid types

The Sum and Product types are the two base ADTs; all other ADTs are hybrids created from those base types. As the book Essential Scala states, “An algebraic data type is any data that uses the above two patterns” (Sum and Product).

One formally-defined hybrid type is known as the “Sum of Products” type. With a few minor changes to reflect modern Scala practices, Mario Gleichmann created a good example of this in 2011:

```
sealed trait Shape
final case class Circle(radius: Double) extends Shape
final case class Rectangle(width: Double, height: Double) extends Shape
```

These types represent a Sum type because `Shape`

*is a* `Circle`

*or* a `Rectangle`

; `Circle`

is a Product type because it *has a* `radius`

; and `Rectangle`

is also Product type because it *has a* `width`

*and* a `height`

.

There are other variations of these possibilities, which is why I refer to all other combinations as “hybrid” types. For instance, the `Pizza`

class in the domain modeling lessons is a Product type that contains three Sum types:

```
case class Pizza (
crustSize: CrustSize,
crustType: CrustType,
toppings: Seq[Topping]
)
```

Sum and Product types can be combined in any ways that are needed to solve the problem at hand. Hopefully this demonstrates the point I made at the beginning of this lesson: ADTs are just a way of formally categorizing the data types in your data model.

## Pattern matching

A great benefit of ADTs is that they simplify and encourage the use of pattern matching in your code. For instance, given these `Shape`

types:

```
sealed trait Shape
final case class Circle(radius: Double) extends Shape
final case class Rectangle(width: Double, height: Double) extends Shape
```

you can easily write an `isRound`

function using pattern matching:

```
def isRound(s: Shape): Boolean = s match {
case Circle(_) => true
case _ => false
}
```

Similarly, using the `Bool`

type I created earlier:

```
sealed trait Bool
case object True extends Bool
case object False extends Bool
```

you can define `and`

and `or`

functions with pattern matching:

```
def and(a: Bool, b: Bool): Bool = (a,b) match {
case (True, True) => True
case (False, False) => True
case (True, False) => False
case (False, True) => False
}
def or(a: Bool, b: Bool): Bool = (a,b) match {
case (True, _) => True
case (_, True) => True
case (_, _) => False
}
```

This is what those last two functions look like in the Scala REPL:

`scala> `**or(True,False)**
res0: Bool = True
scala> **and(True,False)**
res1: Bool = False

As demonstrated in these examples, using pattern matching with ADTs is a common programming pattern ... a Scala/FP idiom.

## Key points

The key points of this lesson are:

- If you create your data models using (a)
`case`

classes with immutable fields and (b)`case`

objects, and (c) those data types have no methods, you’re already writing ADTs (whether you knew it or not). - I view ADTs as a way of categorizing or observing code, not designing code.
- An “algebra” is a set of objects, the operators that can be used on those objects, and laws governing their behavior.
- The two main types of ADTs are the Sum type and Product type. Other hybrid ADTs are derived from these base types.
- ADTs encourage a pattern-matching style of programming.

## See also

- If you don’t mind a little Haskell, Chris Taylor’s The Algebra of ADTs is an excellent resource, and discusses ADT “laws”
- Mario Gleichmann wrote a series of articles on ADTs in Scala, starting with this one
- Daniel Eklund wrote a good article, What the heck are Algebraic Data Types?
- Susan Potter has a nice example of ADTs in Scala
- Tim Perrett uses ADTs well in this “traffic signal” example
- ADTs are described here on Wikipedia
- ADTs are described here on Haskell.org
- Wikipedia has this definition of Universal Algebra
- Martin Odersky mentions ADTs in this acm.org article
- In underscore.io’s article on sealed traits they have a nice discussion of ADTs
- The best book I know on Scala and functional programming :)