How to Enable Filtering in a Scala for-expression

Next, let’s see if we can use a filtering clause inside of a for expression with the Sequence code we have so far.

Trying to use a filtering expression

When I paste the current Sequence class and this code into the Scala REPL:

val ints = Sequence(1,2,3,4,5)

val res = for {
    i <- ints
    if i > 2
} yield i*2

I see the following error message:

<console>:14: error: value filter is not a member of Sequence[Int]
           i <- ints
                ^

Again the bad news is that I can’t use a filtering clause like if i > 2, and the good news is that the REPL tells me why it won’t work. So let’s fix this problem.

Before we continue ...

One note before we continue: I ran this example with the Scala 2.11.7 REPL, and that error message isn’t 100% accurate. As I mentioned a few lessons ago, the current rule for how to get a custom collection class to work in a for expression is this:

  • If a class defines withFilter, it allows for filter expressions starting with an if within the for expression

In versions of Scala up to 2.7 the filter error message shown in the REPL was correct, but starting with Scala 2.8 the preferred solution is to implement a withFilter method rather than a filter method. Therefore, in the following code I’ll implement withFilter.

I’ll write more about this shortly, but for the purposes of this lesson you can think of withFilter as being just like a filter method.

Writing withFilter’s type signature

At this point a good question is, “How does a filter method work?” This image shows the Scaladoc for the filter method of the Scala List class:

The Scaladoc for the `filter` method of the Scala `List` class

By looking at that figure — and from your knowledge of the Scala collection methods — you know these things about how a typical filter method works:

1) It takes a function input parameter (FIP). That FIP must be able to be applied to the type of elements in the collection, and must return a Boolean value.

2) filter loops over the elements in its collection, and returns a new collection that contains the elements for which the passed-in function evaluates to true.

For instance, you can pass the anonymous function _ > 2 into a List[Int]:

scala> val res = List(1,2,3,4,5).filter(_ > 2)
res: List[Int] = List(3, 4, 5)

Note: A function that returns a Boolean value is known as a predicate.

3) Unlike map, filter doesn’t transform elements in the collection, it just returns a subset of the elements in the collection. For instance, when _ > 2 is applied, all elements in the collection that are greater than 2 are returned. This tells us that filter’s return type will be the same as the elements Sequence contains.

Put together, these bullet points tell us that a filter method for Sequence will have this type signature:

def filter(p: A => Boolean): Sequence[A] = ???

In that code, p stands for the predicate that filter takes as an input parameter. Because Sequence contains elements of type A, the predicate transforms that type to a Boolean, and filter returns a Sequence[A].

When that method body is implemented you’ll be able to write code like this:

val ints = Sequence(1,2,3,4,5).filter(i > 2)

Because Scala for expressions prefer withFilter, I’ll go ahead and rename filter to withFilter at this point:

def withFilter(p: A => Boolean): Sequence[A] = ???

Given this type signature, all I need to do now is implement withFilter’s body.

Implementing withFilter’s body

As with foreach and map, I’ll implement withFilter’s body by calling a method on Sequence’s private ArrayBuffer. Because in the real world there are differences in how a true withFilter method works, the easiest thing to do here is to call filter, so I’ll do that:

def withFilter(p: A => Boolean): Sequence[A] = {
    val tmpArrayBuffer = elems.filter(p)
    Sequence(tmpArrayBuffer: _*)
}

When I add this code to the existing implementation of the Sequence class I get this:

case class Sequence[A](initialElems: A*) {

    private val elems = scala.collection.mutable.ArrayBuffer[A]()

    elems ++= initialElems

    def withFilter(p: A => Boolean): Sequence[A] = {
        val tmpArrayBuffer = elems.filter(p)
        Sequence(tmpArrayBuffer: _*)
    }

    def map[B](f: A => B): Sequence[B] = {
        val abMap = elems.map(f)
        new Sequence(abMap: _*)
    }

    def foreach(block: A => Unit): Unit = {
        elems.foreach(block)
    }

}

Will this let us use a filtering clause in a for expression? Let’s see.

When I paste the Sequence class source code into the REPL and then paste in this code:

val ints = Sequence(1,2,3,4,5)

val res = for {
    i <- ints
    if i > 2
} yield i*2

I see the following result:

scala> val res = for {
     |     i <- ints
     |     if i > 2
     | } yield i*2
res: Sequence[Int] = Sequence(ArrayBuffer(6, 8, 10))

Excellent, it works as desired. I can now use if clauses inside for expressions with the Sequence class.

I’ll implement more functionality in the next lesson, but it’s worth pausing for a few moments here to learn more about the differences between implementing withFilter or filter in a class that you want to use in a for expression.

filter vs withFilter

You can read more about how for/yield expressions are translated in a post on the official Scala website titled, “How does yield work?,” but the short story is this:

  • for comprehensions with if filters are translated to withFilter method calls
  • If withFilter does not exist on the class being used in the for comprehension, the compiler will fall back and use the class’s filter method instead
  • If neither method exists, the compilation attempt will fail

If I had implemented filter in this lesson (rather than withFilter), in the next lesson you’d start to see compiler warning messages like this:

Warning:(31, 14) `withFilter' method does not yet exist on 
Sequence[A], 
using `filter' method instead
        p <- peeps
             ^

To avoid those warning messages, I implemented withFilter here.

However — and that’s a big however — it’s important to know that my withFilter method is not exactly what the Scala compiler is expecting.

If you’re not familiar with the difference between filter and withFilter on the built-in Scala collection classes, I wrote about them in a blog post titled, “A good example to show the differences between strict and lazy evaluation in Scala.” What I wrote there can be summarized by what you find in the withFilter Scaladoc on Scala collection classes like List:

withFilter creates a non-strict filter of this traversable collection. Note: the difference between c filter p and c withFilter p is that the former creates a new collection, whereas the latter only restricts the domain of subsequent map, flatMap, foreach, and withFilter operations.

There are probably ways that I could cheat to create a withFilter method that meets that definition, but I think that obscures the main purpose of this lesson:

  • If you implement a withFilter or filter method in your custom class, you’ll be able to use that class with an if clause in a for expression. (This assumes that you also implement other methods like foreach and map.)

Summary

I can summarize what I accomplished in this lesson and the previous lessons with these lines of code:

// (1) a single generator works because `foreach` is defined
for (p <- peeps) println(p)

// (2) `yield` works because `map` is defined
val res: Sequence[Int] = for {
    i <- ints
} yield i * 2
res.foreach(println)

// (3) `if` works because `withFilter` is defined
val res = for {
    i <- ints
    if i > 2
} yield i*2

What’s next

Now that I have Sequence working in all of these ways, there’s just one more thing to learn: how to modify it so we can use multiple generators in a for expression. We’ll accomplish that in the next lesson.

books by alvin