How to define an `equals` method in a Scala class (object equality)

Scala problem: You want to define an equals method for your class so you can compare object instances to each other.

Solution

If you’re new to Scala, a first thing to know is that object instances are compared with ==:

"foo" == "foo"   // true
"foo" == "bar"   // false
"foo" == null    // false
null == "foo"    // false
1 == 1           // true
1 == 2           // false
1d == 1.0d       // true

case class Person(name: String)
Person("Jess") == Person("Jessie")   // false

This is different than Java, which uses == for primitive values and equals for object comparisons.

A second thing to know is that properly writing an equals method turns out to be a difficult problem, so much so that Programming In Scala, Third Edition, by Odersky, Venners, and Spoon (Artima Press) takes 23 pages to discuss it, and Effective Java, Third Edition, by Joshua Bloch (Addison-Wesley) takes 17 pages to cover object equality. Effective Java begins its treatment with the statement, “Overriding the equals method seems simple, but there are many ways to get it wrong, and the consequences can be dire.” Despite this, I’ll attempt to demonstrate a solid solution to the problem, and also share references for further reading.

Effective Java tip: Don’t implement equals unless necessary

Before jumping into the “How to implement an equals method” solution, it’s worth noting that Effective Java states that not implementing an equals method is the correct solution for the following situations:

  • Each instance of a class is inherently unique. Instances of a Thread class are given as an example.

  • There is no need for the class to provide a “logical equality” test. The Java Pattern class is given as an example, where the designers didn’t think that people would want or need this functionality, so it simply inherits its behavior from the Java Object class.

  • A superclass has already overridden equals, and its behavior is appropriate for this class.

  • The class is private or package-private (in Java), and you are certain its equals method will never be invoked.

Those are four situations where you won’t want to write a custom equals method for your class, and those rules make sense for Scala as well. The rest of this recipe focuses on how to properly implement an equals method.

The solution: A seven-step process

Programming In Scala recommends a seven-step process for implementing an equals method for non-final classes:

  1. Create a canEqual method with the proper signature, taking an Any parameter and returning a Boolean.

  2. canEqual should return true if the argument passed into it is an instance of the current class, false otherwise. (The current class is especially important with inheritance.)

  3. Implement the equals method with the proper signature, taking an Any parameter and returning a Boolean.

  4. Write the body of equals as a single match expression.

  5. The match expression should have two cases. The first case should be a typed pattern for the current class.

  6. In the body of this case, use a series of logical “and” tests for all of the tests in this class that must be true. If this class extends anything other than AnyRef, you’ll want to invoke your superclass equals method as part of these tests. One of the “and” tests must also be a call to canEqual.

  7. For the other case, just specify a wildcard pattern that yields false.

Any time you implement an equals method you should also implement a hashCode method, so you might say that’s Step 8 in this process.

The following example demonstrates these steps.

A Scala `equals` method example

Here’s an example class that demonstrates how to properly write an equals method for a small Scala class. In this example I’ll create a Person class with two fields:

class Person (var name: String, var age: Int) { ...

Given those two constructor parameters, here’s the complete source code for a Person class that implements an equals method and a corresponding hashCode method. The comments show which steps in the solution the code refers to:

class Person (var name: String, var age: Int) {

    // Step 1 - proper signature for `canEqual`
    // Step 2 - compare `a` to the current class
    def canEqual(a: Any) = a.isInstanceOf[Person]

    // Step 3 - proper signature for `equals`
    // Steps 4 thru 7 - implement a `match` expression
    override def equals(that: Any): Boolean =
        that match {
            case that: Person => {
                that.canEqual(this) &&
                this.name == that.name &&
                this.age == that.age
            }
            case _ => false
        }

    // Step 8 - implement a corresponding hashCode c=method
    override def hashCode: Int = {
        val prime = 31
        var result = 1
        result = prime * result + age;
        result = prime * result + (if (name == null) 0 else name.hashCode)
        result
    }

}

If you compare that code to the seven steps previously described, you’ll see that they match those definitions. A key to the solution is this code inside the first case expression:

that.canEqual(this) &&
this.name == that.name &&
this.age == that.age

While the first part of the expression — the code case that: Person — tests to see whether that is an instance of Person, the expression that.canEqual(this) is a way of testing the opposite situation: that the current instance (this) is an instance of that. This is particularly important when inheritance is involved, such as when Employee is an instance of Person, but Person is not an instance of Employee. After that, the rest of the code after canEqual tests the equality of the individual fields in the Person class.

With the equals method defined, you can compare instances of a Person with ==, as demonstrated in the following ScalaTest unit tests:

import org.scalatest.FunSuite

class PersonTests extends FunSuite {

    // these first two instances should be equal
    val nimoy = new Person("Leonard Nimoy", 82)
    val nimoy2 = new Person("Leonard Nimoy", 82)
    val shatner = new Person("William Shatner", 82)
    val stewart = new Person("Patrick Stewart", 47)

    // all tests pass
    test("nimoy   != null")    { assert(nimoy != null) }

    // these should be equal
    test("nimoy   == nimoy")   { assert(nimoy == nimoy) }
    test("nimoy   == nimoy2")  { assert(nimoy == nimoy2) }
    test("nimoy2  == nimoy")   { assert(nimoy2 == nimoy) }

    // these should not be equal
    test("nimoy   != shatner") { assert(nimoy != shatner) }
    test("shatner != nimoy")   { assert(shatner != nimoy) }
    test("nimoy   != String")  { assert(nimoy != "Leonard Nimoy") }
    test("nimoy   != stewart") { assert(nimoy != stewart) }

}

All of these tests pass as desired. In the Discussion, the “reflexive” and “symmetric” comments are explained, and a second example shows how this formula works when an Employee class extends Person.

At the time of this writing, when given a Person class with name and age fields, IntelliJ IDEA generates an equals method that is almost identical to the code shown in this solution.

Discussion

The way == works in Scala is that when it’s invoked on a class instance, as in nimoy == shatner, the equals method on nimoy is called. == is a bit of syntactic sugar, and this code:

nimoy == shatner

is the same as this code:

nimoy.==(shatner)

which is the same as this code:

nimoy.equals(shatner)

As shown, the == method is like syntactic sugar for calling equals. You could write nimoy.equals(shatner), but nobody does that because == is much easier for humans to read.

The `equals` contract

The Scaladoc for the equals method of the Any class essentially specifies the contract for how equals methods should be implemented. It begins by stating, “any implementation of this method should be an equivalence relation.” It further states that an equivalence relation should have these three properties:

  • It is reflexive: for any instance x of type Any, x.equals(x) should return true.

  • It is symmetric: for any instances x and y of type Any, x.equals(y) should return true if and only if y.equals(x) returns true.

  • It is transitive: for any instances x, y, and z of type AnyRef, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.

Therefore, if you override the equals method, you should verify that your implementation remains an equivalence relation. The Person example meets that criteria. Now let’s look at how to handle this when inheritance is involved.

Example 2: A Scala `equals` method with inheritance

An important benefit of this approach is that you can continue to use it when you use inheritance in classes. For instance, in the following code, the Employee class extends the Person class that’s shown in the Solution. It uses the same formula that was shown in the first example, with additional tests to (a) test the new role field in Employee, and (b) call super.equals(that) to verify that equals in Person is also true:

class Employee(name: String, age: Int, var role: String)
extends Person(name, age)
{
    override def canEqual(a: Any) = a.isInstanceOf[Employee]

    override def equals(that: Any): Boolean =
        that match {
            case that: Employee => {
                that.canEqual(this) &&
                this.role == that.role &&
                super.equals(that)
            }
            case _ => false
        }

    override def hashCode: Int = {
        val prime = 31
        var result = 1
        result = prime * result + (if (role == null) 0 else role.hashCode)
        result + super.hashCode
    }

}

Note in this code:

  • canEqual checks for an instance of Employee (not Person).

  • The first case expression also tests for Employee (not Person).

  • The Employee case calls canEqual, tests the field(s) in its class (as Person did), and also calls super.equals(that) to use the equals code in Person to use its equality tests. This ensures that the fields in Person as well as the new role field in Employee are all equal.

The following ScalaTest unit tests verify that the equals method in Employee is implemented correctly:

import org.scalatest.FunSuite

class EmployeeTests extends FunSuite {

    // these first two instance should be equal
    val eNimoy1 = new Employee("Leonard Nimoy", 82, "Actor")
    val eNimoy2 = new Employee("Leonard Nimoy", 82, "Actor")
    val pNimoy = new Person("Leonard Nimoy", 82)
    val eShatner = new Employee("William Shatner", 82, "Actor")

    // equality tests
    test("eNimoy1 == eNimoy1") { assert(eNimoy1 == eNimoy1) }
    test("eNimoy1 == eNimoy2") { assert(eNimoy1 == eNimoy2) }
    test("eNimoy2 == eNimoy1") { assert(eNimoy2 == eNimoy1) }

    // non-equality tests
    test("eNimoy1  != pNimoy")   { assert(eNimoy1  != pNimoy) }
    test("pNimoy   != eNimoy1")  { assert(pNimoy   != eNimoy1) }
    test("eNimoy1  != eShatner") { assert(eNimoy1  != eShatner) }
    test("eShatner != eNimoy1")  { assert(eShatner != eNimoy1) }

}

All the tests pass, including the comparison of the eNimoy and pNimoy objects, which are instances of the Employee and Person classes, respectively.

Discussion

As a warning, while these examples demonstrate a solid formula for implementing equals and hashCode methods, the Artima document, How to Write an Equality Method in Java, explains that when equals and hashCode algorithms depend on mutable state, i.e., var fields like name, age, and role, this can be a problem for users in collections. They write:

“If they (users of your class) put such objects into collections, they have to be careful never to modify the depended-on state, and this is tricky. If you need a comparison that takes the current state of an object into account, you should usually name it something else, not equals.”

The problem is easily demonstrated in Scala. First, create an Employee instance like this:

val eNimoy = new Employee("Leonard Nimoy", 81, "Actor")

Then add that instance to a Set:

val set = scala.collection.mutable.Set[Employee]()
set += eNimoy

When you run this code, you’ll see that it returns true, as expected:

set.contains(eNimoy)   // true

But now if you modify the eNimoy instance and then run the same test, you’ll find that it (probably) returns false:

eNimoy.age = 82
set.contains(eNimoy)   // false

In regards to handling this problem, the Artima blog post — which uses a Point(x,y) for their example — suggests:

“Considering the last definition of Point, it would have been preferable to omit a redefinition of hashCode and to name the comparison method equalContents, or some other name different from equals. Point would then have inherited the default implementation of equals and hashCode. So p would have stayed locatable in the collection even after the modification to its x field.”

Implementing hashCode

I won’t discuss hashCode algorithms in depth, but Effective Java states that the following statements comprise the contract for hashCode algorithms (which Joshua Bloch adapted from the Java Object documentation):

  • When hashCode is invoked on an object repeatedly within an application, it must consistently return the same value, provided that no information in the equals method comparison has changed.

  • If two objects are equal according to their equals methods, their hashCode values must be the same.

  • If two objects are unequal according to their equals methods, it is not required that their hashCode values be different. But, producing distinct results for unequal objects may improve the performance of hash tables.

As a brief survey of hashCode algorithms, the algorithm I used in the Person class is consistent with the suggestions in Effective Java:

override def hashCode: Int = {
    val prime = 31
    var result = 1
    result = prime * result + age;
    result = prime * result + (if (name == null) 0 else name.hashCode)
    result
}

Next, this is the hashCode method produced by making Person a case class, then compiling its code with the Scala 3 scalac command, and decompiling it with JAD:

public int hashCode() {
    int i = 0x8e488775;
    i = Statics.mix(i, Statics.anyHash(firstName()));
    i = Statics.mix(i, Statics.anyHash(lastName()));
    i = Statics.mix(i, age());
    return Statics.finalizeHash(i, 3);
}

The “generate code” option of IntelliJ IDEA generates this code for the Person class when I tell it to use the name and age fields in its algorithm:

override def hashCode(): Int = {
    val state = Seq(name, age)
    state.map(_.hashCode()).foldLeft(0)((a, b) => 31 * a + b)
}

Finally, using the same approach with the Scala IDE for Eclipse produces this algorithm:

override def hashCode() = {
    val prime = 41
    prime * (prime + name.hashCode) + age.hashCode
}

See Also