algorithm

How to drop the first matching element in a Scala sequence

Summary: This blog post shows one way to drop/filter the first matching element from a Scala sequence (Seq, List, Vector, Array, etc.). I don’t claim that the algorithm is efficient, but it does work.

Background

While creating some Scala test code earlier today I had an immutable list of toppings for a pizza, and I got into a situation where I wanted to remove the first instance of a topping.

How to define an `equals` method in a Scala class (object equality)

Table of Contents1 - Solution2 - A Scala `equals` method example3 - Discussion4 - Example 2: A Scala `equals` method with inheritance5 - Implementing hashCode6 - See Also

Scala problem: You want to define an equals method for your class so you can compare object instances to each other.

Back to top

Solution

If you’re new to Scala, a first thing to know is that object instances are compared with ==:

"foo" == "foo"   // true
"foo" == "bar"   // false
"foo" == null    // false
null == "foo"    // false
1 == 1           // true
1 == 2           // false
1d == 1.0d       // true

case class Person(name: String)
Person("Jess") == Person("Jessie")   // false

This is different than Java, which uses == for primitive values and equals for object comparisons.

A Kotlin Adler-32 checksum algorithm

As a short post today, here’s an example of a Kotlin implementation of the Adler-32 checksum algorithm:

Scala code to find (and move or remove) duplicate files

My MacBook recently told me I was running out of disk space. I knew that the way I was backing up my iPhone was resulting in me having multiple copies of photos and videos, so I finally decided to fix that problem by getting rid of all of the duplicate copies of those files.

So I wrote a little Scala program to find all the duplicates and move them to another location, where I could check them before deleting them. The short story is that I started with over 28,000 photos and videos, and the code shown below helped me find nearly 5,000 duplicate photos and videos under my ~/Pictures directory that were taking up over 18GB of storage space. (Put another way, deleting those files saved me 18GB of storage.)

Scala version of Collective Intelligence Euclidean distance algorithm

While reading the excellent book, Programming Collective Intelligence recently, I decided to code up the first algorithm in the book using Scala instead of Python (which the book uses). This is a Euclidean distance algorithm, and it provides one way to compare two sets of data to each other, and attempts to score the similarity between the data sets.

Without any further introduction (and assuming you have the Collective Intelligence book), here's the Scala source code for the Euclidean distance algorithm as described in the book:

A Scala method to create an MD5 hash of a string

If you happen to need Scala method to perform an MD5 hash on a string, here you go:

def md5HashString(s: String): String = {
    import java.security.MessageDigest
    import java.math.BigInteger
    val md = MessageDigest.getInstance("MD5")
    val digest = md.digest(s.getBytes)
    val bigInt = new BigInteger(1,digest)
    val hashedString = bigInt.toString(16)
    hashedString
}

A Java method to round a float value to the nearest one-half value

As a quick note, here’s a Java method that will round a float to the nearest half value, such as 1.0, 1.5, 2.0, 2.5, etc.:

/**
 * converts as follows:
 * 1.1  -> 1.0
 * 1.3  -> 1.5
 * 2.1  -> 2.0
 * 2.25 -> 2.5
 */
public static float roundToHalf(float f) {
    return Math.round(f * 2) / 2.0f;
}

The comments show how this function converts the example float values to their nearest half value, so I won’t add any more details here. I don’t remember the origin of this algorithm — I just found it in some old code, thought it was clever, and thought I’d share it here.

AlphaZero generalized to learn more games by itself

From this Cornell University page, Google’s AlphaZero algorithm has been generalized to learn new games given only the game rules: “In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains.