checksum

Scala code to find (and move or remove) duplicate files

My MacBook recently told me I was running out of disk space. I knew that the way I was backing up my iPhone was resulting in me having multiple copies of photos and videos, so I finally decided to fix that problem by getting rid of all of the duplicate copies of those files.

So I wrote a little Scala program to find all the duplicates and move them to another location, where I could check them before deleting them. The short story is that I started with over 28,000 photos and videos, and the code shown below helped me find nearly 5,000 duplicate photos and videos under my ~/Pictures directory that were taking up over 18GB of storage space. (Put another way, deleting those files saved me 18GB of storage.)

How to convert an array of bytes to a hex string in Scala

If you need to convert an array of bytes to a hex string in Scala, I can confirm that this code works:

def convertBytesToHex(bytes: Seq[Byte]): String = {
    val sb = new StringBuilder
    for (b <- bytes) {
        sb.append(String.format("%02x", Byte.box(b)))
    }
    sb.toString
}

I just used this code as part of a checksum algorithm (SHA-1, SHA-256, etc.), and I tested it against command line checksum commands to verify that it works properly.

A Scala Adler-32 checksum algorithm

While fooling around recently with various computer programming algorithms, I ended up writing an implementation of the Adler-32 checksum algorithm in Scala. There isn’t too much to say about it, other than I hope I got it right. My results for the simple test below matched the results shown on the Adler-32 Wikipedia page, so that’s encouraging. :)

Here's the Scala source code for my Adler-32 checksum implementation: