sha-1

Scala code to find (and move or remove) duplicate files

My MacBook recently told me I was running out of disk space. I knew that the way I was backing up my iPhone was resulting in me having multiple copies of photos and videos, so I finally decided to fix that problem by getting rid of all of the duplicate copies of those files.

So I wrote a little Scala program to find all the duplicates and move them to another location, where I could check them before deleting them. The short story is that I started with over 28,000 photos and videos, and the code shown below helped me find nearly 5,000 duplicate photos and videos under my ~/Pictures directory that were taking up over 18GB of storage space. (Put another way, deleting those files saved me 18GB of storage.)

How to convert an array of bytes to a hex string in Scala

If you need to convert an array of bytes to a hex string in Scala, I can confirm that this code works:

def convertBytesToHex(bytes: Seq[Byte]): String = {
    val sb = new StringBuilder
    for (b <- bytes) {
        sb.append(String.format("%02x", Byte.box(b)))
    }
    sb.toString
}

I just used this code as part of a checksum algorithm (SHA-1, SHA-256, etc.), and I tested it against command line checksum commands to verify that it works properly.