Scala Quicksort algorithms: FP/recursive, imperative (and performance)

As a short “note to self,” the quickSort function in the following code is from the “Scala By Example” PDF, and it shows a Scala version of a Quicksort algorithm:

object QuickSortFP extends App {

    // create an array of random 10m random ints
    val r = scala.util.Random
    val randomArray = (for (i <- 1 to 10000000) yield r.nextInt(10000)).toArray

    // do the sorting
    val sortedArray = quickSort(randomArray)

    // the fp/recursive algorithm
    def quickSort(xs: Array[Int]): Array[Int] = {
        if (xs.length <= 1) xs
        else {
            val pivot = xs(xs.length / 2)
            Array.concat(
                quickSort(xs filter (pivot >)),
                xs filter (pivot ==),
                quickSort(xs filter (pivot <)))
        }
    }

}

The quickSort function shows a nice, clean version of a functional programming (FP), recursive solution to the Quicksort algorithm.

The code above that is what I added to test the function, and it creates an Array of ten million integers int the range from 0 <= n < 10,000. You can verify that by adding a line of code like this to print out some values:

randomArray.take(20).foreach(println)

While this quickSort function is nice and clean, (a) it unfortunately uses a lot of RAM, and (b) it’s also slower than an imperative algorithm, in part (or maybe in whole) due to the fact that an imperative algorithm can modify the array elements in place.

Memory use and performance

To see what sort of memory hit this algorithm runs into, I wrapped that code in some ugly print statements and memory checks:

object QuickSortFP extends App {

    val r = scala.util.Random
    val arr = (for (i <- 1 to 10000000) yield r.nextInt(10000)).toArray

    println("=== BEFORE SORT ===")
    printRam
    Thread.sleep(500)
    val sortedArray = quickSort(arr)

    println("=== AFTER SORT ===")
    Thread.sleep(500)
    printRam

    println("=== AFTER GC ===")
    System.gc
    Thread.sleep(500)
    printRam

    def quickSort(xs: Array[Int]): Array[Int] = {
        if (xs.length <= 1) xs
        else {
            printRam
            val pivot = xs(xs.length / 2)
            Array.concat(
                quickSort(xs filter (pivot >)),
                xs filter (pivot ==),
                quickSort(xs filter (pivot <)))
        }
    }

    /**
     * freeMemory  - the amount of free memory in the JVM (an approximation to the total amount of memory currently available for future allocated objects, measured in bytes)
     * totalMemory - the total amount of memory in the JVM
     * maxMemory   - the maximum amount of memory the JVM will attempt to use
     */
    def printRam {
        println("")
        val mb = 1024*1024
        val runtime = Runtime.getRuntime
        println("** Used Memory:  " + (runtime.totalMemory - runtime.freeMemory) / mb + " MB")
        println("** Free Memory:  " + runtime.freeMemory / mb + " MB")
        println("** Total Memory: " + runtime.totalMemory / mb + " MB")
        println("** Max Memory:   " + runtime.maxMemory / mb + " MB")
    }

}

When I run that code with ten million random integers, the output generally looks like this:

 === BEFORE SORT ===
 ** Used Memory:  138 MB
 ** Free Memory:  171 MB
 ** Total Memory: 310 MB
 ** Max Memory:   3641 MB

 === IN THE MIDDLE ===
 ** Used Memory:  798 MB
 ** Free Memory:  333 MB
 ** Total Memory: 1131 MB

 === NEAR THE END ===
 ** Used Memory:  255 MB
 ** Free Memory:  882 MB
 ** Total Memory: 1138 MB

 === AFTER SORT ===
 ** Used Memory:  515 MB
 ** Free Memory:  623 MB
 ** Total Memory: 1138 MB

 === AFTER GC ===
 ** Used Memory:  87 MB
 ** Free Memory:  1044 MB
 ** Total Memory: 1132 MB

The output varies from run to run, but this is representative of the values you’ll see.

The “IN THE MIDDLE” output represents some values that I picked from the middle part of the output, which is where the memory use peaked. As you can see, Used Memory increased to almost 800 MB at that time, and increase of 660 MB over the initial value. The JVM showed that the memory use was 515 MB after the sort, and then dropped back to 87 MB when I ran the garbage collector after that. You can also see that the JVM also increased the necessary “Total Memory” from 310 MB at the beginning to more than 1 GB as the algorithm ran.

Scala’s built-in quickSort function

As a point of contrast, you can also sort the Array using Scala’s built-in quickSort function, like this:

scala.util.Sorting.quickSort(testArray)

When I did this, the “Used Memory” was identical before and after the sort. On my current laptop this solution runs in about four seconds (wall time), while the FP/recursive solution takes about 16 seconds (not including the sleep calls, which I removed for these tests). Please note that those times are not exact by any means; I just looked at a timer as I ran each program, but four seconds vs 16 seconds is clearly significant.

Current summary:

  • I don’t know how much memory Scala’s built-in function used, but immediately after using it, the used memory was the same as before calling it. Conversely, the FP algorithm started with 138 MB used memory, increased to 798 MB while the algorithm was running, was 515 MB after the algorithm was completed, and then dropped to 87 MB after I ran the garbage collector.
  • The FP algorithm took about 16 seconds to complete, while the built-in quickSort function took 4 seconds.

It’s important to note that I haven’t done anything to optimize my Scala/FP quickSort algorithm. I just looked at how a basic quicksort algorithm should be written in a functional style, and wrote it that way.

Notes

Here’s a slightly more formal summary of my notes:

  • I didn’t use any sort of benchmarking tools or do anything else to “warm up” the JVM; I just ran my code one time.
  • More importantly, there are ways to make the functional algorithm run faster.
  • Functional algorithms that run in parallel can be faster than non-FP algorithms.
  • Depending on the problem at hand, “lazy” approaches can also help improve performance.

Summary

I don’t have any sort of “conclusion” here; at the moment I’m just taking a first look at these algorithms and wanted to make these notes. My background/degree is in Aerospace Engineering, so I never wrote these sorts of algorithms in college, and I was digging into this as part of my research into recursive algorithms.

Please note that regarding memory use and performance, I’m not writing this blog post to diss the FP/recursive approach. In fact, a few nights ago I saw in a Haskell book that there is a way to improve the performance of this basic FP/recursive algorithm, and one of my goals is to see how to do something similar in Scala.

I’d give credit to that book if I could find it right now ... while other FP books sweep this sort of thing under the rug, this book clearly stated, “This algorithm is a known memory hog, but, there’s a better way to do this, although the faster code doesn’t look as clean.” When I find that book I may come back and modify the FP/recursive code to see how well that approach works with Scala.