Source code for Scala “line count” functions

At the moment I can’t remember why I wrote the following Scala “line count” code, but without any introduction, I thought I’d share it here:

object LineCount extends App {

    def using[A <: { def close(): Unit }, B](resource: A)(f: A => B): B = {
        try {
            f(resource)
        } finally {
            resource.close()
        }
    }

    def timer[A](blockOfCode: => A) = {
        val startTime = System.nanoTime
        val result = blockOfCode
        val stopTime = System.nanoTime
        val delta = stopTime - startTime
        (result, delta/1000000d)
    }

    def countLines(filename: String): Long = {
        val NEWLINE = 10
        var newlineCount = 0L
        using(io.Source.fromFile(filename)) { source => {
            for {
                char <- source
                if char.toByte == NEWLINE
            } newlineCount += 1
            newlineCount
            }
        }
    }
 
    // took 87 secs (10M lines)
    def countLines2(filename: String): Option[Long] = {
        val NEWLINE = 10
        var newlineCount = 0L
        var source: io.BufferedSource = null
        try {
            source = io.Source.fromFile(filename)
            for {
                char <- source
                if char.toByte == 10
            } newlineCount += 1
            Some(newlineCount)
        } catch {
            case e: Exception => None
        } finally {
            if (source != null) source.close
        }
    }

    // took 27 secs (10M lines)
    def countLines3(filename: String): Option[Long] = {
        val NEWLINE = 10
        var newlineCount = 0L
        var source: io.BufferedSource = null
        try {
            source = io.Source.fromFile(filename)
            for (line <- source.getLines) {
                newlineCount += 1
            }
          Some(newlineCount)
          } catch {
            case e: Exception => None
        } finally {
            if (source != null) source.close
        }
    }
 
    val (lines, time) = timer{ countLines3("tenmillionlines.txt") }
    println(s"Counted $lines in $time ms")

}

If I remember right I was looking at the performance of various line count functions, as shown in the comments. You can see the performance of the last two functions in the comments when run on a very old iMac. I don’t remember the performance of the first function.

Notes: This code also shows the using and timer methods, which I use quite a bit.

Update: I just found my original post, which I titled, Scala file reading performance.

Add new comment

The content of this field is kept private and will not be shown publicly.

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.