Out of curiosity about Scala’s file-reading performance, I decided to write a “line count” program in Scala. One obvious approach was to count the newline characters in the file:
// took 101 secs (10M lines)
// work on one character at a time
def countLines1(source: Source): Long = {
var newlineCount = 0L
for {
c <- source
if c.toByte == NEWLINE
} newlineCount += 1
newlineCount
}
As the comment shows, this took 101 seconds to read a file that has 10M lines. (An Apache access log file for this website.)