How to read a binary file with Scala (FileInputStream, BufferedInputStream)

As a brief note today, if you need to read a binary file with Scala, here’s an approach I just tested and used. It uses the Java FileInputStream and BufferedInputStream classes along with Iterator.continually:

package file_binary

import java.io.{FileInputStream, BufferedInputStream}

@main def readBinaryFile = 
    val filename = "access.log"
    val bis = BufferedInputStream(FileInputStream(filename))
    Iterator.continually(bis.read())
        .takeWhile(_ != -1)
        .foreach(b => b)  // do whatever you want with each byte
    bis.close

(As a note to self) this code is a replacement for reading a file with a while loop in Scala.

Discussion

This example uses some proposed Scala 3 (Dotty) significant indentation syntax, but it’s easily converted to Scala 2.

The Iterator.continually approach lets you loop over each byte in the file. When the end of file is reached a -1 value is returned by the read method, so that’s why the takeWhile method is used as shown.

This is something of a unique problem because the read function returns a Byte, but it continues to return bytes as long as they exist in the file. As a result, the Iterator.continually method is a good way to handle this particular problem.

Note that you can use LazyList.continually instead of Iterator.continually, if you prefer. (LazyList is a replacement for the older Scala Stream class.)

A performance note

Also note that I wrap FileInputStream with BufferedInputStream. If you only use FileInputStream, it takes about 181 seconds to read an Apache access log file on my laptop that has 650,000 lines, but it only takes about 1.6 seconds to read the same file if you wrap that with BufferedInputStream.

The Iterator and Stream objects

For your convenience and reading pleasure, here are links to those objects: