Scala: How to download and process XML data (such as an RSS feed)

I was looking for a good way to access XML resources (like RSS feeds) in Scala, and I currently like the idea of using ScalaJ-HTTP to access the URL and download the XML content, and then using the Scala XML library to process the XML string I download from the URL.

This example Scala program shows my current approach:

import scalaj.http.{Http, HttpResponse}
import scala.xml.XML

object GetXml extends App
{
    // get the xml content using scalaj-http
    val response: HttpResponse[String] = Http("http://www.chicagotribune.com/sports/rss2.0.xml")
                                        .timeout(connTimeoutMs = 2000, readTimeoutMs = 5000)
                                        .asString
    val xmlString = response.body

    // convert the `String` to a `scala.xml.Elem`
    val xml = XML.loadString(xmlString)

    // handle the xml as desired ...
    val titleNodes = (xml \\ "item" \ "title")
    val headlines = for {
        t <- titleNodes
    } yield t.text
    headlines.foreach(println)

}

A few notes about this application:

  • I like using ScalaJ-HTTP to download the content as an HTTP GET request, in part because I like to be able to easily set timeout values on the GET request.
  • Once I get the XML from the URL, it’s easy to convert that to a Scala XML object using XML.loadString.
  • Once I have the XML like that, I can then process it however I want to.

The build.sbt file

If you want to test this on your own computer, the only other thing you need (besides having Scala and SBT installed) is a build.sbt file to go along with it. Here’s mine:

name := "ScalajHttpXml"

version := "1.0"

scalaVersion := "2.11.7"

resolvers += "Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/"

libraryDependencies ++= Seq(
    "org.scalaj" %% "scalaj-http" % "2.3.0",
    "org.scala-lang.modules" %% "scala-xml" % "1.0.3"
)

scalacOptions += "-deprecation"

Once you have that Scala source code and build.sbt file, you can test this Scala/HTTP/XML solution on your system. (Note that the Scala XML project is now separate from the base Scala libraries.)