Scala: How to download and process XML data (such as an RSS feed)

I was looking for a good way to access XML resources (like RSS feeds) in Scala, and I currently like the idea of using ScalaJ-HTTP to access the URL and download the XML content, and then using the Scala XML library to process the XML string I download from the URL.

This example Scala program shows my current approach:

import scalaj.http.{Http, HttpResponse}
import scala.xml.XML

object GetXml extends App
{
    // get the xml content using scalaj-http
    val response: HttpResponse[String] = Http("http://www.chicagotribune.com/sports/rss2.0.xml")
                                        .timeout(connTimeoutMs = 2000, readTimeoutMs = 5000)
                                        .asString
    val xmlString = response.body

    // convert the `String` to a `scala.xml.Elem`
    val xml = XML.loadString(xmlString)

    // handle the xml as desired ...
    val titleNodes = (xml \\ "item" \ "title")
    val headlines = for {
        t <- titleNodes
    } yield t.text
    headlines.foreach(println)

}

A few notes about this application:

  • I like using ScalaJ-HTTP to download the content as an HTTP GET request, in part because I like to be able to easily set timeout values on the GET request.
  • Once I get the XML from the URL, it’s easy to convert that to a Scala XML object using XML.loadString.
  • Once I have the XML like that, I can then process it however I want to.

The build.sbt file

If you want to test this on your own computer, the only other thing you need (besides having Scala and SBT installed) is a build.sbt file to go along with it. Here’s mine:

name := "ScalajHttpXml"

version := "1.0"

scalaVersion := "2.11.7"

resolvers += "Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/"

libraryDependencies ++= Seq(
    "org.scalaj" %% "scalaj-http" % "2.3.0",
    "org.scala-lang.modules" %% "scala-xml" % "1.0.3"
)

scalacOptions += "-deprecation"

Once you have that Scala source code and build.sbt file, you can test this Scala/HTTP/XML solution on your system. (Note that the Scala XML project is now separate from the base Scala libraries.)

Add new comment

The content of this field is kept private and will not be shown publicly.

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.