How to read Atom and RSS feeds using Scala

In case you ever need to read an Atom or RSS feed using Scala, this example code shows how to use the Java ROME library in your Scala code:

import java.net.URL
import com.rometools.rome.feed.synd.{SyndFeed}
import com.rometools.rome.io.SyndFeedInput
import com.rometools.rome.io.XmlReader
import scala.collection.JavaConverters._

object AtomAndRssReader extends App {

    // NOTE: code can throw exceptions
    val feedUrl = new URL("https://www.npr.org/rss/rss.php?id=100")
    val input = new SyndFeedInput
    val feed: SyndFeed = input.build(new XmlReader(feedUrl))
    //println(feed)


    // `feed.getEntries` has type `java.util.List[SyndEntry]`
    val entries = asScalaBuffer(feed.getEntries).toVector

    for (entry <- entries) {
        println("Title: " + entry.getTitle)
        println("URI:   " + entry.getUri)
        println("Date:  " + entry.getUpdatedDate)

        // java.util.List[SyndLink]
        val links = asScalaBuffer(entry.getLinks).toVector
        for (link <- links) {
            println("Link: " + link.getHref)
        }

        val contents = asScalaBuffer(entry.getContents).toVector
        for (content <- contents) {
            println("Content: " + content.getValue)
        }

        val categories = asScalaBuffer(entry.getCategories).toVector
        for (category <- categories) {
            println("Category: " + category.getName)
        }

        println("")

    }

}

A few notes:

  • The code uses Scala’s JavaConverters class to convert the java.util.List instances into something more usable
  • You need to add the Rome dependency to your build.sbt file ("com.rometools" % "rome" % "1.8.1")
  • Of course the ROME library does all the heavy lifting; I just show how to use it in Scala, in particular with the JavaConverters class

The output

Here’s an abridged version of what the output from this code looks like today:

Title: Episode 820: P Is For Phosphorus 
URI:   https://www.npr.org/sections/money/2018/01/26/581156723/episode-820-p-is-for-phosphorus?utm_medium=RSS&utm_campaign=storiesfromnpr
Date:  null
Content: <img src='https://media.npr.org/assets/img/2018/01/26/gettyimages-168997967_wide-3e6007bd49a94a161553cba256335550e12cfb37.jpg?s=600' /><p>Phosphate is a crucial element, for farming, and for life ...

Title: The 10 Events You Need To Know To Understand The Almost-Firing Of Robert Mueller
URI:   https://www.npr.org/2018/01/26/580964814/the-10-events-you-need-to-know-to-understand-the-almost-firing-of-robert-mueller?utm_medium=RSS&utm_campaign=storiesfromnpr
Date:  null
Content: <img src='https://media.npr.org/assets/img/2018/01/26/mueller-2013_wide-ea9e74cdb89431d2e2ebd476acc67cce3cd67167.jpg?s=600' /><p>Everything about this story revolves around obstruction of justice ...

Note that the Date output is null. I haven’t looked into that yet — it’s not important for my needs — but as you can see, the rest of the output looks just fine, and the code works as intended.