xpath

A collection of Scala XML tutorials

The following links are a collection of Scala XML tutorials I've written. Most of them come from the Scala Cookbook, while the others were written before I wrote the Cookbook.

Without any further ado, here are the links:

Scala: Deeper XML parsing, and extracting XML tag attributes

Problem: You need to perform deep XML searches, combining the \ and \\ methods, and possibly searching directly for tag attributes.

Solution

Combine the \\ and \ methods as needed to search the XML. When you need to extract tag attributes, place an @ character before the attribute name.

Given this simplified version of the Yahoo Weather RSS Feed:

Basic Scala XPath searching with \ and \\

Problem: When writing a Scala application, you want to search an XML tree for the data you need using XPath expressions.

Solution

Use the \ and \\ methods, which are analogous to the XPath / and // expressions. The \ method returns all matching elements directly under the current node, and \\ returns all matching elements from all nodes under the current node (all descendant nodes).

To demonstrate this difference, create this XML literal:

How to extract data from XML nodes in Scala

Problem: In a Scala application, you want to extract information from XML you receive, so you can use the data in your application.

Solution

Use the methods of the Scala Elem and NodeSeq classes to extract the data. The most commonly used methods of the Elem class are shown here:

Scala: How to create XML literals

Problem: You want to create XML variables, and embed XML into your Scala code.

Solution

You can assign XML expressions directly to variables, as shown in these examples:

val hello = <p>Hello, world</p>
val p = <person><name>Edward</name><age>42</age></person>

In the REPL you can see that these variables are of type scala.xml.Elem:

A Scala XML XPath example

I'm not going to take any time to describe the following Scala XML/XPath example, other than to say that when it's run, it produces the following output, which is a simulated receipt for an order at a pizza store:

Scala XML examples: XML literals, mixing XML and Scala source code, XPath searching

A really terrific feature about Scala is that XML handling is built into the language. This means you don't have to deal with XML as String objects, you deal with it as XML objects.

Here are just a few examples of using XML in Scala. First, you can create an XML literal like this:

scala> val hello = <p>Hello, world</p>
hello: scala.xml.Elem = <p>Hello, world</p>

Again, note that this is not a String, there are no double quotes; we've just defined an XML literal in Scala.

Scala and XPath - get the first element of an array

Scala/XPath FAQ: How do I get the first element of an array in an XML document using Scala and XPath?

I ran into the problem of needing to get the first array element from an XML document using Scala and XPath recently, and in short, I ended up writing some Scala/XPath code that looked like this:

Parsing “real world” HTML with Scala, HTMLCleaner, and StringEscapeUtils

While XML parsers work great for well-formed XML, out in the 'real world' internet, you can't count on HTML being XHTML, or even being well-formatted. As a result, various 'HTML cleaner' libraries for Java have appeared. They attempt to clean up the HTML so you can parse it.