Scala: How to create XML literals

Problem: You want to create XML variables, and embed XML into your Scala code.

Solution

You can assign XML expressions directly to variables, as shown in these examples:

val hello = <p>Hello, world</p>
val p = <person><name>Edward</name><age>42</age></person>

In the REPL you can see that these variables are of type scala.xml.Elem:

scala> val hello = <p>Hello, world</p>
hello: scala.xml.Elem = <p>Hello, world</p>

scala> val p = <person><name>Edward</name><age>42</age></person>
p: scala.xml.Elem = <person><name>Edward</name><age>42</age></person>

As shown, you can embed XML directly into your Scala source code. There’s no need to embed the XML in double quotes to create a String; just assign an XML block to a variable.

Scala XML blocks can span multiple lines:

val foo = <p>Lorem ipsum 
          dolor sit amet, 
          consectetur adipisicing elit
          et cetera, et cetera</p>

A block can take as many lines as needed to solve the current problem. As shown in the following code, an XML literal can also be the result of a method or function:

def getXml =
  <pizza>
    <crust type="thin" size="14" />
    <topping>cheese</topping>
    <topping>sausage</topping>
  </pizza>

More often you’ll return dynamically generated XML from a method. I’ll demonstrate this in a future recipe.

Discussion

If you’re given a block of XML as a String, you can convert it into an XML literal with the loadString method of the scala.xml.XML object:

scala> val dog = <?pdf-cr?>
xml.XML.loadString("<dog><name>Rocky</name><age>12</age></dog>")
dog: scala.xml.Elem = <dog><name>Rocky</name><age>12</age></dog>

This results in an Elem object, as in the previous examples.

Note that a poorly formed XML string will lead to a SAXParseException:

scala> val x = scala.xml.XML.loadString("")
org.xml.sax.SAXParseException: Premature end of file.

scala> val x = scala.xml.XML.loadString("a")
org.xml.sax.SAXParseException: Content is not allowed in prolog.

scala> val x = scala.xml.XML.loadString("<a>")
org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.

The following figure shows the main classes in the scala.xml class hierarchy:

The Scaladoc states that the Elem class “extends the Node class, providing an immutable data object representing an XML element.” An Elem has a label, attributes, and children, as you’ll see in many examples in this chapter.

When parsing XML, the classes you’ll run into most often are Elem, Node, and NodeSeq. These classes are described here:

Name        Description
----        -----------

Elem        An immutable object that represents an XML element.

Node        An abstract class representing nodes in an XML tree.
            It contains an implementation of XPath methods like \ and \\.

NodeSeq	    A wrapper around Seq[Node], with XPath and comprehension methods. 
            Typically seen as the result of XPath searches.

Although the Elem class has more than 160 methods, don’t be intimidated; many of these come from the ability to treat an Elem as a sequential collection. There are a small handful of commonly used, XML-specific methods.

The NodeSeq class is simpler. Like the Elem class, it implements the \ and \\ methods for XPath searching, and then is primarily composed of common collection methods.

If you get deeply involved in XML parsing and creation, you’ll also run into some of the other Scala classes. The most common of those are listed here:

Name        Description
----        -----------

NodeBuffer  Extends ArrayBuffer[Node], and adds an &+ method that lets you build
            a sequence of XML nodes using a fluent style.

PCData      From the Scaladoc, “this class (which is not used by all XML parsers, but 
            always used by the XHTML one) represents parseable character data, which 
            appeared as CDATA sections in the input and is to be preserved as CDATA sections 
            in the output.” Example:

                scala> val x = PCData("<p>hello</p>")
                x: scala.xml.PCData = <![CDATA[<p>hello</p>]]>

Text        Implements an XML node for text (PCDATA). Programming in Scala refers to it as, 
            “A node holding just text.” Example:

                scala> val x = Text("<p>Sundance</p>")
                x: scala.xml.Text = <p>Sundance</p>

Unparsed    An XML node for unparsed content. Per the Scaladoc, it will output verbatim, and 
            “all bets are off regarding wellformedness etc.” Example:

                // intentional error
                scala> val x = Unparsed("</p>foo<p>")
                x: scala.xml.Unparsed = </p>foo<p>

See Also

See the following links for more information: