(Study Log) Reading XML Documents with Java

(Study Log) Reading XML Documents with Java

I came across the need for reading XML or Extensible Markup Language files in a Java program while working on a Monopoly simulator in my studies. In this blog post I will share how I managed to make the program configurable using XML.

Disclaimer: I am not at all an expert and this is only what I have learned during this study project.

The basics of XML

XML is a markdown language that uses tags much like HTML. Each opening tag can contain attributes and the space between opening and closing tags can contain more tags or plain text. Any XML document needs a root tag that holds all other tags.

<?xml version="1.0" encoding="utf-8" ?>
<root-tag> 
  <tag attribute="some value"> content </tag>
</root-tag>

How I read XML files with Java

I start out using the DocumentBuilderFactory class to instantiate a DocumentBuilder object.

DocumentBuilderFactory factory = documentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

A DocumentBuilder can be used to parse XML to a Document object. After parsing, we normalize the document to remove any empty and unnecessary tags.

InputStream inputStream = XMLUtil.class.getResourceAsStream("/file.xml");
Document doc = builder.parse(inputStream);
doc.getDocumentElement().normalize();

Now we get a list of Node objects from the root tag.

NodeList nodes = doc.getElementsByTagName("root-tag").item(0).getChildNodes();

Now we can iterate through the children tags and get the relevant data from them.

 for (int i = 0; i < nodes.getLength(); i++) {
            Node node = nodes.item(i);
            if (node.getNodeType() == Node.ELEMENT_NODE) {
                Element tag = (Element) node;
                String attribute = tag.getAttribute("attribute");
                String text =  tag.getTextContent();
            }
        }

That is the basics of reading XML using the DocumentBuilder found in javax.xml.parsers.

Example of localisation xml file

<?xml version="1.0" encoding="utf-8" ?>
<locales>
    <locale lang="en_US" name="English">
        <sentence label="word1">This is a sentence.</sentence>
    </locale>

    <locale lang="da_DK" name="Danish">
        <sentence label="word1">Dette er en sætning.</sentence>
    </locale>
</locales>