Search This Blog

Friday, August 1, 2025

Parsing an XML File in Java with SAX Parser

Parsing XML files is a common task in Java development, and a SAX parser is a popular choice for this purpose. The SAX parser is known for being faster and using less memory compared to the DOM parser. This is because the SAX parser doesn't load the entire XML document into memory or create an object representation of it. Instead, it uses callback functions to inform clients about the XML document's structure as it reads through the file.

This post will guide you through parsing an XML file using the SAX parser in Java, based on the provided examples.


Understanding SAX Callback Methods

The SAX parser uses a
DefaultHandler class to handle events in the XML file. The following are some key callback methods that are used to interact with the XML document:

  1. startDocument() and endDocument(): These methods are called at the beginning and end of the XML document, respectively.

  2. startElement() and endElement(): These methods are invoked at the start and end of each element within the document.

  3. characters(): This method is called to handle the text content found between the start and end tags of an XML element.

Step-by-Step Guide to Parsing an XML File

First, let's look at the example XML file that we'll be parsing. This file contains information about a company's staff:


<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff>
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
</company>

Next, you will create a Java file, such as

  • ReadXMLFile.java, to use the SAX parser. This Java code sets up a

  • DefaultHandler to process the XML file. Within the handler, boolean flags (e.g., bfname, blname) are used to track which element's content is currently being read.

  • Working of startElement method works: When the parser encounters an element like < firstname > , the qName variable matches "FIRSTNAME" (case-insensitive), and the bfname flag is set to true.

  • The characters method is where the text content is processed. If a flag, such as bfname, is true, the code prints the text content (e.g., "yong") and then resets the flag to false.

  • Finally, a SAXParser instance is created and used to parse the XML file, for example, "c:\\file.xml", with the custom handler.

Java Code

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXMLFile {
    public static void main(String argv[]) {
        try {
            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser saxParser = factory.newSAXParser();
            DefaultHandler handler = new DefaultHandler() {
                boolean bfname = false;
                boolean blname = false;
                boolean bnname = false;
                boolean bsalary = false;
                public void startElement(String uri, String localName,String qName,Attributes attributes) throws SAXException {
                    System.out.println("Start Element :" + qName);
                    if (qName.equalsIgnoreCase("FIRSTNAME")) {
                        bfname = true;
                    }
                    if (qName.equalsIgnoreCase("LASTNAME")) {
                        blname = true;
                    }
                    if (qName.equalsIgnoreCase("NICKNAME")) {
                        bnname = true;
                    }
                    if (qName.equalsIgnoreCase("SALARY")) {
                        bsalary = true;
                    }
                }
                public void endElement(String uri, String localName, String qName) throws SAXException {
                    System.out.println("End Element :" + qName);
                }
                public void characters(char ch[], int start, int length) throws SAXException {
                    if (bfname) {
                        System.out.println("First Name : " + new String(ch, start, length));
                        bfname = false;
                    }
                    if (blname) {
                        System.out.println("Last Name : " + new String(ch, start, length));
                        blname = false;
                    }
                    if (bnname) {
                        System.out.println("Nick Name : " + new String(ch, start, length));
                        bnname = false;
                    }
                    if (bsalary) {
                        System.out.println("Salary : " + new String(ch, start, length));
                        bsalary = false;
                    }
                }
            };
            saxParser.parse("c:\\file.xml", handler);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Output

When you run the Java program, the SAX parser processes the XML file and generates output that reflects the callback events. The output shows the start and end of each element and the text content (characters) within them.

Start Element :company
Start Element :staff
Start Element :firstname
First Name : yong
End Element :firstname
Start Element :lastname
Last Name : mook kim
End Element :lastname
Start Element :nickname
Nick Name : mkyong
End Element :nickname
Start Element :salary
Salary : 100000
End Element :salary
End Element :staff
Start Element :staff
Start Element :firstname
First Name : low
End Element :firstname
Start Element :lastname
Last Name : yin fong
End Element :lastname
Start Element :nickname
Nick Name : fong fong
End Element :nickname
Start Element :salary
Salary : 200000
End Element :salary
End Element :staff
End Element :company

No comments:

Post a Comment