SAX (Simple API for XML) is an event-based sequential access parser API, a widely-used specification that describes how XML parser can read and pass XML content efficiently from XML document to applications. SAX XML Reader is one famous and most popularity selection of current XML parser, unlike else XML-related specifications, it does not come from a formal committee or world-class company, it was developed by the XML-DEV mailing list, all discussions, documents and implementations around SAX were done by Dev mail lists. But now it has being received industry-wide acceptance. In this how-to tutorial/example, it provides a quick-start tutorial for java programmers to read XML via SAX parser/reader.
1. Prerequisites
Only JDK is required, strictly speaking, the SAX parser can be ran anywhere having JRE installed, it is not dependent upon any additional 3th libraries.
2. Giving the following XML document
Giving the below XML document which describes a list of phases with a few contained attributes, you can copy them and save to your local disk for the later testing.
<?xml version="1.0" encoding="UTF-8"?> <project> <phase id="1"> <name>Sax project Phase 1</name> <description>the phase 1 of an example that use SAX parser to read xml file</description> <owner>Steven</owner> <startDate> 2012-04-04</startDate> <endDate>2012-04-05</endDate> </phase> <phase id="2"> <name>Sax project Phase 2</name> <description>the phase 2 of an example that use SAX parser to read xml file</description> <owner>Tom</owner> <startDate> 2012-04-05</startDate> <endDate>2012-04-06</endDate> </phase> <phase id="3"> <name>Sax project Phase 3</name> <description>the phase 3 of an example that use SAX parser to read xml file</description> <owner>Jimmy</owner> <startDate> 2012-04-06</startDate> <endDate>2012-04-07</endDate> </phase> </project>
3. How to Read XML file via SAX Parser in Java
To start reading XML document, the first thing is to create a class that extends DefaultHandler
which is a default base class for SAX2 event handlers, the code writers can implement this interface, or instantiate the base handler when the application has not want its own.
5 methods are defined in DefaultHandler you may override:
startDocument():
Method will be called at the start of an XML document, you may override this method in a subclass to take specific actions at the beginning of a document (such as allocating the root node of a tree or creating an output file).endDocument():
Method will be called at the end of an XML document, you may override this method in a subclass to take specific actions at the end of a document.startElement() and endElement():
You may override this method in a subclass to take specific actions at the start or end of each element.characters():
This Method will be ran with the text contents in between the start and end tags of an XML document element.4. Create the corresponding Java bean
As mentioned above, The XML defines a list of phases, we created the Java data object to map the elements in XML, the content in XML will be converted to Java data object as following, you can use them in application or simply print it out.
package com.asjava; import java.util.Date; public class Phase { private int id; private String name; private String description; private String owner; private Date startDate; private Date endDate; public int getId() { return id; } public void setId(int id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getDescription() { return description; } public void setDescription(String description) { this.description = description; } public String getOwner() { return owner; } public void setOwner(String owner) { this.owner = owner; } public Date getStartDate() { return startDate; } public void setStartDate(Date startDate) { this.startDate = startDate; } public Date getEndDate() { return endDate; } public void setEndDate(Date endDate) { this.endDate = endDate; } }
5. The code to Read XML file via SAX Parser in Java
We created class XMLParserViaSax which extends DefaultHandler:
package com.oracle; import java.io.IOException; import java.io.InputStream; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Date; import java.util.List; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.helpers.DefaultHandler; public class XMLParserViaSax extends DefaultHandler { private static SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd"); // Used to save a list of phases. private List phases; private Phase phase; // Used to iterator over the name of elements. private String tagName; public List getPhases() { return phases; } public void setPhases(List phases) { this.phases = phases; } public String getTagName() { return tagName; } public void setTagName(String tagName) { this.tagName = tagName; } @Override public void startDocument() throws SAXException { phases = new ArrayList(); } @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { if (qName.equals("phase")) { phase = new Phase(); // Set the Id from the attribute of phase. phase.setId(Integer.parseInt(attributes.getValue(0))); } this.tagName = qName; } @Override public void endElement(String uri, String localName, String qName) throws SAXException { if (qName.equals("phase")) { this.phases.add(this.phase); } this.tagName = null; } @Override public void endDocument() throws SAXException { } @Override public void characters(char[] ch, int start, int length) throws SAXException { if (this.tagName != null) { String data = new String(ch, start, length); if (this.tagName.equals("name")) { this.phase.setName(data); } else if (this.tagName.equals("description")) { this.phase.setDescription(data); } else if (this.tagName.equals("owner")) { this.phase.setOwner(data); } else if (this.tagName.equals("startDate")) { this.phase.setStartDate(parseDate(data)); } else if (this.tagName.equals("endDate")) { this.phase.setEndDate(parseDate(data)); } } } private Date parseDate(String data) { try { return df.parse(data); } catch (ParseException e) { return null; } } public static void main(String[] args) { SAXParser parser = null; try { // create an instance of SAXParser. parser = SAXParserFactory.newInstance().newSAXParser(); // create an instance of DefaultHandler. XMLParserViaSax parseXml = new XMLParserViaSax(); InputStream stream = XMLParserViaSax.class.getClass() .getClassLoader().getResourceAsStream("Project.xml"); parser.parse(stream, parseXml); List list = parseXml.getPhases(); for (Phase phase : list) { StringBuffer buffer = new StringBuffer(); buffer.append("id:").append(phase.getId()).append("\tname:") .append(phase.getName()).append("\tdescription:") .append(phase.getDescription()).append("\towner:") .append(phase.getOwner()).append("\tstartDate:") .append(phase.getStartDate()).append("\tendDate:") .append(phase.getEndDate()); System.out.println(buffer.toString()); } } catch (ParserConfigurationException e) { e.printStackTrace(); } catch (SAXException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } }
6. Output after parsing XML via SAX
name: Sax project Phase 1
description: the phase 1 of an example that use SAX parser to read xml file
owner: Steven
startDate: 2012-04-04
endDate: 2012-04-05
Phase id:2
name: Sax project Phase 2
description: the phase 2 of an example that use SAX parser to read xml file
owner: Tim
startDate: 2012-04-05
endDate: 2012-04-06
Phase id:3
name: Sax project Phase 3
description: the phase 3 of an example that use SAX parser to read xml file
owner: Jimmy
startDate: 2012-04-06
endDate: 2012-04-07
Conclusion
At writing of this article, I just given very simple part of the functionality of SAX parser, of course, has code example included. SAX is absolutely not only this, it is as a stream parser, with an event-driven API, The code writer only define a number of callback methods that will be called when events occur during parsing. Also, compared with else XML parser, it uses less memory and runs faster.
wirter ::= writer
paring ::= parsing
Thanks so much for reading and indicating my typos, have corrected it.