Java with XML Streaming API
As we know XML is platform independent,
so every emerging technology wants to deal with XML. We have multiple ways in
JAVA through which we can process XML. Java supports JAXB, StAX, SAX and DOM
for XML processing. With this post we will focus on StAX parsing API.
StAX stands for Streaming API for XML.
- It is a streaming pull-parsing java API to read and write XML documents.
- StaX parser is fast, easy to use and memory efficient.
- The primary goal of the StAX API is to give “parsing control to the developer by exposing a simple iterator based API.
- StAX was created to address limitations in the two most prevalent parsing APIs, SAX and DOM. StAX is not as powerful or flexible as TrAX(Transformation API for XML) or JDOM.
StAX vs SAX(Simple API for XML)
StAX is a bidirectional API, it means we
can perform both operations like read and write XML documents but SAX is read
only, so another API is needed if we want to write XML documents. StAX is based on pull-parse API where as SAX
is based on push-parse API.
StAX vs DOM(Document Object Model)
- The DOM parser involves creating in-memory tree objects representing an entire XML document Once loaded in memory, DOM trees can be navigated freely and parsed randomly, so providing maximum flexibility for developers. But the cost of flexibility is large memory and needs more time of processor.
- Streaming refers to a programming model in which XML documents are transmitted and parsed serially at application runtime. Stream-based parsers can start generating output immediately, and infoset elements can be discarded and garbage collected immediately after they are used. It requires less memory, reduce processor requirement and provide good performance. Streaming models for XML processing are particularly useful when application has strict memory limitations.
XML parser API Feature details
Feature
|
TrAX
|
|||
XSLT Rule
|
||||
Medium
|
||||
Yes
|
||||
Varies
|
||||
No
|
||||
Yes
|
||||
Yes
|
||||
No
|
Note:- TrAX stands for Transformations API for XML.It is included in the later version of JAXP(Java extension API for XML parsing). JAXP adds two packages one for XML parsing and other for XML transformations(TrAX).
For processing any XML document, 3
components are required.
- XML document.
- Parser API
- Client code
We have below two approaches for XML
document parsing
- pull-parsing model
- push-parsing model
What is pull-parsing model?
Answer:-
In case of pull-parsing, Client application have
control over parsing the XML documents by pulling the events from the parser and
parsing happens according to client requirements. In pull parsing model,
application client code invokes the parsing API's methods to read the data,
then parser reads the XML document, writes the required data and returns it.
But in the pull model, the client is “pushed” with data, irrespective of
whether he actually needs it or not. pull-parsing libraries are smaller as
compare to push-parsing libraries.
What is push-parsing model?
Answer:- In
case of push parsing, parser API reads the XML document and whenever an event
is generated, it pushes the respective data to the client application and
continues. SAX is a push-parse API model. When the SAX parser encounters the
beginning of an XML element, it calls the startElement on our handler object.
It “pushes” the information from the XML into our object. Thus named as
push-parsing model.
StAX core API is separated into two
categories
- Cursor API and
- Event Iterator API
StAX also offers an API for writing XML
documents.
- It offers above two APIs: a low-level, cursor-based API (XMLStreamWriter), and a higher-level, event-based API (XMLEventWriter).
- The cursor-based API is useful in data binding scenarios (for example, creating a document from application data), the event-based API is generally useful in pipelining scenarios where a new document is constructed on the basis of the data provided by input documents.
Cursor API: -
This API follows the similar fashion like JDBC resultset while traversing through the XML documents. It always moves forward and once forwarded then never come back.
There are two main interfaces in Cursor API i.e. XMLStreamReader and XMLStreamWriter.
The XMLStreamReader interface in the StAX cursor API helps to read XML documents in a forward direction only. The following methods are available for pulling data from the stream or for skipping unwanted events: -
- Get the value of an attribute specified in XML document
- Read XML content of the document
- Determine whether an element has content or it is empty
- Get indexed access to a collection of attributes
- Get indexed access to a collection of namespaces
- Get the name of the current event (if applicable)
- Get the content of the current event (if applicable)
Implementation of XMLStreamReader for Reading a XML file
XMLStreamReaderDemo.java
package com.gaurav.staxparsers;
import java.io.FileReader;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
/* CONSTANTS SPECIFIED FOR PARSING THE XML DOCUMENT.
XMLStreamConstants.START_ELEMENT
XMLStreamConstants.END_ELEMENT
XMLStreamConstants.PROCESSING_INSTRUCTION
XMLStreamConstants.CHARACTERS
XMLStreamConstants.COMMENT
XMLStreamConstants.SPACE
XMLStreamConstants.START_DOCUMENT
XMLStreamConstants.END_DOCUMENT
XMLStreamConstants.ENTITY_ REFERENCE
XMLStreamConstants.ATTRIBUTE
XMLStreamConstants.DTD
XMLStreamConstants.CDATA
XMLStreamConstants.NAMESPACE
XMLStreamConstants.NOTATION_DECLARATION
XMLStreamConstants.ENTITY_DECLARATION
*/
public class XMLStreamReaderDemo {
public static void main(String[] args) throws Exception {
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
FileReader fileReader = new FileReader("C://employeeDetails.xml");
XMLStreamReader xmlStreamReader = xmlInputFactory
.createXMLStreamReader(fileReader);
try {
int eventType = xmlStreamReader.getEventType();
while (true) {
switch (eventType) {
case XMLStreamConstants.START_DOCUMENT:
System.out.println("DOCUMENT READING STARTED");
System.out.println("******************************");
break;
case XMLStreamConstants.START_ELEMENT:
System.out.println("Start Tag : "
+ xmlStreamReader.getName());
for (int i = 0, n = xmlStreamReader.getAttributeCount(); i < n; ++i)
System.out.println("Attribute : "
+ xmlStreamReader.getAttributeName(i) + "="
+ xmlStreamReader.getAttributeValue(i));
break;
case XMLStreamConstants.CHARACTERS:
if (xmlStreamReader.isWhiteSpace())
break;
System.out.println("Value : " + xmlStreamReader.getText());
break;
case XMLStreamConstants.END_ELEMENT:
System.out.println("End Tag :" + xmlStreamReader.getName());
break;
case XMLStreamConstants.END_DOCUMENT:
System.out.println("******************************");
System.out.println("DOCUMENT READING COMPLETED.");
break;
}
if (!xmlStreamReader.hasNext())
break;
eventType = xmlStreamReader.next();
}
} catch (Exception e) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, e);
} finally {
xmlStreamReader.close();
}
}
}
Note:- employeeDetails.xml file content is as below:-
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<employees>
<employee>
<id>1</id>
<name>GAURAV</name>
<email>gaurav@yahoo.co.in</email>
</employee>
<recCount>1</recCount>
</employees>
Result:-
DOCUMENT READING STARTED
******************************
Start Tag : employees
Start Tag : employee
Start Tag : id
Value : 1
End Tag :id
Start Tag : name
Value : GAURAV
End Tag :name
Start Tag : email
Value : gaurav@yahoo.co.in
End Tag :email
End Tag :employee
Start Tag : recCount
Value : 1
End Tag :recCount
End Tag :employees
******************************
DOCUMENT READING COMPLETED.
Implementation of XMLStreamWriter for Writing a XML file
XMLStreamWriterDemo.java
package com.gaurav.staxparsers;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;
public class XMLStreamWriterDemo {
public static void main(String[] args) throws XMLStreamException {
XMLOutputFactory xmlOutputFactory = XMLOutputFactory.newInstance();
XMLStreamWriter xmlStreamWriter = null;
try {
xmlStreamWriter = xmlOutputFactory.createXMLStreamWriter(new
FileOutputStream( "C://GeneratedXMLUsingXMLStreamWriter.xml"),"UTF-8");
xmlStreamWriter.writeStartDocument("UTF-8","1.0");
/**If We want to remove encoding="UTF-8" from the XML document the we can
* Use the below commented lines and we can comment the above two lines of code.
*/
/* xmlStreamWriter = xmlOutputFactory
.createXMLStreamWriter(new FileWriter(
"C://GeneratedXMLUsingXMLStreamWriter.xml"));
xmlStreamWriter.writeStartDocument(); */
xmlStreamWriter.writeStartElement("employees");
xmlStreamWriter.writeStartElement("employee");
xmlStreamWriter.writeAttribute("id", "1234");
xmlStreamWriter.writeAttribute("name", "KUMAR GAURAV");
xmlStreamWriter.writeAttribute("designation", "SOFTWARE ENGINEER");
xmlStreamWriter.writeEndElement();
xmlStreamWriter.writeEndElement();
xmlStreamWriter.writeEndDocument();
Logger.getLogger(XMLStreamWriterDemo.class.getName())
.info("Contents are written successfully in the specified XML file using XMLStreamWriter");
xmlStreamWriter.flush();
} catch (XMLStreamException xmlse) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, xmlse);
} catch (IOException ioe) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, ioe);
} finally {
xmlStreamWriter.close();
}
}
}
Result:- content of GeneratedXMLUsingXMLStreamWriter.xml
<?xml version="1.0" encoding="UTF-8"?>
<employees><employee id="1234" name="KUMAR GAURAV" designation="SOFTWARE ENGINEER">
</employee>
</employees>
Event Iterator API:-
- This API is having two main interfaces which is XMLEventReader and XMLEventWriter.
- Event Iterator API parses the XML document and it returns the event objects.
- The events are for element, attributes, text, values, comment etc.
- This is similar to the java collection iterator. XMLEvent is the basic interface and XMLEvent. nextEvent() is the key method which returns the next event in XML document.
- This is similar to next() method available in iterator interface of collection.
- Available Event types are START_DOCUMENT, START_ELEMENT, END_ELEMENT, CHARACTERS, PROCESSING_INSTRUCTION, COMMENT, SPACE, END_DOCUMENT, ENTITY_REFERENCE, ATTRIBUTE, DTD, CDATA, NAMESPACE, NOTATION_DECLARATION, ENTITY_DECLARATION.
Implementation of XMLEventReader for Reading a XML file
XMLEventReaderDemo.java
package com.gaurav.staxparsers;
import java.io.FileReader;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.XMLEvent;
public class XMLEventReaderDemo {
public static void main(String args[]) {
XMLEventReader xmlEventReader = null;
try {
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
FileReader fileReader = new FileReader("C://employeeDetails.xml");
xmlEventReader = xmlInputFactory.createXMLEventReader(fileReader);
while (xmlEventReader.hasNext()) {
XMLEvent xmlEvent = xmlEventReader.nextEvent();
if (xmlEvent.isCharacters()
&& ((Characters) xmlEvent).isWhiteSpace())
continue;
System.out.println(xmlEvent);
}
} catch (Exception exception) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, exception);
} finally {
try {
xmlEventReader.close();
} catch (XMLStreamException e) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, e);
}
}
}
}
Result:-
<?xml version="1.0" encoding='null' standalone='yes'?>
<employees>
<employee>
<id>
1
</id>
<name>
GAURAV
</name>
<email>
gaurav@yahoo.co.in
</email>
</employee>
<recCount>
1
</recCount>
</employees>
ENDDOCUMENT
Implementation of XMLEventWriter for Writing a XML file
XMLEventWriterDemo.java
package com.gaurav.staxparsers;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.stream.XMLEventFactory;
import javax.xml.stream.XMLEventWriter;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.XMLEvent;
public class XMLEventWriterDemo {
public static void main(String args[]) {
XMLOutputFactory xmlOutputFactory = XMLOutputFactory.newInstance();
XMLEventFactory xmlEventFactory = XMLEventFactory.newInstance();
XMLEventWriter xmlEventWriter = null;
try {
xmlEventWriter = xmlOutputFactory.createXMLEventWriter(new
FileOutputStream( "C://GeneratedXMLUsingXMLEventWriter.xml"),"UTF-8");
XMLEvent event = xmlEventFactory.createStartDocument("UTF-8","1.0");
/**If We want to remove encoding="UTF-8" from the XML document the we can
* Use the below commented lines and we can comment the above two lines of code.
*/
/*xmlEventWriter = factory.createXMLEventWriter(new FileWriter(
"C://GeneratedXMLUsingXMLEventWriter.xml"));
XMLEvent event = eventFactory.createStartDocument();*/
xmlEventWriter.add(event);
event = xmlEventFactory.createStartElement("employees",
"http://www.javatechtipssharedbygaurav.com", "employee");
xmlEventWriter.add(event);
event = xmlEventFactory.createNamespace("gaurav",
"http://www.javatechtipssharedbygaurav.com");
xmlEventWriter.add(event);
event = xmlEventFactory.createAttribute("id", "1234");
xmlEventWriter.add(event);
event = xmlEventFactory.createAttribute("name", "KUMAR GAURAV");
xmlEventWriter.add(event);
event = xmlEventFactory.createStartElement("employees",
"http://www.javatechtipssharedbygaurav.com", "company");
xmlEventWriter.add(event);
event = xmlEventFactory.createAttribute("software", "java");
xmlEventWriter.add(event);
event = xmlEventFactory.createEndElement("employees",
"http://www.javatechtipssharedbygaurav.com", "company");
xmlEventWriter.add(event);
event = xmlEventFactory.createEndElement("employees",
"http://www.javatechtipssharedbygaurav.com", "employee");
xmlEventWriter.add(event);
Logger.getLogger(XMLStreamWriterDemo.class.getName())
.info("Contents are written successfully in the specified XML file using XMLEventWriter");
xmlEventWriter.flush();
} catch (XMLStreamException xmle) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, xmle);
} catch (IOException ioe) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, ioe);
} finally {
try {
xmlEventWriter.close();
} catch (XMLStreamException xmle) {
Logger.getLogger(XMLStreamWriterDemo.class.getName()).log(
Level.SEVERE, null, xmle);
}
}
}
}
Result:- content of GeneratedXMLUsingXMLEventWriter.xml
<?xml version="1.0" encoding="UTF-8"?>
<employees:employee xmlns:gaurav="http://www.javatechtipssharedbygaurav.com" id="1234" name="KUMAR GAURAV">
<employees:company software="java">
</employees:company>
</employees:employee>
Advantage of pull-parsing model:-
- In pull-parsing model, parsing happens according to client requirements.
- pull-parsing libraries are smaller.
- Application client code is also smaller which will interacts with parser API and even smaller for more complex documents.
- Filtering of elements is simpler as the client aware of that when a specific element comes in, for filtering of elements client has time to take decisions.
- Pull clients can read multiple documents at one time with a single thread.
- A StAX pull parser can filter XML documents such that elements unnecessary to the client can be ignored, and it can support XML views of non-XML data.
reference taken from oracle and IBM
No comments:
Post a Comment