SAX Adapter

The SAX Adapter is a utility that builds on the SAX interface and greatly simplifies the use of SAX while preserving most of its efficiency and all of its power. By using a utility like the SAX Adapter, SAX truly becomes a "simple" XML API with less of a learning curve than the more complex tree-based APIs such as DOM.

The SAX Adapter utility simplifies the org.xml.sax.ContentHandler interface and provides a parsing model that is a natural fit to most XML processing tasks. First, you instantiate a SAXAdapter, which functions as a SAX2 parser (XMLReader implementation), or producer, of SAX events. Next, you register callback interface objects, which are simplified versions of ContentHandler that implement onStartTag and onEndTag methods, with the adapter for each tag you are interested in handling. This simplifies the use of SAX in two ways. First, it pares down the ContentHandler interface from eleven methods to two and it provides a simple mechanism for keeping track of parsing state that is generally of interest. Second, it allows you to easily partition handling code by tag, which is usually what you want to do. Instead of writing nested if-then-else structures, you can provide different implementations of the callback interface and the adapter will call them for the appropriate tags.

The SAX Adapter stores parsing state in three different ways to simplify application development.

A StringBuffer is passed to each callback that represents the accumulated calls to onCharacters(), saving you from matching text with tags
A Map is provided with each callback that is shared by all callbacks and is an easy way to share application state.
A NamespaceContext interface object is passed into both callback methods, which provides context-specific namespace information about the parsed document.

With this simple infrastructure it is surprising how much work can be done with just a little application code.

The SAX Adapter creates an internal stack of callback objects as it parses an XML document. Every time a callback is identified for a particular tag, that callback is added to the stack. The attributes collected in startElement() and the element content collected in characters() are presented to the registered callback's onStartTag() method. The onStartTag() method is called when either the end tag of the original element is encountered or when a nested element is discovered, whichever comes first. The onEndTag() method is called when the end tag is encountered. Callbacks can be registered mid-parse and the adapter will scan the stack to replace any callbacks previously associated with the specified tag with the new callback. If no callback is found for a given tag, nothing is added to the stack and that tag is ignored.

When a tag is encountered, the adapter searches for a callback using the following logic:

Return a registered callback that matches namespace and name; else
Return a registered default namespace callback that matches the namespace; else
Return a callback registered by qualified-name that matches the qName; else
Return the default callback, if one has been registered.

The adapter keeps track of namespace mappings using the SAX NamepaceSupport helper class and exposes this information in the NamespaceContext interface in the onStartTag() and onEndTag() callback methods.

If no parent XMLReader is supplied in the constructor for a SAXAdapter, then it uses the SAX2 bootstrap mechanism in XMLReaderFactory to create a parent parser. If a certain system variable is defined then that system property is interpreted as the fully-qualified classname of the desired XMLReader implementation. This feature is useful in situations where third party software sets a system default implementation for XMLReader that you want to bypass.

I must give credit to the sax project authors, not only for their significant contributions to XML, but also for the design of this website. I have eggregiously borrowed from the SAX website style and format and respectfully wish for their forgiveness.

SAX Adapter

About SAX Adapter