XML Namespaces Tutorial

Mark Priest

SAX2 adds support for the XML Namespaces specification to the SAX API. Newcomers to SAX2 are often confused by the startElement() and endElement() callbacks in the ContentHandler interface if they are unfamiliar with XML namespaces and how they are handled by SAX2.

The XML Namespaces specification defines a way to group element and attribute names so that schemas created by one organization will not conflict with those created by another. Just as two Java classes can have the same name as long as they are defined in separate packages, two XML elements can have the same name as long as they belong to different namespaces. Each namespace defined in an XML document must be associated with a distinct uniform resource identifier (URI), which is usually a URL. These URIs have no semantic meaning and do not refer to actual web resources. You should define namespace URIs using domains that you control to prevent naming conflicts for the same reason that you should follow the URL naming convention for Java packages. Two URIs are considered distinct if they are distinct character strings, regardless of whether they would resolve to the same physical resource (i.e. http://localhost and http://george are distinct URIs in the context of XML namespaces even on the host george).

Namespaces are associated with a prefix when they are declared and this prefix is used along with a local name to represent an element in an XML document. A namespace declaration looks like this:

<parent xmlns:a="http://url1" xmlns:b="http://url2"> ... </parent>

The namespace http://url1 is bound to the prefix "a" and the namespace http://url2 is bound to the prefix "b" in this example. Three child elements of <parent>: <child>, <a:child>, and <b:child>, would have no namespace, a namespace of "http://url1", and a namespace of "http://url2", respectively, and all would have the local name "child".

Namespaces have a scope associated with them. A namespace declared in a parent element is bound to a given prefix for that element as well as for all of its child elements, unless that prefix is "overridden" in a child element by being assigned to a different namespace. The association between the namespace and prefix declared in an element do not apply to the siblings of that element. This is equivalent to the scope of variable names within the Java programming language.

A default namespace can be defined by omitting the prefix mapping in the declaration as in "xmlns='http://url3'". At most one default namespace is in effect at any given point in an XML document. The default namespace is scoped just as the prefix mappings are. If a default namespace is in scope and an element appears with no prefix then it is associated with that default namespace. Attribute names never inherit the default namespace and must be explicitly mapped to a namespace.

The startElement() and endElement() callbacks in ContentHandler have three arguments for namespace and name as shown in listing 1. The argument "namespaceURI" contains the namespace URI associated with the element or the empty string if there is no namespace. The "localName" argument contains the local name of the element without the prefix and the colon. The "qName", or qualified name, argument contains the element name exactly as it appears in the XML document, including the prefix and colon, if appropriate. To determine whether an element has an associated namespace, compare namespaceURI with the empty string, not with null. SAX2 guarantees that the localName will be meaningful if the namespace is not the empty string, otherwise the qName will be meaningful. If the value for localName or qName is not meaningful it will be set to the empty string, not to null. Some XML parsers always supply both names, but this is not required. The "uri", "localName", and "qName" arguments in the Attributes interface, which is exposed in the "atts" argument, follow the same rules.

There are two SAX2 features related to namespaces that affect these rules: the namespaces feature (defined as "http://xml.org/sax/features/namespaces") and the namespace prefixes feature (defined as "http://xml.org/sax/features/namespace-prefixes"). When the namespaces feature is set to false then namespace URIs are not reported in the SAX callbacks even if they are defined for elements or attributes. If the namespace prefixes feature is set to false then the qName will not be meaningful if a namespace is associated with an element, and attributes that are XML namespace declarations will not be accessible from the "atts" argument. If the namespace prefixes feature is set to true then the qName will always be meaningful and attributes that are namespace declarations will be accessible from "atts". It is not legal to set both features to false. Some XML parsers may supply qName and namespaceURI regardless of the feature settings, but this is not required. The namespaces feature is set to true and the namespace prefixes feature is set to false by default when you get an XMLReader from the XMLReaderFactory. To make things somewhat more complicated, when you get an XMLReader using the Java API for XML Processing (JAXP), the namespaces feature is set to false and the namespace prefixes feature is set to true by default. Example 9 (txt) shows a sample XML document with various nested namespace declarations and Table 1a (txt) and Table 1b (txt) show how a SAX2 parser would report the namespace information for tags and attributes, respectively, for valid namespace feature settings.

Example 9. Example Document for XML Namespaces Tutorial

  <a:Header xmlns="" xmlns:b="http://alturlb">
    <c:to xmlns:c="http://alturlc">Mark Priest</c:to>
    <from fromType="name">John Smith</from>
    <text xmlns="http://newdefault">Hello</text>

Table 1a. Namespace Information For Tags Reported by SAX2 for Document in Example 9

Tag NameNamespacesNS PrefixeslocalNameqNamenamespaceURI

Table 1b. Namespace Information For Attributes Reported by SAX2 for Document in Example 9

Attribute NameNamespacesNS PrefixeslocalNameqNamenamespaceURI

Note: a:encodingStyle is an attribute of a:Envelope and fromType is an attribute of from