A.5 XML Namespaces

XML Namespaces let you place a set of XML elements inside a separate "area" to avoid tag name clashes. This is an important feature because it allows XML documents to be extended and combined. Unfortunately, using XML namespaces is tricky. For something that initially seems very straightforward, there's a surprising amount of explanation required.

A.5.1 Why Use Namespaces?

Using XML Namespaces, developers can work together to define a common set of markup for different sets of data, such as RSS items, meta-information about pages on the Internet, or books. When programmers everywhere represent related information using the same set of elements in the same namespace, then everyone can create powerful applications based on a large set of shared data.

That's the theory, anyway.

On a more practical side, avoiding tag name clashes is still an issue because it's useful to modify XML documents. Clashes aren't a problem when everyone is working with a fixed set of elements. However, you can run into trouble if you allow others to extend a document by adding their own elements.

For example, you may decide to use <title> to refer to the title of a web page, but your friend used <title> as the title of a person, such as Mister or Doctor. With XML Namespaces, you can keep <html:title> distinct from <person:title>.

Some languages have a similar concept, where functions and objects belonging to a package can be namespaced together. PHP does not support namespaces, which is why you may see PHP function and class names prefixed with a unique string. For example, the PEAR::DB MySQL module is named DB_mysql. The leading DB_ means that this class will not conflict with a class named simply mysql.

Another example of namespaces is the domain name system: columbia.com is the Columbia Sportswear company, while columbia.edu is Columbia University. Both hosts are columbia, but one lives in the .com namespace and the other lives in .edu.

A.5.2 Syntax

In XML, a namespace name is a string that looks like a URL?for example, http://www.example.org/namespace/. This URL doesn't have to resolve to an actual web page that contains information about the namespace, but it can. A namespace is not a URL, but a string that is formatted the same way as a URL.

This URL-based naming scheme is just a way for people to easily create unique namespaces. Therefore, it's best only to create namespaces that point to a URL that you control. If everyone does this, there won't be any namespace conflicts. Technically, you can create a namespace that points at a location you don't own or use in any way, such as http://www.yahoo.com. This is not invalid, but it is confusing.

Unlike domain names, there's no official registration process required before you can use a new XML namespace. All you need to do is define the namespace inside an XML document. That "creates" the namespace. To do this, add an xmlns attribute to an XML element. For instance:

<tag xmlns:example="http://www.example.com/namespace/">

When an attribute name begins with the string xmlns, you're defining a namespace. The namespace's name is the value of that attribute. In this case, it's http://www.example.com/namespace/.

A.5.3 Namespace Prefixes

Since URLs are unwieldy, a namespace prefix is used as a substitute for the URL when referring to elements in a namespace (in an XML document or an XPath query, for example). This prefix comes after xmlns and a :. The prefix name in the previous example is example. Therefore, xmlns:example="http://www.example.com/namespace/" not only creates a namespace, but assigns the token example as a shorthand name for the namespace.

Namespace prefixes can contain letters, numbers, periods, underscores, and hyphens. They must begin with a letter or underscore, and they can't begin with the string xml. That sequence is reserved by XML for XML-related prefixes, such as xmlns.

When you create a namespace using xmlns, the element in which you place the attribute and any elements or attributes that live below it in your XML document are eligible to live in the namespace. However, these elements aren't placed there automatically. To actually place an element or attribute in the namespace, put the namespace prefix and a colon in front of the element name. For example, to put the element title inside of the http://www.example.com/namespace/ namespace, use an opening tag of <example:title>.

The entire string example:title is called a qualified name, since you're explicitly mentioning which element you want. The element or attribute name without the prefix and colon, in this case title, is called the local name.

Note that while the xmlns:example syntax implies that xmlns is a namespace prefix, this is actually false. The XML specification forbids using any name or prefix that begins with xml, except as detailed in various XML and XML-related specifications. In this case, xmlns is merely a sign that the name following the colon (:) is a namespace prefix, not an indication that xmlns is itself a prefix.

A.5.4 Examples

Example A-2 updates the address book from Example A-1 and places all the elements inside the http://www.example.com/address-book/ namespace.

Example A-2. Simple address book in a namespace
<ab:address-book xmlns:ab="http://www.example.com/address-book/">

    <ab:person id="1">

        <ab:firstname>Rasmus</ab:firstname>

        <ab:lastname>Lerdorf</ab:lastname>

        <ab:city>Sunnyvale</ab:city>

        <ab:state>CA</ab:state>

        <ab:email>rasmus@php.net</ab:email>

    </ab:person>



    <!-- more entries here -->

    

</ab:address-book>

If two XML documents map the same namespace to different prefixes, the elements still live inside the same namespace. The URL string defines a namespace, not the prefix. Also, two namespaces are equivalent only if they are identical, including their case. Even if two URLs resolve to the same location, they're different namespaces.

Therefore, this document is considered identical to Example A-2:

<bigbird:address-book xmlns:bigbird="http://www.example.com/address-book/">

    <bigbird:person id="1">

        <bigbird:firstname>Rasmus</bigbird:firstname>

        <bigbird:lastname>Lerdorf</bigbird:lastname>

        <bigbird:city>Sunnyvale</bigbird:city>

        <bigbird:state>CA</bigbird:state>

        <bigbird:email>rasmus@php.net</bigbird:email>

    </bigbird:person>



    <!-- more entries here -->

    

</bigbird:address-book>

The ab prefix has been changed to bigbird, but the namespace is still http://www.example.com/address-book/. Therefore, an XML parser would treat these documents as if they were the same.

A.5.5 Default Namespaces

As you can see, prepending a namespace prefix not only becomes tedious, it clutters up your document. Therefore, XML lets you specify a default namespace. Wherever a default namespace is applied, nonprefixed elements and attributes automatically live inside the default namespace.

A default namespace definition is similar to that of other namespaces, but you omit the colon and prefix name:

xmlns="http://www.example.com/namespace/"

This means there's yet another way to rewrite the example:

<address-book xmlns="http://www.example.com/address-book/">

    <person id="1">

        <firstname>Rasmus</firstname>

        <lastname>Lerdorf</lastname>

        <city>Sunnyvale</city>

        <state>CA</state>

        <email>rasmus@php.net</email>

    </person>



    <!-- more entries here -->

    

</address-book>

It is not uncommon to find a document that uses multiple namespaces. One is declared the default namespace, and the others are given prefixes.

For more on XML Namespaces, read Chapter 4 of XML in a Nutshell by Elliotte Rusty Harold and W. Scott Means (O'Reilly) or see the W3 specification at http://www.w3.org/TR/REC-xml-names/