Hack 26 Include External Documents with XInclude

figs/moderate.gif figs/hack26.gif

Beyond entity inclusion, there is another mechanism for including external text and documents. It's called XInclude.

XML Inclusions, or XInclude, is still a working draft specification at W3C (http://www.w3.org/TR/xinclude/), but it is being implemented with reasonable confidence as a step up from external entities. XInclude allows on-the-spot replacement of text or markup, without a DTD or entity reference. It also has a fallback mechanism in case something goes haywire.

The main feature of XInclude is the include element. The namespace name for XInclude is http://www.w3.org/2003/XInclude, and the common prefix is xi.

This hack is based on the November 2003 working draft of XInclude (http://www.w3.org/TR/2003/WD-xinclude-20031110/). The candidate recommendation for XInclude was issued in April 2004, just as I was finishing this book. The candidate rec specifies a previous namespace name, http://www.w3.org/2001/XInclude. I am using the older namespace URI so that it works with the software I'm using here, but it's likely that the software will catch up with the current version of the specification soon, and you'll have to change the namespace URI to get it to work.


The include element has seven possible attributes: href, xpointer, parse, encoding, accept, accept-charset, and accept-language. The href attribute is mandatory unless xpointer is present (and vice versa), because you have to point to the included text or markup using one or the other. href retrieves a whole document, but xpointer (based on http://www.w3.org/TR/xptr-framework/) can supposedly pinpoint a location in a document and retrieve it. The parse attribute can have one of two possible values: text or xml. If the value is text, the included resource must be made up of characters; if xml, the resource must be well-formed XML.

What Is XPointer?

The XML Pointer Language (XPointer) is a language for creating fragment identifiers for URI references to XML documents (the media types text/xml and application/xml). It is specified in three W3C recommendations: the XPointer Framework (http://www.w3.org/TR/xptr-framework/), the XPointer element() scheme (http://www.w3.org/TR/xptr-element/), and the XPointer xmlns() scheme (http://www.w3.org/TR/xptr-xmlns/). Beyond element() and xmlns(), XPointer also has an xpointer() scheme; however, xpointer(), which is based on XPath, is not yet fully developed. In other words, XPointer is not quite ready for prime time. However, Mozilla currently offers partial XPointer support, with additional non-W3C schemes (http://www.mozilla.org/newlayout/xml/#linking).


The encoding attribute may be used only when the value of parse is text. It is intended to help an XInclude processor figure out the character encoding of the included text. Legal values are encoding="UTF-8", encoding="UTF-16", and others as specified by http://www.w3.org/TR/REC-xml.html/#charencoding.

The last three attributes, accept, accept-charset, and accept-language, are used for content negotiation when the resource is transported over HTTP (see Section 14 of http://www.ietf.org/rfc/rfc2616.txt). Each of these attribute names match HTTP message header names; i.e., Accept, Accept-Charset, and Accept-Language. If any of these attributes is used, and a matching header exists in the HTTP request, the value of the attribute should match that of the header field value. For example, if accept-language="de" is on the include element, then the HTTP message header in the request should be Accept-Language: de.

include may have one and only one possible element child, fallback, which provides the fallback or fail-safe mechanism mentioned earlier. If the resource include is trying to reach is unavailable for some reason, the content of the fallback element is used instead. If the fallback element is empty, include is simply ignored in the output.

libxml2's (http://xmlsoft.org) utility xmllint (http://xmlsoft.org/xmllint.html) offers preliminary support for XInclude. Take the document include.xml (Example 2-19), part of the file archive that came with the book (note emphasis):

Example 2-19. include.xml
<?xml version="1.0" encoding="UTF-8"?>

   

<!-- a time instant -->

<time timezone="PST">

 <hour>11</hour>

 <minute>59</minute>

 <second>59</second>

 <meridiem>59</meridiem>

 <xi:include xmlns:xi="http://www.w3.org/2003/XInclude"

     parse="xml" href="http://www.wyeast.net/rmt.ent">

  <xi:fallback>Oops!</xi:fallback>

 </xi:include>

</time>

The properly namespaced include element is looking for a piece of well-formed XML at http://www.wyeast.net/rmt.ent. If it doesn't find it, the word "Oops!" is written at the point of inclusion in the result. Let's test it with xmllint using the --xinclude option:

xmllint --xinclude include.xml

If everything goes all right, the result should look like Example 2-20.

Example 2-20. xmllint output with XInclude processing
<?xml version="1.0" encoding="UTF-8"?>

<!-- a time instant -->

<time timezone="PST">

 <hour>11</hour>

 <minute>59</minute>

 <second>59</second>

 <meridiem>59</meridiem>

 <atomic signal="true" xml:base="http://www.wyeast.net/rmt.ent"/>

</time>

The base URI [Hack #28] for the atomic element is not changed by the XInclude processor, so it is stated explicitly with the xml:base attribute (see http://www.w3.org/TR/xinclude/#base). For fun, change rmt.ent in the href attribute on the include element to rmt.xml (which does not exist) to see what happens. You could also make the fallback element empty to see what result you'll get.

2.17.1 See Also

  • The XMLmind XML Editor has XInclude support: http://www.xmlmind.com/xmleditor/

  • Apache's Xerces v2.6.1 and above now support XInclude: http://xml.apache.org/news.html#xerces-j-2.6.1

  • Elliotte Rusty Harold's Java XML Object Model (XOM) offers XInclude support with its XIncluder class: http://cafeconleche.org/XOM/

  • W3C's list of XInclude implementations: http://www.w3.org/XML/2002/09/xinclude-implementation



     
    ASPTreeView.com
     
    Evaluation has НКјЩЩХ¶Мexpired.
    Info...