7.1 History

In the very early days of markup languages, the only way to transform documents was by writing a custom software application to do it. Before SGML and XML, this was excruciating at best. Presentational markup is quite difficult to interpret in any way other than the device-dependent behavior it encodes.

SGML made it much easier for applications to manipulate documents. However, any transformation process was tied to a particular programming platform, making it difficult to share with others. The SGML community really needed a portable language specifically designed to handle SGML transformations, and which supported the nuances of print publishing (the major use of SGML at the time). The first solution to address these needs was the Document Style Semantics and Specification Language (DSSSL).

DSSSL (pronounced "dissel") was completed in 1996 under the auspices of the ISO working group for Document Description and Processing Languages. It laid out the fundamental rules for describing the parts of a formatted document that inspired later efforts including XSL and CSS. Concepts such as bounding boxes and font properties are painstakingly defined here.

If you look at a DSSSL script, you'll see that it is a no-fooling-around programming language. The syntax is Scheme, a dialect of Lisp. You have to be a pretty good programmer to be able to work with it, and the parentheses might drive some to distraction. There is really nothing you can't do with DSSSL, but for most transformations, it may be overly complex. I certainly don't miss it.

As XML gained prominence, the early adopters and developers began to map out a strategy for high-quality formatting. They looked at DSSSL and decided it suffered from the same problems as SGML: too big, too hard to learn, not easy to implement. James Clarke, a pioneer in the text processing frontier who was instrumental in DSSSL development, took what he had learned and began to work on a slimmed-down successor. Thus was born the Extensible Stylesheet Language (XSL).

XSL is really three technologies rolled into one:

  • XPath, for finding and handling document components.

  • XSL Transformations, for changing any XML document into a presentational XSL Formatting Object tree.

  • XSL Formatting Objects, a markup language for high-quality formatted text.

XSL Transformations (XSLT) is the subject of this chapter (we already covered XPath in the last chapter, and visit XSL Formatting Objects in the next). First published as a recommendation by the W3C in 1999, it was originally designed just to transform an XML document into an XSL Formatting Object tree, hence the reason why it retains the "XSL" in its name. So it was surprising to everybody involved when XSLT became the generic transformation language of choice.

In retrospect, it is not so surprising. XSLT is a brilliantly designed language. It is simple enough that its basics can be learned in an hour. The elegance of XPath is very intuitive. The idea of templates is natural and flexible enough to apply to a wide variety of situations. And because XSLT is itself an XML application, all the XML APIs and tools will happily dissect and manipulate XSLT stylesheets.

In this chapter, I will show you not only how to use XSLT to generate formatted (presentational) output, but to use it in a wide variety of problems. Once you have the transformation mindset, you'll find that it's useful in so many ways. Things you used to write programs to do can be done much more succinctly and clearly in an XSLT stylesheet.