Hack 42 Create and Process SpreadsheetML

figs/expert.gif figs/hack42.gif

Since Excel XP, Excel has included an XML export option. SpreadsheetML provides an XML representation of your spreadsheets, complete with formatting and formula information.

Although there are several ways to read and write Excel spreadsheet files without using Excel, one of the easiest options is to import and export XML files that use Microsoft's SpreadsheetML vocabulary. SpreadsheetML isn't complete?most notably, charts and VBA code are omitted?but it does represent the core components of a spreadsheet, including formulas, named ranges, and formatting.

This hack uses Excel features that are available only in Excel XP and Excel 2003 on Windows. Earlier versions of Excel do not support this, and neither do current or announced Macintosh versions of Excel.

The easiest way to get started with SpreadsheetML is to save a spreadsheet as XML. The spreadsheet shown in Figure 3-20 includes data, formulas, named ranges and cells, and some simple formatting.

Figure 3-20. A test spreadsheet for SpreadsheetML
figs/xmlh_0320.gif


If you save the spreadsheet using the XML Spreadsheet (*.xml) format, which you can access by selecting File Save As, you'll get an XML document named SpreadsheetML.xml containing the markup shown in Example 3-22. Key portions are highlighted in bold.

Example 3-22. A SpreadsheetML document
<?xml version="1.0"?>

<?mso-application progid="Excel.Sheet"?>

<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"

 xmlns:o="urn:schemas-microsoft-com:office:office"

 xmlns:x="urn:schemas-microsoft-com:office:excel"

 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"

 xmlns:html="http://www.w3.org/TR/REC-html40">

 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">

  <Author>Simon St.Laurent</Author>

  <LastAuthor>Simon St.Laurent</LastAuthor>

  <Created>2003-12-03T15:48:38Z</Created>

  <LastSaved>2004-01-26T21:04:14Z</LastSaved>

  <Company>O'Reilly &amp; Associates</Company>

  <Version>11.5703</Version>

 </DocumentProperties>

 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">

  <WindowHeight>6150</WindowHeight>

  <WindowWidth>8475</WindowWidth>

  <WindowTopX>120</WindowTopX>

  <WindowTopY>30</WindowTopY>

  <ProtectStructure>False</ProtectStructure>

  <ProtectWindows>False</ProtectWindows>

 </ExcelWorkbook>

 <Styles>

  <Style ss:ID="Default" ss:Name="Normal">

   <Alignment ss:Vertical="Bottom"/>

   <Borders/>

   <Font/>

   <Interior/>

   <NumberFormat/>

   <Protection/>

  </Style>

  <Style ss:ID="s21">

   <NumberFormat ss:Format="mmm\-yy"/>

  </Style>

  <Style ss:ID="s22">

   <NumberFormat ss:Format="&quot;$&quot;#,##0.00"/>

  </Style>

  <Style ss:ID="s23">

   <Font x:Family="Swiss" ss:Bold="1"/>

  </Style>

 </Styles>

 <Names>

  <NamedRange ss:Name="Critters" 

              ss:RefersTo="=Sheet1!R4C2:R11C2"/>

  <NamedRange ss:Name="Date" ss:RefersTo="=Sheet1!R1C2"/>

  <NamedRange ss:Name="ID" ss:RefersTo="=Sheet1!R4C1:R11C1"/>

  <NamedRange ss:Name="Price" 

              ss:RefersTo="=Sheet1!R4C3:R11C3"/>

  <NamedRange ss:Name="Quantity" 

              ss:RefersTo="=Sheet1!R4C4:R11C4"/>

  <NamedRange ss:Name="Total" ss:RefersTo="=Sheet1!R12C5"/>

 </Names>

 <Worksheet ss:Name="Sheet1">

  <Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="12" 

         x:FullColumns="1" x:FullRows="1">

   <Column ss:AutoFitWidth="0" ss:Width="73.5"/>

   <Column ss:AutoFitWidth="0" ss:Width="96.75"/>

   <Column ss:Index="5" ss:AutoFitWidth="0" ss:Width="56.25"/>

   <Row>

    <Cell ss:StyleID="s23"><Data ss:Type="String">Sales 

      for:</Data></Cell>

    <Cell ss:StyleID="s21"><Data ss:Type="DateTime">

      2004-01-01T00:00:00.000</Data><NamedCell

      ss:Name="Date"/></Cell>

   </Row>

   <Row ss:Index="3" ss:StyleID="s23">

    <Cell><Data ss:Type="String">ID Number</Data></Cell>

    <Cell><Data ss:Type="String">Critter</Data></Cell>

    <Cell><Data ss:Type="String">Price</Data></Cell>

    <Cell><Data ss:Type="String">Quantity</Data></Cell>

    <Cell><Data ss:Type="String">Total</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">4627</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Diplodocus</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data ss:Type="Number">22.5</Data>

      <NamedCell ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">127</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">2857.5</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">3912</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Brontosaurus</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data ss:Type="Number">17.5</Data>

      <NamedCell ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">74</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">1295</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">9845</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Triceratops</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data ss:Type="Number">12</Data>

      <NamedCell ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">91</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">1092</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">9625</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Vulcanodon</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data ss:Type="Number">19</Data>

      <NamedCell ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">108</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">2052</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">5903</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Stegosaurus</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data ss:Type="Number">18.5</Data>

      <NamedCell ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">63</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">1165.5</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">1824</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Monoclonius</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data 

      ss:Type="Number">16.5</Data><NamedCell

      ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">133</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">2194.5</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">9728</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Megalosaurus</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data 

      ss:Type="Number">23</Data><NamedCell

      ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">128</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">2944</Data></Cell>

   </Row>

   <Row>

    <Cell><Data ss:Type="Number">8649</Data><NamedCell 

      ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String">Barosaurus</Data><NamedCell 

      ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data 

      ss:Type="Number">17</Data><NamedCell

      ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number">91</Data><NamedCell 

      ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

      ss:Type="Number">1547</Data></Cell>

   </Row>

   <Row>

    <Cell ss:Index="4" ss:StyleID="s23"><Data 

      ss:Type="String">Total:</Data></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=SUM(R[-8]C:R[-1]C)">

      <Data ss:Type="Number">15147.5</Data><NamedCell

      ss:Name="Total"/></Cell>

   </Row>

  </Table>

  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">

   <Print>

    <ValidPrinterInfo/>

    <HorizontalResolution>600</HorizontalResolution>

    <VerticalResolution>600</VerticalResolution>

   </Print>

   <Selected/>

   <Panes>

    <Pane>

     <Number>3</Number>

     <ActiveRow>11</ActiveRow>

     <ActiveCol>4</ActiveCol>

    </Pane>

   </Panes>

   <ProtectObjects>False</ProtectObjects>

   <ProtectScenarios>False</ProtectScenarios>

  </WorksheetOptions>

 </Worksheet>

 <Worksheet ss:Name="Sheet2">

  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">

   <ProtectObjects>False</ProtectObjects>

   <ProtectScenarios>False</ProtectScenarios>

  </WorksheetOptions>

 </Worksheet>

 <Worksheet ss:Name="Sheet3">

  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">

   <ProtectObjects>False</ProtectObjects>

   <ProtectScenarios>False</ProtectScenarios>

  </WorksheetOptions>

 </Worksheet>

</Workbook>

The first highlighted line, <?mso-application progid="Excel.Sheet"?>, is an XML processing instruction that tells Windows (actually, a component Office 2003 adds to Windows) that this XML document is, in fact, an Excel spreadsheet. When Windows displays the file, it will have an Excel logo on it, and double-clicking it will open it in Excel.

The root element of the document, Worksheet, appears immediately after the processing instruction. Its attributes define namespaces used for various pieces of SpreadsheetML. The next few lines comprise mostly metadata, window presentation, and formatting information, and it isn't until you get to the Names and Worksheet elements that there's much worth examining closely.

The Names element identifies the named ranges and cells in the document. These two NamedRange elements define the Quantity named range?which extends from row 4, column 4, to row 11, column 4?and the Total named range, which is just the cell in row 12 of column 5:

  <NamedRange ss:Name="Quantity" ss:RefersTo="=Sheet1!R4C4:R11C4"/>

  <NamedRange ss:Name="Total" ss:RefersTo="=Sheet1!R12C5"/>

The meat of the spreadsheet is in the Worksheet element. It starts by defining how large the actual table of data is:

  <Worksheet ss:Name="Sheet1">

  <Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="12" 

  x:FullColumns="1" x:FullRows="1">

This sheet, named Sheet1, used 5 columns and 12 rows. (The x:FullColumns and x:FullRows attributes are in another namespace that Excel won't use for layout.) The actual information in the table is stored in Row and Cell elements:

   <Row>

    <Cell ss:StyleID="s23"><Data ss:Type="String">Sales

      for:</Data></Cell>

    <Cell ss:StyleID="s21"><Data 

      ss:Type="DateTime">2004-01-01T00:00:00.000</Data>

      <NamedCell ss:Name="Date"/></Cell>

   </Row>

   <Row ss:Index="3" ss:StyleID="s23">

    <Cell><Data ss:Type="String">ID Number</Data></Cell>

This Row, the first in the spreadsheet, contains two Cell elements. The first, formatted as s23 (bold, in this spreadsheet) and using the datatype String, contains the text "Sales for:". The second cell is formatted as s21 (plain), and uses the datatype DateTime. Its contents are given in a verbose ISO 8601 format. This cell also is part of a named range, in this case, Date.

Most of the other Row elements follow similar patterns, but there are a few items worth extra attention. The second Row element has an extra attribute on it, ss:Index:

   <Row ss:Index="3" ss:StyleID="s23">

Excel doesn't represent empty rows or empty columns with empty Row or Cell elements. It just adds an ss:Index attribute to the next Row or Cell with content to tell you where you are. This requires programs that process this XML to pay a little more attention when assembling their tables. The other thing to watch is formulas:

    <Cell ss:StyleID="s22" ss:Formula="=SUM(R[-8]C:R[-1]C)">

      <Data ss:Type="Number">15147.5</Data><NamedCell

      ss:Name="Total"/></Cell>

In Figure 3-19 this cell had a name of Total, a value of $15,147.50, and a formula of =SUM(E4:E11). All of those parts are here. But you must assemble them from the style of s22 (defined earlier in the document as a monetary number format), the value 15147.5, and a formula that uses relative references to say "the sum of the values in the same column as this one from 8 rows up to 1 row up."

This might not seem like much fun to process, but it's actually not that hard to do once you have an XML toolkit. You can use C#, Java, Perl, Python, VB, or your favorite XML-enabled programming language to extract the information, but we'll use XSLT to demonstrate.

The stylesheet in Example 3-23, SpreadsheetML.xsl, run against the XML in Example 3-22, will produce the much simpler XML in Example 3-24.

Example 3-23. An XSLT stylesheet for extracting content from the SpreadsheetML in Example 3-22
<xsl:stylesheet version="1.0" 

  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

  xmlns="http://simonstl.com/ns/dinosaurs/"

  xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"

 >

   

<xsl:output method="xml" omit-xml-declaration="yes" 

     indent="yes" encoding="US-ASCII"/>

   

<xsl:template match="/">

  <xsl:apply-templates select="ss:Workbook"/> 

</xsl:template>

   

<xsl:template match="ss:Workbook">

  <dinosaurs>

       <xsl:apply-templates 

            select="ss:Worksheet[@ss:Name = 'Sheet1']"/> 

  </dinosaurs>

</xsl:template>

   

<xsl:template match="ss:Worksheet">

   <date><xsl:value-of 

            select="ss:Table/ss:Row/ss:Cell[@ss:StyleID = 

              's21']" />

   </date>

   <xsl:apply-templates select="ss:Table" />

</xsl:template>

   

<xsl:template match="ss:Table">

   <xsl:apply-templates select="ss:Row[position( ) &gt; 2]" />

<!--Note that because Excel skips the blank row, 

the third row is in position 2-->

</xsl:template>

   

<xsl:template match="ss:Row[ss:Cell[4]]">

<sale>

   <IDnum><xsl:apply-templates select="ss:Cell[1]" /></IDnum>

   <critter><xsl:apply-templates select="ss:Cell[2]" 

     /></critter>

   <price><xsl:apply-templates select="ss:Cell[3]" /></price>

   <quantity><xsl:apply-templates select="ss:Cell[4]" 

     /></quantity>

   <total><xsl:apply-templates select="ss:Cell[5]" /></

total>

</sale>

</xsl:template>

   

<xsl:template match="ss:Row">

<total><xsl:apply-templates select="ss:Cell[2]" /></total>

</xsl:template>

   

</xsl:stylesheet>

The heart of the stylesheet is the template that matches all rows with four or more child cell elements. It extracts the information from the cells and puts it into XML elements that reflect the data, producing the result shown in Example 3-24, dinosaurs.xml.

Example 3-24. From SpreadsheetML to a custom XML vocabulary
<dinosaurs xmlns="http://simonstl.com/ns/dinosaurs/" 

xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">

<date>2004-01-01T00:00:00.000</date>

<sale>

<IDnum>4627</IDnum>

<critter>Diplodocus</critter>

<price>22.5</price>

<quantity>127</quantity>

<total>2857.5</total>

</sale>

<sale>

<IDnum>3912</IDnum>

<critter>Brontosaurus</critter>

<price>17.5</price>

<quantity>74</quantity>

<total>1295</total>

</sale>

<sale>

<IDnum>9845</IDnum>

<critter>Triceratops</critter>

<price>12</price>

<quantity>91</quantity>

<total>1092</total>

</sale>

<sale>

<IDnum>9625</IDnum>

<critter>Vulcanodon</critter>

<price>19</price>

<quantity>108</quantity>

<total>2052</total>

</sale>

<sale>

<IDnum>5903</IDnum>

<critter>Stegosaurus</critter>

<price>18.5</price>

<quantity>63</quantity>

<total>1165.5</total>

</sale>

<sale>

<IDnum>1824</IDnum>

<critter>Monoclonius</critter>

<price>16.5</price>

<quantity>133</quantity>

<total>2194.5</total>

</sale>

<sale>

<IDnum>9728</IDnum>

<critter>Megalosaurus</critter>

<price>23</price>

<quantity>128</quantity>

<total>2944</total>

</sale>

<sale>

<IDnum>8649</IDnum>

<critter>Barosaurus</critter>

<price>17</price>

<quantity>91</quantity>

<total>1547</total>

</sale>

<total>15147.5</total>

</dinosaurs>

It's the same data, but in a very different form. The formula information has been discarded in this case, but because Excel provides the values as well as the formulas, this particular application didn't need to understand the formulas.

It's also possible to round-trip this data back into Excel, again using XSLT. The stylesheet in Example 3-25, toSpreadsheetML.xsl, uses the original spreadsheet as a template, and will produce XML similar to the SpreadsheetML you saved from Excel originally. I've left out some formatting so that there's a visible difference.

Example 3-25. An XSLT stylesheet for converting the custom XML vocabulary back to SpreadsheetML
<xsl:stylesheet version="1.0" 

  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

  xmlns:d="http://simonstl.com/ns/dinosaurs/"

  xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"

  xmlns="urn:schemas-microsoft-com:office:spreadsheet"

 >

   

<xsl:output method="xml" omit-xml-declaration="no" indent="yes" encoding="US-ASCII"/>

   

<xsl:template match="/">

  <xsl:apply-templates select="d:dinosaurs" />

</xsl:template>

   

<xsl:template match="d:dinosaurs">

   

<xsl:processing-instruction name="mso-application">progid="Excel.Sheet"

</xsl:processing-instruction>

<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"

 xmlns:o="urn:schemas-microsoft-com:office:office"

 xmlns:x="urn:schemas-microsoft-com:office:excel"

 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"

 xmlns:html="http://www.w3.org/TR/REC-html40">

 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">

  <Author>Simon St.Laurent</Author>

  <LastAuthor>Simon St.Laurent</LastAuthor>

  <Created>2003-12-03T15:48:38Z</Created>

  <LastSaved>2003-12-03T15:57:46Z</LastSaved>

  <Company>O'Reilly &amp; Associates</Company>

  <Version>11.5606</Version>

 </DocumentProperties>

 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">

  <WindowHeight>6150</WindowHeight>

  <WindowWidth>8475</WindowWidth>

  <WindowTopX>120</WindowTopX>

  <WindowTopY>30</WindowTopY>

  <ProtectStructure>False</ProtectStructure>

  <ProtectWindows>False</ProtectWindows>

 </ExcelWorkbook>

 <Styles>

  <Style ss:ID="Default" ss:Name="Normal">

   <Alignment ss:Vertical="Bottom"/>

   <Borders/>

   <Font/>

   <Interior/>

   <NumberFormat/>

   <Protection/>

  </Style>

  <Style ss:ID="s21">

   <NumberFormat ss:Format="mmm\-yy"/>

  </Style>

  <Style ss:ID="s22">

   <NumberFormat ss:Format="&quot;$&quot;#,##0.00"/>

  </Style>

 </Styles>

 <Worksheet ss:Name="Sheet1">

  <Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="{count(d:sale)+4}" x:

FullColumns="1"

   x:FullRows="1">

   <Column ss:AutoFitWidth="0" ss:Width="73.5"/>

   <Column ss:AutoFitWidth="0" ss:Width="96.75"/>

   <Column ss:Index="5" ss:AutoFitWidth="0" ss:Width="56.25"/>

   <Row>

    <Cell><Data ss:Type="String">Sales for:</Data></Cell>

    <Cell ss:StyleID="s21"><Data ss:Type="DateTime">

      <xsl:value-of select="d:date"/></Data></Cell>

   </Row>

   <Row ss:Index="3">

    <Cell><Data ss:Type="String">ID Number</Data></Cell>

    <Cell><Data ss:Type="String">Critter</Data></Cell>

    <Cell><Data ss:Type="String">Price</Data></Cell>

    <Cell><Data ss:Type="String">Quantity</Data></Cell>

    <Cell><Data ss:Type="String">Total</Data></Cell>

   </Row>

   

<xsl:apply-templates select="d:sale" />

   

   <Row>

    <Cell ss:Index="4"><Data ss:Type="String">Total:</Data></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=SUM(R[-{count(d:sale)}]C:R[-1]C)"><Data 

ss:Type="Number"></Data></Cell>

   </Row>

  </Table>

  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">

   <Print>

    <ValidPrinterInfo/>

    <HorizontalResolution>600</HorizontalResolution>

    <VerticalResolution>600</VerticalResolution>

   </Print>

   <Selected/>

   <Panes>

    <Pane>

     <Number>3</Number>

     <ActiveRow>12</ActiveRow>

     <ActiveCol>1</ActiveCol>

    </Pane>

   </Panes>

   <ProtectObjects>False</ProtectObjects>

   <ProtectScenarios>False</ProtectScenarios>

  </WorksheetOptions>

 </Worksheet>

 <Worksheet ss:Name="Sheet2">

  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">

   <ProtectObjects>False</ProtectObjects>

   <ProtectScenarios>False</ProtectScenarios>

  </WorksheetOptions>

 </Worksheet>

 <Worksheet ss:Name="Sheet3">

  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">

   <ProtectObjects>False</ProtectObjects>

   <ProtectScenarios>False</ProtectScenarios>

  </WorksheetOptions>

 </Worksheet>

</Workbook>

</xsl:template>

   

<xsl:template match="d:sale">

   <Row>

    <Cell><Data ss:Type="Number"><xsl:value-of select="d:IDnum" /></

Data><NamedCell ss:Name="ID"/></Cell>

    <Cell><Data ss:Type="String"><xsl:value-of select="d:critter" /></

Data><NamedCell ss:Name="Critters"/></Cell>

    <Cell ss:StyleID="s22"><Data ss:Type="Number"><xsl:value-of select=

"d:price" /></Data><NamedCell

      ss:Name="Price"/></Cell>

    <Cell><Data ss:Type="Number"><xsl:value-of select="d:quantity" /></

Data><NamedCell ss:Name="Quantity"/></Cell>

    <Cell ss:StyleID="s22" ss:Formula="=RC[-2]*RC[-1]"><Data 

ss:Type="Number"><xsl:value-of select="d:total" /></Data></Cell>

   </Row>

</xsl:template>

   

<xsl:template match="d:date" />

<xsl:template match="d:total" />

   

</xsl:stylesheet>

A few pieces of this example are worth special attention. First, note that the SpreadsheetML is wrapped in XSLT; the SpreadsheetML becomes part of the stylesheet. There's one piece of the SpreadsheetML you can't re-create with this method: the processing instruction noted earlier that tells Windows that this is an Excel spreadsheet. For that, you have to manually insert the following:

<xsl:processing-instruction name="mso-application">progid="Excel.Sheet"</xsl:processing-instruction>

Because XPath 1.0 won't allow you to use the default namespace (no prefix) to refer to content that has a namespace, all the references to content in the source document now have the prefix d:, such as d:sale, d:date, etc.

Also, because the named ranges will vary depending on the number of sale elements in the original, this stylesheet won't generate the Names element and its contents. Excel will re-create the named ranges from the NamedCell elements in any case. The heart of this stylesheet is again the part that generates the Row and Cell elements, as shown in the following:

<xsl:template match="d:sale">

   <Row>

    <Cell><Data ss:Type="Number"><xsl:value-of select="d:IDnum" /></Data><NamedCell ss:Name="ID"/></Cell>

The xsl:template element will collect every sale element in the original and produce a Row element that contains Cell elements matching its contents. If you open in Excel the SpreadsheetML that this stylesheet produces (which looks much like that in Example 3-22, minus named ranges and some formatting), you get the result shown in Figure 3-21.

Figure 3-21. The test spreadsheet after its data has gone from SpreadsheetML to another vocabulary and back again
figs/xmlh_0321.gif


This process might not look very beautiful, but there are lots of reasons you might want to follow it. For one, saving as SpreadsheetML gives you better access to the XML map information described in [Hack #15] than Excel's GUI offers at present. More importantly in the long run, SpreadsheetML is portable, and you can process it and generate it on virtually any computer that has basic XML facilities.

?Simon St.Laurent



     
    ASPTreeView.com
     
    Evaluation has ФВјТУЧКїexpired.
    Info...