Hack 15 Work with XML in Microsoft Excel 2003

figs/moderate.gif figs/hack15.gif

Using table-structured data or spreadsheets? Open, format, and save XML documents with Excel 2003.

Microsoft Excel 2003 offers unprecedented XML support. As with Word, full XML features are not available except with the Microsoft Office Professional Edition 2003, Enterprise Edition 2003, and the individual version. Other versions (the Standard or Small Business editions) won't have XML support except for the ability to save a file in SpreadsheetML format.

Excel 2003 allows you to open an XML document and then save or export data as XML. Choose File Open and then navigate to the file time.xml in the Open dialog box. Select the file and then click Open. You can open the document in one of three ways: as an XML list, as a read-only workbook, or by using the XML Source task pane.

When you open an XML file as an XML list, Excel automatically creates an XML Schema that corresponds with the XML (it warns you of that). It also maps each of the attribute values and the content from each of the elements to a cell in the spreadsheet. The XML Source task pane lists each of the elements in the imported document in a tree view. As a cell is highlighted in the spreadsheet, the corresponding element or attribute is highlighted in the task pane.

If the task pane does not appear automatically, choose Data XML XML Source.

In Figure 2-12, notice that in the XML Source pane, the hour element is highlighted; it is associated through a mapping with cell B1, which is also highlighted. If you were to select cell C1, the minute element in the XML Source pane would be highlighted.

Figure 2-12. Mapping time.xml to fields in Excel 2003

If you open an XML document as a read-only workbook, no mappings or schema generation occurs, but cells are labeled with the names of elements and attributes automatically, with the labels resembling XPath location paths.

When an XML document is opened by using the Use the XML Source task pane option, a schema is created, and the elements appear in a tree view. However, no data is imported into the spreadsheet and mappings between cells, and elements or attributes must be done manually, but this gives you more control over where the data winds up.

Under the Data XML menu, you have several choices for working with XML. You can import or export data (you cannot export read-only worksheets, though), refresh XML data, adjust XML map properties, or open XML expansion packs. Expansion packs are used for converting workbooks into smart documents. Smart documents can help you complete tasks, fill in information, use boilerplates, and so forth.

If you are working with a regular Excel spreadsheet, like the loan calculator that appears in Figure 2-13, you can save it in Excel's SpreadsheetML format. (You can't save native XML documents in this format.) Choose File Save As, and in the "Save as type" pull-down menu, select XML. Enter the filename (loan.xml in this example) and click the Save button. Analysis of this file yields that it has over 18,000 lines, 25,667 elements, and 28,697 attributes?by no means a small specimen. A portion of this file follows. I find SpreadsheetML files more humanly readable than the WordprocessingML format. They are nicely indented and generally eschew unnecessary element name prefixing.

<?xml version="1.0"?>

<?mso-application progid="Excel.Sheet"?>

<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"





 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">

  <Author>Microsoft Corporation</Author>

  <LastAuthor>Mike Fitzgerald</LastAuthor>




  <Company>Microsoft Corporation</Company>



 <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
















 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">










Figure 2-13. loan.xls in Excel 2003

2.6.1 See Also

  • Office 2003 XML, by Evan Lenz, Mary McRae, and Simon St.Laurent (O'Reilly)

  • SpreadsheetML documentation and schema: http://www.microsoft.com/downloads/details.aspx?FamilyID=fe118952-3547-420a-a412-00a2662442d9&displaylang=en