Hack 56 Use Lookup Tables with XSLT to Translate FIPS Codes

figs/moderate.gif figs/hack56.gif

With XSLT, translate data in a source file by looking up the translation in a lookup table, using FIPS codes as an example.

While writing XSLT transformations, sometimes you need to convert phrases or data elements from the source file. For example, you might be transforming data from one schema to another, and the target schema might use different enumerated values. The source data might contain event-time, while the target schema requires eventTime.

XSLT techniques to make these conversions are well known, and even though they may not exactly be hacks, they are well worth including here. The approach is to create a lookup table that pairs the input and output phrases. There are two variations:

  1. The lookup table is an external XML file.

  2. The lookup table is embedded into the XSLT stylesheet.

With either, the lookup can be done with or without the help of keys, which will often speed up access. These variations are illustrated in this hack.

3.27.1 The FIPS Code Example

For a concrete example, this hack translates FIPS (Federal Information Processing Standards) numerical codes into city and state names. FIPS codes are published by the United States government. For example, the state of Indiana has the FIPS code 18, and the city of Bethel Village, which is in Indiana, has a code of 5050. The hack changes these codes into their natural language names.

Here is part of the source document (fips_lu_data.xml in Example 3-49).

Example 3-49. fips_lu_data.xml
<places>

    <place>

        <state>17</state>

        <city>14000</city>

    </place>    

    <place>

        <state>17</state>

        <city>57381</city>

    </place>    

    <!-- ... -->

</places>

We use just a few cities and states in order to have a short example. Here is the lookup table (fips.xml in Example 3-50).

Example 3-50. fips.xml
<fips>

     <state fips="17" name="ILLINOIS">

         <city fips="57381" name="PALOS HEIGHTS"/>

         <city fips="35307" name="HINSDALE"/>

         <city fips="20149" name="DIXMOOR"/>

         <city fips="84090" name="YOUNGSDALE"/>

         <city fips="14000" name="CHICAGO"/>

         <city fips="70629" name="SOUTH CHICAGO HEIGHTS"/>

     </state>

     <state fips="18" name="INDIANA">

         <city fips="1810" name="ANTIOCH"/>

         <city fips="36000" name="INDIANAPOLIS"/>

         <city fips="5050" name="BETHEL VILLAGE"/>

         <city fips="17740" name="DENHAM"/>

     </state>

     <state fips="26" name="MICHIGAN">

         <city fips="74010" name="SIMMONS"/>

         <city fips="22000" name="DETROIT"/>

         <city fips="43180" name="KINCHELOE"/>

         <city fips="73260" name="SHERMAN TWP"/>

     </state>

 </fips>

The easiest approach is to make the lookup table an external file, and not to use keys. The following stylesheet illustrates this variation (fips_no_keys.xsl in Example 3-51).

Example 3-51. fips_no_keys.xsl
<?xml version="1.0"?>

<!--=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  

=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =

        fips_no_keys.xsl

        Purpose:

                Demonstrate using a lookup table located

                in an external document.

        Author: Thomas B Passin

        Creation date: 7 March 2004

   

        Demonstrates using a lookup table and the use of

        document( ) to refer to nodes in the table.

=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  

=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =-->

<xsl:stylesheet version="1.0"

        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   

<!--

        indent='yes' is used just to try to get a more 

        readable output.  It has no effect on the 

        functionality of the output.

-->

<xsl:output encoding='utf-8' indent='yes'/>

   

<!--

        It is better to declare these global variables here 

        rather than to just use the expressions inline.

   

    The lookup table is contained in the file "fips.xml".

-->

<xsl:variable name='cities' 

     select='document("fips.xml")/fips/state/city'/>

<xsl:variable name='states' 

     select='document("fips.xml")/fips/state'/>

   

<xsl:template match='places'>

<places>

        <xsl:apply-templates select='place'/>

</places>

</xsl:template>

   

<!--

        This template demonstrates two methods to specify 

        which part of the lookup table to use.  Note the use 

        of current( ), which lets us get the context-derived 

        value into the lookup table predicate.  Otherwise the 

        use of "city" or "state" would be taken to be elements 

        in the lookup table, not in the source document.

   

        The variable is another way to achieve the same

        thing.

-->

<xsl:template match='place'>

 <xsl:variable name='city-fips' select='city'/>

 <place>

  <state><xsl:value-of 

     select='$states[@fips=current( )/state]/@name'/></state>

  <city><xsl:value-of 

     select='$cities[@fips=$city-fips]/@name'/></city>

 </place>

</xsl:template>

   

</xsl:stylesheet>

3.27.2 Putting the Lookup Table in the Stylesheet

If the lookup table is relatively short, you can put it into the stylesheet itself. You need to add a namespace to the top-level element of the table, and you need to add that namespace to the stylesheet element. You refer to nodes within the stylesheet itself using document("") (note the empty string).

So change the stylesheet element to this (fips_internal_codes.xsl in Example 3-52).

Example 3-52. fips_internal_codes.xsl
<?xml version="1.0" encoding='utf-8'?>

<!--=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  

=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =

    fips_internal_codes.xsl

    Purpose:

        Demonstrate using a lookup table located 

        within the stylesheet itself.

    Author: Thomas B Passin    

    Creation date: 7 March 2004

    

    Demonstrates inserting the lookup table using a

    namespace, and the use of document("") to refer

    to nodes in the stylesheet itself.

=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  

=  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =  =-->

<!--

    Note the use of exclude-result-prefixes to prevent

    the "lu" namespace from appearing in the output

    document (where it would be harmless but mildly

    annoying).

-->

<xsl:stylesheet version="1.0" 

    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

    xmlns:lu='http://example.com/lookup'

    exclude-result-prefixes='lu'>

    

<!--

    indent='yes' is used just to try to get a more readable

    output.  It has no effect on the functionality of the

    output.

-->

<xsl:output encoding='utf-8' indent='yes'/>

   

<!--

    It is better to declare these global variables here rather 

    than to just se the expressions inline.

-->

<xsl:variable name='cities' select='document("")/xsl:stylesheet/lu:fips/state/city'/>

<xsl:variable name='states' select='document("")/xsl:stylesheet/lu:fips/state'/>

   

<xsl:template match='places'>

<places>

    <xsl:apply-templates select='place'/>

</places>    

</xsl:template>

   

<!--

    This template demonstrates two methods to specify which 

    part of the lookup table to use.  Note the use of 

    current( ), which lets us get the context-derived value 

    into the lookup table predicate.  Otherwise the use of 

    "city" or "state" would be taken to be elements in the 

    lookup table, not in the source document.  

    

    The variable is another way to achieve the same

    thing.

-->

<xsl:template match='place'>

    <xsl:variable name='city-fips' select='city'/>

    <place>

        <state><xsl:value-of select='$states[@fips=current( )/state]/@name'/></state>

        <city><xsl:value-of select='$cities[@fips=$city-fips]/@name'/></city>

    </place>

</xsl:template>

   

<!--

    The internal lookup table.  The exact namespace used does 

    not matter as long as there is one.

-->

<lu:fips>

     <state fips="17" name="ILLINOIS">

         <city fips="57381" name="PALOS HEIGHTS"/>

         <city fips="35307" name="HINSDALE"/>

         <city fips="20149" name="DIXMOOR"/>

         <city fips="84090" name="YOUNGSDALE"/>

         <city fips="14000" name="CHICAGO"/>

         <city fips="70629" name="SOUTH CHICAGO HEIGHTS"/>

     </state>

     <state fips="18" name="INDIANA">

         <city fips="1810" name="ANTIOCH"/>

         <city fips="36000" name="INDIANAPOLIS"/>

         <city fips="5050" name="BETHEL VILLAGE"/>

         <city fips="17740" name="DENHAM"/>

     </state>

     <state fips="26" name="MICHIGAN">

         <city fips="74010" name="SIMMONS"/>

         <city fips="22000" name="DETROIT"/>

         <city fips="43180" name="KINCHELOE"/>

         <city fips="73260" name="SHERMAN TWP"/>

     </state>

 </lu:fips>

 

</xsl:stylesheet>

3.27.3 Running the Hack

In the following, we use the Instant Saxon XSLT processor [Hack #32] . Assuming that Instant Saxon is on your path, and that both data and stylesheet are in the current working directory, type the following command:

saxon -o fips_out.xml fips_lu_data.xml fips_no_keys.xsl

Here the input data is in fips_lu_data.xml, and the external lookup table fips.xml is in the same directory as the stylesheet. If fips.xml gets moved elsewhere, you have to adjust the paths on the command line and in the document() call in the stylesheet. If you have a large lookup table, you can use fips_keys.xsl instead of fips_no_keys.xsl to improve performance. To use the internal lookup table, type this command:

saxon -o fips_out.xml fips_lu_data.xml fips_internal_codes.xsl

All the variations give the same results (Example 3-53).

Example 3-53. fips_out.xml
<?xml version="1.0" encoding="utf-8"?>

<places>

   <place>

      <state>ILLINOIS</state>

      <city>CHICAGO</city>

   </place>

   <place>

      <state>ILLINOIS</state>

      <city>PALOS HEIGHTS</city>

   </place>

   <place>

      <state>MICHIGAN</state>

      <city>DETROIT</city>

   </place>

   <place>

      <state>INDIANA</state>

      <city>BETHEL VILLAGE</city>

   </place>

</places>

?Tom Passin



     
    ASPTreeView.com
     
    Evaluation has Бѕ·expired.
    Info...