Hack 29 What's the Diff? Diff XML Documents

figs/moderate.gif figs/hack29.gif

If you are handling many XML documents, sometimes you need to check the differences between two or more documents. You can perform diffs of XML documents with online and command-line tools.

When you manage a lot of XML documents, it is likely that you will have similar files with different content. Also, it is likely that you will need to keep track of changes on files within a given project. There are online tools?one from DecisionSoft (http://www.decisionsoft.com) and another from DeltaXML (http://www.deltaxml.com)?that can help you quickly compare XML files to see how different they are. There are also several command-line tools available, such as IBM's XML Diff and Merge Tool (http://www.alphaworks.ibm.com/tech/xmldiffmerge). This hack will walk you through the steps of using these tools.

2.20.1 DecisionSoft's xmldiff

You can diff local XML files on your computer online with DecisionSoft's xmldiff. xmldiff makes line-by-line comparisons of XML documents, and therefore is helpful for comparing similar documents that use the same structure and vocabulary.

To compare two similar documents, follow these steps:

  1. In a web browser, go to http://tools.decisionsoft.com/xmldiff.html (see Figure 2-28).

  2. Click the first Browse button, and the File Upload dialog box appears. Find the file time.xml in the working directory and click the Open button.

  3. Click the second Browse button and find time2.xml, and then click Open.

  4. Select the "Split attributes" checkbox.

  5. Click the "Show differences" button. The results are shown in Figure 2-29.

Figure 2-28. DecisionSoft's xmldiff
figs/xmlh_0228.gif


Figure 2-29. Results from xmldiff
figs/xmlh_0229.gif


If lines are the same, they are shown in gray. If the same lines in both files differ, the differences are highlighted by different colors: the line from the first file is highlighted in red and the line from the second file is shown in green. Because you selected the "Split attributes" checkbox, the attributes are each listed on separate lines, making them easier to read.

Currently, xmldiff works only on local files. You cannot access remote files with URLs.


2.20.2 DeltaXML's XML Comparator

You can also diff local XML files online by using DeltaXML's comparator utility, available at http://compare.deltaxml.com/. Like xmldiff, this utility makes line-by-line comparisons of XML documents, and so is likewise helpful for comparing similar documents using the same structure and vocabulary. It is also possible to paste XML documents into the two paste boxes provided on the DeltaXML comparator page (Figure 2-30).

To compare two similar documents using DeltaXML, follow these steps:

  1. In a web browser, go to http://compare.deltaxml.com/ (see Figure 2-30).

  2. Select all three checkboxes in the Options area.

  3. Click the first Browse button, and the File Upload dialog box appears. Find the file time.xml in the working directory and click the Open button.

  4. Click the second Browse button, find time2.xml, and then click Open.

  5. Click the Compare Files button. The results are displayed in the browser window (see Figure 2-31).

Figure 2-30. DeltaXML's XML Comparator
figs/xmlh_0230.gif


Figure 2-31. Results from DeltaXML
figs/xmlh_0231.gif


Changed lines are highlighted in blue italics. Unchanged lines are shown in plain text. The differences between the first and second files are shown by striking through the item from the first file in red and by underlining the item from the second file in green. I found the output of DeltaXML's utility more grokable than that of xmldiff.

2.20.3 IBM's XML Diff and Merge Tool

Download and install IBM's XML Diff and Merge Tool from http://www.alphaworks.ibm.com/tech/xmldiffmerge (you will be required to register on the IBM alphaWorks site). In the bin directory under the installation directory xmldiff, you will find a Windows batch file called xmldiff2.bat as well as a shell script called xmldiff2.sh. Edit the appropriate script file, depending on your environment, by setting or exporting environment variables for the location of Java and the xmldiff directory. After you complete these steps, you should be able to run this command at a prompt:

xmldiff2 time.xml time2.xml

This command will give you the following results indicating what lines have changed in the second file:

java -DIVB_HOME="C:\temp\xmldiff" -Xnoclassgc -Xmx255m -Xms30m

  com.ibm.ivb.xmldiff.XMLDiffLauncher time.xml time2.xml

 Parsing time.xml ...

 Parsing time2.xml ...

Comparing ...

  <time timezone="PST"> --- CHANGED

    <hour> --- CHANGED

    </hour>

    <minute>

    </minute>

    <second>

    </second>

    <meridiem>

    </meridiem>

    <atomic signal="true"/> --- CHANGED

  </time>

2.20.4 See Also

  • xmlspy 2004 Professional and Enterprise Editions have diff capabilities: http://www.xmlspy.com and http://www.altova.com/matrix.html

  • Microsoft's XML Diff and Patch: http://apps.gotdotnet.com/xmltools/xmldiff/

  • Logilab's xmldiff, written in Python: http://www.logilab.org/projects/xmldiff/

  • Advanced Software's Docucomp: http://www.docucomp.com/



     
    ASPTreeView.com
     
    Evaluation has СВµХОЙёјexpired.
    Info...