O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  


 
Buy the book!
XML Hacks
By Michael Fitzgerald
July 2004
More Info

HACK
#29
What's the Diff? Diff XML Documents
If you are handling many XML documents, sometimes you need to check the differences between two or more documents. You can perform diffs of XML documents with online and command-line tools
[Discuss (2) | Link to this hack]

When you manage a lot of XML documents, it is likely that you will have similar files with different content. Also, it is likely that you will need to keep track of changes on files within a given project. There are online tools—one from DecisionSoft (http://www.decisionsoft.com) and another from DeltaXML (http://www.deltaxml.com)—that can help you quickly compare XML files to see how different they are. There are also several command-line tools available, such as IBM's XML Diff and Merge Tool (http://www.alphaworks.ibm.com/tech/xmldiffmerge). This hack will walk you through the steps of using these tools.

DeltaXML's XML Comparator

You can also diff local XML files online by using DeltaXML's comparator utility, available at http://compare.deltaxml.com/. Like xmldiff, this utility makes line-by-line comparisons of XML documents, and so is likewise helpful for comparing similar documents using the same structure and vocabulary. It is also possible to paste XML documents into the two paste boxes provided on the DeltaXML comparator page ().

To compare two similar documents using DeltaXML, follow these steps:

  1. In a web browser, go to http://compare.deltaxml.com/ (see Figure 3).

  2. Select all three checkboxes in the Options area.

  3. Click the first Browse button, and the File Upload dialog box appears. Find the file time.xml in the working directory and click the Open button.

  4. Click the second Browse button, find time2.xml, and then click Open.

  5. Click the Compare Files button. The results are displayed in the browser window (see Figure 4).

Figure 3. DeltaXML's XML Comparator

Figure 4. Results from DeltaXML

Changed lines are highlighted in blue italics. Unchanged lines are shown in plain text. The differences between the first and second files are shown by striking through the item from the first file in red and by underlining the item from the second file in green. I found the output of DeltaXML's utility more grokable than that of xmldiff.

IBM's XML Diff and Merge Tool

Download and install IBM's XML Diff and Merge Tool from http://www.alphaworks.ibm.com/tech/xmldiffmerge (you will be required to register on the IBM alphaWorks site). In the bin directory under the installation directory xmldiff, you will find a Windows batch file called xmldiff2.bat as well as a shell script called xmldiff2.sh. Edit the appropriate script file, depending on your environment, by setting or exporting environment variables for the location of Java and the xmldiff directory. After you complete these steps, you should be able to run this command at a prompt:

xmldiff2 time.xml time2.xml

This command will give you the following results indicating what lines have changed in the second file:

java -DIVB_HOME="C:\temp\xmldiff" -Xnoclassgc -Xmx255m -Xms30m
  com.ibm.ivb.xmldiff.XMLDiffLauncher time.xml time2.xml
 Parsing time.xml ...
 Parsing time2.xml ...
Comparing ...
  <time timezone="PST"> --- CHANGED
    <hour> --- CHANGED
    </hour>
    <minute>
    </minute>
    <second>
    </second>
    <meridiem>
    </meridiem>
    <atomic signal="true"/> --- CHANGED
  </time>

See also:



O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.