How do you get your old stuff into XML? Legacy text files can be translated into XML with xmlspy.
Perhaps you have plain-text files that you'd like to convert to XML so that the data will interoperate with the latest applications. You can do it by hand with a text or XML editor or you can use a tool that will do it for you automatically. xmlspy (Professional or Enterprise edition) is one of those tools. It's easy to figure out xmlspy's text-to-XML interface, so that's the one I'll show you here. (I used the Enterprise edition when testing this.)
First, here is a little plain-text file, time.txt, that just contains data fields separated by semicolons:
timezone;hour;minute;second;meridiem;atomic PST;11;59;59;p.m.;
The first line defines fields that will be converted to XML markup; the second line defines the content of that markup. A semicolon (;) delimits each of the fields. The second line ends with a field containing a single space, which of course you can see.
Now open xmlspy and select Convert Import Text file. The Text import dialog box is shown in Figure 2-18. Click the Choose File button and open the file time.txt. Make sure that the file encoding is Unicode UTF-8, the field delimiter is Semicolon, and that "First row contains field names" is checked.
Click the symbol to the left of the timezone field name in the first row so that it becomes an equals sign. This specifies that the timezone field will be interpreted as an attribute in the output. Then click OK.
Click the Text label at the bottom of the document pane to see the result in Figure 2-19. The XML declaration and the import and row elements were inserted by xmlspy; the remaining elements were derived from time.txt. You could change the new document by hand to match time.xml (from Chapter 1), or you could apply an XSLT stylesheet to it. XSLT hacks begin in earnest in Chapter 3, but I'll use an XSLT stylesheet here (without going into detail about the stylesheet itself) to show you how to shape this document up.
Select XSL XSL Transformation or press F10, and the dialog box in Figure 2-20 appears. Click the Browse button and open the stylesheet time.xsl. Then click OK. The imported text is then transformed by xmlspy's XSLT engine, according to time.xsl. Again click the Text label under the document pane and select Edit Pretty-Print XML Text. The final result is shown in Figure 2-21. You can save this document with File Save.
You can also convert text files whose data fields are separated by tabs, commas, or spaces. You can also select fields whose text is enclosed in single or double quotes. I chose semicolons in the first example because they are easier to see than space and tabs. The text file time2.txt (Example 2-6) uses tabs as delimiters.
timezone hour minute second meridiem atomic PST 11 59 59 p.m. MST 12 59 59 a.m. CST 01 59 59 a.m. EST 02 59 59 a.m. AST 03 59 59 a.m. BST 04 59 59 a.m. FST 05 59 59 a.m. AT 06 59 59 a.m. UTC 07 59 59 a.m.
Run this file through the conversion steps, making sure to select Tab as the field delimiter in the Text Import dialog box, as shown in Figure 2-22.
You can experiment with the other delimiters by changing the delimiters in time.txt or time2.txt to other kinds of delimiters and stepping through the conversion again. With some experimentation you will see that xmlspy can convert many kinds of text files.
Sysonyx's xmlArchitect: http://www.sysonyx.com/Products/xmlLinguist/
For heavy-duty text-to-XML conversions, a dedicated hardware solution from Xlipstream offers rackmounted appliances that do the conversions: http://www.xlipstream.com/