XML Object Serialization

XML Object Serialization

In the .NET Framework, object serialization is offered through the classes in the System.Runtime.Serialization namespace. These classes provide type fidelity and support deserialization. Deserialization is the reverse process of serialization. It takes in stored information and re-creates objects from it.

Object serialization in .NET allows you to store public, protected, and private fields, and it automatically handles circular references. A circular reference occurs when a child object references a parent object, and the parent object also references the child object. Serialization classes in .NET can detect these references and work them out. Serialization can generate output data in multiple formats by using different made-to-measure formatter modules. The two system-provided formatters are represented by the BinaryFormatter and SoapFormatter classes, which write the object’s state in binary format and SOAP format, respectively.

Enabling Object Serialization

A .NET class that implements the ISerializable interface is serializable. A class that does not implement the interface must be marked with the Serializable­Attribute attribute to be serializable. When you want to keep the serialization lean, prevent some fields from entering the serialization process by applying the NonSerializedAttribute attribute to them. You might not want to serialize fields that represent transient states or that are easily reproducible at run time. The following code snippet shows you how to mark .NET classes for serialization:

[Serializable]class Order {
    int price, numItems;
    [NonSerialized] int total;
    ...
}

The whole Order class is serializable except for the total member. In our example, you don’t have a compelling reason to save total because its value can be easily reproduced at any time by multiplying price and numItems.

To serialize an object to a storage medium, you first select a serialization formatter, typically the binary formatter or the SOAP formatter. Each formatter has its own class:

IFormatter binFormatter = new BinaryFormatter();
IFormatter soapFormatter = new SoapFormatter();

After you hold a living instance of the serializer, you call the Serialize method, passing the stream to write to and the object to save. You can also serialize an object to a file, a memory buffer, and even a string. When you need to serialize a graph of objects, pass only the root object to the Serialize method:

soapFormatter.Serialize(stream, rootObject);

If you used the SOAP formatter, the following is a plausible output:

<SOAP-ENV:Envelope
    xmlns:xsi=http://www.w3.org/1999/XMLSchema-instance
    xmlns:xsd="http://www.w3.org/1999/XMLSchema" 

    xmlns:a1="...">
<SOAP-ENV:Body>
    <a1:Order id="ref-1">
        <price>1</price>
        <numItems>2</numItems>
    </a1:Order>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Rebuilding objects from a storage medium is also easy. You call the Deserialize method on the specified formatter:

Order rootObject = (Order) soapFormatter.Deserialize(stream);

It goes without saying that you cannot serialize to, say, SOAP, and then pretend to deserialize using the binary formatter.

The ISerializable Interface

A class that needs to control its serialization and deserialization processes can do this by implementing the ISerializable interface. When a class implements the ISerializable interface, the formatter calls the GetObjectData method of the interface at serialization time. GetObjectData then populates the supplied ­SerializationInfo structure with all of the data that represents the object. A class that implements ISerializable must also provide a private constructor with the following signature:

private MyClass(SerializationInfo info, StreamingContext context) 

Such a constructor is called when the formatter is finished filling out the SerializationInfo structure. The streaming context information lets the class know about the destination of its bytes. Depending on whether the class will be serialized in a cross-process or cross-machine way, some fields might require special treatment. In general, this constructor should be declared private when the class is not explicitly marked as not inheritable.

Controlling the Deserialization Process

The .NET serialization mechanism also allows you to control the post-deserialization processing and explicitly handle data being serialized and deserialized. In this way, you are given a chance to restore transient state and data which, for one reason or another, you decide not to serialize.

By implementing the IDeserializationCallback interface, a class indicates that it wants to be notified when the deserialization of the entire object is complete. The class can easily complete the operation by re-creating parts of the state and adding any information not made serializable. The OnDeserialization method is called after the type has been deserialized. For example, you wouldn’t serialize member information for the current time; you can instead mark the field as non-serialized and implement the IDeserializationCallback interface to restore a consistent value at the end of the deserialization process:

[NonSerialized] 
public DateTime d = DateTime.Now;
public void OnDeserialization(Object sender) 
{
    d = DateTime.Now;
}
Type Binding

In .NET, you can bind one class type to another at the end of the deserialization process by first making a serializable class inherit the SerializationBinder class behavior. Then you implement the BindToType method, which controls the binding of a serialized object to a .NET class:

public abstract Type BindToType(
    String assemblyName,
    String typeName
);

The second step is setting the Binder property of the formatter of choice, typically SoapFormatter or BinaryFormatter. You set the formatter to operate on an instance of your binder class. After the operation completes, you serialize and deserialize as usual. The binder automatically maps involved types, and the assembly for the original class doesn’t even have to be loadable. This is an excellent solution for mapping between different versions of the same class.

Serializing to XML

A second, very special, type of .NET serialization is XML serialization. When compared with ordinary .NET object serialization, XML serialization is so different that you should not even consider it another type of formatter. It is similar to SOAP and binary formatters because it also persists and restores the object’s state, but when you examine the way each serializer works, you see many significant differences.

XML serialization is handled by using the XmlSerializer class, which enables you to control how objects are encoded into XML. You can use the xsd.exe tool to generate base classes that the serializer encodes to XML according to an XSD schema. The namespace for XML serialization is System.Xml. Serialization.

You can configure to some extent the process of XML serialization. For example, you can specify whether a property should be encoded as an attribute or an element and provide a name for that property when the default is inappropriate. You can also provide an XML namespace for elements.

Type Fidelity and Infidelity

The primary goal of .NET object serialization is fidelity between the original object and its serialized version. However, type infidelity is the goal of XML serialization. XML serialization is a feature whose expressed purpose is to facilitate interaction and interoperability between complex .NET objects and non-.NET platforms. This declared goal brings to light a number of limitations. For example, XML serialization supports only public members and cannot serialize graphs of objects (such as a doubly linked list). Circular references are not detected and lead to endless loops. Only the data contained in the class is serialized, and no type and assembly information data is ever included.

Exporting Data to XML

Let’s review the typical process that leads to creating XML documents that reflect the contents of living instances of .NET classes. You start out with an XSD document that describes the structure of the resultant data. If you don’t have and can’t write an XSD document, you use the xsd.exe tool to infer the document from an XML file.

Based on the XSD schema, you generate a C# or Visual Basic .NET class, which has the ability to serialize itself in an XML format that adheres to the specified schema. The class is generated using the xsd.exe tool, which is well hidden under the covers of the Visual Studio .NET user interface. This class is a template class that you need to flesh out a bit by adding functionality. Manipulating data with such a class has a double advantage. When you work under the aegis of .NET, you enjoy the effectiveness of strongly typed programming. When you move out of .NET and route your classes toward other platforms, you enjoy the interoperability and the flexibility of an XML schema.