Storing Objects in Collection Classes

Storing Objects in Collection Classes

The .NET Framework includes specialized classes known as collection classes, or simply collections, which are used to store objects. Like arrays, collections enable you to place objects into a container for later retrieval. However, collections differ from arrays in the way they’re used, as well as in the techniques used for interaction. Some collection classes, such as the Stack and Queue classes, have specialized methods for inserting and removing items. Other collection classes, such as the StringCollection class, are dedicated for use with a certain type of object. In this section, you’ll see how the Queue, Hashtable, and Stack classes are used. You’ll also be introduced to the interfaces commonly implemented by all collection classes in the .NET Framework.

Commonly Used Collection Interfaces

A wide variety of collection classes are included in the .NET Framework. Each class has unique features that separate it from other collection classes, but a few interfaces (ICollection, IEnumerable, ICloneable, IDictionary, and IList) are consistently implemented by most collection classes. This section examines those interfaces in detail.

ICollection—The Root of All Collections

The ICollection interface is implemented by all collection classes in the .NET Framework. This interface defines the minimum contract for a collection class and is declared as follows:

interface ICollection
{
    int    Count{ get; }
    bool   IsSynchronized{ get; }
    object SynchRoot{ get; }
    void   CopyTo(Array array, int index);
}

The members of the ICollection interface are described here:

  • Count  Returns the number of objects stored in the collection

  • IsSynchronized  Returns true if the collection is thread-safe; otherwise, returns false

  • SynchRoot  Returns an object used to synchronize multithreaded access to the collection

  • CopyTo  Copies the objects in the collection to an array

By factoring the common properties and methods for collections into the ICollection interface, a basic contract is defined for all collection classes in the .NET Framework. All collections must implement the Count property so that clients can determine the number of objects in a collection. All collections expose the CopyTo method to provide a standard way for a collection to copy its contents to an array.

All collection classes also implement the IsSynchronized and SynchRoot properties to support basic multithreaded programming. The use of these properties is covered, along with other multithreaded programming topics, in Chapter 10.

Providing Enumerators with the IEnumerable Interface

The IEnumerable interface, which was discussed in Chapter 7, is implemented by all collection classes in the .NET Framework. To review, IEnumerable is the standard interface implemented by classes that support iteration over contained objects. IEnumerable is declared as follows:

interface IEnumerable
{
    IEnumerator GetEnumerator();
}

Some classes forego returning IEnumerator in favor of a more specialized interface suited for their particular type of collection. For example, the Hashtable class (discussed later in this chapter, in the section “Using the Hashtable Class”) returns an enumerator that implements the IDictionaryEnumerator interface, which is more suited for enumerating that type of collection. The IDictionaryEnumerator interface will be discussed later in this chapter, in the section “Mapping Keys to Values with the IDictionary Interface.”

Copying Objects with the ICloneable Interface

The ICloneable interface is used to provide a standard way to create copies of existing objects. The ICloneable interface is declared as follows:

interface ICloneable
{
    object Clone();
}

The Clone method returns a new instance that is of the same type as (or occasionally a derived type of) the current object. As the method’s name implies, the returned object is initialized with the same contents as the current object.

An interesting aspect of cloning is that the type that’s being cloned can provide a shallow copy or a deep copy of the current object. A shallow copy copies the current object; however, no new instances of any fields are created. Instead, the shallow copy maintains references to the same objects as the original, as shown in Figure 8-1.

Figure 8-1.
A shallow copy, which copies only contained references.

In contrast, a deep copy creates a new object that includes references to completely new objects, as shown in Figure 8-2.

Figure 8-2.
A deep copy, which creates new objects for field references.

Although a deep copy is more costly to execute, it results in a copy that’s more isolated from its original. Thus, if your code makes changes to an object’s deep copy, the original object (as well as any items that the original object references) remains the same. The execution cost for a deep copy can be especially significant for collections because many items can be stored in a collection object. For this reason, most collection classes provide shallow copy semantics.

Mapping Keys to Values with the IDictionary Interface

The IDictionary interface is exposed by collection classes that support mapping keys to values, much like the AssociativeArray class presented in Chapter 7. Classes that support IDictionary can be defined as having a fixed size, which prevents adding new keys to the collection while allowing existing keys to be associated with new values. Classes also can be specified as read-only, which prevents the contents of the collection from being modified in any way.

The IDictionary interface is declared as follows:

interface IDictionary
{
    bool IsFixedSize { get; }
    bool IsReadOnly { get; }
    ICollection Keys { get; }
    ICollection Values  { get; }
    object this[object key] { get; set; }
    void Add(object key, object value);
    void Clear();
    bool Contains(object key);
    IDictionaryEnumerator GetEnumerator();
    void Remove(object key);
}

IDictionary specifies an indexer that uses the collection key as the index parameter. A get operation on the indexer returns the value mapped to the key and a set operation overwrites the item currently associated with the key. If the collection exposing IDictionary doesn’t have a fixed size, a new item can be added to the collection using the indexer. If the collection has a fixed size, attempting to add a new item will result in a NotSupportedException exception being thrown.

The other methods and properties declared by IDictionary are listed here:

  • IsFixedSize  Returns true if the size of the collection is fixed or false if new keys can be added.

  • IsReadOnly  Returns true if the collection is read-only or false if items can be added to the collection.

  • Keys  Returns an ICollection interface that can be used to interact with all keys in this collection.

  • Values  Returns an ICollection interface that can be used to interact with all values in this collection.

  • Add  Adds a new key/value pair to the collection. If the key already exists, an ArgumentException exception is thrown.

  • Clear  Removes all key/value pairs from the collection.

  • Contains  Returns true if a specified key exists in the collection or false if the key isn’t found.

  • GetEnumerator  Returns an IDictionaryEnumerator interface for the collection. IDictionaryEnumerator provides a richer interface than IEnumerator and is discussed in more detail shortly.

  • Remove  Removes a specific key and its associated value from the collection.

Classes that implement IDictionary often have a nonlinear method of associating keys with values. For example, many data structures based on trees or hashing are difficult to iterate using the IEnumerator interface. The IDictionaryEnumerator interface, shown in the following code, is a specialized version of the IEnumerator interface that’s well-suited for enumerating dictionary-style collections:

interface IDictionaryEnumerator: IEnumerator
{
    DictionaryEntry Entry { get; }
    object Key { get; }
    object Value { get; }
}
Using the IList Interface

Collections that implement the IList interface offer listlike semantics, including access to individual items and insertion and removal of items at any point in the list. The IList interface is declared as follows:

interface IList
{
    bool IsFixedSize { get; }
    bool IsReadOnly { get; }
    object this[object key] { get; set; }
    int Add(object value);
    void Clear();
    bool Contains(object value);
    int IndexOf(object value);
    Insert(int index, object value);
    void Remove(object key);
    void RemoveAt(int index);
}

The IList interface is implemented by several of the collection classes included in the .NET Framework, including the following:

  • ArrayList

  • ListDictionary

  • StringCollection

  • StringDictionary

As discussed in Chapter 7, the ArrayList class combines the properties of an array and a list. The StringCollection, ListDictionary, and StringDictionary classes are discussed later in this chapter, in the section “Using the Specialized Collection Classes.”

Using the Queue Class

The Queue class implements a first-in, first-out data structure that places objects into a waiting list, or queue, as shown in Figure 8-3. Conceptually, objects are inserted into a queue at one end and removed at the other end. Queues are useful for processing objects sequentially because they can store objects (for later processing) in the order in which the objects were inserted.

Figure 8-3.
A queue, which is a first-in, first-out collection.
Queue Interfaces and Methods

Like most collection classes, the Queue class implements the three major collection interfaces: ICollection, IEnumerable, and ICloneable. In addition to the methods exposed by those interfaces, the most commonly used methods exposed by the Queue class are as follows:

  • Enqueue  Adds a new item to the back of the queue

  • Dequeue  Removes and returns an item from the front of the queue

  • Peek  Returns the object at the front of the queue but doesn’t remove it from the queue

  • Clear  Removes all objects from the queue

  • Contains  Returns true if a specified object exists in the queue or false if the object isn’t found

Internally, the items in a queue are stored as elements in a buffer, and the buffer expands as needed to accommodate additional items. Operations performed on a queue are relatively inexpensive, unless the internal buffer must be enlarged, which requires that a new, larger buffer be allocated and all existing items be copied to the new array.

If the performance cost of resizing the Queue object’s internal array is a concern, there are some steps you can take to reduce the likelihood of frequent resizing. The default size of the internal array allows it to store 32 objects. You can specify the initial size of the array by passing a value to the Queue constructor, as shown here:

Queue aQueue = new Queue(50);   // The initial capacity is 50 objects.

During construction, you also can define the growth factor for the Queue object, which is used when determining the size of the object’s new internal array. By default, the growth factor is 2.0, which means that the internal array is doubled whenever it must be grown. One of the constructors for the Queue class allows you to specify the initial capacity for the Queue as well as its growth factor, as shown here:

// Set the growth factor to 3.0.
Queue aQueue = new Queue(50, 3.0);

The growth factor can be as large as 10.0. By providing a growth factor larger than the default value, you can reduce the number of times your Queue objects are resized. However, this added runtime efficiency adds to the total memory footprint of your collection object.

A Queuing Example

To illustrate how the Queue collection class can be used in an application, this section presents a simple simulation of a ski lift, named SkiLiftQueue. The Ski­LiftQueue application places Skier objects into a queue that represents a ski lift and removes Skier objects from the queue to simulate the skiers reaching the end of the lift.

The Skier structure is a simple value type that demonstrates that nonprimitive types can be stored in a queue. The Skier structure is basically a struct wrapper around a single string field, as shown below.

struct Skier
{
    public Skier(string name)
    {
        Name = name;
    }
    public string Name;
}

Although the Skier structure is simplified to reduce the complexity of the example, keep in mind that arbitrarily complex objects can be stored in a queue.

The Main method for the SkiLift example is also quite basic—it simply creates an instance of the SkiLift class and calls its Run method, as shown here:

static void Main(string[] args)
{
    SkiLift lift = new SkiLift();
    lift.Run();
}

Most of the code in the SkiLiftQueue example is found in the SkiLift class, which is declared as follows:

public class SkiLift
{
    public SkiLift()
    {
        _theLift = new Queue();
    }

    public void Run()
    {
        bool done = false;
        while(!done)
        {
            DisplayStatus();
            SkiAction choice = GetNextAction();
            switch(choice)
            {
                case SkiAction.AddSkier:
                    string name;
                    do
                    {
                        Console.Write("Skier's name: ");
                        name = Console.ReadLine();
                    } while(name.Length == 0);
                    Skier newSkier = new Skier(name);
                    _theLift.Enqueue(newSkier);
                    break;

                case SkiAction.RemoveSkier:
                    if(_theLift.Count == 0)
                    {
                        Console.WriteLine("The lift is empty.");
                    }
                    else
                    {
                        Skier nextSkier = (Skier)_theLift.Dequeue();
                        Console.WriteLine("{0} has left the ski lift.",
                            nextSkier.Name);
                    }
                    break;

                case SkiAction.Quit:
                    Console.WriteLine("Goodbye.");
                    done = true;
                    break;

                default:
                    break;
            }
        }
    }

    protected void DisplayStatus()
    {
        Console.WriteLine("There are currently {0} skiers on the lift.",
            _theLift.Count);
        if(_theLift.Count > 0)
        {
            Skier nextSkier = (Skier)_theLift.Peek();
            Console.WriteLine("The next skier will be {0}.",
                nextSkier.Name);

            Console.WriteLine("Skiers on the lift:");
            Array skiers = _theLift.ToArray();
            foreach(Skier aSkier in skiers)
            {
                Console.WriteLine("\t" + aSkier.Name);
            }
        }
    }

    protected SkiAction GetNextAction()
    {
        SkiAction result = SkiAction.Quit;
        bool done = false;
        while(!done)
        {
            Console.WriteLine("A) Add a skier to the lift");
            Console.WriteLine("R) Remove a skier from the lift");
            Console.WriteLine("Q) Quit");
            Console.Write("Choice: ");

            switch(Console.ReadLine().ToUpper())
            {
                case "A":
                    result = SkiAction.AddSkier;
                    done = true;
                    break;

                case "R":
                    result = SkiAction.RemoveSkier;
                    done = true;
                    break;

                case "Q":
                    result = SkiAction.Quit;
                    done = true;
                    break;

                default:
                    break;
            }
        }
        return result;
    }
    protected enum SkiAction { AddSkier, RemoveSkier, Quit };
    protected Queue _theLift;
}

When an instance of the SkiLift class is constructed, it creates a Queue object that will be used to store Skier instances. The Run method calls the following two methods that provide the primary interaction with the user:

  • GetNextAction  Prompts the user for the next action to be performed by the ski lift

  • DisplayStatus  Displays the current state of the ski lift’s Queue object, including the current number of items in the collection as well as the names of Skier objects in the queue

If the user chooses to add a skier to the ski lift, a new Skier object is created and added to the collection with the Enqueue method. If the user chooses to remove a skier from the ski lift, a Skier object is removed by calling the Dequeue method.

Using the Stack Class

The Stack class implements a last-in, first-out data structure that stores its objects such that the last object inserted appears to be at the top of the stack, as shown in Figure 8-4.

Figure 8-4.
A stack, which is a last-in, first-out collection.

In addition to the ICollection, IEnumerable, and ICloneable interfaces, the most commonly used methods implemented by the Stack class are as follows:

  • Push  Adds a new object to the top of the stack

  • Pop  Removes and returns the object at the top of the stack

  • Peek  Returns the object at the top of the stack but doesn’t remove it from the stack

  • Clear  Removes all objects from the stack

  • Contains  Returns true if a specified object exists in the queue or false if the object isn’t found

Like the Queue class, the Stack class stores its objects in an array. As long as the array is large enough to store new objects, a call to the Push method is very efficient. If the internal array must be resized, however, a new array must be allocated and existing objects copied into it. To avoid this performance cost, you can take the same steps outlined in the previous section for the Queue class: either preallocate a large internal array, or define an appropriate growth factor that meets your performance needs.

The following source code provides an example of how the Stack class is used. The StackApp class uses a Stack object to store several strings representing various colors.

class StackApp
{
    static void Main(string[] args)
    {
        Stack colorStack = new Stack();

        colorStack.Push("Red");
        colorStack.Push("Green");
        colorStack.Push("Blue");
        colorStack.Push("Yellow");
        colorStack.Push("Orange");

        Console.WriteLine("Contents of stack...");
        foreach(string color in colorStack)
        {
            Console.WriteLine(color);
        }

        while(colorStack.Count > 0)
        {
            string color = (string)colorStack.Pop();
            Console.WriteLine("Popping {0}", color);
        }
        Console.WriteLine("Done");
    }
}

When the preceding code is compiled and executed, the output looks like this:

Contents of stack...
Orange
Yellow
Blue
Green
Red
Popping Orange
Popping Yellow
Popping Blue
Popping Green
Popping Red
Done
Using the Hashtable Class

The Hashtable class implements a data structure that supports associating objects, known as keys, with other objects, known as values. The Hashtable class gets its name from the way it stores objects internally, which is based on the hash code for keys stored in the collection.

Keys and objects stored in a Hashtable object are arranged in a series of smaller collections, as shown in Figure 8-5.

Figure 8-5.
A Hashtable structure, which contains multiple subcollections called buckets.

The hash bucket associated with a particular key is determined based on the hash code for the key. If two keys have the same hash code, the keys are always sent to the same hash bucket. When searching for an object, the Hashtable class uses the hash code of the object’s key to search for the object in only one bucket, thus making the search significantly more efficient.

The Fine Print of Hashing

It’s possible, and even likely, that some objects with different hash codes will be placed into the same hash bucket. For example, when a default Hashtable object is constructed, it has 11 hash buckets. As each key/value pair is added to the Hashtable object, the key is transformed from a 32-bit value into the index of one of the 11 hash buckets. If multiple keys are resolved into the same hash bucket index, a collision is said to occur and the entries are stored in an array. Although some collisions are inevitable, too many collisions degrade performance because they increase the likelihood of a linear search and increase the cost of inserting an item into the collection.

The Hashtable load factor controls the ratio of hash buckets to keys. When the number of keys exceeds the load factor, the number of hash buckets is increased. By default, the load factor is 1.0, which keeps the number of buckets and objects approximately equal and strikes a balance between unused hash buckets and collisions. The load factor can be defined when the Hashtable object is initially constructed, as follows:

Hashtable elements = new Hashtable(118, 0.1f);

The load factor must be between 0.1 and 1.0, with 0.1 offering the best possible performance and 1.0 providing the most compact memory footprint.

The implementation of the key’s GetHashCode method can greatly impact the performance of the collection. The worst possible case for Hashtable performance occurs when a poorly implemented hash code returns the same value for all keys in the collection, resulting in performance that’s worse than an array, and with much greater complexity.

Even worse than an inefficient hash code is a hash code that changes. If a key stored in a Hashtable is modified in such a way that its hash code varies, it might be impossible to retrieve the object from the collection at a later time. If you override GetHashCode in one of your classes, take care either to base your implementation of GetHashCode on a value that doesn’t change or to not change the value of keys while they’re stored in a Hashtable.

Interfaces Implemented by the Hashtable Class

The Hashtable class implements a number of interfaces. In addition to the commonly implemented collection interfaces ICollection, IEnumerable, and ICloneable, the Hashtable class implements the IDictionary interface. (The IDictionary interface was described earlier in this chapter, in the section “Mapping Keys to Values with the IDictionary Interface.”) The Hashtable class also implements the ISerializable and IDeserializationCallback interfaces, which are used when serializing instances of the class. Refer to MSDN for a detailed discussion about these two interfaces.

The Hashtable class offers a number of ways to interact with objects stored in the collection. Because the Hashtable class exposes an enumerator, the foreach statement can be used to iterate over all objects stored in the collection, as shown here:

struct Element
{
    public Element(string itsName, string itsSymbol)
    {
        Name = itsName;
        Symbol = itsSymbol;
    }
    public string Name;
    public string Symbol;
}

static void Main(string[] args)
{
    Hashtable elements = new Hashtable(118, 0.1f);
    Element sodium = new Element("Sodium", "Na");
    Element lead = new Element("Lead", "Pb");
    Element gold = new Element("Gold", "Au");
    elements.Add(sodium.Symbol, sodium);
    elements.Add(lead.Symbol, lead);
    elements.Add(gold.Symbol, gold);

    foreach(DictionaryEntry entry in elements)
    {
        string symbol = (string)entry.Key;
        Element elem = (Element)entry.Value;
        Console.WriteLine("{0} - {1}", symbol, elem.Name);
    }
} 

In this code, the DictionaryEntry type is used inside the foreach statement. DictionaryEntry is a structure that contains the key/value pair for a Hashtable entry.

Another way to access the objects in a Hashtable class is to use keys as indexers, as follows:

Element anElement = (Element)elements["Na"];
Console.WriteLine(anElement.Name);

The difficulty with this approach is that you must know the values of each key. Fortunately, the Hashtable class exposes the keys stored in the collection through its Keys property. The following code provides index-based access to every object in a Hashtable object by iterating over the collection returned by the Keys property and then using each key as an index to the Hashtable class:

foreach(object key in elements.Keys)
{
    Element e = (Element)elements[key];
    Console.WriteLine(e.Name);
}
Using the Specialized Collection Classes

The .NET Framework class library includes a number of specialized collection classes in the System.Collections.Specialized namespace. Some of these classes, such as the StringCollection class, are dedicated to holding specific types. Other classes solve a specific need, such as the ListDictionary class, which is optimized for small collection sets. This section introduces these two specialized classes and the StringDictionary class.

The StringCollection Class

The StringCollection class is a collection class that’s strongly typed to hold string objects. In addition to the ICollection and IEnumerable interfaces, the String­Collection class implements the IList interface, so it supports listlike access to the strings that it contains, as shown here:

static void Main(string[] args)
{
    StringCollection stringBag = new StringCollection();
    stringBag.Add("Mickey");
    stringBag.Add("Rene'");
    stringBag.Add("Ali");

    int index = stringBag.IndexOf("Ali");
    Console.WriteLine("Ali's index is {0}", index);
    stringBag.Insert(0, "Kenzie");
    foreach(string name in stringBag)
    {
        Console.WriteLine(name);
    }
}

The StringCollection class supports multiple strings with the same value, so using the Remove or IndexOf methods will affect only the first string encountered in the collection.

The ListDictionary Class

The ListDictionary class is a lightweight collection that implements IDictionary. Unlike the Hashtable class, ListDictionary is implemented as a simple linked list of items that offers improved performance for small dictionaries of 10 or fewer items. As shown in the following code, the ListDictionary class is used just like the Hashtable class presented earlier:

struct Element
{
    public Element(string itsName, string itsSymbol)
    {
        Name = itsName;
        Symbol = itsSymbol;
    }
    public string Name;
    public string Symbol;
}

    

ListDictionary dictionary = new ListDictionary();

Element sodium = new Element("Sodium", "Na");
Element lead = new Element("Lead", "Pb");
Element gold = new Element("Gold", "Au");
dictionary.Add(sodium.Symbol, sodium);
dictionary.Add(lead.Symbol, lead);
dictionary.Add(gold.Symbol, gold);

Element anElement = (Element)dictionary["Na"];
Console.WriteLine(anElement.Name);
foreach(DictionaryEntry entry in dictionary)
{
    string symbol = (string)entry.Key;
    Element elem = (Element)entry.Value;
    Console.WriteLine("{0} - {1}", symbol, elem.Name);
}

Because the ListDictionary class is manipulated using the same IDictionary interface exposed by the Hashtable class, the code used for both collections is similar. The ListDictionary class isn’t suitable for collections with more than ten items, however.

The StringDictionary Class

The StringDictionary class is similar to the Hashtable class except that it’s strongly typed to accept string keys and values. Internally, the StringDictionary class uses a private Hashtable field to contain keys and values in the collection. The StringDictionary class simply acts as a wrapper around the Hashtable class to provide strong typing.

As shown in the following code, the StringDictionary class works with string keys and values without the need for explicit casting:

StringDictionary elements = new StringDictionary();
elements.Add("Na", "Sodium");
elements.Add("Pb", "Lead");
elements.Add("Au", "Gold");

foreach(string key in elements.Keys)
{
    Console.WriteLine("{0} - {1}", key, elements[key]);
}

Although StringDictionary supports the same properties and methods found in interfaces such as ICollection and IDictionary, it doesn’t implement those interfaces. IEnumerable is the only interface StringDictionary implements. The keys in a StringDictionary collection are compared without regard to their case and are automatically converted to lowercase when they’re inserted into the collection. If you require high fidelity for your keys, you’ll need to use a different type of collection.



Part III: Programming Windows Forms