As mentioned, a variable declared as a reference type serves as a reference to an object located on the heap. As illustrated in Figure 3-3, multiple reference variables can be attached to a single object, and some reference variables might not be attached to any object. When a reference type variable is declared without assigning it to an object, its default value is null.
In Visual C#, objects are always created using the new keyword, but the objects aren’t explicitly released as they are in C or C++. The .NET Framework’s garbage collector will automatically free the memory used by unreferenced objects. In some situations, however, you must take explicit action when you’ve finished using an object—for example, when objects hold scarce resources and you can’t wait for the garbage collector to act. Such situations have implications for how you write code, as you’ll see later in this chapter, in the section “Reference Type Lifetime and Garbage Collection.”
An array is a reference type that contains a sequence of variables of a specific type. An array is declared by including index brackets between the type and the name of the array variable, as shown here:
int [] ages;
This example declares a variable named ages that’s an array of int, but it doesn’t attach that reference variable to an actual array object. To do so requires that the array be initialized, as shown here:
int [] ages = {5, 8, 39};
Arrays are reference types that the Visual C# .NET compiler automatically subclasses from the System.Array class. When an array contains value types, the space for the types is allocated as part of the array. When an array contains reference elements, the array contains only references—the objects are allocated elsewhere on the managed heap, as shown in Figure 3-4.
The individual elements of an array are accessed through an index, with 0 always referring to the first element in the array, as follows:
int currentAge = ages[0];
You can determine the number of elements in an array by using the Length property:
int elements = nameArray.Length;
An array can be cloned with the Clone method, which returns a new copy of the array. Because Clone is declared as returning an array of object, you must explicitly state the type of the new array, as follows:
string [] secondArray = (string[])nameArray.Clone();
Cloning an array creates a shallow copy. All the array elements are copied into a new array; the objects referenced by array elements aren’t copied.
Clear is a static method in the Array class that removes one or more of the array elements by setting the removed array elements to 0 (for value types) or null (for reference types). The array to be cleared is passed as the first parameter, along with the index of the first element to clear and the number of elements be removed. To eliminate all the elements of the array, pass 0 as the start element and the array length as the third parameter, as shown here:
Array.Clear(nameArray, 0, nameArray.Length);
Reverse is a static method in the Array class that reverses the order of array elements, operating on either the complete array or just a subset of elements. To reverse an entire array, simply pass the array to the static method, as shown here:
Array.Reverse(nameArray);
To reverse a range within the array, pass the array along with the start element and the number of items to be reversed.
Array.Reverse(nameArray, 0, nameArray.Length);
Sort is a static method that sorts an array. There are several versions of Sort; the simplest version accepts an array as its only parameter and sorts the elements in ascending order.
Array.Sort(nameArray);
Other overloads of the Sort method allow you to exercise more control over the sorting process. The interfaces and methods used when sorting are discussed in more detail in Chapter 8.
The following example manipulates an array containing the names of the month. The array is examined, reversed, sorted, cloned, and finally cleared.
using System; namespace MSPress.CSharpCoreRef.ArrayExample { class ArrayExampleApp { static void Main(string[] args) { string [] months = { "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"}; Console.WriteLine("The array has a rank of {0}.", months.Rank); int elements = months.Length; Console.WriteLine("There are {0} elements in the array.", elements); Console.WriteLine("Reversing..."); Array.Reverse(months); PrintArray(months); Console.WriteLine("Sorting..."); Array.Sort(months); PrintArray(months); string [] secondArray = (string[])months.Clone(); Console.WriteLine("Cloned Array..."); PrintArray(months); Console.WriteLine("Clearing..."); Array.Clear(months, 0, months.Length); PrintArray(months); } /// <summary> /// Print each element in the names array. /// </summary> static void PrintArray(string[] names) { foreach(string name in names) { Console.WriteLine(name); } } } }
This example uses the foreach statement to iterate over each of the array elements. You can use this type of statement in Visual C# to simplify loop programming when working with arrays. The foreach statement allows you to declare a variable that’s used to represent the currently indexed element in the array. This element will be updated for every loop iteration, with bounds checking performed automatically. Use of the foreach statement with arrays and other types is discussed in more detail in Chapter 7.
Arrays can be multidimensional, meaning that instead of extending in a simple sequence, they extend in multiple dimensions. A multidimensional array can be either rectangular, meaning that array dimensions are consistent, or jagged, meaning that array dimensions can have varying lengths. Some examples of different array types are shown in Figure 3-5.
To create a rectangular multidimensional array, the array declaration simply separates each dimension with commas, as shown here:
string [,] location = new string[5,2]; // 2-dimensional array string [,,] locationsWithZipCode; // 3-dimensional array
If the Length property is used with a multidimensional array, it returns the number of elements in the entire array, not just one of the dimensions. To determine the number of elements in one dimension of a multidimensional array, use the GetLength method, passing the dimension that you want tested, as follows:
int locations = names.GetLength(0); // Length of first dimension
A jagged array is declared using multiple index brackets:
double[][] polygons;
Dimensions within a jagged array are allocated just like new arrays of simpler rank.
double[][] shapes = new double[4][]; shapes[0] = new double[1] {10}; // Circle shapes[1] = new double[4] {3, 4, 3, 4}; // Quadrilateral shapes[2] = new double[3] {3, 4, 5}; // Triangle shapes[3] = new double[5] {5, 5, 5, 5, 5}; // Pentagon
Jagged arrays are more flexible because any dimension can be made up of arrays with differing lengths, but allocating and navigating jagged arrays is potentially more difficult. Each edge in a jagged array must be tested for its length and navigated separately. The foreach statement can be used to simplify iteration over a jagged array, as shown here:
static void DisplayShapeInfo(double[][] shapes) { int rankNumber = 0; foreach(double[] shape in shapes) { int totalLength = 0; foreach(int side in shape) { totalLength += side; } Console.WriteLine("Shape {0} perimeter length is {1}", rankNumber, totalLength); ++rankNumber; } }
This example uses the foreach statement to iterate over each dimension of the array. Even though the second dimension of the array is jagged, the foreach statement allows you to write a simpler loop than is possible with other loop constructions.
Visual C# includes a built-in string reference type that simplifies string manipulation. Strings can be created with the new operator. However, because strings are built-in types, the compiler also allows a simpler syntax, in which a string literal provides an initial value to a string reference variable, as shown here:
string name = "Mickey";
String literals are always enclosed within double quotation marks. Slashes are escaped with a preceding slash, so the string literal with the value C:\Windows would be as follows:
string path = "C:\\Windows";
Alternatively, the @ operator can be used to indicate to the compiler that a string literal is to be evaluated without escaping, allowing you to use a simpler syntax.
string path = @"C:\Windows";
Conceptually, a string is an array of characters, so the string class allows you to access individual characters as if you are accessing an array element. This example assigns the third character from name to the char variable c:
char c = name[2];
If name is empty or doesn’t have at least three characters, an IndexOutOfRangeException exception will be thrown.
When testing strings for equality, you have two options: value comparisons and reference comparisons. To test strings for value equality, simply use the Visual C# equality operator (==), as shown here:
if(name1 == name2) { // Strings match. }
This code performs a case-sensitive test for string equality; two string variables that are set to null will always test as equivalent. Strings can also be relatively compared using the relational operators, which will be discussed in detail in Chapter 4.
To determine whether two string references point to the same object, you must explicitly test for equality using the object base class, as shown here:
if((object)path == (object)name) { // The path and name variables refer to the same string object. }
Strings are concatenated using the addition operator (+), just like other built-in types. However, Visual C# strings are immutable—once a string is created, its value never changes. The simple act of concatenating a string results in the creation of a new string object, as shown here:
path += fileName; // Concatenation string fullPath = path + fileName; // Addition
In both examples, adding fileName to path causes a new string to be created. In the first line, the new string object is assigned to the path string reference. In the second example, the path string reference isn’t modified and the new string is assigned to the fullPath string reference.
As you’ve seen in Chapter 2 and , Visual C# uses the new keyword to create new instances of reference types. You usually don’t take any action to explicitly release an object because the .NET Framework uses a process known as garbage collection to automatically free objects that are no longer in use. For the most part, this process occurs automatically, and you’ll rarely be aware of it. However, garbage collection is such a fundamental part of the .NET Framework that understanding the mechanics of garbage collection will help you to write much more efficient code.
The basic ideas behind garbage collection are simple enough, and many types of systems today use some form of garbage collection. All garbage collection mechanisms have one thing in common: they absolve the programmer from the responsibility of tracking memory usage. Although most garbage collectors require that applications occasionally pause to reclaim memory that’s no longer used, the garbage collector used to manage memory in the .NET Framework is highly efficient.
The garbage collector in .NET is known as a generational garbage collector, meaning that allocated objects are sorted into three groups, or generations. Objects that have been most recently allocated are placed in generation zero. Generation zero provides fast access to its objects because the size of generation zero is small enough to fit into the processor’s L2 cache. Objects in generation zero that survive a garbage collection pass are moved into generation one. Objects in generation one that survive a collection pass are moved into generation two. Generation two contains long-lived objects that have survived at least two collection passes.
The details of the garbage collection pass will be discussed in the next section, but the basic idea is simple: look for unused objects, remove them from memory, and compact the managed heap to recover the space that the unused objects were occupying. After the heap is compacted, all object references are adjusted to point to the new object locations.
When an object is allocated by a Visual C# program, the managed heap will almost instantly return the memory required for the new object. The reason for the extremely fast allocation performance is that the managed heap is not a complex data structure. The managed heap resembles a simple byte array, with a pointer to the first available memory location, as shown in Figure 3-6.
When a block of memory is requested for an object, the value of the pointer is returned to the caller, and the pointer is adjusted to point to the next available memory location. Allocation of a managed block of memory is only slightly more complex than simply incrementing a pointer. This is one of the performance wins that the managed heap offers you. In an application that doesn’t require much garbage collection, the managed heap will outperform a traditional heap.
Due to this linear allocation method, objects that are allocated together in a Visual C# application tend to be allocated near each other on the managed heap. This arrangement differs significantly from traditional heap allocation, in which memory blocks are allocated based on their size. For example, two objects allocated at the same time might be allocated far apart from each other on the heap, reducing cache performance.
So allocation is very fast, but in a nontrivial program, the memory available in generation zero will eventually be exhausted. Remember, generation zero fits inside the L2 cache and no unused memory is being spontaneously returned. Until now, you’ve seen only how .NET simply increments a pointer when a program needs more memory. Although this approach is very efficient, it obviously can’t continue forever.
When no more memory can be allocated in generation zero, a garbage collection pass will be initiated on generation zero, which removes any objects that are no longer referenced and moves currently used objects into generation one. Promoting referenced objects into generation one frees up generation zero for new allocations. A collection pass for generation zero is the most common type of collection, and it’s very fast. A generation one collection pass is performed if a generation zero collection pass isn’t sufficient to reclaim memory. As a last resort, a generation two collection pass is performed only when collections on generations zero and one haven’t freed enough memory. If no memory is available after a complete collection pass of all generations, an OutOfMemoryException is thrown.
A class can expose a finalizer that executes when the object is destroyed, subject to conditions that we’ll look at later in this section. In Visual C#, the finalizer is a protected method named Finalize, as shown here:
protected void Finalize() { base.Finalize(); // Clean up external resources. }
If you implement a finalizer, you should always declare it as protected. Never expose your finalizer as a public method because it is called only by the .NET Framework. In your finalizer, you must follow a pattern whereby you call the finalizer for your base class before executing any of your own code, as shown in the previous example.
The Visual C# .NET compiler will generate code equivalent to a well-formed finalizer if you declare a destructor, as shown here:
~ResourceConnector() { // Clean up external resources. }
Attempting to declare a destructor and a Finalize method in the same class will result in an error.
Keep in mind that finalizers, and therefore Visual C# destructors, aren’t guaranteed to execute at any specific time, and they might not even execute at all in some circumstances. The .NET Framework can’t guarantee that it will call an object’s destructor or finalizer in a timely fashion because of the way it executes the finalization process. When an object with a finalizer is collected, it’s not immediately removed from memory. Instead, a reference to the object is placed in a special queue that contains objects waiting for finalization.
A dedicated thread is responsible for executing the finalizer for each object in the finalization queue. This thread then marks the object as no longer requiring finalization and removes the object from the finalization queue. Until finalization is complete, the queue’s reference to the object is sufficient to keep the object alive. After finalization has been completed, the object will be reclaimed during the next garbage collection pass.
There’s no guaranteed order for finalization. When an object is finalized, other objects that it refers to might have already been finalized. During finalization, you can safely free external resources such as operating system handles or database connections, but objects on the managed heap shouldn’t be referenced.
Avoid creating finalizers whenever possible. Objects with finalizers are more costly to the .NET Framework than objects without finalizers. They also maintain their existence through at least two garbage collection passes, increasing the memory pressure on the runtime.
Instead of including a finalizer, consider exposing a Dispose method that can be called to properly free your object’s resources, as shown in the following code. Classes that handle files and connections often name this method Close. You can use this method to free any resources that you’re holding, including any managed object references.
public void Dispose() { // Clean up owned resources. }
If you clean up your object using a Dispose or Close method, you should indicate to the runtime that your object no longer requires finalization by calling GC.SuppressFinalize, as shown here:
public void Dispose() { tools.Dispose(); statusBar.Dispose(); dbConnection.Dispose(); GC.SuppressFinalize(this); }
If you’re creating and using objects that have Dispose or Close methods, you should call these methods when you’ve finished using the objects. A good place to make these calls is in a finally clause, which guarantees that the objects are properly handled even if an exception is thrown.
If you implement a Dispose or Close method, you still need to implement a finalizer if your class has external resources that aren’t allocated from the managed heap. In the ideal case, your public Dispose method will properly clean up resources and suppress finalization, resulting in efficient cleanup of your object. If a user of your class forgets to call Dispose, the finalizer might be called and will act as a safety net to ensure that your external resources will be freed.
Creating and properly disposing of an object requires several lines of correctly written code. A mistake in implementing this code can cause errors that are difficult to trace, so the Visual C# language offers a more automated solution based on using the IDisposable interface.
IDisposable defines one method: Dispose. Implementing this interface is the preferred way for a class to advertise that it’s exposing a method for proper object cleanup. A typical implementation of IDisposable is shown here:
public class ResourceConnector: IDisposable { ~ResourceConnector() { Dispose(false); } public void Dispose() { Dispose(true); } protected void Dispose(bool disposing) { if(disposing) { GC.SuppressFinalize(this); // Dispose of managed objects if disposing. } // Release our external resources here. } }
When the Dispose method is called, the object is being properly freed by a client. External resources are freed, and GC.SuppressFinalize is called as an optimization step to prevent finalization. If the object is disposed by command, it’s also appropriate for an object to dispose of objects it owns.
If the finalizer is called by the .NET Framework, the call to GC.SuppressFinalize isn’t needed because the object is already being finalized. In addition, it’s not appropriate to reference any managed objects because these objects may have been finalized or even collected already.
Classes that implement IDisposable can take advantage of a Visual C# language feature that assists in proper disposal. The using statement works with the IDisposable interface to simplify the process of writing client code that correctly cleans up objects that require finalization. The using statement guarantees that the Dispose method is called even if exceptions occur. Consider the following code:
using(ResourceConnector rc = new ResourceConnector()) { rc.UseResource(); // rc.Dispose called automatically. }
The using statement has two sections: the allocation expression is located between the parentheses, and the code block that follows provides scoping. After the code block has finished executing, the Dispose method will be called for the allocated object.
The Visual C# .NET compiler will generate code equivalent to the following code written without the using statement:
ResourceConnector rc = null; try { rc = new ResourceConnector(); // Use rc here. } finally { if(rc != null) { IDisposable disp = rc as IDisposable; disp.Dispose(); } }
As you can see, the code written with the using statement is more clear and less error-prone.
Multiple objects of a single type can be allocated in a single using expression by simply separating the allocation expressions with commas, as shown here:
using(SolidBrush greenBrush = new SolidBrush(Color.Green), redBrush = new SolidBrush(Color.Red)) { }
The System.GC class contains static methods that are used to interact with the garbage collection mechanism, including methods to initiate a garbage collection pass, to determine an object’s current generation, and to determine the amount of allocated memory.
The most frequently used System.GC method is SuppressFinalize, shown in the following code.
GC.SuppressFinalize(this);
The SuppressFinalize method prevents an object from being finalized and optimizes the performance of the garbage collector. You should call this method when your object is disposed of via a Dispose or Close method.
Collect is used to programmatically initiate a garbage collection pass. There are two versions of Collect. The version with no parameters, shown here, performs a full collection:
GC.Collect();
The more useful version of Collect allows you to specify the generation to be collected. This flexibility enables you to quickly reclaim generation zero if you’ve recently used and freed a number of temporary objects:
GC.Collect(0);
A generation one or two collection pass always includes any lower generations, so calling Collect(2) will cause a full garbage collection pass.
The GetGeneration method will return the current generation of an object passed as a parameter:
int myGeneration = GC.GetGeneration(this);
GetGeneration is useful for tracking objects as they interact with the garbage collector and for auditing memory usage. Like the GetTotalMemory method, however, GetGeneration has limited value in code not dedicated to tracing or debugging.
GetTotalMemory returns the amount of memory allocated on the managed heap. Depending on the parameter you pass to the function, the function’s return value might not be a precise number, due to the way the managed heap works. If unreferenced objects that the garbage collector hasn’t yet reclaimed exist on the heap, GetTotalMemory might return a number that’s larger than the number of currently allocated bytes. To get a more exact number, this method allows you to pass a bool parameter that specifies whether a collection is to be initiated before the measurement, as follows:
long totalMemory = GC.GetTotalMemory(true);
Passing true as a parameter causes a full garbage collection pass before the managed heap’s size is calculated. Passing false simply returns the size of the heap without attempting to collect or compact any unused space.