Chapter 14: Memory Management

Overview

Memory management is an essential aspect of every program except the most trivial of applications. There are different classifications of memory. Registers, static data area (SDA), stack, thread local storage, heap, virtual, and file storage are some of the categories of memory. Registers hold data that require quick and efficient memory access. Critical system information, such as the instruction and stack pointer, are stored in registers. Static and global values are automatically stored in the SDA. Stacks are thread-specific and hold the context information (stack frames) of current functions. Local variables, parameters, return values, the instruction pointer of the calling function, and other function-related information is placed on the stack. Thread local storage (TLS) is also thread-specific storage. The TLS table is 64 slots of 32-bit values that contain thread-specific data. TLS slots frequently hold pointers to blocks of data that belong to the thread. A heap is memory allocated at run time from the virtual memory of an application and controlled by the heap manager. An application can have more than one heap. Large objects are commonly placed on the heap, whereas small objects are located on the stack. Virtual memory is raw memory that developers directly manipulate at run time with minimal assistance from the environment. Virtual memory is ideal for collections of disparate-sized data stored in noncontiguous memory addresses, such as a link list.

Developers of managed applications can directly affect the stack and managed heap. The other forms of memory, such as registers and virtual memory, are largely unavailable except through interoperability. Value type instances, local values, are created on the stack. Instances of reference types, objects, are created on the managed heap. Lifetimes of local values are confined by scope. When a local value loses scope, it is removed from the stack. For example, the local variables and parameters of a function are removed from memory when the function is exited. The lifetime of an object, which resides on the managed heap, is controlled by the garbage collector (GC), which is an element of the Common Language Runtime (CLR). The GC periodically performs garbage collection to cleanse memory of unused objects.

The policies and best practices of the GC and garbage collection are the primary focus of this chapter. Although garbage collection is not language-specific, the tradition of C-based developers, as related to memory management, is somewhat different from other developers—particularly Microsoft Visual Basic and Java developers. The memory model employed in Visual Basic and Java is cosmetically similar to the managed environment. However, the memory model of previous C-based languages is dissimilar to this environment. These differences make this chapter especially important.

Developers in C-based languages are accustomed to deterministic garbage collection, in which developers explicitly set the lifetime of an object. The malloc/free and new/delete statement combinations create and destroy objects that reside on a heap. Managing the memory of a heap required programmer discipline, which proved insufficient for guaranteeing consistently robust code. Memory leaks and other problems were common. These leaks could eventually destabilize the application and cause complete application failure. Instead of each developer individually struggling with these issues, the managed environment has the GC, which is omnipresent and controls the lifetime of objects located on the managed heap.

The GC offers nondeterministic garbage collection. Developers explicitly allocate memory for objects. However, the GC determines when garbage collection is performed and unused objects are vanquished from memory.

When memory for an object is allocated at run time, the GC returns a reference to that object. The new operator requests that an instance of a type (an object) is placed on the managed heap. A reference is an indirect pointer to that object. This indirection helps the GC transparently manage the managed heap, including the relocation of pointers when necessary.

In .NET, unused objects are eventually removed from memory. When is an object unused? Reference counting is not performed in the managed environment. Reference counting was common to Component Object Model (COM) components. When the reference count became zero, the related COM component was considered no longer relevant and removed from memory. There were many problems in this model. First, this required careful synchronization of the AddRef and Release methods. Breakdown of synchronization could sometimes cause memory leakage and exceptions. Second, reference counting was expensive. Reference counting was applied to collectable and noncollectable components. Finally, programs incurred the overhead of reference counting, even when there was no memory stress on the application. For this reason, reference counting was deservedly abandoned for a more efficient model that addresses the memory concerns of modern applications. When there is memory stress in the managed environment, garbage collection occurs, and an object graph is built. Objects not on the graph become candidates for collection.