8.1 Garbage Collection

Efficient memory management is essential in a runtime system. Storage for objects is allocated in a designated part of memory called the heap. The size of the heap is finite. Garbage collection is a process of managing the heap efficiently; that is, reclaiming memory occupied by objects that are no longer needed and making it available for new objects. Java provides automatic garbage collection, meaning that the runtime environment can take care of memory management concerning objects without the program having to take any special action. Storage allocated on the heap through the new operator is administered by the automatic garbage collector. The automatic garbage collection scheme guarantees that a reference to an object is always valid while the object is needed in the program. The object will not be reclaimed, leaving the reference dangling.

Having an automatic garbage collector frees the programmer from the responsibility of providing code for deleting objects. In relying on the automatic garbage collector, a Java program also forfeits any significant influence on the garbage collection of its objects (see p. 327). However, this price is insignificant when compared to the cost of putting the code for object management in place and plugging all the memory leaks. Time-critical applications should bear in mind that the automatic garbage collector runs as a background task and may prove detrimental to their performance.

Reachable References

An automatic garbage collector essentially performs two tasks:

  • decide if and when memory needs to be reclaimed

  • find objects that are no longer needed by the program and reclaim their storage

A program has no guarantees that the automatic garbage collector will be run during its execution. A program should not rely on the scheduling of the automatic garbage collector for its behavior (see p. 327).

In order to understand how the automatic garbage collector finds objects whose storage should be reclaimed, we need to look at the activity going on in the JVM. Java provides thread-based multitasking, meaning there can be several threads executing in the JVM, each doing its own task (see Chapter 9). A thread is an independent path of execution through the program code. A thread is alive if it has not completed its execution. Each live thread has its own runtime stack, as explained in Section 5.5 on page 181. The runtime stack contains activation records of methods that are currently active. Local references declared in a method can always be found in the method's activation record, on the runtime stack associated with the thread in which the method is called. Objects, on the other hand, are always created in the heap. If an object has a field reference, then the field is to be found inside the object in the heap, and the object denoted by the field reference is also to be found in the heap.

An example of how memory is organized during execution is depicted in Figure 8.1. It shows two live threads (t1 and t2) and their respective runtime stacks with the activation records. The diagram shows which objects in the heap are referenced by local references in the method activation records. The diagram also shows field references in objects, which denote other objects in the heap. Some objects have several aliases.

Figure 8.1. Memory Organization at Runtime

graphics/08fig01.gif

An object in the heap is said to be reachable if it is denoted by any local reference in a runtime stack. Additionally, any object that is denoted by a reference in a reachable object is also said to be reachable. Reachability is a transitive relation. Thus, a reachable object has at least one chain of reachable references from the runtime stack. Any reference that makes an object reachable is called a reachable reference. An object that is not reachable is said to be unreachable.

A reachable object is alive. It is accessible by the live thread that owns the runtime stack. Note that an object can be accessible by more than one thread. Any object that is not accessible by a live thread is a candidate for garbage collection. When an object becomes unreachable and is waiting for its memory to be reclaimed, it is said to be eligible for garbage collection. An object is eligible for garbage collection if all references denoting it are in eligible objects. Eligible objects do not affect the future course of program execution. When the garbage collector runs, it finds and reclaims the storage of eligible objects. However, garbage collection does not necessarily occur as soon as an object becomes unreachable.

From Figure 8.1 we see that objects o4, o5, o11, o12, o14, and o15 all have reachable references. Objects o13 and o16 have no reachable references and are, therefore, eligible for garbage collection.

From the discussion above we can conclude that if a composite object becomes unreachable, then its constituent objects also become unreachable, barring any reachable references to the constituent objects. Although objects o1, o2, and o3 form a circular list, they do not have any reachable references. Thus, these objects are all eligible. On the other hand, objects o5, o6, and o7 form a linear list, but they are all reachable, as the first object in the list, 05, is reachable. Objects o8, o10, o11, and o9 also form a linear list (in that order), but not all objects in the list are reachable. Only objects o9 and o11 are reachable, as object o11 has a reachable reference. Objects o8 and o10 are eligible for garbage collection.

The lifetime of an object is the time from when it is created to the time it is garbage collected. Under normal circumstances, an object is accessible from the time when it is created to the time when it is unreachable. The lifetime of an object can also include a period when it is eligible for garbage collection, waiting for its storage to be reclaimed. The finalization mechanism (see p. 324) in Java does provide a means for resurrecting an object after it is eligible for garbage collection, but the finalization mechanism is rarely used for this purpose.

In the garbage collection scheme discussed above, an object remains reachable as long as there is a reference to it from running code. Using strong references (the technical name for the normal kind of references) can prove to be a handicap in certain situations. An application that uses a clipboard would most likely want its clipboard accessible at all times, but it would not mind if the contents of the clipboard were garbage collected when memory became low. This would not be possible if strong references were used to refer to the clipboard's contents.

The abstract class java.lang.ref.Reference and its concrete subclasses (SoftReference, WeakReference, PhantomReference) provide reference objects that can be used to maintain more sophisticated kinds of references to another object (called the referent). A reference object introduces an extra level of indirection, so that the program does not access the referent directly. The automatic garbage collector knows about reference objects and can reclaim the referent if it is only reachable through reference objects. The concrete subclasses implement references of various strength and reachability, which the garbage collector takes into consideration.

Facilitating Garbage Collection

The automatic garbage collector figures out which objects are not reachable and, therefore, eligible for garbage collection. It will certainly go to work if there is a danger of running out of memory. Although the automatic garbage collector tries to run unobtrusively, certain programming practices can nevertheless help in minimizing the overhead associated with garbage collection during program execution. Automatic garbage collection should not be perceived as a license for uninhibited creation of objects and forgetting about them.

Certain objects, such as files and net connections, can tie up other resources and should be disposed of properly when they are no longer needed. In most cases, the finally block in the try-catch-finally construct (see Section 5.7, p. 188) provides a convenient facility for such purposes, as it will always be executed, thereby ensuring proper disposal of any unwanted resources.

To optimize its memory footprint, a live thread should only retain access to an object as long as the object is needed for its execution. The program can make objects become eligible for garbage collection as early as possible by removing all references to the object when it is no longer needed.

Objects that are created and accessed by local references in a method are eligible for garbage collection when the method terminates, unless reference values to these objects are exported out of the method. This can occur if a reference value is returned from the method, passed as argument to another method that records the reference, or thrown as an exception. However, a method need not always leave objects to be garbage collected after its termination. It can facilitate garbage collection by taking suitable action, for example, by nulling references.

import java.io.*;

class WellbehavedClass {
    // ...
    void wellbehavedMethod() {

        File aFile;
        long[] bigArray = new long[20000];

        // ... uses local variables ...

        // Does cleanup (before starting something extensive)
        aFile = null;                    // (1)
        bigArray = null;                 // (2)

        // Start some other extensive activity
        // ...
    }
    // ...
}

In the previous code, the local variables are set to null after use at (1) and (2), before starting some other extensive activity. This makes the objects denoted by the local variables eligible for garbage collection from this point onward, rather than after the method terminates. This optimization technique of nulling references need only be used as a last resort when resources are scarce.

When a method returns a reference value and the object denoted by the value is not needed, not assigning this value to a reference also facilitates garbage collection.

If a reference is assigned a new reference value, the object denoted by the reference prior to the assignment can become eligible for garbage collection.

Removing reachable references to a composite object can make the constituent objects become eligible for garbage collection, as explained earlier.

Example 8.1 illustrates how the program can influence garbage collection eligibility. Class HeavyItem represents objects with a large memory footprint, on which we want to monitor garbage collection. Each composite HeavyItem object has a reference to a large array. The class overrides the finalize() method from the Object class to print out an ID when the object is finalized. This method is always called on an eligible object before it is destroyed (see finalizers, p. 324). We use it to indicate in the output if and when a HeavyItem is reclaimed. To illustrate the effect of garbage collection on object hierarchies, each object may also have a reference to another HeavyItem.

In Example 8.1, the class RecyclingBin defines a method createHeavyItem() at (4). In this method, the HeavyItem created at (5) is eligible for garbage collection after the reassignment of reference itemA at (6), as this object will have no references. The HeavyItem created at (6) is accessible on return from the method. Its fate depends on the code that calls this method.

In Example 8.1, the class RecyclingBin also defines a method createList() at (7). It returns the reference value in the reference item1, which denotes the first item in a list of three HeavyItems. Because of the list structure, none of the HeavyItems in the list are eligible for garbage collection on return from the method. Again, the fate of the objects in the list is decided by the code that calls this method. It is enough for the first item in the list to become unreachable, in order for all objects in the list to become eligible for garbage collection (barring any reachable references).

Example 8.1 Garbage Collection Eligibility
class HeavyItem {                                   // (1)
    int[]     itemBody;
    String    itemID;
    HeavyItem nextItem;

    HeavyItem(String ID, HeavyItem itemRef) {       // (2)
        itemBody = new int[100000];
        itemID   = ID;
        nextItem = itemRef;
    }
    protected void finalize() throws Throwable {    // (3)
        System.out.println(itemID + ": recycled.");
        super.finalize();
    }
}

public class RecyclingBin {

    public static HeavyItem createHeavyItem(String itemID) {           // (4)
        HeavyItem itemA = new HeavyItem(itemID + " local item", null); // (5)
        itemA = new HeavyItem(itemID, null);                           // (6)
        System.out.println("Return from creating HeavyItem " + itemID);
        return itemA;                                                  // (7)
    }

    public static HeavyItem createList(String listID) {                // (8)
        HeavyItem item3 = new HeavyItem(listID + ": item3", null);     // (9)
        HeavyItem item2 = new HeavyItem(listID + ": item2", item3);    // (10)
        HeavyItem item1 = new HeavyItem(listID + ": item1", item2);    // (11)
        System.out.println("Return from creating list " + listID);
        return item1;                                                  // (12)
    }

    public static void main(String[] args) {                           // (13)
        HeavyItem list = createList("X");                              // (14)
        list = createList("Y");                                        // (15)

        HeavyItem itemOne = createHeavyItem("One");                    // (16)
        HeavyItem itemTwo = createHeavyItem("Two");                    // (17)
        itemOne = null;                                                // (18)
        createHeavyItem("Three");                                      // (19)
        createHeavyItem("Four");                                       // (20)
        System.out.println("Return from main().");
    }
}

Possible output from the program:

Return from creating list X
Return from creating list Y
X: item3: recycled.
X: item2: recycled.
X: item1: recycled.
Return from creating HeavyItem One
Return from creating HeavyItem Two
Return from creating HeavyItem Three
Three local item: recycled.
Three: recycled.
Two local item: recycled.
Return from creating HeavyItem Four
One local item: recycled.
One: recycled.
Return from main().

In Example 8.1, the main() method at (13) in the class RecyclingBin uses the methods createHeavyItem() and createList(). It creates a list X at (14), but the reference to its first item is reassigned at (15), making objects in list X eligible for garbage collection after (15). The first item of list Y is stored in the reference list, making this list non-eligible for garbage collection during the execution of the main() method.

The main() method creates two items at (16) and (17), storing their reference values in references itemOne and itemTwo, respectively. The reference itemOne is nulled at (18), making HeavyItem with identity One eligible for garbage collection. The two calls to the createHeavyItem() method at (19) and (20) return reference values to HeavyItems, which are not stored, making each object eligible for garbage collection right after the respective method call returns.

The output from the program bears out the observations made above. Objects in list Y (accessible through reference list) and HeavyItem with identity Two (accessible through reference itemTwo) remain non-eligible while the main() method executes. Although the output shows that HeavyItems with identities Four and Five were never garbage collected, they are not accessible once they become eligible for garbage collection at (19) and (20), respectively. Any objects in the heap after the program terminates are reclaimed by the operating system.

Object Finalization

Object finalization provides an object a last resort to undertake any action before its storage is reclaimed. The automatic garbage collector calls the finalize() method in an object that is eligible for garbage collection before actually destroying the object. The finalize() method is defined in the Object class.

protected void finalize() throws Throwable


An implementation of the finalize() method is called a finalizer. A subclass can override the finalizer from the Object class in order to take more specific and appropriate action before an object of the subclass is destroyed.

A finalizer can, like any other method, catch and throw exceptions (see Section 5.7, p. 188). However, any exception thrown but not caught by a finalizer invoked by the garbage collector is ignored. The finalizer is only called once on an object, regardless of whether any exception is thrown during its execution. In case of finalization failure, the object still remains eligible for disposal at the discretion of the garbage collector (unless it has been resurrected, as explained in the next subsection). Since there is no guarantee that the garbage collector will ever run, there is also no guarantee that the finalizer will ever be called.

In the following code, the finalizer at (1) will take appropriate action if and when called on objects of the class before they are garbage collected, ensuring that the resource is freed. Since it is not guaranteed that the finalizer will ever be called at all, a program should not rely on the finalization to do any critical operations.

public class AnotherWellbehavedClass {
    SomeResource objRef;
    // ...
    protected void finalize() throws Throwable {         // (1)
        try {                                            // (2)
            if (objRef != null) objRef.close();
        } finally {                                      // (3)
            super.finalize();                            // (4)
        }
    }
}

Finalizer Chaining

Unlike subclass constructors, overridden finalizers are not implicitly chained (see Section 6.3, p. 243). Therefore, a finalizer in a subclass should explicitly call the finalizer in its superclass as its last action, as shown at (4) in the previous code. The call to the finalizer of the superclass is in a finally block at (3), guaranteed to be executed regardless of any exceptions thrown by the code in the try block at (2).

A finalizer may make the object accessible again (i.e., resurrect it), thus avoiding it being garbage collected. One simple technique is to assign its this reference to a static field, from which it can later be retrieved. Since a finalizer is called only once on an object before being garbage collected, an object can only be resurrected once. In other words, if the object again becomes eligible for garbage collection and the garbage collector runs, the finalizer will not be called. Such object resurrections are not recommended, as they only undermine the purpose of the finalization mechanism.

Example 8.2 illustrates chaining of finalizers. It creates a user-specified number of large objects of a user-specified size. The number and size are provided through command-line program arguments. The loop at (7) in the main() method creates Blob objects, but does not store any references to them. Objects created are instances of the class Blob defined at (3). The Blob constructor at (4) initializes the field fat by constructing a large array of integers. The Blob class extends the BasicBlob class that assigns each blob a unique number (blobId) and keeps track of the number of blobs (population) not yet garbage collected. Creation of each Blob object by the constructor at (4) prints the ID number of the object and the message "Hello". The finalize() method at (5) is called before a Blob object is garbage collected. It prints the message "Bye" and calls the finalize() method in the class BasicBlob at (2), which decrements the population count. The program output shows that two blobs were not garbage collected at the time the print statement at (8) was executed. It is evident from the number of "Bye" messages that three blobs were garbage collected before all the five blobs had been created in the loop at (7).

Example 8.2 Using Finalizers
class BasicBlob {                                     // (1)
    static    int idCounter;
    static    int population;
    protected int blobId;

    BasicBlob() {
        blobId = idCounter++;
        ++population;
    }
    protected void finalize() throws Throwable {      // (2)
        --population;
        super.finalize();
    }
}

class Blob extends BasicBlob {                        // (3)
    int[] fat;

    Blob(int bloatedness) {                           // (4)
        fat = new int[bloatedness];
        System.out.println(blobId + ": Hello");
    }

    protected void finalize() throws Throwable {      // (5)
        System.out.println(blobId + ": Bye");
        super.finalize();
    }
}

public class Finalizers {
    public static void main(String[] args) {          // (6)
        int blobsRequired, blobSize;
        try {
            blobsRequired = Integer.parseInt(args[0]);
            blobSize      = Integer.parseInt(args[1]);
        } catch(IndexOutOfBoundsException e) {
            System.err.println(
                "Usage: Finalizers <number of blobs> <blob size>");
            return;
        }
        for (int i=0; i<blobsRequired; ++i) {         // (7)
            new Blob(blobSize);
        }
        System.out.println(BasicBlob.population + " blobs alive"); // (8)
    }
}

Running the program with the command

>java Finalizers 5 500000

might result in the following output:

0: Hello
1: Hello
2: Hello
0: Bye
1: Bye
2: Bye
3: Hello
4: Hello
2 blobs alive

Invoking Garbage Collection

Although Java provides facilities to invoke the garbage collection explicitly, there are no guarantees that it will be run. The program can only request that garbage collection be performed, but there is no way that garbage collection can be forced.

The System.gc() method can be used to request garbage collection, and the System.runFinalization() method can be called to suggest that any pending finalizers be run for objects eligible for garbage collection. Alternatively, corresponding methods in the Runtime class can be used. A Java application has a unique Runtime object that can be used by the application to interact with the JVM. An application can obtain this object by calling the method Runtime.getRuntime(). The Runtime class provides various methods related to memory issues.

static Runtime getRuntime()

Returns the Runtime object associated with the current application.

void gc()

Requests that garbage collection be run. However, it is recommended to use the more convenient static method System.gc().

void runFinalization()

Requests that any pending finalizers be run for objects eligible for garbage collection. Again, it is more convenient to use the static method System.runFinalization().

long freeMemory()

Returns the amount of free memory (bytes) in the JVM, that is available for new objects.

long totalMemory()

Returns the total amount of memory (bytes) available in the JVM. This includes both memory occupied by current objects and that which is available for new objects.


Example 8.3 illustrates invoking garbage collection. The class MemoryCheck is an adaptation of the class Finalizers from Example 8.2. The RunTime object for the application is obtained at (7). This object is used to get information regarding total memory and free memory in the JVM at (8) and (9), respectively. Blobs are created in the loop at (10). The amount of free memory after blob creation is printed at (11). We see from the program output that some blobs were already garbage collected before the execution got to (11). A request for garbage collection is made at (12). Checking free memory after the request shows that more memory has become available, indicating that the request was honoured. It is instructive to run the program without the method call System.gc() at (12), in order to compare the results.

Example 8.3 Invoking Garbage Collection
class BasicBlob {  /* See Example 8.2. */ }
class Blob extends BasicBlob { /* See Example 8.2.*/ }

public class MemoryCheck {
    public static void main(String[] args) {          // (6)
        int blobsRequired, blobSize;
        try {
            blobsRequired = Integer.parseInt(args[0]);
            blobSize      = Integer.parseInt(args[1]);
        } catch(IndexOutOfBoundsException e) {
            System.err.println(
                "Usage: MemoryCheck <number of blobs> <blob size>");
            return;
        }
        Runtime environment = Runtime.getRuntime();                      // (7)
        System.out.println("Total memory: " + environment.totalMemory());// (8)
        System.out.println("Free memory before blob creation: "
                           + environment.freeMemory());                  // (9)
        for (int i=0; i<blobsRequired; ++i) {                            // (10)
            new Blob(blobSize);
        }
        System.out.println("Free memory after blob creation: "
                            + environment.freeMemory());                 // (11)
        System.gc();                                                     // (12)
        System.out.println("Free memory after requesting GC: "
                            + environment.freeMemory());                 // (13)
        System.out.println(BasicBlob.population + " blobs alive");       // (14)
    }
}

Running the program with the command

>java MemoryCheck 5 100000

gave the following output:

Total memory: 2031616
Free memory before blob creation: 1773192
0: Hello
1: Hello
2: Hello
1: Bye
2: Bye
3: Hello
0: Bye
3: Bye
4: Hello
Free memory after blob creation: 818760
4: Bye
Free memory after requesting GC: 1619656
0 blobs alive

Certain aspects regarding automatic garbage collection should be noted:

  • There are no guarantees that objects that are eligible for garbage collection will have their finalizers executed. Garbage collection might not even be run if the program execution does not warrant it. Thus, any memory allocated during program execution might remain allocated after program termination, but will be reclaimed by the operating system.

  • There are also no guarantees about the order in which the objects will be garbage collected, or the order in which their finalizers will be executed. Therefore, the program should not make any assumptions based on these aspects.

  • Garbage collection does not guarantee that there is enough memory for the program to run. A program can rely on the garbage collector to run when memory gets very low and it can expect an OutOfMemoryException to be thrown if its memory demands cannot be met.