2.2 Garbage Collection

The Java runtime system normally includes a garbage collector.[2] Some of the commercial profilers provide statistics showing what the garbage collector is doing. You can also use the -verbosegc option with the VM. This option prints out time and space values for objects reclaimed and space recycled as the reclamations occur. The 1.4 VM introduced an additional option to log the output to a file instead of standard error: the -Xloggc:<file> option. Printing directly to a file is slightly more efficient than redirecting the VM output to a file because the direct file write buffering is slightly more efficient than the piped redirect buffering. The printout includes explicit synchronous calls to the garbage collector (using System.gc( ) ) as well as asynchronous executions of the garbage collector, as occurs in normal operation when free memory available to the VM gets low.

[2] Some embedded runtimes do not include a garbage collector. All objects may have to fit into memory without any garbage collection for these runtimes.

System.gc( ) does not necessarily force a synchronous garbage collection. Instead, the gc( ) call is really a hint to the runtime that now is a good time to run the garbage collector. The runtime decides whether to execute the garbage collection at that time and what type of garbage collection to run. In more recent VMs, the effects of calling System.gc( ) can be completely disabled using the runtime flag XX:+DisableExplicitGC.

2.2.1 Using -verbosegc to Analyze Garbage Collection

It is worth looking at some output from running with -verbosegc. The following code fragment creates lots of objects to force the garbage collector to work, and also includes some synchronous calls to the garbage collector:

package tuning.gc;
public class Test {
  public static void main(String[  ] args)
  {
    int SIZE = 4000;
    StringBuffer s;
    java.util.Vector v;
  
    //Create some objects so that the garbage collector 
    //has something to do
    for (int i = 0; i < SIZE; i++)
    {
      s = new StringBuffer(50);
      v = new java.util.Vector(30);
      s.append(i).append(i+1).append(i+2).append(i+3);
    }
    s = null;
    v = null;
    System.out.println("Starting explicit garbage collection");
    long time = System.currentTimeMillis(  );
    System.gc(  );
    System.out.println("Garbage collection took " + 
      (System.currentTimeMillis(  )-time) + " millis");
  
    int[  ] arr = new int[SIZE*10];
    //null the variable so that the array can be garbage collected
    time = System.currentTimeMillis(  );
    arr = null;
    System.out.println("Starting explicit garbage collection");
    System.gc(  );
    System.out.println("Garbage collection took " + 
      (System.currentTimeMillis(  )-time) + " millis");
  }
}

When this code is run in Sun JDK 1.2 with the -verbosegc option,[3] you get:

[3] Note that -verbosegc can also work with applets by using this command line:

<GC: need to expand mark bits to cover 16384 bytes>
<GC: managing allocation failure: need 1032 bytes, type=1, action=1>
<GC: 0 milliseconds since last GC>
<GC: freed 18578 objects, 658392 bytes in 26 ms, 78% free (658872/838856)>
  <GC: init&scan: 1 ms, scan handles: 12 ms, sweep: 13 ms, compact: 0 ms>
  <GC: 0 register-marked objects, 1 stack-marked objects>
  <GC: 1 register-marked handles, 31 stack-marked handles>
  <GC: refs: soft 0 (age >= 32), weak 0, final 2, phantom 0>
<GC: managing allocation failure: need 1032 bytes, type=1, action=1>
<GC: 180 milliseconds since last GC>
<GC: compactHeap took 15 ms, swap time = 4 ms, blocks_moved=18838>
<GC: 0 explicitly pinned objects, 2 conservatively pinned objects>
<GC: last free block at 0x01A0889C of length 1888>
<GC: last free block is at end>
<GC: freed 18822 objects, 627504 bytes in 50 ms, 78% free (658920/838856)>
  <GC: init&scan: 2 ms, scan handles: 11 ms, sweep: 16 ms, compact: 21 ms>
  <GC: 0 register-marked objects, 2 stack-marked objects>
  <GC: 0 register-marked handles, 33 stack-marked handles>
  <GC: refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
Starting explicit garbage collection
<GC: compactHeap took 9 ms, swap time = 5 ms, blocks_moved=13453>
<GC: 0 explicitly pinned objects, 5 conservatively pinned objects>
<GC: last free block at 0x019D5534 of length 211656>
<GC: last free block is at end>
<GC: freed 13443 objects, 447752 bytes in 40 ms, 78% free (657752/838856)>
  <GC: init&scan: 1 ms, scan handles: 12 ms, sweep: 12 ms, compact: 15 ms>
  <GC: 0 register-marked objects, 6 stack-marked objects>
  <GC: 0 register-marked handles, 111 stack-marked handles>
  <GC: refs: soft 0 (age >= 32), weak 0, final 0, phantom 0>
Garbage collection took 151 millis
...

The actual details of the output are not standardized, and are likely to change between different VM versions as well as between VMs from different vendors. As a comparison, this is the output from the later garbage collector version using Sun JDK 1.3:

[GC 511K->96K(1984K), 0.0281726 secs]
[GC 608K->97K(1984K), 0.0149952 secs]
[GC 609K->97K(1984K), 0.0071464 secs]
[GC 609K->97K(1984K), 0.0093515 secs]
[GC 609K->97K(1984K), 0.0060427 secs]
Starting explicit garbage collection
[Full GC 228K->96K(1984K), 0.0899268 secs]
Garbage collection took 170 millis
Starting explicit garbage collection
[Full GC 253K->96K(1984K), 0.0884710 secs]
Garbage collection took 180 millis

Note the dramatic difference in output from 1.2, stemming from HotSpot technology in 1.3 and later VMs.

As you can see, each time the garbage collector kicks in, it produces a report of its activities. Any one garbage collection reports on the times taken by the various parts of the garbage collector and specifies what the garbage collector is doing. Note that the internal times reported by the garbage collector are not the full time taken for the whole activity. In the examples, you can see the full time for one of the synchronous garbage collections, which is wrapped by print statements from the code fragment (i.e., those lines not starting with a < or [ sign). However, these times include the times taken to output the printed statements from the garbage collector and are therefore higher times than those for the garbage collection alone. To see the pure synchronous garbage-collection times for this code fragment, you need to run the program without the -verbosegc option.

In the previous examples, the garbage collector kicks in either because it has been called by the code fragment or because creating an object from the code fragment (or the runtime initialization) encounters a lack of free memory from which to allocate space for that object. This is normally reported as "managing allocation failure."

Some garbage-collector versions appear to execute their garbage collections faster than others. But be aware that this time difference may be an artifact: it can be caused by the different number of printed statements when using the -verbosegc option. When run without the -verbosegc option, the times may be similar. The garbage collector from JDK 1.2 executes a more complex scavenging algorithm than earlier JDK versions to smooth out the effects of garbage collection running in the background. (The garbage-collection algorithm is discussed in Chapter 3 along with tuning the heap. The garbage collection algorithm can be altered a little, and garbage-collection statistics can give you important information about objects being reclaimed, which helps you tune your application.) From JDK 1.2, the VM also handles many types of references that never existed in VM versions before 1.2. Overall, Java 2 applications do seem to have faster object recycling in application contexts than previous JDK versions did.

It is worthwhile running your application using the -verbosegc option to see how often the garbage collector kicks in. At the same time, you should use all logging and tracing options available with your application so that the output from the garbage collector is set in the context of your application activities. Garbage-collection statistics can be collected and summarized in a useful way. The 1.2 output is relatively easy to understand; the important lines are those summarizing the statistics, e.g.:

<GC: freed 18822 objects, 627504 bytes in 50 ms, 78% free (658920/838856)>

This line shows that 18822 objects were reclaimed during this garbage collection. The reclamation freed up 627504 bytes, and the time taken to run the garbage collection was 50 milliseconds. After the garbage collection the heap was 78% free, with 658920 bytes available out of a total heap size of 838856 bytes.

The 1.3 (HotSpot) output is a little more concentrated:

[GC 609K->97K(1984K), 0.0071464 secs]
[Full GC 253K->96K(1984K), 0.0884710 secs]

It consists of two types of output, a "full" or major GC (shown in the second line of the example) that runs through the whole heap, and a minor GC (shown in the first line) that executed in young space (see Section 3.2). The numbers before and after the arrow show the amount of space taken by objects before and after the garbage collection. The following number in parentheses is the total available heap space, and the remaining number shows the time taken to execute the garbage collection.

The important items that all -verbosegc output has in common are:

  • The size of the heap after the garbage collection

  • The time taken to run the garbage collection

  • The number of bytes reclaimed by the garbage collection (either listed directly, or deduced by subtracting the before and after used values)

Having the number of objects reclaimed would also be nice, but not all output lists that. Still, these three statistics are extremely useful. First, the heap size gives you a good idea of how much memory the application needs and helps you to tune the heap. Even more useful are the other two options. By running the output through a pattern matcher to extract the GC times and amount freed and totalling those values, you can identify the cost of GC to your application. I like to send some output from the application to indicate when the application finished the initialization stage and started running; then I can filter lines from that point until when I terminate my test.

Let's look at an example now of how to calculate the GC impact on an application. After the application initializes, I start the test running. This particular test is a server application, which I run for 40 minutes, then stop. Taking the logged -verbosegc output, I eliminate any log statements before and after the test run (identified by lines emitted by the application). Then I run a pattern matcher against the -verbosegc logs. The pattern matcher can be Awk, Perl, Java regular expressions, or any pattern matcher you prefer. The matching is very simple; for example, here is a simple matcher in Perl for the 1.2 output:

if(/freed\s+(\d+)\s+objects\,\s+(\d+)\s+bytes\s+in\s+(\d+)\s+ms\,/)
{
  $objects += $1;
  $bytes += $2;
  $time_ms += $3;
}
sub END {print "freed $objects objects, $bytes bytes in $time_ms ms\n"}

The result from this pattern match shows that over the course of the 40 minutes (or 2400 seconds) of elapsed time, 5654137008 bytes (5392 megabytes) were freed up by garbage collections, taking a total of 717612 milliseconds (or 718 seconds) of GC thread time. With a few simple calculations, we can see that:

  • 30% of the time was taken by garbage collection (718 seconds divided by 2400 seconds), which looks excessive. As a guideline, your ultimate target should be under 5% taken by GC, and certainly anything over 15% is an urgent problem.

  • 135 megabytes per minute of heap were recycled (5392 megabytes divided by 40 minutes), which for this application looks excessive.

If you don't know the exact number of objects being churned, you can estimate it using a standard average object size of 50 bytes. In this case we have an object churn rate of 2.8 million objects per minute. (5654137008 divided by 50 to get 113 million objects, divided by 40 minutes. For comparison, the actual recorded number of objects churned was 107 million.)

Of course, these values need to be taken in the context of the application. The primary value is the percentage of time taken by the GC, which, again, should be below 15% and ideally below 5%. The other churn values can be considered only in the context of the application, taking into account what you are doing.

The calculation I just made of GC percentage time is actually only partially complete. To be completely accurate, you also need to factor in how much load the application and GC put on the CPU, which would require you to monitor the underlying system for the duration of the test (see Chapter 14 for tools to do that). In this case, the server ran on a single-processor system. The GC utilized the CPU at 100% when running (established in a separate test by correlating GC output to per-second CPU utilization monitoring), and over the course of the 40 minutes of this test, the CPU utilization was 67.5%. Again, a few simple calculations show exactly what is happening:

  • The CPU time for the application was 67.5% of 2400 seconds (40 minutes), which makes 1620 seconds.

  • The CPU time for the GC was 100% of 718 seconds, which makes 718 seconds.

  • Therefore, garbage collection took 718 out of 1620 seconds, or 44% of the application's processing time.

These results reveal that GC is taking way too much time, but at least the metric is accurate. Note that if this were a dual-processor system, GC would probably have utilized the CPU at 50% (i.e., one CPU at 100% and the other unused), which, if everything else was as reported, would have yielded a GC percentage of 22% (as it would have been 50% of 718 seconds of GC CPU time).

The calculation outlined here is fairly simple and can be made with a minimum of tools. It is also easily altered to handle different output formats of -verbosegc.

2.2.2 Tools for Analyzing -verbosegc Output

In addition to performing these calculations yourself, several tools are available to analyze -verbosegc output:

awk script from Ken Gottry

Ken Gottry's JavaWorld article[4] gives a nice description of the HotSpot generational garbage collection and includes an awk script for generating an analysis of -verbosegc logs.

[4] Ken Gottry, "Pick up performance with generational garbage collection," JavaWorld, January 2002, http://www.javaworld.com/javaworld/jw-01-2002/jw-0111-hotspotgc.html.

GC analyzer from Nagendra Nagarajayya and J. Steven Mayer

A Sun Wireless Developer article[5] from Nagarajayya and Mayer provides a very detailed discussion of fine-tuning a heap. It includes a GC analyzer tool[6] for generating reports from -verbosegc logs.

[5] Nagendra Nagarajayya and J. Steven Mayer, "Improving Java Application Performance and Scalability by Reducing Garbage Collection Times and Sizing Memory," Sun Wireless Developer, May 2002, http://wireless.java.sun.com/midp/articles/garbage/.

[6] You can download it directly from http://wireless.java.sun.com/midp/articles/garbage/gc_analyze.pl.

GCViewer from Hendrik Schreiber

GCViewer is a graphical tool that allows you to visualize GC logging output. GCViewer can also export data in the proper format for further manipulation in a spreadsheet. GCViewer is available from http://www.tagtraum.com/.