4.4 Avoiding Garbage Collection

The canonicalization techniques I've discussed are one way to avoid garbage collection: fewer objects means less to garbage-collect. Similarly, the pooling technique in that section also tends to reduce garbage-collection requirements, partly because you are creating fewer objects by reusing them, and partly because you deallocate memory less often by holding onto the objects you have allocated. Of course, this also means that your memory requirements are higher, but you can't have it both ways.

Another technique for reducing garbage-collection impact is to avoid using objects where they are not needed. For example, there is no need to create an extra unnecessary Integer to parse a String containing an int value, as in:

String string = "55";
int theInt = new Integer(string).intValue(  )

Instead, there is a static method available for parsing:

int theInt = Integer.parseInt(string);

Unfortunately, some classes do not provide static methods that avoid the spurious intermediate creation of objects. Until JDK 1.2, there were no static methods that allowed you to parse strings containing floating-point numbers to get double s or floats. Instead, you needed to create an intermediate Double object and extract the value. (Even after JDK 1.2, an intermediate FloatingDecimal is created, but this is arguably due to good abstraction in the programming design.) When a class does not provide a static method, you can sometimes use a dummy instance to execute instance methods repeatedly, thus avoiding the need to create extra objects.

The primitive data types in Java use memory space that also needs reclaiming, but the overhead in reclaiming data-type storage is smaller: it is reclaimed at the same time as its holding object and so has a smaller impact. (Temporary primitive data types exist only on the stack and do not need to be garbage-collected at all; see Section 6.4.) For example, an object with just one instance variable holding an int is reclaimed in one object reclaim. If it holds an Integer object, the garbage collector needs to reclaim two objects.

Reducing garbage collection by using primitive data types also applies when you can hold an object in a primitive data-type format rather than another format. For example, if you have a large number of objects, each with a String instance variable holding a number (e.g., "1492", "1997"), it is better to make that instance variable an int data type and store the numbers as ints, provided that conversion overhead does not swamp the benefits of holding the values in this alternative format.

Similarly, you can use an int (or long) to represent a Date object, providing appropriate calculations to access and update the values, thus saving an object and the associated garbage overhead. Of course, you have a different runtime overhead instead, as those conversion calculations may take up more time.

A more extreme version of this technique is to use arrays to map objects: for example, see Section 11.10. Toward the end of that example, one version of the class gets rid of node objects completely, using a large array to map and maintain all instances and instance variables. This leads to a large improvement in performance at all stages of the object life cycle. Of course, this technique is a specialized one that should not be used generically throughout your application, or you will end up with unmaintainable code. It should be used only when called for (and when it can be completely encapsulated). A simple example is for the class defined as:

class MyClass
{
  int x;
  boolean y;
}

This class has an associated collection class that seems to hold an array of MyClass objects, but actually holds arrays of instance variables of the MyClass class:

class MyClassCollection
{
  int[  ] xs;
  boolean[  ] ys;
  public int getXForElement(int i) {return xs[i];}
  public boolean getYForElement(int i) {return ys[i];}
  //If possible avoid having to declare element access like the
  //following method:
  //public MyClass getElement(int i) {return new MyClass(xs[i], ys[i]);}
}

An extension of this technique flattens objects that have a one-to-one relationship. The classic example is a Person object that holds a Name object, consisting of first name and last name (and a collection of middle names), and an Address object, with street, number, etc. This can be collapsed down to just the Person object, with all the fields moved up to the Person class. For example, the original definition consists of three classes:

public class Person {
  private Name name;
  private Address address;
}
class Name {
  private String firstName;
  private String lastName;
  private String[  ] otherNames;
}
class Address {
  private int houseNumber;
  private String houseName;
  private String streetName;
  private String town;
  private String area;
  private String greaterArea;
  private String country;
  private String postCode;
}

These three classes collapse into one class:

public class Person {
  private String firstName;
  private String lastName;
  private String[  ] otherNames;
  private int houseNumber;
  private String houseName;
  private String streetName;
  private String town;
  private String area;
  private String greaterArea;
  private String country;
  private String postCode;
}

This results in the same data and the same functionality (assuming that Addresses and Names are not referenced by more than one Person). But now you have one object instead of three for each Person. Of course, this violates the good design of an application and should be used only when absolutely necessary, not as standard.

Finally, here are some general recommendations that help to reduce the number of unnecessary objects being generated. These recommendations should be part of your standard coding practice, not just performance-related:

  • Reduce the number of temporary objects being used, especially in loops. It is easy to use a method in a loop that has side effects such as making copies, or an accessor that returns a copy of some object you need only once.

  • Use StringBuffer in preference to the String concatenation operator (+). This is really a special case of the previous point, but needs to be emphasized.

  • Be aware of which methods alter objects directly without making copies and which ones return a copy of an object. For example, any String method that changes the string (such as String.trim( )) returns a new String object, whereas a method like Vector.setSize( ) does not return a copy. If you do not need a copy, use (or create) methods that do not return a copy of the object being operated on.

  • Avoid using generic classes that handle Object types when you are dealing with basic data types. For example, there is no need to use Vector to store ints by wrapping them in Integers. Instead, implement an IntVector class that holds the ints directly.