In many ways, you can think of Reference objects as normal objects that have a private Object instance variable. You can access the private object (termed the referent) using the Reference.get( ) method. However, Reference objects differ from normal objects in one hugely important way. The garbage collector may be allowed to clear Reference objects when it decides space is low enough. Clearing the Reference object sets the referent to null. For example, say you assign an object to a Reference. Later you test to see if the referent is null. It could be null if, between the assignment and the test, the garbage collector kicked in and decided to reclaim space:
Reference ref = new WeakReference(someObject); //ref.get( ) is someObject at the moment //Now do something that creates lots of objects, making //the garbage collector try to find more memory space doSomething( ); //now test if ref is null if (ref.get( ) = = null) System.out.println("The garbage collector deleted my ref"); else System.out.println("ref object is still here");
Note that the referent can be garbage-collected at any time, as long as there are no other strong references referring to it. (In the example, ref.get( ) can become null only if there are no other non-Reference objects referring to someObject.)
The advantage of References is that you can use them to hang onto objects that you want to reuse but are not needed immediately. If memory space gets too low, those objects not currently being used are automatically reclaimed by the garbage collector. This means that you subsequently need to create objects instead of reusing them, but that is preferable to having the program crash from lack of memory. (To delete the reference object itself when the referent is nulled, you need to create the reference with a ReferenceQueue instance. When the reference object is cleared, it is added to the ReferenceQueue instance and can then be processed by the application, e.g., explicitly deleted from a hash table in which it may be a key.)
There are three Reference types in Java 2. WeakReferences and SoftReferences differ essentially in the order in which the garbage collector clears them. Simplistically, the garbage collector does not clear SoftReference objects until all WeakReferences have been cleared. PhantomReferences (not addressed here) are not cleared automatically by the garbage collector and are intended for use in a different way.
Sun's documentation suggests that WeakReferences could be used for canonical tables, whereas SoftReferences would be more useful for caches. In the previous edition, I suggested the converse, giving the rationale that caches take up more space and so should be the first to be reclaimed. But after a number of discussions, I have come to realize that both suggestions are simply misleading. What we have are two reference types, one of which is likely to be reclaimed before the other. So you should use both types of Reference objects in a priority system, using the SoftReference objects to hold higher-priority elements so that they are cleared later than low-priority elements. For both caches and canonical tables, priority would probably be best assigned according to how expensive it is to recreate the object. In fact, you can also add PhantomReferences as a third, even higher-priority element. PhantomReferences would be cleared last of all.
Prior to Version 1.3.1, SoftReferences and WeakReferences were treated fairly similarly by the VM, simply being cleared whenever they were no longer strongly (and weakly) reachable, with only a slight ordering difference. However, from 1.3.1 on, the Sun VM started treating SoftReferences differently. Now, SoftReferences remain alive for some time after the last time they were referenced. The default length of time value is one second of lifetime per free megabyte in the heap. This provides more of a differentiation between SoftReference and WeakReference behavior.
The initial time-to-live values for SoftReferences can be altered using the -XX:SoftRefLRUPolicyMSPerMB flag, which specifies the lifetime per free megabyte in the heap, in milliseconds. For example, to change the value to 3 seconds per free heap megabyte, you would use:
% java -XX:SoftRefLRUPolicyMSPerMB=3000 ...
The server mode VM and client mode VM use slightly different methods to calculate the free megabytes in the heap. The server mode VM assumes that the heap can expand to the -Xmx value and uses that as the full heap size to calculate the available free space. The client mode VM simply uses the current heap size, deriving the actual free space in the current heap. This means that the server VM has an increased likelihood of actually growing the heap space rather than clearing SoftReferences, even where there are SoftReferences that could otherwise be reclaimed. This behavior is not part of any specification, so it could change in a future version. But it is likely that some difference in behavior between WeakReferences and SoftReferences will remain, with SoftReferences being longer lived.
To complete our picture on references and how they work, we'll look in detail at the implementation and performance effects of the WeakHashMap class. WeakHashMap is a type of Map that differs from other Maps in more than just having a different implementation. WeakHashMap uses weak references to hold its keys, making it one of the few classes able to respond to the fluctuating memory requirements of the JVM. This can make WeakHashMap unpredictable at times, unless you know exactly what you are doing with it.
The keys in a WeakHashMap are WeakReference objects. The object passed as the key to a WeakHashMap is stored as the referent of the WeakReference object, and the value is the standard Map value. (The object returned by calling Reference.get( ) is termed the referent of the Reference object.) A comparison with HashMap can help:
HashMap |
WeakHashMap |
---|---|
Map h = new HashMap( ); Object key = new Object; h.put(key, "xyz"); key = null; |
Map h = new WeakHashMap( ); Object key = new Object; h.put(key, "xyz"); key = null; |
The key is referenced directly by the HashMap. |
The key is not referenced directly by the WeakHashMap. Instead, a WeakReference object is referenced directly by the WeakHashMap, and the key is referenced weakly from the WeakReference object. Conceptually, this is similar to inserting a line before the put( ) call, like this: key = new WeakReferenkey(key); |
The value is referenced directly by the HashMap. |
The value is referenced directly by the HashMap. |
The key is not garbage-collectable since the map contains a strong reference to the key. The key could be obtained by iterating over the keys of the HashMap. |
The key is garbage-collectable as nothing else in the application refers to it, and the WeakReference only holds the key weakly. Iterating over the keys of the WeakHashMap might obtain the key, but might not if the key has been garbage-collected. |
The value is not garbage-collectable. |
The value is not directly garbage-collectable. However, when the key is collected by the garbage collector, the WeakReference object is subsequently removed from the WeakHashMap as a key, thus making the value garbage-collectable too. |
The 1.2 and 1.3 versions of the WeakHashMap implementation wrap a HashMap for its underlying Map implementation and wrap keys with WeakReferences (actually a WeakReference subclass) before putting the keys into the underlying HashMap. The 1.4 version implements a hash table directly in the class, for improved performance. The WeakHashMap uses its own ReferenceQueue object so that it is notified of keys that have been garbage-collected, thus allowing the timely removal of the WeakReference objects and the corresponding values. The queue is checked whenever the Map is altered. In the 1.4 version, the queue is also checked whenever any key is accessed from the WeakHashMap. If you have not worked with Reference objects and ReferenceQueues before, this can be a little confusing, so I'll work through an example. The following example adds a key-value pair to the WeakHashMap, assumes that the key is garbage-collected, and records the subsequent procedure followed by the WeakHashMap:
A key-value pair is added to the Map:
aWeakHashMap.put(key, value);
This results in the addition of a WeakReference key added to the WeakHashMap, with the original key held as the referent of the WeakReference object. You could do the equivalent using a HashMap like this:
ReferenceQueue Queue = new ReferenceQueue( ); MyWeakReference RefKey = new MyWeakReference(key, Queue); aHashMap.put(RefKey, value);
(For the equivalence code, I've used a subclass of WeakReference, as I'll need to override the WeakReference.equals( ) for equal key access in the subsequent points to work correctly.)
Note that at this stage the referent of the WeakReference just created is the original key. That is, the following statement would output true:
System.out.println(RefKey.get( ) = = key);
At this point, you could access the value from the WeakHashMap using the original key, or another key that is equal to the original key. The following statements would now output true:
System.out.println(aWeakHashMap.get(equalKey) = = value); System.out.println(aWeakHashMap.get(key) = = value);
In our equivalent code using the HashMap, the following statements would now output true:
MyWeakReference RefKey2 = new MyWeakReference(equalKey, Queue); System.out.println(aHashMap.get(RefKey2) = = value); System.out.println(aHashMap.get(RefKey) = = value);
Note that in order to get this equivalence, we need to implement equals( ) and hashcode( ) in the MyWeakReference class so that equal referents make equal MyWeakReference objects. This is necessary so that the MyWeakReference wrapped keys evaluate as equal keys in Maps. The equals( ) method returns true if the MyWeakReference objects are identical or if their referents are equal.
We now null out the reference to the original key:
key = null;
After some time, the garbage collector detects that the key is no longer referenced anywhere else in the application and clears the WeakReference key. In the equivalent code using the HashMap from this point on, the WeakReference we created has a null referent. The following statement would now output true:
System.out.println(RefKey.get( ) = = null);
Maintaining a reference to the WeakReference object (in the RefKey variable) does not affect clearing the referent. In the WeakHashMap, the WeakReference object key is also strongly referenced from the map, but its referent, the original key, is cleared.
The garbage collector adds the WeakReference that it recently cleared into its ReferenceQueue: that queue is the ReferenceQueue object that was passed in to the constructor of the WeakReference.
Trying to retrieve the value using a key equal to the original key would now return null. (To try this, it is necessary to use a key equal to the original key since we no longer have access to the original key; otherwise, it could not have been garbage-collected.) The following statement would now output true:
System.out.println(aWeakHashMap.get(equalKey) = = null);
In our equivalent code using the HashMap, the following statements would now output true:
MyWeakReference RefKey3 = new MyWeakReference(equalKey, Queue); System.out.println(aHashMap.get(RefKey3) = = null);
However, at the moment the WeakReference and the value objects are still strongly referenced by the Map. That is where the ReferenceQueue comes in. Recall that when the garbage collector clears the WeakReference, it adds the WeakReference into the ReferenceQueue. Now that it is in the ReferenceQueue, we need to have it processed. In the case of the 1.2 and 1.3 versions of WeakHashMap, the WeakReference stays in the ReferenceQueue until the WeakHashMap is altered in some way (e.g., by calling put( ), remove( ), or clear( )). Once one of the mutator methods has been called, the WeakHashMap runs through its ReferenceQueue, removing all WeakReference objects from the queue and also removing each WeakReference object as a key in its internal map, thus simultaneously dereferencing the value. From the 1.4 version, accessing any key also causes the WeakHashMap to run through its ReferenceQueue. In the following example, I use a dummy object to force queue processing without making any real changes to the WeakHashMap:
aWeakHashMap.put(DUMMY_OBJ, DUMMY_OBJ);
The equivalent code using the HashMap does not need a dummy object, but we need to carry out the equivalent queue processing:
MyWeakReference aRef; while ((aRef = (MyWeakReference) Queue.poll( )) != null) { aHashMap.remove(aRef); }
As you can see, we take each WeakReference out of the queue and remove it from the Map. This also releases the corresponding value object, and both the WeakReference object and the value object can now be garbage-collected if there are no other strong references to them.
Reference Objects with String Literal ReferentsNote that if you use a string literal as a key to a WeakHashMap or the referent to a Reference object, it will not necessarily be garbage-collected when the application no longer references it. For example, consider the code: String s = "hello "; WeakHashMap h = new WeakHashMap( ); h.put(s,"xyz"); s = null; You might expect that the string "hello" can now be garbage-collected, since we have nulled the reference to it. However, a string created as a literal is created at compile time and held in a string pool in the class file. The JVM normally holds these strings internally in its own string pool after the class has been loaded. Consequently, the JVM retains a strong reference to the String object, and it will not be garbage-collected until the JVM releases it from the string pool: that may be never, or when the class is garbage-collected, or some other time. If you want to use a String object as a key to a WeakHashMap, ensure that it is created at runtime, e.g.: String s1 = new String("hello"); String s2 = (new StringBuffer( )).append("hello").toString( ); This is one of the few times when creating extra copies of an object is better for performance! This string does not get put into the JVM string pool, and so can be garbage-collected when the application no longer holds strong references to it. Note that calling String.intern( ) on a string will also put it into the internal JVM string pool, giving rise to the same issues as literal strings. Similarly, other objects that the JVM could retain a strong reference to, such as Class objects, may also not be garbage-collected when there are no longer any strong references to them from the application, and so also should not be used as Reference object keys. |
Reference clearing is atomic. Consequently, there is no need to worry about achieving some sort of corrupt state if you try to access an object and the garbage collector is clearing keys at the same time. You will either get the object or you won't.
For 1.2 and 1.3, the values are not released until the WeakHashMap is altered. Specifically, one of the mutator methods, put( ), remove( ), or clear( ), needs to be called directly or indirectly (e.g., from putAll( )) for the values to be released by the WeakHashMap. If you do not call any mutator methods after populating the WeakHashMap, the values and WeakReference objects will never be dereferenced. This does not apply to 1.4 or, presumably, to later versions. However, even with 1.4, the WeakReference keys and values are not released in the background. With 1.4, the WeakReference keys and values are only released when some WeakHashMap method is executed, giving the WeakHashMap a chance to run through the reference queue.
The 1.2 and 1.3 WeakHashMap implementation wraps an internal HashMap. This means that practically every call to the WeakHashMap has one extra level of indirection it must go through (e.g., WeakHashMap.get( ) calls HashMap.get( )), which can be a significant performance overhead. This is a specific choice of the implementation. The 1.4 implementation has no such problem.
In the 1.2 and 1.3 implementations, every call to get( ) creates a new WeakReference object to enable equality testing of keys in the internal HashMap. Although these are small, short-lived objects, if get( ) is used intensively this could generate a heavy performance overhead. Once again, the 1.4 implementation has no such problem.
Unlike many other collections, WeakHashMap cannot maintain a count of elements, as keys can be cleared at any time by the garbage collector without immediately notifying the WeakHashMap. This means that seemingly simple methods such as isEmpty( ) and size( ) have more complicated implementations than for most collections. Specifically, size( ) in the 1.2 and 1.3 implementations actually iterates through the keys, counting those that have not been cleared. Consequently, size( ) is an operation that takes time proportional to the size of the WeakHashMap. In the 1.4 implementation, size( ) processes the reference queue, then returns the current size. Similarly, in the 1.2 and 1.3 implementations, isEmpty( ) iterates through the collection looking for a non-null key. This produces the perverse result that a WeakHashMap that had all its keys cleared and is therefore empty requires more time for isEmpty( ) to return than a similar WeakHashMap that is not empty. In the 1.4 implementation, isEmpty( ) processes the reference queue and returns whether the current size is 0, thus providing a more consistent execution time, although on average the earlier isEmpty( ) implementation would be quicker.