3.6 Replacing JDK Classes

It is possible for you to replace JDK classes directly. Unfortunately, you can't distribute these altered classes with any application or applet unless you have complete control of the target environment. Although you often do have this control with in-house and enterprise-developed applications, most enterprises prefer not to deploy alterations to externally built classes. The alterations would not be supported by the vendor (Sun in this case) and may violate the license, so contact the vendor if you need to do this. In addition, altering classes in this way can be a significant maintenance problem.[3]

[3] If your application has its classes localized in one place on one machine, for example with servlets, you might consider deploying changes to the core classes.

The upshot is that you can easily alter JDK-supplied classes for development purposes, which can be useful for various reasons including debugging and tuning. But if you need the functionality in your deployed application, you need to provide classes that are used instead of the JDK classes by redirecting method calls to your own classes.

Replacing JDK classes indirectly in this way is a valid tuning technique. Some JDK classes, such as StreamTokenizer (see Section 5.4), are inefficient and can be replaced quite easily since you normally use them in small, well-defined parts of a program. Other JDK classes, like Date , BigDecimal, and String, are used all over the place, and it can take a large effort to replace references with your own versions of these classes. The best way to replace these classes is to start from the design stage so that you can consistently use your own versions throughout the application.

In SDK 1.3, many of the java.lang.Math methods were changed from native to call the corresponding methods in java.lang.StrictMath. StrictMath provides bitwise consistency across platforms; earlier versions of Math used platform-specific native functions that were not identical across all platforms. Unfortunately, StrictMath calculations are somewhat slower than the corresponding native functions. My colleague Kirk Pepperdine, who first pointed out the performance problem to me, puts it this way: "I've now got a bitwise-correct but excruciatingly slow program." The potential workarounds to this performance issue are all ugly: using an earlier JDK version, replacing the JDK class with an earlier version, or writing your own class to manage faster alternative floating-point calculations.

For optimal performance, I recommend developing with your own versions of classes rather than the JDK versions whenever possible. This gives maximum tuning flexibility. However, this recommendation is clearly impractical in most cases. Given that, perhaps the single most significant class to replace with your own version is the String class. Most other classes can be replaced inside identified bottlenecks when required during tuning without affecting other parts of the application. But String is used so extensively that replacing String references in one location tends to have widespread consequences, requiring extensive rewriting in many parts of the application. In fact, this observation also applies to other data type classes you use extensively (Integer, Date, etc.). But the String class tends to be used most often. See Chapter 5 for details on why the String class can be a performance problem and why you might need to replace it.

It is often impractical to replace the String classes where their internationalization capabilities are required. Because of this, you should logically partition the application's use of Strings to identify those aspects that require internationalization and those aspects that are really character processing, independent of language dependencies. The latter usage of Strings can be replaced more easily than the former. Internationalization-dependent String manipulation is difficult to tune because you are dependent on internationalization libraries that are difficult to replace.

Many JDK classes provide generic capabilities (as you would expect from library classes), so they are frequently more generic than what is required for your particular application. These generic capabilities often come at the expense of performance. For example, Vector is fine for generic Objects, but if you are using a Vector for only one type of object, then a custom version with an array and accessors of that type is faster, as you can avoid all the casts required to convert the generic Object back into your own type. Using Vector for basic data types (e.g., longs) is even worse, requiring the data type to be wrapped by an object to get it into the Vector. For example, building and using a LongVector class improves performance and readability by avoiding casts, Long wrappers, unwrapping, etc.:

public class LongVector
{
  long[  ] internalArray;
  int arraySize
  ...
  public void addElement(long l) {
  ...
  public long elementAt(int i) {
  ...

Note that Generics are due to be introduced in Version 1.5. Generics allow instances of generic classes like Vector to be specified as aggregate objects that hold only specified types of objects. However, the implementation of Generics is to insert casts at all the access points and to analyze the updates to ensure that the update type matches the cast type. There is no specialized class generation, so there is no performance benefit, and there may even be a slight performance degradation from the additional casts.

If you are using your own classes, you can extend them with specific functionality you require, with direct access to the internals of the class. Using Vector as an example, if you want to iterate over the collection (e.g., to select a particular subset based on some criteria), you need to access the elements through the get( ) method for each element, with the significant overhead that implies. If you are using your own (possibly derived) class, you can implement the specific action you want in the class, allowing your loop to access the internal array directly with the consequent speedup:

public class QueryVector extends MyVector
{
  public Object[  ] getTheBitsIWant{
    //Access the internal array directly rather than going through
    //the method accessors. This makes the search much faster
    Object[  ] results = new Object[10];
    for(int i = arraySize-1; i >= 0; i--)
      if (internalArray[i] ....

Finally, there are often many places where objects (especially collection objects) are used initially for convenience (e.g., using a Vector because you did not know the size of the array you would need, etc.). In a final version of the application, they can be replaced with presized arrays. A known-sized array (not a collection object) is the fastest way in Java to store and access elements of a collection.