9.2 An Efficient Sorting Framework

The sorting methods provided by the JDK are perfectly adequate for most situations. When they fall short, the techniques illustrated in the previous section often speed things up as much as is required. However, if you work on a project where varied and flexible sorting capabilities are needed, sorting is one area of performance tuning where it is sensible to create a framework early in the development cycle. A good sorting framework should allow you to change sorting-algorithm and comparison-ordering methods in a generic way, without having to change too much in the application.

Providing support for arbitrary sorting algorithms is straightforward: just use sorting interfaces. There needs to be a sorting interface for each type of object that can be sorted. Arrays and collection objects should be supported by any sorting framework, along with any other objects that are specific to your application. Here are two interfaces that define sorting objects for arrays and collections:

import java.util.Comparator;
import java.util.Collection;
   
public interface ArraySorter
{
  public void sort(Comparator comparator, Object[  ] arr);
  public void sort(Comparator comparator, Object[  ] arr, 
       int startIndex, int length);
  public void sortInto(Comparator comparator, Object[  ] source,
       int sourceStartIndex, int length,
       Object[  ] target, int targetStartIndex);
}
   
public interface CollectionSorter
{
  public Object[  ] sort(Comparator comparator, Collection c);
  public void sortInto(Comparator comparator, Collection c,
       Object[  ] target, int targetStartIndex);
}

Individual classes that implement the interfaces are normally stateless and hence implicitly thread-safe. This allows you to specify singleton sorting objects for use by other objects. For example:

public class ArrayQuickSorter
  implements ArraySorter
{
  public static final ArrayQuickSorter SINGLETON = new ArrayQuickSorter(  );
   
  //protect the constructor so that external classes are
  //forced to use the singleton
  protected ArrayQuickSorter(  ){  }
   
  public void sortInto(Comparator comparator, Object[  ] source,
    int sourceStartIndex, int length, Object[  ] target, int targetStartIndex)
  {
    //Only need the target - quicksort sorts in place.
    if ( !(source =  = target && sourceStartIndex =  = targetStartIndex) )
      System.arraycopy(source, sourceStartIndex, target,
          targetStartIndex, length);
    this.sort(comparator, target, targetStartIndex, length);
  }
   
  public void sort(Comparator comparator, Object[  ] arr)
  {
    this.sort(comparator, arr, 0, arr.length);
  }
   
  public void sort(Comparator comparator, Object[  ] arr,
       int startIndex, int length)
  {
    //quicksort algorithm implementation using Comparator.compare(Object, Object)
    ...
  }

This framework allows you to change the sort algorithm simply by changing the sort object you use. For example, if you use a quicksort but realize that your array is already partially sorted, simply change the sorter instance from ArrayQuickSorter.SINGLETON to ArrayInsertionSorter.SINGLETON.

However, we are only halfway to an efficient framework. Although the overall sorting structure is here, you have not supported generic optimizations such as optimized comparison wrappers (e.g., as with java.text.CollationKey). For generic support, you need the Comparator interface to have an additional method that checks whether it supports optimized comparison wrappers (which I will now call ComparisonKeys). Unfortunately, you cannot add a method to the Comparator interface, so you have to use the following subinterface:

public interface KeyedComparator
  extends Comparator
{
  public boolean hasComparisonKeys(  );
  public ComparisonKey getComparisonKey(Object o);
}
   
public interface ComparisonKey
{
  public int compareTo(ComparisonKey target);
  public Object getSource(  );
}

Now you need to support this addition to the framework in each sorter object. Since you don't want to change all your sorter-object implementations again and again, it's better to find any further optimizations now. One optimization is a sort that avoids a call to any method comparison. You can support that with a specific ComparisonKey class:

public class IntegerComparisonKey
  implements ComparisonKey
{
  public Object source;
  public int order;
  public IntegerComparisonKey(Object source, int order) {
     this.source = source;
     this.order = order;
  }
  public int compareTo(ComparisonKey target){
    return order - ((IntegerComparisonKey) target).order;
  }
  public Object getSource(  ) {return source;}
}

Now you can reimplement your sorter class to handle these special optimized cases. Only the method that actually implemented the sort needs to change:

public class ArrayQuickSorter
  implements ArraySorter
{
  //everything else as previously
  ...
   
  public void sort(Comparator comparator, Object[  ] arr, 
                   int startIndex, int length)
  {
    //If the comparator is part of the extended framework, handle
    //the special case where it recommends using comparison keys
    if (comparator instanceof KeyedComparator && 
          ((KeyedComparator) comparator).hasComparisonKeys(  ))
    {
      //wrap the objects in the ComparisonKeys
      //but if the ComparisonKey is the special case of
      //IntegerComparisonKey, handle that specially
      KeyedComparator comparer = (KeyedComparator) comparator;
      ComparisonKey first = comparer.getComparisonKey(arr[startIndex]);
      if (first instanceof IntegerComparisonKey)
      {
        //wrap in IntegerComparisonKeys
        IntegerComparisonKey[  ] iarr = new IntegerComparisonKey[length];
        iarr[startIndex] = (IntegerComparisonKey) first;
        for(int j = length-1, i = startIndex+length-1; j > 0; i--, j--)
          iarr[j] = comparer.getComparisonKey(arr[i]);
   
        //sort using the optimized sort for IntegerComparisonKeys
        sort_intkeys(iarr, 0, length);
   
        //and unwrap
        for(int j = length-1, i = startIndex+length-1; j >= 0; i--, j--)
          arr[i] = iarr[j].source;
      }
      else
      {
        //wrap in IntegerComparisonKeys
        ComparisonKey[  ] karr = new ComparisonKey[length];
        karr[startIndex] = first; 
        for(int j = length-1, i = startIndex+length-1; j > 0; i--, j--)
          karr[i] = comparer.getComparisonKey(arr[i]);
   
        //sort using the optimized sort for ComparisonKeys
        sort_keys(karr, 0, length);
   
        //and unwrap
        for(int j = length-1, i = startIndex+length-1; j >= 0; i--, j--)
          arr[i] = karr[i].getSource(  );
      }
    }
    else
      //just use the original algorithm
      sort_comparator(comparator, arr, startIndex, length);
  }
  public void sort_comparator(Comparator comparator, Object[  ] arr,
            int startIndex, int length)
  {
    //quicksort algorithm implementation using Comparator.compare(Object, Object)
    ...
  }
  public void sort_keys(ComparisonKey[  ] arr, int startIndex, int length)
  {
    //quicksort algorithm implementation using 
    //ComparisonKey.compare(ComparisonKey)
    ...
  }
  public void sort_intkeys(IntegerComparisonKey[  ] arr,
            int startIndex, int length)
  {
    //quicksort algorithm implementation comparing key order directly
    //using access to the IntegerComparisonKey.order field
    //i.e if (arr[i].order > arr[j].order)
    ...
  }
}

Although the special cases mean that you have to implement the same algorithm three times (with slight changes to data type and comparison method), this is the kind of tradeoff you often have to make for performance optimizations. The maintenance impact is limited by having all implementations in one class, and once you've debugged the sort algorithm, you are unlikely to need to change it.

This framework now supports:

  • An easy way to change the sorting algorithm being used at any specific point of the application.

  • An easy way to change the pair-wise comparison method, by changing the Comparator object.

  • Automatic support for comparison key objects. Comparison keys are optimal to use in sorts where the comparison method requires a calculation for each object being compared, and that calculation could be cached.

  • An optimized integer key comparison class, which doesn't require method calls when used for sorting.

This outline should provide a good start to building an efficient sorting framework. Many further generic optimizations are possible, such as supporting a LongComparisonKey class and other special classes appropriate to your application. The point is that the framework should handle optimizations automatically. The most the application builder should do is decide on the appropriate Comparator or ComparisonKey class to build for the object to be sorted.

The last version of our framework supports the fastest sorting implementation from the previous section (the last implementation with no casts and direct access to the ordering field). Unfortunately, the cost of creating an IntegerComparisonKey object for each object being sorted is significant enough to eliminate the speedup from getting rid of the casts. It's worth looking at ways to reduce the cost of object creations for comparison keys. This cost can be reduced using the object-to-array mapping technique from Chapter 4: the array of IntegerComparisonKeys is changed to a pair of Object and int arrays. By adding another interface, you can support the needed mapping:

interface RawIntComparator
  //extends not actually necessary, but logically applies
  extends KeyedComparator
{
  public void getComparisonKey(Object o, int[  ] orders, int idx);
}

For the example Sortable class that was defined earlier, you can implement a Comparator class:

public class SortableComparator
 implements RawIntComparator
{
  //Required for Comparator interface
  public int compare(Object o1, Object o2){
    return ((Sortable) o1).order -((Sortable) o2).order;} 
  //Required for Comparator interface
  public boolean hasComparisonKeys(  ){return true;}
  public ComparisonKey getComparisonKey(Object o){
    return new IntegerComparisonKey(o, ((Sortable) o).order);}
  //Required for RawIntComparator interface
  public void getComparisonKey(Object s, int[  ] orders, int index){
    orders[index] = ((Sortable) s).order;}
}

Then the logic to support the RawIntComparator in the sorting class is:

public class ArrayQuickSorter
  implements ArraySorter
{
  //everything else as previously except rename the
  //previously defined sort(Comparator, Object[  ], int, int)
  //method as previous_sort
  ...
   
  public void sort(Comparator comparator, Object[  ] arr, 
                   int startIndex, int length)
  {
    //support RawIntComparator types
    if (comparator instanceof RawIntComparator)
    {
      RawIntComparator comparer = (RawIntComparator) comparator;
      Object[  ] sources = new Object[length];
      int[  ] orders = new int[length];
   
      for(int j = length-1, i = startIndex+length-1; j >= 0; i--, j--)
      {
          comparer.getComparisonKey(arr[i], orders, j);
          sources[j] = arr[i];
      }
   
      //sort using the optimized sort with no casts
      sort_intkeys(sources, orders, 0, length);
   
      //and unwrap
      for(int j = length-1, i = startIndex+length-1; j >= 0; i--, j--)
        arr[i] = sources[j];
    }
    else
      previous_ sort(comparator, arr, startIndex, length);
  }
   
  public void sort_intkeys(Object[  ] sources, int[  ] orders,
             int startIndex, int length)
  {
    quicksort(sources, orders, startIndex, startIndex+length-1);
  }
   
  public static void quicksort(Object[  ] sources, int[  ] orders, int lo, int hi)
  {
    //quicksort algorithm implementation with a pair of
    //synchronized arrays. 'orders' is the array used to
    //compare ordering. 'sources' is the array holding the
    //source objects whicn needs to be altered in synchrony
    //with 'orders'
    if( lo >= hi ) 
      return;
   
    int mid = ( lo + hi ) / 2;
    Object tmp_o;
    int tmp_i;
    int middle = orders[ mid ];
   
    if( orders[ lo ] > middle )
    {
      orders[ mid ] = orders[ lo ];
      orders[ lo ] = middle;
      middle = orders[ mid ];
      tmp_o = sources[mid];
      sources[ mid ] = sources[ lo ];
      sources[ lo ] = tmp_o;
    }
    
    if( middle > orders[ hi ])
    {
      orders[ mid ] = orders[ hi ];
      orders[ hi ] = middle;
      middle = orders[ mid ];
      tmp_o = sources[mid];
      sources[ mid ] = sources[ hi ];
      sources[ hi ] = tmp_o;
   
      if( orders[ lo ] > middle)
      {
        orders[ mid ] = orders[ lo ];
        orders[ lo ] = middle;
        middle = orders[ mid ];
        tmp_o = sources[mid];
        sources[ mid ] = sources[ lo ];
        sources[ lo ] = tmp_o;
      }
    }
   
    int left = lo + 1;
    int right = hi - 1;
   
    if( left >= right ) 
      return;
   
    for( ;; ) 
    {
      while( orders[ right ] > middle)
      {
        right--;
      }
   
      while( left < right && orders[ left ] <= middle )
      {
        left++;
      }
   
      if( left < right )
      {
        tmp_i = orders[ left ];
        orders[ left ] = orders[ right ];
        orders[ right ] = tmp_i;
        tmp_o = sources[ left ];
        sources[ left ] = sources[ right ];
        sources[ right ] = tmp_o;
        right--;
      }
      else
      {
        break;
      }
    }
   
    quicksort(sources, orders, lo, left);
    quicksort(sources, orders, left + 1, hi);
  }
}

With this optimization, the framework quicksort is now as fast as the fastest handcrafted quicksort from the previous section for some VMs (see Table 9-2).

Table 9-2. Timings of the various sorting tests normalized to the initial JDK 1.2 test of Table 9-1
 

1.2.2

1.3.1_02

1.3.1_02-server

1.4.0

1.4.0-server

1.4.0-Xint

Quicksort(Object[ ]) from Table 9-1

100%

52%

53%

49%

50%

356%

Quicksort(Sortable[ ]) using field access from Table 9-1

43%

32%

30%

31%

30%

111%

ArrayQuickSorter using Sortable.field[3]

39%

53%

47%

34%

60%

111%

Arrays.sort( ) from Table 9-1

152%

63%

62%

59%

56%

267%

[3] HotSpot is variable in how well it manages to optimize the framework sort. The 1.4.0 client is almost as fast as the direct field access sort. This indicates that HotSpot technology is theoretically capable of similarly optimizing the framework sort. That it hasn't managed to in some modes of JDK 1.3 and 1.4 indicates that the VM can be improved further.