Chapter 11. Appropriate Data Structures and Algorithms

And this is a table ma'am. What in essence it consists of is a horizontal rectilinear plane surface maintained by four vertical columnar supports, which we call legs. The tables in this laboratory, ma'am, are as advanced in design as one will find anywhere in the world.

Michael Frayn, The Tin Men

In this chapter, we look at the performance problems that can stem from using an inappropriate or nonoptimal data structure. Of course, I cannot cover every possible structure. Instead, my focus is on how to performance-tune structures and associated algorithms. Those structures I do cover are provided as examples to give you an idea of how the tuning procedure looks.

For performance-tuning purposes, be aware of alternative structures and algorithms, and always consider the possibility of switching to one of these alternatives rather than tuning the structure and algorithm that is already being used. Being aware of alternative data structures requires extensive reading of computer literature.[1] One place to start is with the JDK code. Look at the structures that are provided and make sure that you know all about the available classes. There are already several good books on data structures and algorithms in Java, as well as many packages available from the Web with extensive documentation and often source code too. Some popular computer magazines include articles about structures and algorithms (see Chapter 19).[2]

[1] An interesting analysis of performance-tuning a "traveling salesman" problem is made by Jon Bentley in his article "Analysis of Algorithms," Dr. Dobb's Journal, April 1999.

[2] The classic reference is The Art of Computer Programming by Donald Knuth (Addison Wesley). A more Java-specific book is Data Structures and Algorithm Analysis in Java by Mark Weiss (Peachpit Press).

When tuning, you often need to switch one implementation of a class with a more optimal implementation. Switching data structures is easier because you are in an object-oriented environment, so you can usually replace one or a few classes with different implementations while keeping all the interfaces and signatures the same.

When tuning algorithms, one factor that should pop to the front of your mind concerns the scaling characteristics of the algorithms you use. For example, bubblesort is an O(n2) algorithm while quicksort is O(nlogn). (The concept of "order of magnitude" statistics is described in Section 9.3 in Chapter 9.) This tells you nothing about absolute times for using either of these algorithms for sorting elements, but it does clearly tell you that quicksort has better scaling characteristics, and so is likely to be a better candidate as your collections increase in size. Similarly, hash tables have an O(1) searching algorithm where an array requires O(n) searching.