13.7 Performance Planning

This chapter has described how to factor in performance at various stages of development. Integrating this advice allows you to create a performance plan, as outlined in this section.

  1. Specify performance requirements.

    During the specification stage, the performance requirements of the application need to be defined. This is not primarily a developer task. Your customers or business experts need to establish what response time is acceptable for most functions the user will execute. It may be more useful to start by specifying what response times will be unacceptable.

    This task can be undertaken at a later stage of development. In fact, it can be simpler, if a prototype has already been created, to use the prototype and other business information in specifying acceptable responses. But do not neglect to specify these response-time requirements before starting any type of implementation-level performance tuning (code tuning). If code tuning starts without performance requirements, then goals are inadequately defined, and tuning effort will be wasted on parts of the application that do not require tuning.

    If your development environment is layered (e.g., application layer, component layer, technical architecture layer), try to define performance specifications that map to each layer, so that each team has its own set of performance targets to work on. If this is not possible, the performance experts will need to be able to tune across all layers and interact with all teams.

  2. Include a performance focus in the analysis phase.

    During the analysis stage, the main performance focus is to analyze the requirements for shared and limited resources in the application (e.g., a network connection is both a shared and a limited resource; a database table is a shared resource; threads are a limited resource). These are the resources that will cost the most to fix later in development if they are not identified and designed correctly at the outset. Analysis of data volume and load-carrying capacities of the components of the system should also be carried out to determine the limitations of the system.

    This task should fit in comfortably as part of the normal analysis stage. To be on the safe side, or to highlight the requirement for performance analysis, you may wish to allocate 10% of planned analysis time for performance analysis in this phase. The analysis team must be aware of the performance impact of different design choices so that they do not miss aspects of the system that need analysis (see the earlier Section 13.3). The analysis should be made in association with the technical architecture analysis so that you end up with an architectural blueprint that clearly identifies performance aspects.

  3. Require performance predictions from the design team.

    Progressing from the analysis stage, the performance focus in the design phase should be on how shared resources will be used by the application and on the performance consequences of the expected physical architecture of the deployed application.

    Ensure that the designers are aware of the performance consequences of different decisions by asking for performance-impact predictions to be included with the normal design aspects. The external design review should either include design experts familiar with the performance aspects of design choices, or a secondary performance expert familiar with design should review the application design. If any significant third-party products will be used (e.g., a middleware or database product), the product vendor should have performance experts who can validate the design and identify any potential performance problems. A 10% budget allocation for performance planning and testing highlights the emphasis on performance. See the earlier Section 13.4.

    The design should include reference to scalability both for users and for data/object volumes, the amount of distribution possible for the application depending on the required level of messaging between distributed components, and the transaction mechanisms and modes (pessimistic, optimistic, required locks, durations of transactions and locks held). The theoretical limitation to the performance of many multiuser applications is the amount and duration of locks held on shared resources. The designers should also include a section on handling queries against large datasets, if that will be significant for your application.

  4. Create a performance-testing environment.

    The performance task for the beginning of the development phase is setting up the performance-testing environment. You need to:

    • Specify benchmark functions and required response times based on the specification.

    • Ensure that a reasonably accurate test environment for the system is available.

    • Buy or build various performance tools for your performance experts to evaluate, including profiling tools, monitoring tools, benchmark harnesses, web loading, GUI capture/playback, or other client emulation tools.

    • Ensure that the benchmark/performance-testing harness can drive the application with simulated user and external driver activity.

    • Schedule regular, exclusive performance-testing time for the test environment: if the test environment is shared, performance testing should not take place at the same time as other activities.

    • Create reusable performance tests with reproducible application activity. Note that this is not QA: the tests should not be testing failure modes of the system, only normal, expected activity.

    • Prepare the testing and monitoring environment. This is normally system-specific and usually evolves as the testing proceeds. You will ultimately need to have performance-monitoring tools or scripts that monitor the underlying system performance as well as providing statistics on network and application performance (discussed further in Step 8).

    • Plan for code versioning and release from your development environment to your performance environment, according to your performance test plan. (Note that this often requires a round of bug-fixing to properly run the tests, and time restrictions usually mean that it is not possible to wait for the full QA release, so plan for some developer support.)

  5. Test a simulation or skeleton system for validation.

    Create a simulation of the system that faithfully represents the main components of the application. The simulation should be implemented so that you can test the scalability of the system and determine how shared resources respond to increased loads and at what stage limited resources start to become exhausted or bottlenecked. The simulation should allow finished components to be integrated as they become available. If budget resources are unavailable, skip the initial simulation, but start testing as soon as sufficient components become available to implement a skeleton version of the system. The targets are to determine response times and scalability of the system for design validation feedback as early as possible.

    If you have a "Proof of Concept" stage planned, it could provide the simulation or a good basis for the simulation. Ideally, the validation would take place as part of the "Proof of Concept."

  6. Integrate performance logging.

    Integrate performance logging into the application. This logging should be deployed with the released application (see Step 8), so performance logging should be designed to be low-impact. Performance logging should be added to all the layer boundaries: servlet I/O and marshalling; JVM server I/O and marshalling; database access/update; transaction boundaries; and so on. Performance logging should not produce more than one line of output to a log file per 20 seconds. It should be designed so that it adds less than 1% of time to all application activity. Logging should be configurable to aggregate variable amounts of statistics so that it can be deployed to produce one summary log line per configurable time unit (e.g., one summary line every minute). Ideally, logging should be designed so that the output can be analyzed in a spreadsheet, allowing for effective and easy-to-read aggregation results. J2EE monitoring products are available that automatically integrate logging into J2EE servers (see http://www.JavaPerformanceTuning.com/resources.shtml).

  7. Performance-test and tune using results.

    During code implementation, unit performance testing should be scheduled along with QA. No unit performance tuning is required until the unit is ready for QA. Unit performance tuning proceeds by integrating the unit into the system simulation and running scaling tests with profiling.

    It is important to test the full system or a simulation of it as soon as is feasible, even if many of the units are incomplete. Simulated units are perfectly okay at an early stage of system performance testing. Initially, the purpose of this system performance test is to validate the design and architecture and identify any parts of the design or implementation that will not scale. Later, the tests should provide detailed logs and profiles that will allow developers to target bottlenecks in the system and produce faster versions of the application.

    To support the later-stage performance testing, the test bed should be configured to provide performance profiles of any JVM processes, including system and network statistics, in addition to performance logging. Your performance experts should be able to produce JVM profiles and obtain and analyze statistics from your target system.

    The performance tests should scale to higher loads of users and data. Scale tests to twice the expected peak load. Test separately:

    • Twice the peak expected throughput, together with the peak expected data volume and the peak expected users.

    • Twice the peak expected data volume, together with the peak expected throughput and the peak expected users.

    • Twice the peak expected users, together with the peak expected data volume and the peak expected throughput.

    User activity should be simulated as accurately as possible, but it is most important that data is simulated to produce the expected real data variety; otherwise, cache activity can produce completely misleading results. The numbers of objects should be scaled to reasonable amounts: this is especially important for query testing and batch updates. Do not underestimate the complexity of creating large amounts of realistic data for scalability testing.

  8. Deploy with performance-logging features.

    Performance-logging features should be deployed with the released application. Such logging provides remote analysis and constant monitoring capabilities for the deployed application. Ideally, you should develop tools that automatically analyze the performance logs. At minimum, the performance-log analysis tools should generate summaries of the logs, compare performance against a set of reference logs, and highlight anomalies.

    Two other useful tools identify long-term trends in the performance logs and generate alerts when particular performance measurements exceed defined ranges. A graphical interface for these tools is also helpful.