The user's perception of performance is crucial. The user perception discussion from Chapter 1 applies to J2EE architectures, but there are some additional considerations.
Some connections may fail due to a congested network or overloaded server. Users perceive the need to reenter data or return to the last screen as bad performance. Ideally, when a connection is reestablished, the user should find himself back at the same state as before the connection failure. If the session ID is still valid, then you should hold all the session state so the display can be re-created at any point. If the session ID is invalidated by the connection failure, then maintaining state in the client should enable display re-creation.
As discussed in Chapter 1, popular browsers try to display screens in a way that seems faster to the user. Nevertheless, certain ways of laying out pages make the display take longer. For example, HTML tables are often not displayed until the contents are available to enable table-cell size calculation. Use size tags to help the browser calculate the display.
Pages constructed from multiple disparate sources (e.g., embedded images) require multiple connections, all of which add to the overall perceived page display time. A poorly designed page could be seen as slow even if the components of the page individually download quickly. You should be able to find multiple sites displaying structures similar to those you wish to display. Compare their performance and choose the best combination for your application.
On the server side, don't rely on the default server buffers to flush the pages. Different buffer sizes and forced flushing of the output at certain points can improve perceived performance by sending displayable parts of a page more quickly.
Different users have different requirements and, more importantly, different value to your business. You should balance the performance provided to your users according to their value to your business. However, doing so is not always a clear-cut matter of giving higher-priority users a faster response at the expense of lower-value users. For example, many web sites provide premium service to paying users, with restricted access to nonpaying users. But if nonpaying users find that the web site is too slow, they may be unlikely to convert to paying users, and converting nonpaying users to paying users may be a business priority. However you may decide to assign priorities, sort incoming requests into different priority queues and service the higher-priority requests first. Priority queuing can be achieved by initially allocating the incoming request a priority level based on your application requirements or according to the priority associated with the session. You can then route the request by priority. To support priorities throughout the J2EE application, requests probably need to be transferred between components at each stage through multiple queues so that queued requests can be accepted in order of priority level.
On the Internet, there are inevitably some very long response times and communication failures. This results from the nature of Internet communications, which is liable to variable congestion, spikes, blocked routes, and outages. Even if your server were up 100% of the time and serviced every request with a subsecond response, there would still be some problems due to Internet communication channels. You need to construct your application to handle communication failures gracefully, bearing in mind the issue of user perception. This is discussed in the next section.
A few long response times from communication failures may not necessarily make a bad impression, especially if handled correctly. Experienced Internet users expect communication failures and don't necessarily blame the server. In any case, if a connection or transaction needs to be reestablished, explain to the user why the outage occurred. Identifying the connection failure can help. For example, the Internet regularly becomes more congested at certain times. By monitoring your server, you should be able to establish whether these congested times result in an increased number of connection retries. If so, you can present a warning to the user explaining that current Internet congestion may result in some connection failures (and perhaps suggest that the user try again later after a certain time if performance is unsatisfactory). Setting the expectation of your users in this way can help reduce the inevitable dissatisfaction that communication failures cause. Including an automated mechanism for congestion reporting could be difficult. The Java API doesn't provide a mechanism to measure connection retries. You could measure download times regularly from some set of reference locations, and use those measurements to identify when congestion causes long download times.
Evaluate performance targets as early as possible (preferably at project specification), and then keep your targets in mind. One million requests per day, 24/7, is equivalent to 12 requests per second. Most servers receive requests unevenly around periodic patterns. Peak traffic can be an order of magnitude higher than the average request rate. For a highly scaled popular server, ideal peak performance targets would probably consist of subsecond response times and hundreds of (e-commerce) transactions per second. You can use these basic guidelines to calculate target response times. Naturally, your application will have its own requirements.
The quickest way to lose user interest is to keep the user waiting for screens to display. Some experts suggest that perceived delays accumulate across multiple screens. It is not sufficient for individual screens to display within the limit of the user's patience (the subject of the earlier "Page Display" section). If the user finds himself waiting for several screens to display slowly, one after the other, the cumulative wait time can exceed a limit (perhaps as low as eight seconds) that induces the user to abandon the transaction (and the site). One of the better ways to keep the cumulative delay low is to avoid making the user go through too many screens to get to his goal.
Your users have a characteristic range of bandwidths, from slow modem dialup speeds to broadband. Determine what the range of user bandwidths are and test throughout the range. Different page designs display at different speeds for different bandwidths, and users have different expectations. Users with broadband connections expect pages to appear instantly, and slow pages stand a good chance of never being looked at.