In Section 18.2 later in this chapter, I describe several design patterns that help EJB systems attain adequate performance. But first, I will discuss some primary design guidelines to consider before you can apply patterns.
EJBs should be designed to have large granularityone remote invocation to an EJB should perform a large amount of work instead of requiring many remote invocations. This criterion is extremely important for a successful EJB design. Coarse-grained EJBs tend to provide a more efficient application because they minimize the number of remote communications needed to complete the work.
A more refined guideline is that any remotely accessed EJBs should be coarse-grained. Any EJBs that are always accessed locally can be fine-grained, if the local access is not treated as a remote access. Bear in mind that prior to the EJB 2.0 specification, all EJB access was (theoretically) treated remotely, even with EJBs in the same container. This means that the parameters could always be marshaled and passed through a socket, incurring a significant portion of remote-calling overhead. (Some application servers detect local EJB communication automatically and optimize that communication to avoid remote-calling overhead.) Since EJB 2.0, local entity beans can be defined, allowing optimized communications for local EJBs. But that is not a runtime decision, so it needs to be factored into the design. Local EJBs were added to the EJB specification to address this issue of improving performance among locally collocated EJBs.
The following are some detailed guidelines for achieving this combination design target of coarse-grained remote EJBs and fine-grained local EJBs. In the following list, I consider EJBs either local or remote, but an EJB can implement both interfaces, if appropriate to your application.
Design the application to access entity beans from session beans. This optimizes the likelihood that an EJB call is local and supports several other design optimizations (listed in the subsequent section covering design patterns).
Determine which EJBs will be collocated within the same VM. These EJBs can communicate with one another by using optimized local communications.
Those EJBs that will (always) be collocated should be:
Defined as local EJBs (from EJB 2.0); or
Defined normally as remote EJBs and collocated within an application server that is capable of optimizing local EJB communications; or
Built as normal JavaBeans, and then wrapped in an EJB to provide one coarse-grained EJB (see the CompositeEntity pattern).
EJBs that communicate remotely should combine methods to reduce possible remote invocations. Multiple calls frequently specify various parameters, and these parameters can be combined as a parameter object to be passed for one remote call. Section 12.2 gives a concrete example of how to combine methods to reduce the number of remote calls required to perform an action.
Don't design EJBs with one access method per data attribute unless they are definitely local EJBs. (More accurately, don't define data attribute accessors and updators as remote, as they have relatively high overheads.)
Bear in mind that any EJB service could be called remotely if you define a remote interface for it, and try to anticipate the resulting costs to the application.
EJBs should not be simple wrappers on database table rows. An EJB should be a fully fledged business object that represents and can manipulate underlying database data, applying business logic to provide appropriate refined information to callers of the EJB. If you need to access database data, but not for business-object purposes, use JDBC directly (probably from session beans) without intermediate EJB objects. EJBs can cause multiple per-row database access and updates. While this inefficiency can be justified when the EJB adds information value to the data, it is pure overhead in the absence of such business logic, and plain JDBC could be optimized much better.
Read-only data should be identified and separated from read-write data. When treating read-only data and read-only attributes of objects, a whole host of optimizations are possible. Some optimizations use design patterns, and others are available from the application server. Transactions that consist purely of read-only data are much more efficient than read-write data. Trying to decouple read-only data from read-write data after the application has been designed is difficult.
By definition, a stateless session bean has no state. That means that all the services it provides do not depend on what it just did. So a single stateless session bean can serve one client, then another, and then come back to the first, while each client can be in a different or the same state. The stateless session bean doesn't need to worry about which client does what. The result is that one stateless bean instance can serve multiple clients, thereby decreasing the average number of resources required per client. The stateless bean pool doesn't need to grow and shrink according to the number of clients; instead, it can be optimized for the overall rate of requests.
Most application servers support pools of stateless beans. As each bean services multiple clients, the bean pool can be kept smaller, which is more optimal. To optimize the session-bean pool for your application, choose a (maximum) size that minimizes activations and passivations of beans. The container dynamically adjusts the size to optimally handle the current request rate, which may conflict with trying to choose a single size for the pool.
Stateful beans, in contrast, require one instance for each client accessing the bean. The stateful-bean pool grows and shrinks depending on the current number of clients, increasing pool overhead. If you have stateful beans, try to remove any that are finished so that fewer beans are serialized if the container needs to passivate them (see Section 17.5.2, which details how explicitly removing beans can improve performance).
If you have stateful beans in your design, the best technique to reduce their overhead is to convert them to stateless session beans. Primarily, this involves adding parameters that hold the extra state to the bean methods to pass the bean the current client state whenever it needs to execute. An extended example of converting a stateful bean to a stateless bean is available in Brett McLaughlin's Building Java Enterprise Applications, Volume I: Architecture (O'Reilly), and online at http://www.onjava.com/pub/a/onjava/excerpt/bldgjavaent_8/index3.html. The example even shows that you can retain the stateful-bean interface while using stateless beans by using the Proxy design pattern.
If state needs to be accessible on the server, you can hold it outside session beans, for example, in an HttpSession object, or in a global cache that provides access to the state through a unique session identifier. Converting stateful session beans to stateless session beans adds extra data to the client-server transfers, but the extra data can be minimized by using identifiers and a server data store. For high-performance J2EE systems, the advantages tend to outweigh the disadvantages.
JNDI lookups, like other remote calls, are expensive. The results of JNDI lookups are also easily cached. There is even a dedicated pattern for caching EJBHome objects (the EJBHomeFactory pattern) because it is such a frequently suggested optimization.
Should you use container-managed persistence (CMP) or bean-managed persistence (BMP)? This is one of the most frequently discussed questions about EJBs. BMP requires the developer to add code for persisting the beans. CMP leaves the job of persisting the beans up to the application server. BMP can ultimately be made faster than CMP in almost any situation, but to do so, you would probably need to build a complete generic persistency layerin effect, your own CMP. So let's get back to reality. (You could build a very fast, simple persistence layer, mainly raw JDBC calls, but it would not be flexible enough for the kinds of development changes constantly made in most J2EE systems. However, if speed is the top priority, this option is viable.)
BMP can be faster for any one bean. You can build in the persistency that is required by the bean, avoiding any generic overhead. That's fine if you have five EJB types in your application. But more realistically, with tens or hundreds of EJB types, writing optimal BMP code for each EJB and keeping that code optimal across versions of the application is unachievable (though again, if you can impose the required discipline in your development changes, then it is achievable).
With multiple beans and bean types, CMP can apply many optimizations:
Efficient lazy loading
Efficient combinations of multiple queries to the same table (i.e., multiple beans of the same type that can be handled together)
Optimized multi-row deletion to handle deletion of beans and their dependents
I would recommend using CMP by default. However, CMP is not yet mature, which makes the judgment more complex. It may come down to which technique your development team is more comfortable with. If you do use CMP, profile the application to determine which beans cause bottlenecks from their persistency. Implement BMP for those beans. Use the Data Access Object design pattern (described later) to abstract your BMP implementations so you can take advantage of optimizations for multiple beans or database-specific features. (You may also need to use BMP where CMP cannot support the required logice.g., if fields use stored procedures, or one bean maps to multiple tables.)
Tuning EJB transactions is much like tuning JDBC transactions; you will find Section 16.2.15 very relevant for EJB transactions. There are a few additional considerations. The following list summarizes optimal transaction handling for EJBs:
Keep transactions short.
Commit the data after the transaction completes rather than after each method call. That is, if multiple methods are executed close together, each needing to execute a transaction, then combine their transactions into one transaction. The target is to minimize the overall transaction time rather than simplistically targeting each currently defined transaction.
Try to perform bulk updates to reduce database calls.
For very large transactions, use the transaction attribute TX_REQUIRED to get all EJB method calls in a call chain to use the same transaction. Use a session façade that provides a high-level entry point so that all the methods called from that point are included in one transaction.
Optimize read-only EJBs to use read-only transactions. Use read-only in the deployment descriptor to avoid unnecessary calls to ejbStore( ) by the application server (not all application servers support this feature).
Choose the lowest-cost transaction isolation level that avoids corrupting the data. Transaction levels in order of increasing cost are TRANSACTION_READ_UNCOMMITTED, TRANSACTION_READ_COMMITTED, TRANSACTION_REPEATABLE_READ, and TRANSACTION_SERIALIZABLE.
Don't use client-initiated transactions in the EJB environment because long-running transactions increase the likelihood of conflict, making rows inaccessible to other sessions. If the client controls the duration of the transaction, you may have no way to force the transaction to close from the server, thus allowing long or indefinite transactions. The longer a transaction lasts, the more likely it is to conflict with another transaction.
If you need client-initiated transactions, set an appropriate transaction timeout in the ejb-jar.xml deployment descriptor file. Setting a timeout ensures that the application doesn't start leaking resources from transactions that are opened at the client but not completed. The deployment descriptor should be something like "trans-timeout-seconds," and you should specify a timeout that is long enough for users to reasonably complete their task.
Declare nontransactional methods of session beans with NotSupported or Never transaction attributes (in the ejb-jar.xml deployment descriptor file).
Use a dirty flag where supported by the EJB server or in a BMP or DAO implementation to avoid writing unchanged EJBs to the database. Dirty flags are a standard way to avoid writing unchanged data. The write is guarded with the dirty flag and performed only if the flag is dirty. Initially the flag is clean, and any change to the EJB sets the flag to dirty.