In this section we will discuss the online utilities like table load, index rebuild, index create and the incremental maintenance of materialized query table during load append.
When loading data into a table in Version 8, the table space in which the table resides will no longer be locked. Users have full read and write access to all the tables in the table space, except for the table being loaded. For the table being loaded, the existing data in the table will be available for read access if the load is appending data to the table.
These new load features significantly improve the availability of the data and help customers deal with the maintenance of large data volumes and shrinking maintenance windows.
Load will lock the target table in a share mode. The table state will be set to both LOAD IN PROGRESS and READ ACCESS. Readers may access the non-delta portion of the data while the table is being loaded.
In other words, data that existed before the start of the load will be accessible by readers to the table; data that is being loaded is not available until the load is complete. LOAD TERMINATE or LOAD RESTART of an ALLOW READ ACCESS load may use this option; LOAD TERMINATE or LOAD RESTART of an ALLOW NO ACCESS load may not use this option. Furthermore, this option is not valid if the indexes on the target table are marked as requiring a rebuild.
When there are constraints on the table, the table state will be set to CHECK PENDING, as well as LOAD IN PROGRESS and READ ACCESS. At the end of the load, the table state LOAD IN PROGRESS will be removed but the table states CHECK PENDING and READ ACCESS will remain. The SET INTEGRITY command must be used to take the table out of CHECK PENDING. While the table is in CHECK PENDING and READ ACCESS, the non-delta portion of the data is still accessible to readers; the new (delta) portion of the data will remain inaccessible until the SET INTEGRITY command has completed. A user may perform multiple loads on the same table without issuing a SET INTEGRITY command. Only the original (checked) data will remain visible, however, until the SET INTEGRITY command is issued.
ALLOW READ ACCESS also supports modifiers. The load utility provides two options that control the amount of access that other applications have to a table being loaded.
The ALLOW NO ACCESS option locks the table exclusively and allows no access to the table data while the table is being loaded. This is the default.
The ALLOW READ ACCESS option prevents all write access to the table by other applications but allows read access to preloaded data. This section deals with the ALLOW READ ACCESS option.
Table data and index data that exist prior to the start of a load operation are visible to queries while the load operation is in progress.
The ALLOW READ ACCESS option is very useful when loading large amounts of data because it gives users access to table data at all times, even when the load operation is in progress. The behavior of a load operation in ALLOW READ ACCESS mode is independent of the isolation level of the application. That is, readers with any isolation level can always read the pre-existing data while the load operation is in progress. They will not be able to read the newly loaded data until the load operation has finished.
Read access is provided throughout the load operation, except at the very end.
Before data is committed, the load utility acquires an exclusive lock on the table. The load utility will wait until all applications that have locks on the table release them. This may cause a delay before the data can be committed. The LOCK WITH FORCE option may be used to force off conflicting applications and allow the load operation to proceed without having to wait.
Usually, a load operation in ALLOW READ ACCESS mode acquires an exclusive lock for a short amount of time; however, if the USE <tablespaceName> option is specified, the exclusive lock will last for the entire period of the index copy phase.
If a load operation is aborted, it remains at the same access level that was specified when the load operation was issued. So if a load operation in ALLOW NO ACCESS mode aborts, the table data is inaccessible until a load terminate or a load restart is issued. If a load operation in ALLOW READ ACCESS mode aborts, the preloaded table data is still accessible for read access.
If the ALLOW READ ACCESS option was specified for a load operation, it can also be specified for a load restart or load terminate operation. However, if the original load operation specified the ALLOW READ ACCESS option, the ALLOW NO ACCESS option cannot be specified for a load restart or load terminate operation.
The ALLOW READ ACCESS option is not supported if:
The REPLACE option is specified. Because a load replace operation truncates the existing table data before loading the new data, there is no preexisting data to query until after the load operation is complete.
The indexes have been marked invalid and are waiting to be rebuilt. Indexes can be marked invalid in a rollforward scenario or through the use of the db2dart command.
The INDEXING MODE DEFERRED option is specified. This mode marks the indexes as requiring a rebuild.
An ALLOW NO ACCESS load operation is being restarted or terminated. Until it is brought fully online, a load operation in ALLOW READ ACCESS mode cannot take place on the table.
A load operation is taking place to a table that is in check-pending state and is not in read access state. This is also the case for multiple load operations on tables with constraints. A table is not brought online until the SET INTEGRITY statement is issued.
Generally, if table data is taken offline, read access is not available during a load operation until the table data is back online.
Indexes are built during the build phase of a load operation. There are four indexing modes that can be specified in the LOAD command:
REBUILD. All indexes will be rebuilt.
INCREMENTAL. Indexes will be extended with new data.
AUTOSELECT. The load utility will automatically decide between REBUILD or INCREMENTAL mode. This is the default.
DEFERRED. The load utility will not attempt index creation if this mode is specified. Indexes will be marked as needing a refresh. The first access to such indexes that is unrelated to a load operation may force a rebuild. This option is not compatible with the ALLOW READ ACCESS option because it does not maintain the indexes, and index scanners require a valid index.
You may decide to choose an indexing mode explicitly because the behavior of the REBUILD and INCREMENTAL modes are quite different.
Load operations that specify the ALLOW READ ACCESS option require special considerations in terms of space usage and logging, depending on the type of indexing mode chosen. The load utility keeps indexes available for queries, even while building new indexes.
When a load operation in ALLOW READ ACCESS mode specifies the INDEXING MODE INCREMENTAL option, the load utility will write some log records that protect the integrity of the index tree. The number of log records written is a fraction of the number of inserted keys and is a number considerably less than would be needed by a similar SQL insert operation. A load operation in ALLOW NO ACCESS mode with the INDEXING MODE INCREMENTAL option specified writes only a small log record beyond the normal space allocation logs.
When a load operation in ALLOW READ ACCESS mode specifies the INDEXING MODE REBUILD option, new indexes are built as a shadow, either in the same table space as the original index or in a system temporary table space. The original indexes remain intact and are available during the load operation, and they are replaced by the new indexes only at the end of the load operation, while the table is exclusively locked.
A materialized query table is a dependent of an underlying table. Before Version 8, if data was appended to the underlying table during a load operation, the materialized query table was unavailable until the load completed and the materialized query table was maintained. Furthermore, the materialized query table was completely rebuilt, which was often a lengthy operation.
With Version 8, the materialized query table can remain available during the load append operation on the underlying table. When the load of the appended rows in the underlying table is complete, the materialized query table may be refreshed incrementally, using only the appended data, which significantly reduces the time to update it.
For example, if the materialized query table is an aggregate (automatic summary table), for rows appended to the underlying table that correspond to new groups in the aggregate, new summary rows will be inserted.
For appended rows that correspond to existing groups in the aggregate, existing rows will be updated. Although the aggregate table remains unavailable during this maintenance phase, when a small number of rows are appended to the underlying table (when compared with the size of the table), the time the aggregate is unavailable is reduced.
The ability to incrementally maintain a materialized query table is not restricted to aggregates. Many materialized query tables can be incrementally maintained.
These changes significantly improve the availability of materialized query tables to your users.
If the underlying table of an immediate refresh materialized query table is loaded using the INSERT option, executing the SET INTEGRITY statement on the dependent materialized query tables defined with REFRESH IMMEDIATE will result in an incremental refresh of the materialized query table. During an incremental refresh, the rows corresponding to the appended rows in the underlying tables are updated and inserted into the materialized query tables.
Incremental refresh is faster in the case of large underlying tables with small amounts of appended data. There are cases in which incremental refresh is not allowed, and full refresh (that is, recomputation of the materialized query table definition query) will be used.
When the INCREMENTAL option is specified but incremental processing of the materialized query table is not possible, an error is returned if:
A load replace operation has taken place into an underlying table of the materialized query table or the NOT LOGGED INITIALLY WITH EMPTY TABLE option has been activated since the last integrity check on the underlying table. The materialized query table has been loaded (in either REPLACE or INSERT mode).
An underlying table has been taken out of check-pending state before the materialized query table is refreshed by using the FULL ACCESS option during integrity checking.
An underlying table of the materialized query table has been checked for integrity non-incrementally.
The materialized query table was in check-pending state before migration.
The table space containing the materialized query table or its underlying table has been rolled forward to a point in time, and the materialized query table and its underlying table reside in different table spaces.
If the materialized query table has one or more W values in the CONST_CHECKED column of the SYSCAT.TABLES catalog, and if the NOT INCREMENTAL option is not specified in the SET INTEGRITY statement, the table will be incrementally refreshed and the CONST_CHECKED column of SYSCAT.TABLES will be marked U to indicate that not all data has been verified by the system.
DB2 now provides two methods of reorganizing tables: online and offline.
The online table reorganization allows applications to access the table during the reorganization. In addition, online table reorganization can be paused and resumed later by anyone with the appropriate authority by using the schema and table name.
Online table reorganization is allowed only on tables with type-2 indexes and without extended indexes.
The offline method provides faster table reorganization, especially if you do not need to reorganize LOB or LONG data. LOBS and LONG data are no longer reorganized unless specifically requested. In addition, indexes are rebuilt in order after the table is reorganized.
Read-only applications can access the original copy of the table, except during the last phases of the reorganization, when the "shadow copy" replaces the original copy and the indexes are rebuilt.
Both online and offline reorganizations have been enhanced to improve support for multi-partition databases. You can reorganize a single partition, a set of partitions, or all partitions.
New for DB2 Version 8 is the ability to read and update a table and its existing indexes during an index reorganization, using the new REORG INDEXES command.
During online index reorganization, the entire index object (that is, all indexes on the table) is rebuilt. A "shadow copy" of the index object is made, leaving the original indexes and the table available for read and write access. Any concurrent transactions that update the table are logged. Once the logged table changes have been forward-fitted and the new index (the shadow copy) is ready, the new index is made available. While the new index is being made available, all access to the table is prohibited.
The default behavior of the REORG INDEXES command is ALLOW NO ACCESS, which places an exclusive lock on the table during the reorganization process, but you can also specify ALLOW READ ACCESS or ALLOW WRITE ACCESS to permit other transactions to read from or update the table.
Indexes can now be created in large table spaces (formerly long table spaces). In situations where the existing indexes consume more than 32 GB, this will allow you to allocate sufficient space to accommodate the two sets of indexes that will exist during the online index reorganization process.
This option specifies the table whose indexes are to be reorganized. The table can be in a local or a remote database.
This option specifies that no other users can access the table while the indexes are being reorganized. This is the default.
This option specifies that other users can have read-only access to the table while the indexes are being reorganized.
This option specifies that other users can read from and write to the table while the indexes are being reorganized.
When CLEANUP ONLY is requested, a cleanup rather than a full reorganization will be done. The indexes will not be rebuilt, and any pages freed up will be available for reuse by indexes defined on this table only.
The CLEANUP ONLY PAGES option will search for and free committed pseudo empty pages. A committed pseudo empty page is one where all the keys on the page are marked as deleted, and all these deletions are known to be committed.
The number of pseudo empty pages in an index can be determined by running runstats and looking at the NUM EMPTY LEAFS column in SYSCAT.INDEXES. The PAGES option will clean the NUM EMPTY LEAFS if they are determined to be committed.
The CLEANUP ONLY ALL option will free committed pseudo empty pages, as well as remove committed pseudo deleted keys from pages that are not pseudo empty. This option will also try to merge adjacent leaf pages if doing so will result in a merged leaf page that has at least PCTFREE free space on the merged leaf page, where PCTFREE is the percentage of free space defined for the index at index creation time. The default PCTFREE is 10%. If two pages can be merged, one of the pages will be freed. The number of pseudo deleted keys in an index, excluding those on pseudo empty pages, can be determined by running runstats, then selecting the NUMRIDS DELETED from SYSCAT.INDEXES. The ALL option will clean the NUMRIDS DELETED and the NUM EMPTY LEAFS if they are determined to be committed.
Use the ALLOW READ ACCESS or ALLOW WRITE ACCESS option to allow other transactions either read-only or read-write access to the table while the indexes are being reorganized. Note that, while ALLOW READ ACCESS and ALLOW WRITE ACCESS allow access to the table, during the period in which the reorganized copies of the indexes are made available, no access to the table is allowed.
This option specifies the table to reorganize. The table can be in a local or a remote database. The name or alias in the form: schema.table-name may be used. The schema is the user name under which the table was created. If you omit the schema name, the default schema is assumed.
For typed tables, the specified table name must be the name of the hierarchy's root table.
You cannot specify an index for the reorganization of a multidimensional clustering (MDC) table. Also note that in-place reorganization of tables cannot be used for MDC tables.
This option specifies the index to use when reorganizing the table. If you do not specify the fully qualified name in the form schema.index-name, the default schema is assumed. The schema is the user name under which the index was created. The database manager uses the index to physically reorder the records in the table it is reorganizing.
For an inplace table reorg, if a clustering index is defined on the table and an index is specified, it must be a clustering index.
If the inplace option is not specified, any index specified will be used. If you do not specify the name of an index, the records are reorganized without regard to order. If the table has a clustering index defined, however, and no index is specified, the clustering index is used to cluster the table.
You cannot specify an index if you are reorganizing an MDC table.
Reorganize the table while permitting user access. Inplace table reorganization is allowed only on tables with type-2 indexes and without extended indexes.
Allow only read access to the table during reorganization.
Allow write access to the table during reorganization. This is the default behavior.
Do not truncate the table after inplace reorganization. During truncation, the table is S-locked.
Start the inplace REORG processing. Because this is the default, this keyword is optional.
Stop the inplace REORG processing at its current point.
Suspend or pause inplace REORG for the time being.
Continue or resume a previously paused inplace table reorganization.
To clean up the pseudo empty pages in all the indexes on the EMPLOYEE table while allowing other transactions to read and update the table, enter:
db2 reorg indexes all for table homer.employee allow write access cleanup only pages