Updating the Data

One of the reasons people use the ClientDataSet component (or turn to cached updates in the BDE) is to make a SQL join updatable. Consider the following SQL equi-join:

SELECT * FROM Orders, Customer
WHERE Customer.CustNo=Orders.CustNo

This statement provides a list of orders and the customers that placed those orders. The BDE considers any SQL join to be read-only because inserting, updating, and deleting rows in a join is ambiguous. For example, should the insert of a row into the previous join result in a new product and also a new supplier, or just a new product? The ClientDataSet/Provider architecture allows you to specify a primary update table (and advanced features actually not covered in the book) and also customize the updates' SQL, as we partially saw in Chapter 14 and we'll further explore in Chapter 16, "Multitier DataSnap Applications."

ADO supports an equivalent to cached updates called batch updates, which are similar to the BDE approach. In the next section we will take a closer look at ADO's batch updates, what they can offer you, and why they are so important. However, in this section you won't need them to solve the problem of updating a join, because in ADO, joins are naturally updatable.

For example, the JoinData example is built around an ADODataset component that uses the previous SQL join. If you run it, you can edit one of the fields and save the changes (by moving off the record). No error occurs, because the update has been applied successfully. ADO, compared to the BDE, has taken a more practical approach to the problem. In an ADO join, each field object knows which underlying table it belongs to. If you update a field in the Products table and post the change, then a SQL UPDATE statement is generated to update the field in the Products table. If you change a field in the Products table and a field in the Suppliers table, then two SQL UPDATE statements are generated, one for each table.

The insertion of a row into a join follows a similar behavior. If you insert a row and enter values for the Products table only, then a SQL INSERT statement is generated for the Products table. If you enter values for both tables, two SQL INSERT statements are generated, one for each table. The order in which the statements are executed is important, because the new product might relate to the new supplier, so the new supplier is inserted first.

The biggest problem with the ADO solution can be seen when a row in a join is deleted. The deletion attempt will appear to fail. The exact message you see depends on the version of ADO you are using and the database, but it will be along the lines that you cannot delete the row because other records relate to it. The error message can be confusing. In the current scenario, the error message implies that a product cannot be deleted because there are records that relate to the product, but the error occurs whether the product has any related records or not. The explanation can be found by following the same logic for deletions as for insertions. Two SQL DELETE statements are generated: one for the Suppliers table and then another for the Products table. Contrary to appearances, the DELETE statement for the Products table succeeds. It is the DELETE statement for the Suppliers table that fails, because the supplier cannot be deleted while it still has dependent records.

Tip

If you are curious about the SQL statements that are generated, and you use SQL Server, you can see these statements using SQL Server Profiler.

Even if you understand how this process works, it's helpful to look at this problem through the users' eyes. From their point of view, when users delete a row in the grid, I would wager that 99 percent of them intend to delete just the product—not both the product and the supplier. Fortunately, you can achieve this result using another dynamic property—in this case, the Unique Table dynamic property. You can specify that deletes refer to just the Products table and not to Suppliers using the following line of code:

ADOQuery1.Properties['Unique Table'].Value := 'Products';

This value cannot be assigned at design time, so the next best alternative is to place this line in the form's OnCreate event.

Batch Updates

When you use batch updates, any changes you make to your records can be made in memory; later, the entire "batch" of changes can be submitted as one operation. This approach offers some performance benefits, but there are more practical reasons why this technology is a necessity: The user might not be connected to the database at the time they make their updates. This would be the case in a briefcase application (which we will return to in the section "The Briefcase Model"), but it can also be the case in web applications that use another ADO technology, Remote Data Services (RDS).

You can enable batch updates in any ADO dataset by setting LockType to ltBatchOptimistic before the dataset is opened. In addition, you will need to set the CursorLocation to clUseClient, because batch updates are managed by ADO's cursor engine. Hereafter, changes are all made to a delta (a list of changes). The dataset looks to all intents and purposes as if the data has changed, but the changes have only been made in memory; they have not been applied to the database. To make the changes permanent, use UpdateBatch (equivalent to cached updates' ApplyUpdates):

ADODataSet1.UpdateBatch;

To reject the entire batch of updates, use either CancelBatch or CancelUpdates. There are many similarities in method and property names between ADO batch updates, BDE cached updates, and ClientDataSet. UpdateStatus, for example, can be used exactly the same way as for cached updates to identify records according to whether they have been inserted, updated, deleted, or unmodified. This approach is particularly useful for highlighting records in different colors in a grid or showing their status on a status bar. Some differences between the syntaxes are slight, such as changing RevertRecord to CancelBatch(arCurrent); others require more effort.

One useful cached update feature that is not present in ADO batch updates is the dataset's UpdatesPending property. This property is true if changes have been made but not yet applied.It's particularly useful in a form's OnCloseQuery event:

procedure TForm1.FormCloseQuery(
  Sender: TObject; var CanClose: Boolean);
begin
  CanClose := True;
  if ADODataSet1.UpdatesPending then
    CanClose := (MessageDlg('Updates are still pending' #13 +
      'Close anyway?', mtConfirmation, [mbYes, mbNo], 0) = mrYes);
end;

However, with a little knowledge and ingenuity you can implement a suitable ADOUpdatesPending function. The necessary knowledge is that ADO datasets have a property called FilterGroup, which is a kind of filter. Unlike a dataset's Filter property, which filters the data based on a comparison of the data against a condition, FilterGroup filters based on the status of the record. One such status is fgPendingRecords, which includes all records that have been modified but not yet applied. So, to allow the user to look through all the changes they have made so far, you need only execute two lines:

ADODataSet1.FilterGroup := fgPendingRecords;
ADODataSet1.Filtered := True;

Naturally, the result set will now include the records that have been deleted. The effect you will see is that the fields are left blank, which is not very helpful because you don't know which record has been deleted. (This was not the behavior in the first version of ADOExpress, which displayed the field values of deleted records.)

The ingenuity you need in order to solve the UpdatesPending problem involves clones, discussed earlier. The ADOUpdatesPending function will set the FilterGroup to restrict the dataset to only those changes that have not yet been applied. All you need to do is see whether there are any records in the dataset once the FilterGroup has been applied. If there are, then some updates are pending. However, if you do this with the actual dataset, then the setting of the FilterGroup will move the record pointer, and the user interface will be updated. The best solution is to use a clone:

function ADOUpdatesPending(ADODataSet: TCustomADODataSet): boolean;
var
  Clone: TADODataSet;
begin
  Clone := TADODataSet.Create(nil);
  try
    Clone.Clone(ADODataSet);
    Clone.FilterGroup := fgPendingRecords;
    Clone.Filtered := True;
    Result := not (Clone.BOF and Clone.EOF);
    Clone.Close;
  finally
    Clone.Free;
  end;
end;

In this function, you clone the original dataset, set the FilterGroup, and check to see whether the dataset is at both beginning of the file and also the end of the file. If it is, then no records are pending.

Optimistic Locking

Earlier we looked at the LockType property and saw how pessimistic locking works. In this section, we'll look at optimistic locking, not only because it is the preferred locking type for medium- to high-throughput transactions, but also because it is the locking scheme employed by batch updates.

Optimistic locking assumes there is a low probability that users will attempt to update the same records at the same time and that a conflict is unlikely. As such, the attitude is that all users can edit any record at any time, and you deal with the consequences of conflicts between different users' updates to the same records when the changes are saved. Thus, conflicts are considered an exception to the rule. This means there are no controls to prevent two users from editing the same record at the same time. The first user to save their changes will succeed; the second user's attempt to update the same record might fail. This behavior is essential for briefcase applications (discussed later in the chapter) and web applications, where there is no permanent connection to the database and, therefore, no way to implement pessimistic locking. In contrast with pessimistic locking, optimistic locking has the additional considerable benefit that resources are consumed only momentarily; therefore, the average resource usage is much lower, making the database more scalable.

Let's consider an example. Assume you have an ADODataSet connected to the Customer table of the dbdemos.mdb database, with LockType set to ltBatchOptimistic, and the contents are displayed in a grid. Assume that you also have a button to call UpdateBatch. Run the program twice (it is the BatchUpdates example if you don't want to rebuild it) and begin editing a record in the first copy of the program. Although for the sake of simplicity I'll demonstrate a conflict using just a single machine, the scenario and subsequent events are unchanged when using multiple machines:

Choose the Bottom-Dollar Markets company in Canada and change the name to Bottom-Franc Markets.
Save the change, move off the record to post it, and click the button to update the batch.
In the second copy of the program, locate the same record and change the company name to Bottom-Pound Markets.
Move off the record and click the button to update the batch. It will fail.

As with many ADO error messages, the exact message you receive will depend not only on the version of ADO you are using but also on how closely you followed the example. In ADO 2.6, the error message is "Row cannot be located for updating. Some values may have been changed since it was last read." This is the nature of optimistic locking. The update to the record is performed by executing the following SQL statement:

UPDATE CUSTOMER SET CompanyName="Bottom-Pound Markets"
WHERE CustomerID="BOTTM" AND CompanyName="Bottom-Dollar Markets"

The number of records affected by this UPDATE statement is expected to be one, because it locates the original record using the primary key and the contents of the CompanyName field as it was when the record was first read. In this example, however, the number of records affected by the UPDATE statement is zero. This result can occur only if the record has been deleted, the record's primary key has changed, or the field that you are changing was changed by someone else. Hence, the update fails.

If the "second user" had changed the ContactName field and not the CompanyName field, then the UPDATE statement would have looked like this:

UPDATE CUSTOMER SET ContactName="Liz Lincoln"
WHERE CustomerID="BOTTM" AND ContactName="Elizabeth Lincoln"

In the example scenario, this statement would have succeeded because the other user didn't change the primary key or the contact name. This behavior is similar to the BDE with the update where changed update mode. The UpdateMode property of the BDE in ADO is replaced by the Update Criteria dynamic property of a dataset. The following list shows the possible values that can be assigned to this dynamic property:

Constant	Locate Records By
adCriteriaKey	Primary key columns only
adCriteriaAllCols	All columns
adCriteriaUpdCols	Primary key columns and changed columns only
adCriteriaTimeStamp	Primary key columns and a timestamp column only

Don't fall into the trap of thinking that one of these settings is better than another for your whole application. In practice, your choice of setting will be influenced by the contents of each table. Say that the Customer table has just CustomerID, Name, and City fields. In this case, the update of any one of these fields is logically not mutually exclusive with the update of any of the other fields, so a good choice for this table would be adCriteriaUpdCols (the default). If, however, the Customer table included a PostalCode field, then the update of a PostalCode field would be mutually exclusive with the update of the City field by another user (because if the city changes, then so should the postal code, and possibly vice versa). In this case, you could argue that adCriteriaAllCols would be a safer solution.

Another issue to be aware of is how ADO deals with errors during the update of multiple records. Using the BDE's cached updates and ClientDataSet, you can use the OnUpdateError event to handle each update error as the error occurs and resolve the problem before moving on to the next record. In ADO, you cannot establish such a dialog. You can monitor the progress and success or failure of the updating of the batch using the dataset's OnWillChangeRecord and OnRecordChangeComplete, but you cannot revise the record and resubmit it during this process as you can with the BDE and ClientDataSet. There's more: If an error occurs during the update process, the updating does not stop. It continues to the end, until all updates have been applied or have failed. This process can produce an unhelpful and incorrect error message. If more than one record cannot be updated, or the single record that failed is not the last record to be applied, then the error message in ADO 2.6 is "Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done." The last sentence is the problem; it states that "No work was done," but this is incorrect. It is true that no work was done on the record that failed, but other records were successfully applied, and their updates stand.

Resolving Update Conflicts

As a consequence of the nature of applying updates, the approach that you need to take to update the batch is to update the batch, let the individual records fail, and then deal with the failed records once the process is over. You can determine which records have failed by setting the dataset's FilterGroup to fgConflictingRecords:

ADODataSet1.FilterGroup := fgConflictingRecords;
ADODataSet1.Filtered := True;

For each failed record, you can inform the user of three critical pieces of information about each field using the following TField properties:

Property	Description
NewValue	The value this user changed it to
CurValue	The new value from the database
OldValue	The value when first read from the database

Users of the ClientDataSet component will be aware of the handy ReconcileErrorForm dialog, which wraps up the process of showing the user the old and new records and allows them to specify what action to take. Unfortunately, there is no ADO equivalent to this form, and TReconcileErrorForm has been written with ClientDataSet so much in mind that it is difficult to convert it for use with ADO datasets.

I'll point out one last gotcha about using these TField properties: They are taken straight from the underlying ADO Field objects to which they refer. This means, as is common in ADO, that you are at the mercy of your chosen OLE DB provider to support the features you hope to use. All is well for most providers, but the Jet OLE DB provider returns the same value for CurValue as it does for OldValue. In other words, if you use Jet, you cannot determine the value to which the other user changed the field unless you resort to your own measures. Using the OLEDB provider for SQL Server, however, you can access the CurValue only after calling the Resync method of the dataset with the AffectRecords parameter set to adAffectGroup and ResyncValues set to adResyncUnderlyingValues, as in the following code:

adoCustomers.FilterGroup := fgConflictingRecords;
adoCustomers.Filtered := true;
adoCustomers.Recordset.Resync(adAffectGroup, adResyncUnderlyingValues);