Submitting Changes

The DataSet object can submit data to the database in batch mode by using the data adapter’s Update method. Data can be submitted only on a per-table basis. When you call Update without specifying a table name, the code assumes a default name of Table. If no table exists with that name, an exception is raised. The Update method takes as arguments an instance of a DataSet object and a DataTable object, or a DataTable name. The Update method first examines the RowState property of each table row. Then it prepares and calls a tailor-made INSERT, UPDATE, or DELETE statement for each inserted, updated, or deleted row in the specified DataTable object. Rows are scanned and processed according to their natural order (their position in the table’s Rows collection.)

Detecting Changes

The DataSet object’s GetChanges method returns a smaller DataSet object that includes only the rows changed in all the child tables. When you need to know about all rows changed in the DataSet object, just call GetChanges. When you need to know only about a given table, call the DataTable object’s GetChanges method. In this latter case, you get a DataTable object. One overload of the GetChanges method makes it particularly interesting in situations in which you need to control the sequence of statements run during the batch update process.

The GetChanges method, both for the DataSet and DataTable classes, can accept an optional argument indicating the state of the rows to be selected. For example, you might want to select only those rows that have been inserted to ensure that all the INSERT statements are issued before the UPDATE statements. In this case, you can split the batch update process into steps by getting and submitting the changes in small groups of rows as needed. The following code shows how to select only the new rows added to a table and ensure that they are submitted prior to the modified rows:

SqlDataAdapter da = new SqlDataAdapter();
DataTable dt = ds.Tables["MyWorkingTable"];
DataTable dtInserted = dt.GetChanges(DataViewRowState.Added); 
DataTable dtModified = dt.GetChanges(DataViewRowState.Modified); 
da.Update(ds, dtInserted);
da.Update(ds, dtModified);

GetChanges returns a potentially empty DataTable or DataSet object with all the changes that were entered since the last time changes to the table were committed using AcceptChanges. When you need to know only whether a DataSet object has pending changes, calling the HasChanges method will be more effective.

The HasChanges method returns a Boolean value that indicates whether the DataSet object has changes, including new, deleted, or modified rows. (The DataTable object has no equivalent method.) Like GetChanges, HasChanges can take an optional argument of type DataViewRowState and filter changes by action. The HasChanges method is incredibly useful for keeping your user interface up-to-date.

You can create a panel, such as the one shown in Figure 7-4, to contain the controls you need when the table has pending changes. You can make it automatically appear when needed by linking its visibility state to the value returned by HasChanges.

panelChanges.Visible = ds.HasChanges();

Figure 7-4

A control panel that automatically appears when a table has pending changes.

When you have pending changes, you should take care of a few tasks, such as providing a way for the user to view changes, reject all changes, reject individual changes, and submit all the changes to the server, reconciling the in-memory data representation with the results that you get from the user.

Rejecting Changes

In-memory changes made to the DataSet object can be accepted or rejected at three different levels: the DataSet object, the DataTable object, and the individual DataRow object. Calling the RejectChanges method rolls back all the changes made to the specified object since it was created, or since the last time AcceptChanges was called. Calling RejectChanges on a DataSet object invokes the RejectChanges method on all DataTable objects contained in the DataSet.

An application based on a DataGrid control can easily reject the changes in the DataSet object and refresh the user interface by using a pretty simple, short piece of code. This code is the event handler associated with the Reject button in Figure 7-4.

public void OnRejectChanges(Object sender, EventArgs e)
{
    // Retrieve the DataSet from the ASP.NET cache
    DataSet ds = GetDataFromCache();

    // Reject all the pending changes on the DataSet
    ds.RejectChanges();

    // Refresh the grid to reflect rejected changes
    BindData(ds);
}

Viewing Changes

An application that works in a disconnected, off-line manner must offer its users a way to review changes before committing or rejecting them. In ASP.NET, this complex function can be coded in a pretty straightforward way thanks to the power of the DataGrid control and the DataSet DiffGram format. The next code fragment demonstrates how to set up a filter for the DataGrid control so that it displays only the modified, inserted, or deleted rows of a table.

// Retrieve the changes from the ASP.NET cache
DataSet ds = GetDataFromCache();
DataSet dsChanges = ds.GetChanges();

// Set the filter on the grid to select only modified rows
DataView dv = dsChanges.Tables["Employees"].DefaultView;
dv.RowStateFilter = DataViewRowState.Added;
dv.RowStateFilter ¦= DataViewRowState.ModifiedOriginal;
dv.RowStateFilter ¦= DataViewRowState.Deleted;

// Refresh the grid with changes
gridChanges.DataSource = dv;
gridChanges.DataBind();

To filter a table based on the row state, you use the DataView object’s RowStateFilter property. Figure 7-5 shows the sample application when you try to review the current changes.

Figure 7-5

The current changes in the DataSet object displayed in a separate grid.

The grid that shows changes has an extra column for displaying the state of the row in text. The number-to-string conversion is accomplished by the ToString method called on the RowState property of each single row. The following ASP.NET code presents the template used to populate the rightmost column in Figure 7-5.

<asp:TemplateColumn runat="server">
    <itemtemplate>
        <b><%# ((DataRowView)Container.DataItem).Row.RowState.ToString() %>
        </b>
    </itemtemplate>
</asp:TemplateColumn>

The Container.DataItem object represents the view of data in the row that is being processed.

Restoring Modified Rows

As mentioned earlier, the .NET Framework allows you to reject changes for an individual row. So why not take the sample application one step further by adding a button to reject changes and restore the original state? You add a new button column with the following ASP.NET code. It creates a link button with a command name of reject. Beware of the command name—it is case-sensitive.

<asp:ButtonColumn runat="server" Text="Restore" CommandName="reject" />

Clicking a button column can be detected by the DataGrid control’s ItemCommand handler. It is a generic handler for all button clicks. You recognize the actual source of the event by using the command name.

public void ItemCommand(Object sender, DataGridCommandEventArgs e)
{
    if(e.CommandName == "reject")
    {...}
}

The first (and pretty obvious) issue to consider is that you cannot use the Select method to locate the row to process. You can’t use Select with deleted rows because they are no longer part of the table’s list of current rows and thus can’t be retrieved. In the grid, how can you get the DataRow object hiding behind each displayed row with pending changes? You have to collect the row by using the filtered view that takes care of displaying it. You obtain the current DataSet object from the cache and then build a DataView object like the one that we used earlier to show changes:

DataView dv = ds.Tables["Employees"].DefaultView;
dv.RowStateFilter = DataViewRowState.Added;
dv.RowStateFilter ¦= DataViewRowState.ModifiedOriginal;
dv.RowStateFilter ¦= DataViewRowState.Deleted;

From the DataView object, you easily access the needed DataRow object. The item index will point to the correct address in the DataView object’s collection element. DataView elements are objects of the DataRowView type. You obtain the underlying DataRow object by using the Row property and reject its changes:

DataRow row = dv[e.Item.ItemIndex].Row;
row.RejectChanges();
dv.RowStateFilter = DataViewRowState.CurrentRows;

At this point, changes must be reflected in the current DataSet object. You start this phase by resetting the row state filter for the view. This step is necessary to avoid unpleasant side effects with the controls that are bound to the DataTable object that the view is based on. Once the freshly updated DataSet object is successfully re-cached, all you have left to do is update the two grids: the grid with the current rows and the grid with the pending changes. Figure 7-6 shows the sample application with a pending change. As soon as the individual change is restored, the two grids are refreshed, as shown in Figure 7-7.

Figure 7-6

The sample application has some pending changes, and the user is about to restore one of them.

Figure 7-7

One of the pending changes has just been rejected and the original values restored.

One final word about this code: the DataSet object’s GetChanges method returns null when the DataSet object (or the DataTable object) has no pending changes. Handle this case carefully or an exception will be thrown.


	Although both DataRowState and DataViewRowState describe row states, each results in different enumerations with different values and goals. The DataViewRowState values are used to retrieve a particular version of data from a DataRow object. The DataRowState values, instead, retrieve the current state of a row in regard to its relationship to the list of rows maintained by a table.

Generating Commands

The data adapter’s Update method loops through the rows of the specified DataTable object to detect which rows are changed. To slightly optimize the whole operation, you might want the adapter to work on a smaller table that includes only modified rows. You get this ad-hoc DataTable object by using the GetChanges method. As mentioned earlier, be cautious, because ADO.NET returns a null reference if no changes are detected.


	When executing the Update method, the data adapter loops through the table rows. Based on the row state, the adapter determines which command has to execute. It copies row values into the command’s parameters, fires the RowUpdating event, and issues the command. Next the adapter fires the RowUpdated event to the application and calls AcceptChanges for the row. In light of this, using a temporary DataSet containing only changes rather than the whole DataSet results in a very minimal optimization. The adapter has fewer rows to loop through, but this does not change the number of computationally relevant operations—the database statements. Moreover, as we’ll see shortly, using a temporary DataSet can significantly complicate conflicts resolution.

Three operations can modify the state of a table: inserting a new row, deleting an existing row, or updating an existing row. For each of these key operations, the data adapter defines a tailor-made command object that is exposed as a property: InsertCommand, DeleteCommand, and UpdateCommand, respectively. You are responsible for assigning to these properties meaningful command objects—for example, the SqlCommand object.


	The availability of the InsertCommand, DeleteCommand, and UpdateCommand properties is a quantum leap in functionality from ADO. They give you a chance to customize the way in which in-memory updates are submitted to the database. With ADO, you had no control over the SQL commands silently generated by the library. In ADO.NET, for example, you can control the update by using a stored procedure to perform batch updates and subsequently work with cross-referenced tables and non-SQL data providers.

The commands must run for each changed row in the table and be general enough to accommodate different values. Parameters are good for this kind of task, but you have to bind them to all the values of a database column. The data adapter object supports parameter objects that expose the SourceColumn and SourceVersion properties. SourceColumn gets or sets the name of the source column that is mapped to the DataSet object and is used for loading or returning the parameter value. SourceVersion indicates whether the original or current value has to be used as the parameter value. The following code shows how to bind the customerid field of the database to a given data adapter. The column metadata used to configure the parameter must match one of the columns in the table that will be part of the batch update.

SqlParameter p1 = new SqlParameter("@nCustomerID", SqlDbType.Int);
p1.SourceVersion = DataRowVersion.Current;
p1.SourceColumn = "customerid";
da.InsertCommand.Parameters.Add(p1);

The Insert Command

In the sample application, the INSERT command is defined as follows:

StringBuilder sb = new StringBuilder("");
sb.Append("INSERT Employees (firstname, lastname, titleofcourtesy, ");
sb.Append("sbtitle, country, address, postalcode) VALUES(");
sb.Append("@sFirstName, @sLastName, @sTitle, @sPosition, @sCountry, "); 
sb.Append("@sAddress, @sPostal)");
da.InsertCommand = new SqlCommand();
da.InsertCommand.CommandText = sb.ToString();
da.InsertCommand.Connection = conn;

All the parameters are added to the data adapter’s Parameters collection and are bound to a DataTable column. Auto-increment columns must not be listed in the syntax of the INSERT command.

The Update Command

The UPDATE command needs to identify one particular row to apply its changes. You do this using a WHERE clause in which a parameter value is compared against a key field. Note in the following code that the parameter used in the WHERE clause must be bound by using its original value rather than the current value.

StringBuilder sb = new StringBuilder("");
sb.Append("UPDATE Employees SET ");
sb.Append("title=@sPosition, country=@sCountry, ");
sb.Append("titleofcourtesy=@sTitle, firstname=@sFirstName, ");
sb.Append("lastname=@sLastName, address=@sAddress, ");
sb.Append("postalcode=@sPostal ");
sb.Append("WHERE employeeid=@nEmpID");da.UpdateCommand = new SqlCommand();
da.UpdateCommand.CommandText = sb.ToString();
da.UpdateCommand.Connection = conn;

p1 = new SqlParameter("@nEmpID", SqlDbType.Int);
p1.SourceVersion = DataRowVersion.Original;p1.SourceColumn = "employeeid";
da.UpdateCommand.Parameters.Add(p1);

The Delete Command

Like UPDATE, the DELETE command requires a WHERE clause to identify the row to remove. You also need to use the original version of the row to bind the parameter value.

StringBuilder sb = new StringBuilder("");
sb.Append("DELETE FROM Employees ");
sb.Append("WHERE employeeid=@nEmpID");
da.DeleteCommand = new SqlCommand();
da.DeleteCommand.CommandText = sb.ToString();
da.DeleteCommand.Connection = conn;

p1 = new SqlParameter("@nEmpID", SqlDbType.Int);
p1.SourceVersion = DataRowVersion.Original;p1.SourceColumn = "employeeid";
da.DeleteCommand.Parameters.Add(p1);

The SelectCommand Property

In the programming interface of the data adapter object, along with InsertCommand, UpdateCommand, and DeleteCommand, you’ll find a rather unusual property named SelectCommand. This property serves two main purposes. First, it gives you a chance to use a stored procedure with or without parameters to populate a DataSet object. When you set the SelectCommand property with a valid command object and then call the Fill method, the DataSet object is filled with data. The second use for SelectCommand is to automatically create the commands used in the batch update.

Command Builders

When invoked, the data adapter’s Update method assumes that the necessary INSERT, UPDATE, and DELETE statements are specified. If the pending changes require the use of a command that isn’t specified, the Update method generates an exception.

Automatically Generated Commands

The update commands can also be generated automatically and made available to the data adapter update engine. When you create a command builder object (for example, SqlCommandBuilder), the command builder automatically generates the SQL statements needed for the batch update.


	Command builders are an optional feature of .NET data providers. Command builders work in conjunction with data adapters to generate custom SQL commands that are used during the batch update.

Command builders cannot be used in all cases. The automatic generation of commands can take place only under certain circumstances. In particular, command builders do not generate anything when the table is obtained by joining columns from more than one table; they also don’t produce statements when calculated, or aggregate, columns are detected. Command builders are extremely helpful and save code only when they deal with single table updates. How can command builder objects know about the structure of the table so that they can generate update statements? They rely on SelectCommand.


	Command builders also fail when column names or table names contain special characters, such as spaces, periods, and quotation marks, even when these characters are delimited by brackets.

A fundamental prerequisite for a command builder object is that the SelectCommand property of the data adapter points to a valid query statement. If this query contains INNER JOIN statements or calculated columns, and if the query does not return key column information, an exception is thrown and no further command is generated. A command builder is a smart kind of object because it recognizes whether the object it is called to create already exists. If it does, only those command properties you did not set are generated.

How the Command Builder Works

The command builder object works closely with the data adapter. When you use a command builder, you don’t need to set any of the adapter’s command properties with the objects generated by the builder. The Update method will ask the associated command builder to provide commands when needed, so beware of accessing the InsertCommand, DeleteCommand, and UpdateCommand properties after creating the command builder. These properties contain non-null objects only when you explicitly instantiate them.

A command builder object creates commands on demand and then caches them internally to serve further requests. It purposely exposes three methods: GetInsertCommand, GetUpdateCommand, and GetDeleteCommand. The association between the data adapter and the command builder is established through the builder’s constructor, as seen in the following code example:

SqlCommand cmd = new SqlCommand();
cmd.CommandText = "SELECT employeeid, postalcode FROM Employees";
cmd.Connection = conn;
dataAdapter.SelectCommand = cmd;
SqlCommandBuilder cb = new SqlCommandBuilder(dataAdapter);

To generate INSERT, UPDATE, and DELETE statements, the command builder uses the SelectCommand property, which retrieves metadata. If you change SelectCommand after the metadata has been retrieved (for example, after the first of a series of updates), call the builder’s RefreshSchema method to update the metadata. SqlCommandBuilder uses some of the properties defined in the SelectCommand object: Connection, CommandTimeout, and Transaction. When any of these properties are modified, call RefreshSchema to update the structure of the generated commands. You also need to call RefreshSchema when the SelectCommand itself is replaced. In all other cases, the InsertCommand, UpdateCommand, and DeleteCommand properties retain their previous values no matter how many updates you perform on the same adapter.

Only the fields listed in the SelectCommand object are actually part of the update. For example, in the code we just examined, only employeeid and postalcode are inserted or updated using the code generated by the command builder. For the command builder to work, one of these columns must be the primary key.


	The command builder is able to detect whether a column is also an auto-increment column. If the column is an auto-increment column, the command builder correctly avoids adding that column in the list of fields populated by an automatically generated INSERT statement.

A Quick Look at Auto-Generated Commands

If you want to find out which command was automatically generated, you can obtain a reference to it by using the GetInsertCommand, GetUpdateCommand, and GetDeleteCommand methods. Once you hold the objects returned by these methods, you use the CommandText property to learn the details of the command being run.

Trace.Warn("InsertCommand", cb.GetInsertCommand().CommandText);

To see what kind of commands the SQL Server command builder generates, use the following command text for the SelectCommand object:

SELECT employeeid, firstname, lastname, postalcode FROM Employees

The employeeid field is the primary key and also an identity column. This dual purpose is a special condition and is correctly detected as the text for the INSERT statements, which look like the following:

INSERT INTO Employees( firstname , lastname , postalcode ) 
    VALUES ( @p1 , @p2 , @p3 )

A bit more interesting is the command text generated for the DELETE and UPDATE statements, which follows:

DELETE FROM Employees 
    WHERE employeeid=@p1 AND firstname=@p2 AND lastname=@p3 
        AND postalcode=@p4

UPDATE Employees 
    SET firstname=@p1, lastname=@p2, postalcode=@p3 
    WHERE employeeid=@p4 AND firstname=@p5 AND lastname=@p6 
        AND postalcode=@p7

The part of the statements you should notice appears with the WHERE clause. When you want to ensure the correct row is identified, you usually don’t have to use such complex conditions involving all the columns. So why is the command builder implemented like this? The answer is pretty simple: to comply with the requirements of optimistic lock. A WHERE condition that involves a check on all the modified fields ensures that if the update command runs successfully, no conflict is detected and the underlying row is exactly as it was before it was read.

Regardless of the SQL command contained in the SelectCommand property, no rows are ever returned. How is this possible? The answer to the mystery is easily revealed by taking a look at what happens under the hood of the SQL Profiler. The command builder executes the SelectCommand query in a SQL batch in which the actual text of the command is preceded by the following statements:

SET FMTONLY OFF
SET NO_BROWSETABLE ON
SET FMTONLY ON

As a result, the query does not return rows but only column metadata information.

Once the batch update successfully terminates, you have to refresh the in-memory data and reflect any change made at the database level. For example, triggers or default values might have changed or just completed the inserted or modified rows. To make fresh data available on the client so that you can reconcile in-memory and source tables, you repeat the query command that populated the DataSet object. Unfortunately, the DataSet object does not track any information about the sequence of commands that originated it, but you can take responsibility for storing the SQL query command previously used to populate the DataSet object. The DataSet object’s ExtendedProperties cargo collection appears to be the ideal place to store it.

Data Conflicts

The logic behind automatic command generation is based on an optimistic vision of concurrency, in which each record is not locked after being read so that it remains exposed for other users to access for reading and writing at any time. With an optimistic vision of concurrency, a number of potentially inconsistent situations can occur. For example, a row can be modified or even deleted after it is returned from the SELECT statement but before the UPDATE statement is actually issued.

To avoid overwriting new data, the automatically generated UPDATE statement contains a WHERE clause that allows the command to be successful only if the rows contain original values and are not deleted. When such an UPDATE command fails to update the row, we have a conflict of data. In this case, the command does not affect the involved row but throws an exception of type DBConcurrencyException. The following code snippet demonstrates a more accurate way to execute a batch update operation with ADO.NET:

try 
{
    dataAdapter.Update(ds, "Employees");
}
catch (DBConcurrencyException dbdcex)								
{
    Response.Write("Conflict detected.");
}

The command builder always generates commands according to the principle of optimistic concurrency. If you want the UPDATE to complete regardless of the conflict between original and current values, you will need to use your own UPDATE command and not rely on automatic command generation. A command that writes the row without comparing original and current values has a much simpler WHERE clause, as shown here:

DELETE FROM Employees WHERE employeeid=@p1 

UPDATE Employees SET firstname=@p1, lastname=@p2, postalcode=@p3 
    WHERE employeeid=@p4

Basically, the WHERE clause now needs only to unequivocally identify the row to work on. A condition based on the primary key column can easily pay the bill.

Resolving the Conflict

The data adapter throws an exception of type DBConcurrencyException for the first row where the update fails. All modifications occurring prior to throwing the exception are considered committed. When the first conflict is detected, ADO.NET passes the control back to the client application, stopping the execution of the batch update. You access the DataRow involved in the conflictual update by using the Row property of the DBConcurrencyException class.

The way in which the conflict is resolved is application-specific. You could decide that the last value always overwrites the current value, whatever it is. Alternatively, you could apply an algorithm to determine resolution on a case by case basis, or even leave the decision to the user. When you leave the decision to the user, however, the prompt message must include the value proposed by the application along with the value originally read from the database and the value read while trying the update. The DataRow property returned by the Row property of the DBConcurrencyException object contains both the proposed value and the original value of the row. It does not contain the value currently stored in the database for the column. This value can be retrieved only with another query command. The code that runs when a concurrency exception is raised now looks like the following.

try 
{
    dataAdapter.Update(ds, "Employees");
}
catch (DBConcurrencyException dbcex)
{
    // One conflict has been detected...

    // Load the current row from the database
    SqlCommand c = new SqlCommand();
    c.CommandText = "SELECT * FROM Employees WHERE employeeid=" + 
        dbcex.Row["employeeid"].ToString();
    c.Connection = conn;
    conn.Open();
    SqlDataReader dr = c.ExecuteReader();

    // Store all the values in the row in an array
    dr.Read();
    Object [] rg = new Object[NUM_OF_ROWS];
    dr.GetValues(rg);

    // Close connection and data reader
    dr.Close();
    conn.Close();

    // Use an ad-hoc user control to display data
    conflict.Display(dbcex.Row, rg);
}

The GetValues method of the DataReader object packages into an array of objects all the values read in the currently selected row in the data reader. The sample code passes this array, together with the DataRow object that caused the conflict, to a tailor-made user control. The user control is shown in Figure 7-8.

Figure 7-8

A user control displays all the information about the conflicting row.

The next listing shows how the user control retrieves and displays the needed information.

<script runat="server">
public void Display(DataRow row, Object [] rg)
{
    StringBuilder sb = new StringBuilder("");

    // Loops through the columns that should be updated
    sb.Append("<table>");
    for(int nIndex=0; nIndex<row.ItemArray.Length; nIndex++)
    {
        if (!row[nIndex].Equals(row[nIndex, DataRowVersion.Original]))
        {
        try {
            sb.Append("<tr>");
            sb.Append("<td>");
            sb.Append("<b>Original: </b>");
            sb.Append(row[nIndex, DataRowVersion.Original].ToString());
            sb.Append("</td>");
            sb.Append("<td>");
            sb.Append("<b>Proposed: </b>");
            sb.Append(row[nIndex].ToString());
            sb.Append("</td>");
            sb.Append("<td>");
            sb.Append("<b>Underlying: </b>");
            sb.Append(rg[nIndex].ToString());
            sb.Append("</td>");
            sb.Append("</tr>");
            } catch {}
        }
    }
    sb.Append("</table>");

    // Display the resulting text
    lblInfo.Text = sb.ToString();
}
</script>

The proposed value is the value stored in the DataRow object that is returned via the exception. To get the original value stored in the row when it was originally read from the data source, you use an overloaded version of the DataRow object’s Item property. When you access a particular column in a row, you can add a second, optional argument, which is the version of the row you want, as shown in this example:

row[nIndex, DataRowVersion.Original]

Resolving a conflict means that you decide what value should be written to the database. After you do this, you run a new command with a lazier WHERE condition and then resume the batch update for all the rows not yet committed.

Accepting In-Memory Changes

A critical aspect of conflict resolution in ADO.NET is the way you manage the acceptance of changes while in the middle of a concurrency exception. In ADO the process is more simplistic because the first conflict rolls back all the successful changes accomplished in the meantime.

In ADO.NET, rows successfully updated when the first conflict is detected are not rolled back. Changes to these rows have also been committed. The Update method, in fact, implicitly calls AcceptChanges on the row after the update is completed. If you use the original DataSet for the batch update, then you can safely resume the process once the conflict has been resolved. Previously updated rows are now marked as unchanged and won’t participate in the batch update again. But what happens if you used a temporary copy of the original DataSet—for example, the copy you get through GetChanges?

In this case, the changes to the rows updated prior to the first conflict are committed but only on the temporary DataSet. This makes writing the code to resume the batch update more challenging. Basically, you need to figure out a way to make the original DataSet accept the changes to the rows processed prior to the one that originated the conflict.

Part III: Interoperability