Query Compilation and Optimization

Query compilation is the complete process from the submission of a query to the actual execution. There are many steps to query compilation?one of which is optimization. All T-SQL statements are compiled, but not all are optimized. Only the standard SQL Data Manipulation Language (DML) statements, SELECT, INSERT, UPDATE, and DELETE, require optimization. The other procedural constructs in T-SQL (IF, WHILE, local variables, and so on) are compiled as procedural logic but do not require optimization. DML statements are set-oriented requests that the optimizer must translate into procedural code that can be executed efficiently to return the desired results.

Compiling DML Statements

When SQL Server compiles an execution plan for a DML statement, it performs the following basic steps:

  1. The query is checked for proper syntax and the T-SQL statements parsed into keywords, expressions, operators, and identifiers to generate a sequence tree. The sequence tree is an internal format of the query that SQL Server can operate on.

  2. The sequence tree is then normalized. During normalization, the tables and columns are verified and the metadata (datatypes, null properties, index statistics, and so on) about them is retrieved. Additionally, any views are resolved to their underlying tables and implicit conversions are performed (for example, an integer compared with a float value).

  3. If the statement is a DML statement, SQL Server takes the normalized query and generates a query graph.

  4. The query graph is then optimized and a query execution plan is generated.

  5. SQL Server executes the query execution plan.

Optimization Steps

When the query graph is passed to the optimizer, the optimizer performs a series of steps to break the query down into its component pieces for analysis in order to generate an optimal execution plan.

  1. Query Analysis?The query is analyzed to determine search arguments and join clauses. A search argument is defined as a WHERE clause comparing a column to a constant. A join clause is a WHERE clause comparing a column from one table to a column in another table.

  2. Index Selection?Indexes are selected based on search arguments and join clauses (if any exist). Indexes are evaluated based on their distribution statistics and are assigned a cost.

  3. Join Selection?The join order is evaluated to determine the most appropriate order in which to access tables. Additionally, the optimizer evaluates the most appropriate join algorithm to match the data.

  4. Execution Plan Selection?Execution costs are evaluated and a query execution plan is created that represents the most efficient solution.

The next four sections of this chapter will examine each of these steps in more detail.

NOTE

Unless stated otherwise, the examples presented in this chapter operate on the tables in the bigpubs2000 database. The pubs and Northwind databases provided with SQL Server generally do not contain enough data to demonstrate many of the query strategies presented in this chapter. A copy of the bigpubs2000 database is available on the CD included with this book. Instructions on how to install the database are presented in the Introduction.



    Part III: SQL Server Administration
    Part IV: Transact-SQL
    Part V: SQL Server Internals and Performance Tuning
    Part VI: Additional SQL Server Features