Hardware Configuration

The minimums for hardware configurations from Microsoft shown earlier are just that. Any system configured at those values will not run the software quickly and certainly will not support any serious workload. To determine the proper configuration for a given application, you must understand how each component within a server affects SQL Server's performance.

CPU

The central processing unit is one of the single most important components for SQL Server performance. The factors that affect CPU choice are speed, its cache, the number of processors, and the architecture of the processors, such as 32 or 64 bit.

Obviously, the faster the processor, the faster instructions pass through the system. The number of processors, however, can outweigh the raw speed. For instance, a fast single (non-dual-core) processor might actually run a given query more slowly than two slower processors. This is true because of the ability of SQL Server to parallelize, or spread out a query across multiple processors.

Hyper-threaded processors also help speed queries by doubling certain parts of the CPU architecture within a single chip, allowing multiple threads to operate on one CPU. This gives the appearance to the software of having multiple processors. Dual-core processors take this architecture a step further by doubling an entire processor on a single chip, and in some configurations, it is also used to provide 32- and 64-bit concurrent operations.

Windows 2000 handles hyper-threading differently than Windows 2003. In Windows 2000, a thread is placed on a processor based on its logical processor count, not by physical processor. That means that a thread might be scheduled to run on physical processor one and then logical processor two, which may be on the same physical chip. The second physical processor, with its own cache and pipeline, would go underutilized in this situation. Windows 2003 schedules threads by physical processor first and then logical processors second.

Another important factor is the second level (L2) cache memory on the CPU. More is better. A slower CPU with a higher L2 cache can outperform a faster one with a lower L2 cache amount.

The difference between 32- and 64-bit processors is startling. A 64-bit processor is not just double the size of a 32-bit CPU; 64-bit architectures allow an amazing increase in address space for instructions and memory. For serious enterprise configurations, consider 64-bit processors.

There are two types of 64-bit processors. The first is called x64, which means that it is completely backward compatible with 32-bit software. The processor will run 64-bit code, but will also run 32-bit software if that is what is installed. Some tests show that 32-bit software actually runs faster on an x64 processor.

The second type of 64-bit processor is an entirely new architecture. The Intel version of this chip is called the Itanium or IA64, and it runs only 64-bit software compiled to work on it. SQL Server is currently designed to work on IA64 as well as x64 architectures.

Memory

Random access memory (RAM) is the second most important factor in how quickly a SQL Server responds to queries. Several buffers inside SQL Server use memory to cache query plans and procedures. If memory is constrained, the server has to cycle recent plans and procedures down to the I/O subsystem, which is several orders of magnitude slower than the circuitry in memory.

The recommendations for operating SQL Server for any production application should start at 2 gigabytes and progress from there. SQL Server automatically allocates memory among its various components, so no tuning for locks and caching is needed. By default, the system takes a range of memory from a lower boundary to a maximum limit and gives this memory back to the operating system as needed. It is much more efficient to keep all other applications away from the SQL Server and set the memory to a fixed amount, leaving at least 512MB for the operating system. This setup keeps SQL Server from having to manage the memory, which saves a few CPU cycles. I explain more about memory configurations in Chapter 5, "Monitoring and Optimization."

I/O

With the I/O subsystem, speed is the primary factor, followed closely by the interface type. For local systems, IDE or on-board SATA are adequate; in production configurations, however, the system should use an SCSI interface. SCSI has a wider bus, so it can transfer more data at a single pass.

The best local technology is a redundant array of inexpensive disks (RAID). This is a scheme of a special controller hooked up to one or more drives. There are several levels of RAID, all having to do with how the data is stored on more than one drive at a time.

DBA 101

RAID level 0 stripes all the data across several drives (at least two) and is extremely fast. But because the entire data set is dependent on all the drives working properly, lose one and you lose all the data. For this reason, RAID 0 is not appropriate for SQL Server files.

The first real choice for SQL Server is RAID 1, where all data is duplicated onto a "mirror" set of drives. SQL Server 2005 log files (more on those in the next chapter) are best stored on this level of RAID, because RAID 1 works well with sequential reads and writes. For this setup, you need an even number of drives.

The next level normally seen in a SQL Server is RAID 5, where the data is "striped" across multiple drives (at least three) and a parity bit is used to rebuild a single drive in case of failure. You can think of an equation, such as 3 + 2 = 5. If you were to lose any part of that equation, say the 3, you could rebuild the equation from the other two parts, like this: X + 2 = 5. The data, combined with the parity bit, works the same way. RAID level 5 is best used with SQL Server data files, because they are accessed randomly.

Although there are many other RAID levels (check out www.raid.com), there is most often only one other that you will see in SQL Server installations. It is actually a combination of other RAID levels and is called RAID 0+1. In this setup, the data is striped with no safety across several drives for speed, and then the entire set is duplicated for safety. You need at least four drives for this setup.

All RAID levels should be handled by hardware devices only, and it is best if you use multiple controllers for each set of drives. Software RAID setups are available, but they should never be implemented in a SQL Server environment, not even in testing or development. The reason is that the CPU and memory requirements to run the software RAID are so prohibitive that SQL Server is starved for resources.

In clustering or in larger configurations where greater speed and safety is a necessity, the SQL Server is normally connected to a storage area network (SAN). In a SAN configuration, a special I/O adapter called a host bus adaptor (HBA) is installed in the server, and the server is connected to an external storage array via a switched network, often using Fibre Channel technology.

DBA 101

Fibre Channel is often confused with fiber-optic interfaces, which it sometimes employs. The Fibre in this case refers to the interface as well as the network that it uses, called the fabric of the SAN. Fibre Channels can use fiber-optic setups and copper wiring, too.

In Fibre Channel, an HBA connects to a Fibre switch, which in turn connects to storage arrays, which have enclosures, which contain physical drives. These drives are served up to the computer as logical unit numbers (LUNs). These LUNs form the lowest level of storage that the operating system can see, and the LUNs are then carved into partitions available for either mount points or drive letters on the operating system.

To optimize the SAN configuration, ensure that there are multiple paths to your storage arrays and balance the I/O where possible across LUNs, physical storage, and processors.

There is also network attached storage (NAS), which is a drive enclosure that is mounted externally on the network. If you are using gigabit speeds or fiber-optic technology, these are also fine for SQL Server installations. The danger here is that the drives in these systems are often used by other applications, and if they mix with the SQL Server data files, the system will take a performance impact.

Another popular technology for large-scale storage is serial attached storage (SAS). In this setup, the drives are stored remotely from the server. Having a networked version allows for robust disaster recovery scenarios.

One rule of thumb with I/O whether local storage or SANS are used is that more spindles are better. More spindles means that more heads are able to satisfy requests, so data is delivered quicker. With more spindles, a read operation is not blocked by a write operation, and queries do not block log updates. This depends on how you arrange the files that SQL Server uses, something I cover in more depth in Chapter 5. You can check with your hardware vendor to find out how many spindles their solution has.

To properly plan for the final drive configuration, create separate mount points or drive letters for the tempdb database files, data files, logs, and indexes. This allows sequential writes (the log files) to be separated from random reads (the data files) and indexing operations (the index files) to be separate from internal sorting and tracking (tempdb).

Even if you are using a SAN that hides the physical separation and controls the I/O, having separate drive letters allows you greater flexibility when creating filegroups. Filegroups allow you to perform backups, maintenance, and other tasks on individual chunks of the database. Having filegroups arranged this way also simplifies transitions to newer systems, which happens often in large companies.

The most often overlooked component of storage during the planning stage is the backup strategy. Most often, you will use a magnetic tape solution. Smaller setups can use DAT or DLT tapes, and larger organizations will require LTO-1, LTO-2, or LTO-3 tapes for larger storage. At the largest levels, you can begin to consider virtual tape drives, which SQL Server 2005 supports. I explain the backup and recovery strategies more completely in Chapter 3 "Maintenance and Automation," but you should plan for at least the capacity, recovery interval, and automation level that you want right at the outset.

Network

Although many administrators do not realize it, the network interface in a computer is often the slowest component in the system.

The network cards inside the server should be as fast as possible and, where applicable, set to full duplex (not to auto-negotiate). You should include a fast network card in the server even if the network topology is slower than the card in the server can perform. This lets you increase the speed of the network later without having to alter the server and allows the server to satisfy more requests at a time.

The server should have at least two network cards if it will participate in replication or you plan to locate it in a network demilitarized zone (DMZ). In addition, each network card should be connected directly to a different network switch. This provides redundancy and greater speed to handle multiple requests. In addition, a separate "backbone" or dedicated connection for the servers isolates traffic from the rest of the network and provides the best level of performance for both.