The minimums for hardware configurations from Microsoft shown earlier are just that. Any system configured at those values will not run the software quickly and certainly will not support any serious workload. To determine the proper configuration for a given application, you must understand how each component within a server affects SQL Server's performance.
The central processing unit is one of the single most important components for SQL Server performance. The factors that affect CPU choice are speed, its cache, the number of processors, and the architecture of the processors, such as 32 or 64 bit.
Obviously, the faster the processor, the faster instructions pass through the system. The number of processors, however, can outweigh the raw speed. For instance, a fast single (non-dual-core) processor might actually run a given query more slowly than two slower processors. This is true because of the ability of SQL Server to parallelize, or spread out a query across multiple processors.
Hyper-threaded processors also help speed queries by doubling certain parts of the CPU architecture within a single chip, allowing multiple threads to operate on one CPU. This gives the appearance to the software of having multiple processors. Dual-core processors take this architecture a step further by doubling an entire processor on a single chip, and in some configurations, it is also used to provide 32- and 64-bit concurrent operations.
Windows 2000 handles hyper-threading differently than Windows 2003. In Windows 2000, a thread is placed on a processor based on its logical processor count, not by physical processor. That means that a thread might be scheduled to run on physical processor one and then logical processor two, which may be on the same physical chip. The second physical processor, with its own cache and pipeline, would go underutilized in this situation. Windows 2003 schedules threads by physical processor first and then logical processors second.
Another important factor is the second level (L2) cache memory on the CPU. More is better. A slower CPU with a higher L2 cache can outperform a faster one with a lower L2 cache amount.
The difference between 32- and 64-bit processors is startling. A 64-bit processor is not just double the size of a 32-bit CPU; 64-bit architectures allow an amazing increase in address space for instructions and memory. For serious enterprise configurations, consider 64-bit processors.
There are two types of 64-bit processors. The first is called x64, which means that it is completely backward compatible with 32-bit software. The processor will run 64-bit code, but will also run 32-bit software if that is what is installed. Some tests show that 32-bit software actually runs faster on an x64 processor.
The second type of 64-bit processor is an entirely new architecture. The Intel version of this chip is called the Itanium or IA64, and it runs only 64-bit software compiled to work on it. SQL Server is currently designed to work on IA64 as well as x64 architectures.
Random access memory (RAM) is the second most important factor in how quickly a SQL Server responds to queries. Several buffers inside SQL Server use memory to cache query plans and procedures. If memory is constrained, the server has to cycle recent plans and procedures down to the I/O subsystem, which is several orders of magnitude slower than the circuitry in memory.
The recommendations for operating SQL Server for any production application should start at 2 gigabytes and progress from there. SQL Server automatically allocates memory among its various components, so no tuning for locks and caching is needed. By default, the system takes a range of memory from a lower boundary to a maximum limit and gives this memory back to the operating system as needed. It is much more efficient to keep all other applications away from the SQL Server and set the memory to a fixed amount, leaving at least 512MB for the operating system. This setup keeps SQL Server from having to manage the memory, which saves a few CPU cycles. I explain more about memory configurations in Chapter 5, "Monitoring and Optimization."
With the I/O subsystem, speed is the primary factor, followed closely by the interface type. For local systems, IDE or on-board SATA are adequate; in production configurations, however, the system should use an SCSI interface. SCSI has a wider bus, so it can transfer more data at a single pass.
The best local technology is a redundant array of inexpensive disks (RAID). This is a scheme of a special controller hooked up to one or more drives. There are several levels of RAID, all having to do with how the data is stored on more than one drive at a time.
All RAID levels should be handled by hardware devices only, and it is best if you use multiple controllers for each set of drives. Software RAID setups are available, but they should never be implemented in a SQL Server environment, not even in testing or development. The reason is that the CPU and memory requirements to run the software RAID are so prohibitive that SQL Server is starved for resources.
In clustering or in larger configurations where greater speed and safety is a necessity, the SQL Server is normally connected to a storage area network (SAN). In a SAN configuration, a special I/O adapter called a host bus adaptor (HBA) is installed in the server, and the server is connected to an external storage array via a switched network, often using Fibre Channel technology.
There is also network attached storage (NAS), which is a drive enclosure that is mounted externally on the network. If you are using gigabit speeds or fiber-optic technology, these are also fine for SQL Server installations. The danger here is that the drives in these systems are often used by other applications, and if they mix with the SQL Server data files, the system will take a performance impact.
Another popular technology for large-scale storage is serial attached storage (SAS). In this setup, the drives are stored remotely from the server. Having a networked version allows for robust disaster recovery scenarios.
One rule of thumb with I/O whether local storage or SANS are used is that more spindles are better. More spindles means that more heads are able to satisfy requests, so data is delivered quicker. With more spindles, a read operation is not blocked by a write operation, and queries do not block log updates. This depends on how you arrange the files that SQL Server uses, something I cover in more depth in Chapter 5. You can check with your hardware vendor to find out how many spindles their solution has.
To properly plan for the final drive configuration, create separate mount points or drive letters for the tempdb database files, data files, logs, and indexes. This allows sequential writes (the log files) to be separated from random reads (the data files) and indexing operations (the index files) to be separate from internal sorting and tracking (tempdb).
Even if you are using a SAN that hides the physical separation and controls the I/O, having separate drive letters allows you greater flexibility when creating filegroups. Filegroups allow you to perform backups, maintenance, and other tasks on individual chunks of the database. Having filegroups arranged this way also simplifies transitions to newer systems, which happens often in large companies.
The most often overlooked component of storage during the planning stage is the backup strategy. Most often, you will use a magnetic tape solution. Smaller setups can use DAT or DLT tapes, and larger organizations will require LTO-1, LTO-2, or LTO-3 tapes for larger storage. At the largest levels, you can begin to consider virtual tape drives, which SQL Server 2005 supports. I explain the backup and recovery strategies more completely in Chapter 3 "Maintenance and Automation," but you should plan for at least the capacity, recovery interval, and automation level that you want right at the outset.
Although many administrators do not realize it, the network interface in a computer is often the slowest component in the system.
The network cards inside the server should be as fast as possible and, where applicable, set to full duplex (not to auto-negotiate). You should include a fast network card in the server even if the network topology is slower than the card in the server can perform. This lets you increase the speed of the network later without having to alter the server and allows the server to satisfy more requests at a time.
The server should have at least two network cards if it will participate in replication or you plan to locate it in a network demilitarized zone (DMZ). In addition, each network card should be connected directly to a different network switch. This provides redundancy and greater speed to handle multiple requests. In addition, a separate "backbone" or dedicated connection for the servers isolates traffic from the rest of the network and provides the best level of performance for both.