5.3 CAS Latency

CAS latency is the delay, in clock cycles, between the time the processor requests data from memory and the time the memory makes the first piece of data available to be read. SDR-SDRAM modules may have a CAS latency of 1, 2, or 3. DDR-SDRAM modules have a CAS latency of 2 or 2.5. CAS latency is often abbreviated as CAS or CL. For example, a PC133 module may be labeled CAS2, CAS-2, CAS=2, CL2, CL-2, or CL=2, all of which mean that module has a CAS latency of 2.

Current systems read memory in 32-bit chunks, comprising four 8-bit bytes. CAS latency specifies the number of clock cycles required before the first byte can be read. After that first byte is read, the remaining bytes are read without latency, in one clock cycle each. For example, CL3 memory delivers the first byte after three clock cycles and the other three bytes in one clock cycle each. This memory timing is designated 3-1-1-1 and indicates that six clock cycles (3+1+1+1) are needed to read all four bytes. CL2 memory uses a 2-1-1-1 memory timing, and therefore reads all four bytes in five clock cycles (2+1+1+1). Similarly, CL1 memory uses a 1-1-1-1 memory timing and requires only four clock cycles to complete the read.

On that basis, one might conclude that CL2 memory is 16.7% faster than CL3 memory and CL1 memory is 33.3% faster than CL3, which is a substantial difference. In fact, that differential holds only for single 32-bit reads, whereas most reads are streaming. During streaming reads, each 32-bit read after the first is performed without latency. As the number of streamed 32-bit reads per access increases, the relative significance of the CAS latency overhead incurred for the first byte diminishes.

For example, compare a streaming 32-byte read (eight sequential 32-bit reads) with CL3 versus CL2 versus CL1 memory. With CL3 memory, the first 32-bit read requires six clock cycles. Each of the following seven 32-bit reads does not incur the CAS latency penalty, and so requires only four clock cycles. The full 32-byte read therefore requires a total of 6 + (7*4) or 34 clock cycles. With CL2 memory, the first 32-bit read requires five clock cycles, and each of the following seven 32-bit reads again requires only four clock cycles, for a total of 33 clock cycles. With CL1 memory, all eight 32-bit reads require four clock cycles each, for a total of 32 clock cycles. In this (very realistic) example, CL2 memory is actually only 2.9% faster (1/34) than CL3 memory, and CL1 memory is only 5.9% (2/34) faster than CL3.

In practice, lower CAS latencies benefit highly random read operations but do little to help streaming (sequential) read operations. Typical PC read operations use sequential read operations heavily, which means that you can expect only a minor improvement in memory performance if you use memory with a lower CAS latency rating. It's worth paying a bit more for memory with faster CAS latency, but not for the reason you might expect. (See the last point in the following bulleted list.)

Keep these CL-related issues in mind:

  • Most motherboards can use memory of any CL timing, although some motherboards may not take advantage of the reduced latency. A few motherboards require memory with a specific CL timing. For example, a motherboard that requires CL2 PC133 may not work properly with CL3 PC133 memory, and a motherboard that requires CL3 PC133 memory may not work properly with CL2 PC133. This is a good reason to use the memory configurator utilities provided by Crucial and other memory makers, which take CAS latency issues into account when listing compatible memory modules.

  • Some motherboards allow mixing memory with different CL timings, although the faster memory almost always operates at the CAS latency of the slowest module installed. Some motherboards work properly with memory of different CL timings as long as all memory installed has the same CL timing, but misbehave if you install mixed modules of different CL timings. We suspect these problems are caused by minor electrical differences such as capacitance, but have never gotten a good explanation of why this is true. Although problems with mixed CL timings are unusual in our experience, we recommend not mixing CL timings for this reason.

  • Most motherboards that support different CL timings configure themselves optimally automatically based on the information reported by the memory module itself, but some require setting memory timings manually in the Chipset Configuration section of BIOS Setup. If you install "fast" modules in a system, it's worth checking BIOS Setup to make sure the system is configured to use the faster CL timings.

  • Using conservative memory timings can increase the stability and reliability of a system at a minimal cost in terms of reduced performance. For example, if a system has PC133 CL2 memory installed and crashes too frequently, you can increase the stability of that system by configuring CMOS Setup to use CL3 memory timings. CL2 memory running as CL3 is much more stable than CL2 memory running as CL2, and probably more stable than CL3 memory running as CL3. The performance hit will be so small that you won't even notice it unless you run a memory benchmark program.