2.3 Memory

2.3 Memory

A system's random access memory (RAM, or memory) is a temporary storage location used to store instructions and data. Instructions are the actual operations a processor executes. The data comes from a variety of sources. It may be data supplied by some peripheral, such as a hard disk or network controller. It may be intermediary results generated during program execution. Instructions and data are both required for the processor to compute a meaningful result. Hence, the processor constantly is issuing commands to load or store data from memory across the memory bus. Memory buses operate at rates between 100 MHz and 800 MHz. This bus is also referred to as the front side bus, or FSB.

Because of the constant usage of system RAM and the large gap between processor clock rate and memory bus rate, the memory bus is one of the largest impediments to achieving theoretical peak. Memory bus performance is measured in terms of two characteristics. The first is peak memory bandwidth, the burst rate that data can be copied between the DRAM chips in main memory and the CPU. The FSB must be fast enough to support this high burst rate. In the case of some proprietary systems, memory accesses are pipelined to improve aggregate memory bandwidth. In this case, data is bursted from multiple groups of DRAM chips. However, this technique is not used in PC systems. The second characteristic is memory latency, the amount of time it takes to move data between RAM and the CPU. RAM bandwidth ranges from one to four gigabytes per second. RAM latency has fallen to under 6 nanoseconds.

Except for very carefully designed applications, a program's entire dataset must reside in RAM. The alternative is to use disk storage either explicitly (out-of-core calculations) or implicitly (virtual memory swapping), but this usually entails a severe performance penalty. Thus, the size of a node's memory is important in parameter in system design. It determines the size of problem that can practically be run on the node. Engineering and scientific applications often obey a rule of thumb that says that for every floating-point operation per second, one byte of RAM is necessary. This is a gross approximation at best, and actual requirements can vary by many orders of magnitude, but it provides some guidance; for example, a 1 GHz processor capable of sustaining 200 Mflops should be equipped with approximately 200 MBytes of RAM.

Two main types of RAM are used in current commodity systems. SDRAM has been in use for several years. RDRAM is a newer standard used only in Pentium 4-based systems. RDRAM tends to be faster and more expensive.

Part III: Managing Clusters