5.8 Troubleshooting Memory Installation and Operation

Once installed and configured, memory seldom causes problems. When problems do occur, they may be as obvious as a failed RAM check at boot or as subtle as a few corrupted bits in a datafile. The usual symptom of memory problems is a kernel panic under Linux or a blue-screen crash under Windows. Unfortunately, that occurs so often with Windows that it's of little use as a diagnostic aid. When troubleshooting memory problems, always do the following:

  • Use standard antistatic precautions. Ground yourself before you touch a memory module.

  • Remove and reinstall all memory modules to ensure they are seated properly. While you're doing that, it's a good idea to clean the contacts on the memory module. Some people gently rub the contacts with a pencil eraser. We've done that ourselves, but memory manufacturers recommend against it because of possible damage to the contacts. Also, there is always the risk of a fragment from the eraser finding its way into the memory slot, where it can block one or more contacts. Better practice is to use a fresh dollar bill, which has just the right amount of abrasiveness to clean the contacts without damaging them.

    Although we have never used it, many people whom we respect recommend using Stabilant-22, a liquid contact enhancer. You'll probably keel over from sticker shock when you see the price of this stuff, but a drop or two is all that's needed, and a tiny tube lasts most people for years (http://www.stabilant.com/).

  • Before assuming memory is the problem, check all internal cables to ensure none is faulty or has come loose.

The next steps you should take depend on whether you have made any changes to memory recently.

5.8.1 ... When You Have Not Added Memory

If you suspect memory problems but have not added or reconfigured memory (or been inside the case), it's unlikely that the memory itself is causing the problem. Memory does simply die sometimes, and may be killed by electrical surges, but this is uncommon because the PC power supply itself does a good job of isolating memory and other system components from electrical damage. The most likely problem is a failing power supply. Try one or both of the following:

  • If you have another system, install the suspect memory in it. If it runs there, the problem is almost certainly not the memory, but the power supply.

  • If you have other memory, install it in the problem system. If it works, you can safely assume that the original memory is defective. More likely is that it will also fail, which strongly indicates power supply problems.

If you have neither another system nor additional memory, and if your system has more than one bank of memory installed, use binary elimination to determine which modules are bad. For example, if you have two modules installed (one per bank), simply remove one module to see if that cures the problem. If you have four identical modules installed (one per bank), designate them A, B, C, and D. Install only A and B and restart the system. If no problems occur, A and B are known good and the problem must lie with C and/or D. Remove B and substitute C. If no problems occur, you know that D is bad. If the system fails with A and C, you know that C is bad, but you don't know whether D is bad. Substitute D for C and restart the system to determine if D is good.

If you haven't enough banks to allow binary elimination, the best solution is to remove the modules, wrap them if possible in a static-safe bag (the pink plastic that most components arrive in), and take them to a local computer store that has a memory tester.

MS-DOS, Windows 3.X, and Windows 9X do not stress memory. If you install Windows NT/2000/XP or Linux, memory errors may appear on a PC that seemed stable. People often therefore assume that they did something while installing the new OS to cause the errors, but that is almost never the case. Such errors almost always indicate a real problem with physical memory. The memory was defective all along, but the more forgiving OS simply ignored the problem.

5.8.2 ... When You Are Adding Memory

If you experience problems when adding memory, note the following:

  • If a DIMM appears not to fit, there's good reason. SDR-SDRAM DIMMs have two notches whose placement specifies 3.3V versus 5V and buffered versus unbuffered. DDR-SDRAM DIMMs have a keying notch in a different location. If the DIMM notches don't match the socket protrusions, the DIMM is of the wrong type.

  • If the system displays a memory mismatch error the first time you restart, that usually indicates no real problem. Follow the prompts to enter Setup, select Save and Exit, and restart the system. The system should then recognize the new memory. Some systems require these extra steps to update CMOS.

  • Verify the modules are installed in the proper order. Unless the motherboard documentation says otherwise, fill banks sequentially from lowest number to highest. Generally, install the largest module in Bank 0, the next largest in Bank 1, and so on. A few systems require the smallest module be in Bank 0 and larger modules sequentially in higher banks.

  • If the system recognizes a newly installed module as half actual size and that module has chips on both sides, the system may recognize only single-banked or single-sided modules. Some systems limit the total number of "sides" that are recognized, so if you have some existing smaller modules installed, try removing them. The system may then recognize the double-side modules. If it doesn't, return those modules and replace them with single-sided modules.

  • A memory module may not be defective, but still be incompatible with your system. For example, many 486s treat three-chip and nine-chip SIMMs differently, although they should theoretically be interchangeable. Some 486s use only three-chip SIMMs or only nine-chip SIMMs. Others use either, but generate memory errors if you have both types installed.

  • A memory module may not be defective, but still be incompatible with your current configuration. For example, if you install a CAS3 PC133 DIMM in a 133 MHz FSB Pentium III motherboard that is configured to use CAS2 timing, the system will almost certainly generate memory errors.