22.1 Prelude

There are three major rules for handling security breaches:

  1. Don't panic. No matter what has happened, you will only make things worse if you act without thinking.

  2. Document. Whether your goal is to get your system running again as soon as possible, or you want to collect evidence for a prosecution, you will be better off if you document what you do.

  3. Plan ahead. The key to effective response is advance planning. If you plan and practice your response to a security incident, you'll be better equipped to handle the incident when and if it ever happens.

22.1.1 Rule #1: Don't Panic

After a security breach, you are faced with many different choices. Should you shut down the computer, disconnect the network, or call the cops? No matter what has happened, you will only make things worse if you act without thinking.

Before acting, you need to answer certain questions and keep the answers firmly in mind:

  • Did you really have a breach of security? Something that appears to be the action of an intruder might actually be the result of human error or software failure.

  • Was any damage really done? With many security breaches, the perpetrator gains unauthorized access but doesn't actually access privileged information or maliciously change the contents of files.

  • Is it important to obtain and protect evidence that might be used in an investigation?

  • Is it important to get the system back into normal operation as soon as possible?

  • Are you willing to take the chance that files have been altered or removed? If not, how can you tell for sure if changes have been made?

  • Does it matter if anyone within the organization hears about this incident? If somebody outside hears about it?

  • Is an insider suspected?

  • Do you know how the intruder got in?

  • Do you know how many systems are involved?

  • Can it happen again?

The answers to many of these questions may be contradictory; for example, protecting evidence and comparing files may not be possible if the goal is to get the system back into normal operation as soon as possible. You'll have to decide what's best for your own site.

22.1.2 Rule #2: Document

Start a paper log, immediately. Take a notebook and write down everything you find, always noting the date and time. If you examine text files, print copies and then sign and date the hardcopy. If you have the necessary disk space, record your entire session with the script command. Having this information on hand to study later may save you considerable time and aggravation, especially if you need to restore or change files quickly to bring the system back to normal.

This chapter and the chapters that follow present a set of guidelines for handling security breaches. In the following sections, we describe the mechanisms you can use to help you detect a break-in, and handle the question of what to do if you discover an intruder on your system. In Chapter 24 we'll describe denial of service attacks; ways in which attackers can make your system unusable without actually destroying any information. In Chapter 25 we'll discuss legal approaches and other issues you may need to consider after a security incident.

22.1.3 Rule #3: Plan Ahead

A key to effective response in an emergency is advance planning. When a security problem occurs, there are some standard steps to be taken. You should have these steps planned out in advance so there is little confusion or hesitation when an incident occurs.

In larger installations, you may want to practice your plans. For example, along with standard fire drills, you may want to have "virus drills" to practice coping with the threat of a virus, or "break-in drills" to practice techniques for preserving evidence and re-establishing normal operations.

The following basic steps should be at the heart of your plan:

Step 1: Identify and understand the problem

If you don't know what the problem is, you cannot take action against it. This rule does not mean that you need to have perfect understanding, but you should understand at least what form of problem you are dealing with. Cutting your organization's Internet connection won't help you if the problem is being caused by a vengeful employee with a laptop who is hiding out in a co-worker's office.

Step 2: Contain or stop the damage

If you've identified the problem, take immediate steps to halt or limit it. For instance, if you've identified the employee who is deleting system files, you should turn off his account, and probably take disciplinary action as well. Both are steps to limit the damage to your data and system.

Step 3: Confirm your diagnosis and determine the damage

After you've taken steps to contain the damage, confirm your diagnosis of the problem and determine the damage it caused. Are files still disappearing after the employee has been discharged? You may never be 100% sure if two or more incidents are actually related. Furthermore, you may not be able to identify all of the damage immediately, if ever.

Step 4: Preserve the evidence, if necessary

If you intend to prosecute or seek legal redress for your incident, you must make an effort to preserve necessary evidence before going further. Failure to preserve evidence does not prohibit you from calling the cops or filing a suit against the suspected perpetrator, but the lack of evidence may significantly decrease your chances for success. Be advised: preserving evidence can take time and is hard to do properly. For this reason, many organizations dealing with incidents forgo this step.

Step 5: Restore your system

After you know the extent of the damage, you need to restore the system and data to a consistent state. This may involve reloading portions of the system from backups, or it may mean a simple restart of the system. Before you proceed, be certain that all of the programs you are going to use are "safe." The attacker may have replaced your restore program with a Trojan horse that deletes both the files on your hard disk and on your backup tape!

Step 6: Deal with the cause

If the problem occurred because of some weakness in your security or operational measures, you should make changes and repairs after your system has been restored to a normal state. If the cause was a person making a mistake, you should probably educate her to avoid a second occurrence of the situation. If someone purposefully interfered with your operations, you may wish to involve law enforcement authorities.

Step 7: Perform related recovery

If what occurred was covered by insurance, you may need to file claims. Rumor control, and perhaps even community relations, will be required at the end of the incident to explain what happened, what breaches occurred, and what measures were taken to resolve the situation. This step is especially important with a large user community because unchecked rumors and fears can often damage your operations more than the problem itself.

Step 8: Postmortem

Once the heat has died down, review the incident and your handling of it. How could you and your team have handled the situation better? What effort was wasted? What wrong decisions were made? How could you have prevented it from happening in the first place?

In addition to having a plan of action, you can be prepared by creating a toolkit on read-only media (floppy, CD-ROM, etc.). This toolkit will give you a set of programs for incident response that you know are not compromised. Include programs that you will need to examine a compromised system, such as: awk, bash, cat, compress, cut, dd, des, df, du, file, find, grep, gzip, icat, ifconfig, last, ls, lsmod, lsof, md5sum, modinfo, more, netcat, netstat, nmap, paste, pcat, Perl, PGP, pkginfo, ps, rpm, rm, script, sed, strings, strace, tar, top, Tripwire, truss, uncompress, vi, and w. Don't forget shared libraries (or ensure that the programs are statically linked). Having a bootable live filesystem on your CD or DVD is useful as well.

    Part VI: Appendixes