Introduction to a VPN

A VPN is a technology that allows two or more locations to communicate securely over a public network while maintaining the security and privacy of a private network. Encryption, authentication, and packet integrity checks are key enablers of VPNs; they ensure that the data is private and the integrity of the data is maintained. The main reason for using VPNs is that private networks are expensive to acquire and maintain. Companies are finding VPNs a cost-effective way to connect sites to one another.

Given that firewalls are designed to be the gatekeepers between public and private networks, it makes sense to integrate VPN capabilities into a firewall. You can not only filter traffic but also subject VPN usage to a security policy that is integrated with your existing one. In FireWall-1, there is no separate place to create a VPN rulebase; it is simply part of your existing security policy. This makes it really easy to implement a VPN in FireWall-1.

Concepts

Many generic concepts are used throughout this chapter and are briefly defined in the following subsections. Because this chapter is not meant to comprehensively cover encryption, these descriptions may seem inadequate. For a more detailed description of these concepts, refer to Bruce Schneier's Applied Cryptography [1996].

Cryptography and Encryption

Cryptography is the art and science of keeping messages secure. Encryption is the actual process of transforming data into a nearly random form that can be reversed only in specific circumstances. Encryption ensures privacy by keeping information hidden from anyone for whom it is not intended, even those who have access to the encrypted data. Without encryption, a VPN would not be possible.

Encryption Keys

An encryption key is used to either encrypt data, decrypt data, or both. The kind of key used depends on the type of encryption algorithm used. The number of bits in this key plays a role in determining how strong the encryption is, though the encryption algorithm and the implementation thereof arguably play a bigger role. The fewer the bits, the easier it is to guess the encryption key by brute force (i.e., by simply trying each possible key).

Entropy?the Random Pool

Computers are deterministic: They do exactly what you tell them to do. This makes it difficult to have computers do something randomly, though personal experience might suggest otherwise. Unfortunately, there is no really good way to have a computer do something like, for instance, come up with a random encryption key. If the encryption keys aren't random or are relatively easy to work out, it doesn't matter what your encryption scheme is?someone's going to be able to read your encrypted message.

Pseudo-random number generators used in cryptography need to have two properties to be cryptographically secure.

The output they generate looks random. This means the output can pass all statistical tests of randomness that we can find.
The output must be unpredictable. This means it must be computationally infeasible to predict what the next random number will be, given the knowledge of the algorithm in use and all previously generated numbers.

One way that pseudo-random number generators are able to meet these criteria is to employ methods of obtaining entropy, or randomness from various sources. These random sources must be relatively difficult to reproduce and/or influence in a particular manner. Recall that when you first installed FireWall-1, it asked you to enter a bunch of random text. FireWall-1 uses the timing between keystrokes and the keys entered as a way to gather some initial entropy to use for the encryption processes. Other programs or operating systems may use certain kinds of network traffic, messages in system logs, or various counters from the system to gather entropy.

Symmetric Encryption

Symmetric encryption uses the same key for encrypting and decrypting data. The encryption key needs to be kept secret and should be exchanged via some sort of secure mechanism. Symmetric encryption is generally used for bulk data encryption because it is faster than other methods. Examples of symmetric encryption include the Data Encryption Standard (DES), Blowfish, the recently approved Advanced Encryption Standard (AES), and FWZ.

Asymmetric Encryption

Asymmetric encryption uses different keys for encrypting and decrypting data. Asymmetric encryption schemes are approximately 1,000 times slower than symmetric encryption schemes on similar hardware and are used only to exchange small amounts of data (e.g., encryption keys for a symmetric algorithm).

All public-key cryptography systems, such as RSA,^[1] are asymmetric algorithms. In these systems, each node has a public key (which is widely distributed) and a private key (which is kept secret). A node's public and private keys have a peculiar property in that they effectively cancel out each other's effects, thus allowing you to encrypt and decrypt them in combination with another person's keys.

^[1] RSA is an encryption and authentication system that uses an algorithm developed in 1977. It is named for its authors: Ron Rivest (the R), Adi Shamir (the S), and Leonard Adleman (the A).

Consider the following example: The source node can encrypt a message using the destination node's public key, and only the destination node can decrypt it. More importantly, a source node can encrypt the message with its private key and the destination node's public key. The destination node can decrypt it using its own private key and the source node's public key. Aside from encrypting the message, the process provides verification that the source node was the correct node, another important function.

Hash Functions

A hash function is a one-way function that takes a variable-length input and converts it to a fixed-length string. One-way functions have the unique property of being difficult to reverse, meaning that, given the function and its output, it is computationally unfeasible to determine what value(s) was originally plugged into the function to give that output. One-way functions suitable for cryptographic use need to be collision-free, that is, it must be hard to create any two inputs that generate the same output.

Although they do not encrypt data per se, the hash functions themselves are based in cryptography and provide a very important purpose in the encryption process: validation. When data is placed through a hash function, the result is sort of a "checksum." Because it is highly unlikely that any two inputs to a hash function will give the same result, you can be reasonably confident that the data has not been tampered with.

Within the context of a VPN, the sender encrypts a packet, then passes it through a hash function, the result of which is encrypted with a symmetric algorithm such as RSA. The encrypted packet and the encrypted hash are forwarded to the recipient. Prior to decrypting, the recipient computes a hash on the encrypted packet, then decrypts the received hash and verifies that the computed and received hashes are identical. If the two hashes match, the message must have been sent by the claimed sender.

Examples of hash functions include MD5 and SHA-1.

Fingerprints

Public keys are run through a hash function to generate a result called the fingerprint. This is used to verify that you are working with the correct key. For example, I placed the fingerprint to my Pretty Good Privacy (PGP) key in the preface of this book. If you find what appears to be my PGP key on the Internet, you can download it and ask your PGP application to display this key's fingerprint. If PGP shows the same fingerprint as printed in the preface, you most likely have my correct PGP key because it is unlikely that two different PGP public keys will yield the same PGP fingerprint.

To use an example more relevant to FireWall-1, during installation of the management station, you are shown a fingerprint. The first time you connect to the management station with SmartDashboard/Policy Editor, the management station sends the public key. SmartDashboard/Policy Editor runs this key through the same hash function and displays a fingerprint for you to verify. The fingerprint shown during the initial installation should match what is shown in SmartDashboard/Policy Editor.

Certificate Authorities

A certificate authority (CA) is a trusted third party that certifies public keys. The CA has its own public and private keys. The CA takes prudent steps to verify the authenticity of a public key. The CA then signs the node's public key by encrypting it with its private key, which is then widely distributed. A node can verify it has the correct public key by decrypting using the CA's public key.

A firewall management console has its own CA, the Internal Certificate Authority (ICA). Certificates are used for authentication between managed modules. They can also be generated for firewalls to use with Internet Key Exchange (IKE) encryption as well as to identify end users. FireWall-1 does support third-party CAs.

Diffie-Hellman Keys

Diffie-Hellman (DH) keys are essentially public- and private-key pairs used in an asymmetric encryption algorithm. To verify that the keys have not been tampered with, they are typically signed by a CA key.

The Encryption Domain

The encryption domain is a concept that is not entirely unique to FireWall-1, but the term is. Generally speaking, it contains everything on the private side of the network (i.e., all hosts behind the gateway in question). Note that this does not mean that every host in the encryption domain is allowed to communicate through a VPN. This is controlled by the defined security policy. The encryption domain just defines the potential for encryption. You must include all translated IP addresses for internal hosts. In general, the firewall is not part of the encryption domain but can be. It should be if the firewall is being used as the hide address for internal hosts.