7.17 Representing Keys and Certificates in Plaintext (PEM Encoding)

7.17.1 Problem

You want to represent cryptographic data such as public keys or certificates in a plaintext format, so that you can use it in protocols that don't accept arbitrary binary data. This may include storing an encrypted version of a private key.

7.17.2 Solution

The PEM format represents DER-encoded data in a printable format. Traditionally, PEM encoding simply base64-encodes DER-encoded data and adds a simple header and footer. OpenSSL provides an API for such functionality that handles the DER encoding and header writing for you.

OpenSSL has introduced extensions for using encrypted DER representations, allowing you to use PEM to store encrypted private keys and other cryptographic data in ASCII format.

7.17.3 Discussion

Privacy Enhanced Mail (PEM) is the original encrypted email standard. Although the standard is long dead, a small subset of its encoding mechanism has managed to survive.

In today's day and age, PEM-encoded data is usually just DER-encoded data with a header and footer. The header is a single line consisting of five dashes followed by the word "BEGIN", followed by anything. The data following the word "BEGIN" is not really standardized. In some cases, there might not be anything following this word. However, if you are using the OpenSSL PEM outputting routines, there is a textual description of the type of data object encoded. For example, OpenSSL produces the following header line for an RSA private key:

-----BEGIN RSA PRIVATE KEY-----

This is a good convention, and one that is widely used.

The footer has the same format, except that "BEGIN" is replaced with "END". You should expect that anything could follow. Again, OpenSSL uses a textual description of the content.

In between the two lines is a base64-encoded DER representation, which may contain line breaks (\r\n, often called CRLFs for "carriage return and line feed"), which get ignored. We cover base64 in Recipe 4.5 and Recipe 4.6, and DER encoding in Recipe 7.16.

If you want to encrypt a DER object, the original PEM format supported that as well, but no one uses these extensions today. OpenSSL does implement something similar. First, we'll describe what OpenSSL does, because this will offer compatibility with applications built with OpenSSL that use this format?most notably Apache with mod_ssl. Next, we'll demonstrate how to use OpenSSL's PEM API directly.

We'll explain this format by walking through an example. Here's a PEM-encoded, encrypted RSA private key:

-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,F2D4E6438DBD4EA8
   
LjKQ2r1Yt9foxbHdLKZeClqZuzN7PoEmy+b+dKq9qibaH4pRcwATuWt4/Jzl6y85
NHM6CM4bOV1MHkyD01tFsT4kJ0GwRPg4tKAiTNjE4Yrz9V3rESiQKridtXMOToEp
Mj2nSvVKRSNEeG33GNIYUeMfSSc3oTmZVOlHNp9f8LEYWNmIjfzlHExvgJaPrixX
QiPGJ6K05kV5FJWRPET9vI+kyouAm6DBcyAhmR80NYRvaBbXGM/MxBgQ7koFVaI5
zoJ/NBdEIMdHNUh0h11GQCXAQXOSL6Fx2hRdcicm6j1CPd3AFrTt9EATmd4Hj+D4
91jDYXElALfdSbiO0A9Mz6USUepTXwlfVV/cbBpLRz5Rqnyg2EwI2tZRU+E+Cusb
/b6hcuWyzva895YMUCSyDaLgSsIqRWmXxQV1W2bAgRbs8jD8VF+G9w=  =
-----END RSA PRIVATE KEY-----

The first line is as discussed at the beginning of this section. Table 7-4 lists the most useful values for the data type specified in the first and last line. Other values can be found in openssl/pem.h.

Table 7-4. PEM header types

Name

Comments

RSA PUBLIC KEY

?

RSA PRIVATE KEY

?

DSA PUBLIC KEY

?

DSA PRIVATE KEY

?

DH PARAMETERS

Parameters for Diffie-Hellman key exchange

CERTIFICATE

An X.509 digital certificate

TRUSTED CERTIFICATE

A fully trusted X.509 digital certificate

CERTIFICATE REQUEST

A PKCS #10 certificate signing request

X509 CRL

An X.509 certificate revocation list

SSL SESSION PARAMETERS

?

The header line is followed by three lines that look like MIME headers. Do not treat them as MIME headers, though. Yes, the base64-encrypted text is separated from the header information by a line with nothing on it (two CRLFs). However, you should assume that there is no real flexibility in the headers. You should have either the two headers that are there, or nothing (and if you're not including headers, be sure to remove the blank line). In addition, the headers should be in the order shown above, and they should have the same comma-separated fields.

As far as we can determine, the second line must appear exactly as shown above for OpenSSL compatibility. There's some logic in OpenSSL to handle two other options that would add an integrity-checking value to the data being encoded, but it appears that the OpenSSL team never actually finished a full implementation, so these other options aren't used (it's left over from a time when the OpenSSL implementers were concerned about compliance with the original PEM RFCs). The first parameter on the "DEK-Info" line (where DEK stands for "data encrypting key") contains an ASCII representation of the algorithm used for encryption, which should always be a CBC-based mode. Table 7-5 lists the identifiers OpenSSL currently supports.

Table 7-5. PEM encryption algorithms supported by OpenSSL

Cipher

String

AES with 128-bit keys

AES-128-CBC

AES with 192-bit keys

AES-192-CBC

AES with 256-bit keys

AES-256-CBC

Blowfish

BF-CBC

CAST5

CAST-CBC

DES

DES-CBC

DESX

DESX

2-key Triple-DES

DES-EDE-CBC

3-key Triple-DES

DES-EDE3-CBC

IDEA

IDEA-CBC

RC2

RC2-CBC

RC5 with 128-bit keys and 12 rounds

RC5-CBC

The part of the DEK-Info field after the comma is a CBC initialization vector (which should be randomly generated), represented in uppercase hexadecimal.

The way encrypted PEM representations work in OpenSSL is as follows:

  1. The data is DER-encoded.

  2. The data is encrypted using a key that isn't specified anywhere (i.e., it's not placed in the headers, for obvious reasons). Usually, the user must type in a password to derive an encryption key. (See Recipe 4.10.[7]) The key-from-password functionality has the initialization vector double as a salt value, which is probably okay.

    [7] OpenSSL uses PKCS #5 Version 1.5 for key derivation. PKCS #5 is an earlier version of the algorithm described in Recipe 4.10. MD5 is used as the hash algorithm with an iteration count of 1. There are some differences between PKCS #5 Version 1.5 and Version 2.0. If you don't care about OpenSSL compatibility, you should definitely use Version 2.0 (the man pages even recommend it).

  3. The encrypted data is base64-encoded.

The OpenSSL API for PEM encoding and decoding (include openssl/pem.h) only allows you to operate on FILE or OpenSSL BIO objects, which are the generic OpenSSL IO abstraction. If you need to output to memory, you can either use a memory BIO or get the DER representation and encode it by hand.

The BIO API and the FILE API are similar. The BIO API changes the name of each function in a predictable way, and the first argument to each function is a pointer to a BIO object instead of a FILE object. The object type on which you're operating is always the second argument to a PEM function when outputting PEM. When reading in data, pass in a pointer to a pointer to the encoded object. As with the DER functions described in Recipe 7.16, OpenSSL increments this pointer.

All of the PEM functions are highly regular. All the input functions and all the output functions take the same arguments and have the same signature, except that the second argument changes type based on the type of data object with which you're working. For example, the second argument to PEM_write_RSAPrivateKey( ) will be an RSA object pointer, whereas the second argument to PEM_writeDSAPrivateKey( ) will be a DSA object pointer.

We'll show you the API by demonstrating how to operate on RSA private keys. Then we'll provide a table that gives you the relevant functions for other data types.

Here's the signature for PEM_write_RSAPrivateKey( ):

int PEM_write_RSAPrivateKey(FILE *fp, RSA *obj, EVP_CIPHER *enc,
                           unsigned char *kstr, int klen,
                           pem_password_cb callback, void *cb_arg);

This function has the following arguments:

fp

Pointer to the open file for output.

obj

RSA object that is to be PEM-encoded.

enc

Optional argument that, if not specified as NULL, is the EVP_CIPHER object for the symmetric encryption algorithm (see Recipe 5.17 for a list of possibilities) that will be used to encrypt the data before it is base64-encoded. It is a bad idea to use anything other than a CBC-based cipher.

kstr

Buffer containing the key to be used to encrypt the data. If the data is not encrypted, this argument should be specified as NULL. Even if the data is to be encrypted, this buffer may be specified as NULL, in which case the key to use will be derived from a password or passphrase.

klen

If the key buffer is not specified as NULL, this specifies the length of the buffer in bytes. If the key buffer is specified as NULL, this should be specified as 0.

callback

If the data is to be encrypted and the key buffer is specified as NULL, this specifies a pointer to a function that will be called to obtain the password or passphrase used to derive the encryption key. It may be specified as NULL, in which case OpenSSL will query the user for the password or passphrase to use.

cb_arg

If a callback function is specified to obtain the password or passphrase for key derivation, this application-specific value is passed directly to the callback function.

If encryption is desired, OpenSSL will use PKCS #5 Version 1.5 to derive an encryption key from a password. This is an earlier version of the algorithm described in Recipe 4.10.

This function will return 1 if the encoding is successful, 0 otherwise (for example, if the underlying file is not open for writing).

The type pem_password_cb is defined as follows:

typedef int (*pem_password_cb)(char *buf, int len, int rwflag, void *cb_arg);

It has the following arguments:

buf

Buffer into which the password or passphrase is to be written.

len

Length in bytes of the password or passphrase buffer.

rwflag

Indicates whether the password is to be used for encryption or decryption. For encryption (when writing out data in PEM format), the argument will be 1; otherwise, it will be 0.

cb_arg

This application-specific value is passed in from the final argument to the PEM encoding or decoding function that caused this callback to be made.

Make sure that you do not overflow buf when writing data into it!

Your callback function is expected to return 1 if it successfully reads a password; otherwise, it should return 0.

The function for writing an RSA private key to a BIO object has the following signature, which is essentially the same as the function for writing an RSA private key to a FILE object. The only difference is that the first argument is the BIO object to write to instead of a FILE object.

int PEM_write_bio_RSAPrivateKey(BIO *bio, RSA *obj, EVP_CIPHER *enc,
                               unsigned char *kstr, int klen,
                               pem_password_cb callback, void *cbarg);

Table 7-6 lists the FILE object-based functions for the most useful PEM-encoding variants.[8] The BIO object-based functions can be derived by adding _bio_ after read or write.

[8] The remainder can be found by looking for uses of the IMPLEMENT_PEM_rw macro in the OpenSSL crypto/pem source directory.

Table 7-6. FILE object-based functions for PEM encoding

Kind of object

Object type

FILE object-based encoding function

FILE object-based decoding function

RSA public key

RSA

PEM_write_RSAPublicKey()

PEM_read_RSAPublicKey()

RSA private key

RSA

PEM_write_RSAPrivateKey()

PEM_read_RSAPrivateKey()

Diffie-Hellman parameters

DH

PEM_write_DHparams()

PEM_read_DHparams()

DSA parameters

DSA

PEM_write_DSAparams()

PEM_read_DSAparams()

DSA public key

DSA

PEM_write_DSA_PUBKEY()

PEM_read_DSA_PUBKEY()

DSA private key

DSA

PEM_write_DSAPrivateKey()

PEM_read_DSAPrivateKey()

X.509 certificate

X509

PEM_write_X509()

PEM_read_X509()

X.509 CRL

X509_CRL

PEM_write_X509_CRL()

PEM_read_X509_CRL()

PKCS #10 certificate signing request

X509_REQ

PEM_write_X509_REQ()

PEM_read_X509_REQ()

PKCS #7 container

PKCS7

PEM_write_PKCS7()

PEM_read_PKCS7()

The last two rows enumerate calls that are intended for people implementing actual infrastructure for a PKI, and they will not generally be of interest to the average developer applying cryptography.[9]

[9] PKCS #7 can be used to store multiple certificates in one data object, however, which may be appealing to some, instead of DER-encoding multiple X.509 objects separately.

7.17.4 See Also

Recipe 4.5, Recipe 4.6, Recipe 4.10, Recipe 5.17,Recipe 7.16