11.5 Using an Application-Level Generator

11.5.1 Problem

You are in an environment where you do not have access to a built-in, cryptographically strong pseudo-random number generator. You have obtained enough entropy to seed a pseudo-random generator, but you lack a generator.

11.5.2 Solution

For general-purpose use, we recommend a pseudo-random number generator based on the AES encryption algorithm run in counter (CTR) mode (see Recipe 5.9). This generator has the best theoretical security assurance, assuming that the underlying cryptographic primitive is secure. If you would prefer a generator based on a hash function, you can run HMAC-SHA1 (see Recipe 6.10) in counter mode.

In addition, the keystream of a secure stream cipher can be used as a pseudo-random number generator.

11.5.3 Discussion

Stream ciphers are actually cryptographic pseudo-random number generators. One major practical differentiator between the two terms is whether you are using the output of the generator to perform encryption. If you are, it is a stream cipher; otherwise, it is a cryptographic pseudo-random number generator.

Another difference is that, when you are using a stream cipher to encrypt data, you need to be able to reproduce the same stream of output to decrypt the encrypted data. With a cryptographic PRNG, there is generally no need to be able to reproduce a data stream. Therefore, the generator can be reseeded at any time to help protect against internal state guessing attacks, which is analogous to rekeying a stream cipher.

The primary concern with a good cryptographic PRNG at the application level is internal state compromise, which would allow an attacker to predict its output. As long as the cryptographic algorithms used by a PRNG are not broken and the generator is not used to produce more output than it is designed to support, state compromise is generally not feasible by simply looking at the generator's output. The number of outputs supported by a generator varies based on the best attacks possible for whatever cryptographic algorithms are in use.

The risk of state compromise is generally not a big deal when dealing with something like /dev/random, where the generator is in the kernel. The only way to compromise the state is to be inside the kernel. If that's possible, there are much bigger problems than /dev/urandom or CryptGenRandom( ) producing data that an attacker can guess.

In the user space, state compromise may be more of an issue, though. You need to work through the threats about which you are worried. Threats are likely to come only from code on the local machine, but what code? Are you worried about malicious applications running with the same permissions being able to somehow peer inside the current process to get the internal state? If so, perhaps you should have a separate process that only provides entropy and runs with a set of permissions where only itself and the superuser would be a concern (this is the recommended approach for using the EGADS package discussed in Recipe 11.8).

If state compromise is a potential issue, you might have to worry about more than an attacker guessing future outputs. You might also have to worry about an attacker backtracking, which means compromising previous outputs the generator made. Reseeding the generator periodically, as discussed in Recipe 11.6, can solve this problem. At best, an attacker should only be able to backtrack to the last reseeding (you can reseed without new entropy to mix in).

In practice, few people should have to worry very much about state compromise of their cryptographic PRNG. As was the case at the operating system level, if such attacks are a realistic threat, you will usually have far bigger threats, and mitigating those threats will help mitigate this one as well.

There is a lot that can go wrong when using a pseudo-random number generator. Coming up with a good construct turns out to be the easy part. Here are some things you should closely consider:

  • Pseudo-random number generators need to be seeded with an adequate amount of entropy; otherwise, they are still potentially predictable. We recommend at least 80 bits. See the various recipes elsewhere in this chapter for information on collecting entropy.

  • Be careful to pay attention to the maximum number of outputs a generator can produce before it will need to be reseeded with new entropy. At some point, generators start to leak information and will generally fall into a cycle. Note, though, that for the configurations we present, you will probably never need to worry about the limit in practice. For example, the generator based on AES-128 leaks a bit of information after 264 16-byte blocks of output, and cycles after 2128 such blocks.

  • When adding entropy to a system, it is best to collect a lot of entropy and seed all at once, instead of seeding a little bit at a time. We will illustrate why by example. Suppose that you seed a generator with one bit of entropy. An attacker has only one bit to guess, which can be done accurately after two outputs. If the attacker completely compromises the state after two outputs, and we then add another bit of entropy, he can once again guess the state easily. If we add one bit 128 times, there is still very little security overall if the generator state is compromised. However, if you add 128 bits of entropy to the generator all at once, an attack should essentially be infeasible.

  • If an attacker can somehow compromise the internal state of a pseudo-random number generator, then it might be possible to launch a backtracking attack, where old generator outputs can be recovered. Such attacks are easy to thwart; see Recipe 11.6.

In the following three subsections, we will look at three different techniques for pseudo-random number generators: using a block cipher such as AES, using a stream cipher directly, and using a cryptographic hash function such as SHA1.

11.5.3.1 Using generators based on block ciphers

If you are in an environment where you have use of a good block cipher such as AES, you have the makings of a cryptographically strong pseudo-random number generator. Many of the encryption modes for turning a block cipher into a stream cipher are useful for this task, but CTR mode has the nicest properties. Essentially, you create random outputs one block at a time by encrypting a counter that is incremented after every encryption operation.

The seed should be at least as large as the key size of the cipher, because it will be used to key a block cipher. In addition, it is useful to have additional seed data that sets the first plaintext (counter) value.

Our implementation is based on the code in Recipe 5.5 and has two exported routines. The first initializes a random number generator:

void spc_bcprng_init(SPC_BCPRNG_CTX *prng, unsigned char *key, int kl, 
                     unsigned char *x, int xl);

This function has the following arguments:

prng

Pointer to a context object that holds the state for a block cipher-based PRNG. The caller may allocate the context object either dynamically or statically; this function will initialize it.

key

Buffer that should contain entropic data. This data is used to key the block cipher, and it is the required portion of the seed to the generator.

kl

Length of the key buffer in bytes; must be a valid value for the algorithm in use.

x

Buffer that may contain extra seed data, which we recommend you use if you have available entropy. If the specified size of this buffer is zero, this argument will be ignored. Note that if the buffer is larger than SPC_BLOCK_LEN (see Recipe 5.5) any additional data in the buffer will be ignored. Therefore, if you have sparse amounts of entropy, compress it to the right length before calling this function, as discussed in Recipe 11.16.

xl

Length of the extra seed buffer in bytes. It may be specified as zero to indicate that there is no extra seed data.

Once you have an instantiated generator, you can get cryptographically strong pseudo-random data from it with the following function:

unsigned char *spc_bcprng_rand(SPC_BCPRNG_CTX *prng, unsigned char *buf, size_t l);

This function has the following arguments:

prng

Pointer to the generator's context object.

buf

Buffer into which the random data will be written.

l

Number of bytes that should be placed into the output buffer.

This function never fails (save for a catastrophic error in encryption), and it returns the address of the output buffer.

Here is an implementation of this generator API, which makes use of the block cipher interface we developed in Recipe 5.5:

/* NOTE: This code should be augmented to reseed after each request
/* for pseudo-random data, as discussed in Recipe 11.6
/*
#ifndef WIN32
#include <string.h>
#include <pthread.h>
#else
#include <windows.h>
#endif
   
/* if encryption operations fail, you passed in a bad key size or are using a
 * hardware API that failed.  In that case, be sure to perform error checking.
*/
   
typedef struct {
  SPC_KEY_SCHED ks;
  unsigned char ctr[SPC_BLOCK_SZ];
  unsigned char lo[SPC_BLOCK_SZ]; /* Leftover block of output */
  int           ix;               /* index into lo */
  int           kl;               /* The length of key used to key the cipher */
} SPC_BCPRNG_CTX;
   
#ifndef WIN32
static pthread_mutex_t spc_bcprng_mutex = PTHREAD_MUTEX_INITIALIZER;
   
#define SPC_BCPRNG_LOCK(  )   pthread_mutex_lock(&spc_bcprng_mutex);
#define SPC_BCPRNG_UNLOCK(  ) pthread_mutex_unlock(&spc_bcprng_mutex);
#else
static HANDLE hSpcBCPRNGMutex;
   
#define SPC_BCPRNG_LOCK(  )   WaitForSingleObject(hSpcBCPRNGMutex, INFINITE)
#define SPC_BCPRNG_UNLOCK(  ) ReleaseMutex(hSpcBCPRNGMutex)
#endif
   
static void spc_increment_counter(SPC_BCPRNG_CTX *prng) {
  int i = SPC_BLOCK_SZ;
   
  while (i--) 
    if (++prng->ctr[i]) return;
}
   
void spc_bcprng_init(SPC_BCPRNG_CTX *prng, unsigned char *key, int kl,
                     unsigned char *x, int xl) {
  int i = 0;
   
  SPC_BCPRNG_LOCK(  );
  SPC_ENCRYPT_INIT(&(prng->ks), key, kl);
  memset(prng->ctr, 0, SPC_BLOCK_SZ);
  while (xl-- && i < SPC_BLOCK_SZ)
    prng->ctr[i++] = *x++;
  prng->ix = 0;
  prng->kl = kl;
  SPC_BCPRNG_UNLOCK(  );
}
   
unsigned char *spc_bcprng_rand(SPC_BCPRNG_CTX *prng, unsigned char *buf, size_t l) {
  unsigned char *p;
   
  SPC_BCPRNG_LOCK(  );
  for (p = buf;  prng->ix && l;  l--) {
    *p++ = prng->lo[prng->ix++];
    prng->ix %= SPC_BLOCK_SZ;
  }
  while (l >= SPC_BLOCK_SZ) {
    SPC_DO_ENCRYPT(&(prng->ks), prng->ctr, p);
    spc_increment_counter(prng);
    p += SPC_BLOCK_SZ;
    l -= SPC_BLOCK_SZ;
  }
  if (l) {
    SPC_DO_ENCRYPT(&(prng->ks), prng->ctr, prng->lo);
    spc_increment_counter(prng);
    prng->ix = l;
    while (l--) p[l] = prng->lo[l];
  }
  SPC_BCPRNG_UNLOCK(  );
  return buf;
}

If your block cipher has 64-bit blocks and has no practical weaknesses, do not use this generator for more than 235 bytes of output (232 block cipher calls). If the cipher has 128-bit blocks, do not exceed 268 bytes of output (264 block cipher calls). If using a 128-bit block cipher, it is generally acceptable not to check for this condition, as you generally would not reasonably expect to ever use that many bytes of output.

To bind this cryptographic PRNG to the API in Recipe 11.2, you can use a single global generator context that you seed in spc_rand_init( ), requiring you to get a secure seed. Once that's done (assuming the generator variable is a statically allocated global variable named spc_prng), you can simply implement spc_rand( ) as follows:

unsigned char *spc_rand(unsigned char *buf, size_t l) {
  return spc_bcprng_rand(&spc_prng, buf, l);
}

Note that you should probably be sure to check that the generator is seeded before calling spc_bcprng_rand( ).

11.5.3.2 Using a stream cipher as a generator

As we mentioned, stream ciphers are themselves pseudo-random number generators, where the key (and the initialization vector, if appropriate) constitutes the seed. If you are planning to use such a cipher, we strongly recommend the SNOW 2.0 cipher, discussed in Recipe 5.2.

Because of the popularity of the RC4 cipher, we expect that people will prefer to use RC4, even though it does not look as good as SNOW. The RC4 stream cipher does make an acceptable pseudo-random number generator, and it is incredibly fast if you do not rekey frequently (that is particularly useful if you expect to need a heck of a lot of numbers). If you do rekey frequently to avoid backtracking attacks, a block cipher-based approach may be faster; time it to make sure.

RC4 requires a little bit of work to use properly, given a standard API. First, most APIs want you to pass in data to encrypt. Because you want only the raw keystream, you must always pass in zeros. Second, be sure to use RC4 in a secure manner, as discussed in Recipe 5.23.

If your RC4 implementation has the API discussed in Recipe 5.23, seeding it as a pseudo-random number generator is the same as keying the algorithm. RC4 can accept keys up to 256 bytes in length.

Because of limitations in RC4, you should throw away the first 256 bytes of RC4 output, as discussed in Recipe 5.23.

After encrypting 256 bytes and throwing the results away, you can then, given an RC4 context, get random data by encrypting zeros. Assuming the RC4 API from Recipe 5.23 and assuming you have a context statically allocated in a global variable named spc_prng, here's a binding of RC4 to the spc_rand( ) function that we introduced in Recipe 11.2:

/* NOTE: This code should be augmented to reseed after each request
/* for pseudo-random data, as discussed in Recipe 11.6
/*
#ifndef WIN32
#include <pthread.h>
   
static pthread_mutex_t spc_rc4rng_mutex = PTHREAD_MUTEX_INITIALIZER;
   
#define SPC_RC4RNG_LOCK(  )   pthread_mutex_lock(&spc_rc4rng_mutex)
#define SPC_RC4RNG_UNLOCK(  ) pthread_mutex_unlock(&spc_rc4rng_mutex)
#else
#include <windows.h>
   
static HANDLE hSpcRC4RNGMutex;
   
#define SPC_RC4RNG_LOCK(  )   WaitForSingleObject(hSpcRC4RNGMutex, INFINITE)
#define SPC_RC4RNG_UNLOCK(  ) ReleaseMutex(hSpcRC4RNGMutex)
#endif
   
#define SPC_ARBITRARY_SIZE 16
   
unsigned char *spc_rand(unsigned char *buf, size_t l) {
  static unsigned char zeros[SPC_ARBITRARY_SIZE] = {0,};
  unsigned char       *p = buf;
   
#ifdef WIN32
  if (!hSpcRC4RNGMutex) hSpcRC4RNGMutex = CreateMutex(0, FALSE, 0);
#endif
   
  SPC_RC4RNG_LOCK(  );
  while (l >= SPC_ARBITRARY_SIZE) {
    RC4(&spc_prng, SPC_ARBITRARY_SIZE, zeros, p);
    l -= SPC_ARBITRARY_SIZE;
    p += SPC_ARBITRARY_SIZE;    
  }
  if (l) RC4(&spc_prng, l, zeros, p);
   
  SPC_RC4RNG_UNLOCK(  );
  return buf;
}

Note that, although we don't show it in this code, you should ensure that the generator is initialized before giving output.

Because using this RC4 API requires encrypting zero bytes to get the keystream output, in order to be able to generate data of arbitrary sizes, you must either dynamically allocate and zero out memory every time or iteratively call RC4 in chunks of up to a fixed size using a static buffer filled with zeros. We opt for the latter approach.

RC4 is only believed to be a strong source of random numbers for about 230 outputs. After that, we strongly recommend that you reseed it with new entropy. If your application would not conceivably use that many outputs, it should generally be okay not to check that condition.

11.5.3.3 Using a generator based on a cryptographic hash function

The most common mistake made when trying to use a hash function as a cryptographic pseudo-random number generator is to continually hash a piece of data. Such an approach gives away the generator's internal state with every output. For example, suppose that your internal state is some value X, and you generate and output Y by hashing X. The next time you need random data, rehashing X will give the same results, and any attacker who knows the last outputs from the generator can figure out the next outputs if you generate them by hashing Y.

One very safe way to use a cryptographic hash function in a cryptographic pseudo-random number generator is to use HMAC in counter mode, as discussed in Recipe 6.10. Here we implement a generator based on the HMAC-SHA1 implementation from Recipe 6.10. You should be able to adapt this code easily to any HMAC implementation you want to use.

/* NOTE: This code should be augmented to reseed after each request
/* for pseudo-random data, as discussed in Recipe 11.6
/*
#ifndef WIN32
#include <string.h>
#include <pthread.h>
#else
#include <windows.h>
#endif
   
/* If MAC operations fail, you passed in a bad key size or you are using a hardware
 * API that failed.  In that case, be sure to perform error checking.
 */
#define MAC_OUT_SZ 20
   
typedef struct {
  SPC_HMAC_CTX      ctx;
  unsigned char ctr[MAC_OUT_SZ];
  unsigned char lo[MAC_OUT_SZ];   /* Leftover block of output */
  int           ix;               /* index into lo. */
} SPC_MPRNG_CTX;
   
#ifndef WIN32
static pthread_mutex_t spc_mprng_mutex = PTHREAD_MUTEX_INITIALIZER;
   
#define SPC_MPRNG_LOCK(  )   pthread_mutex_lock(&spc_mprng_mutex)
#define SPC_MPRNG_UNLOCK(  ) pthread_mutex_unlock(&spc_mprng_mutex)
#else
static HANDLE hSpcMPRNGMutex;
   
#define SPC_MPRNG_LOCK(  )   WaitForSingleObject(hSpcMPRNGMutex, INFINITE)
#define SPC_MPRNG_UNLOCK(  ) ReleaseMutex(hSpcMPRNGMutex)
#endif
   
static void spc_increment_mcounter(SPC_MPRNG_CTX *prng) {
  int i = MAC_OUT_SZ;
   
  while (i--) 
    if (++prng->ctr[i]) 
      return;
}
   
void spc_mprng_init(SPC_MPRNG_CTX *prng, unsigned char *seed, int l) {
  SPC_MPRNG_LOCK(  );
  SPC_HMAC_Init(&(prng->ctx), seed, l);
  memset(prng->ctr, 0, MAC_OUT_SZ);
  prng->ix = 0;
  SPC_MPRNG_UNLOCK(  );
}
   
unsigned char *spc_mprng_rand(SPC_MPRNG_CTX *prng, unsigned char *buf, size_t l) {
  unsigned char *p;
   
  SPC_MPRNG_LOCK(  );
  for (p = buf;  prng->ix && l;  l--) {
    *p++ = prng->lo[prng->ix++];
    prng->ix %= MAC_OUT_SZ;
  }
  while (l >= MAC_OUT_SZ) {
    SPC_HMAC_Reset(&(prng->ctx));
    SPC_HMAC_Update(&(prng->ctx), prng->ctr, sizeof(prng->ctr));
    SPC_HMAC_Final(p, &(prng->ctx));
    spc_increment_mcounter(prng);
    p += MAC_OUT_SZ;
    l -= MAC_OUT_SZ;
  }
  if (l) {
    SPC_HMAC_Reset(&(prng->ctx));
    SPC_HMAC_Update(&(prng->ctx), prng->ctr, sizeof(prng->ctr));
    SPC_HMAC_Final(prng->lo, &(prng->ctx));
    spc_increment_mcounter(prng);
    prng->ix = l;
    while (l--) p[l] = prng->lo[l];
  }
  SPC_MPRNG_UNLOCK(  );
  return buf;
}

This implementation has two publicly exported functions. The first initializes the generator:

void spc_mprng_init(SPC_MPRNG_CTX *prng, unsigned char *seed, int l);

This function has the following arguments:

prng

Context object used to hold the state for a MAC-based PRNG.

seed

Buffer containing data that should be filled with entropy (the seed). This data is used to key the MAC.

l

Length of the seed buffer in bytes.

The second function actually produces random data:

unsigned char *spc_mprng_rand(SPC_MPRNG_CTX *prng, unsigned char *buf, size_t l);

This function has the following arguments:

prng

Context object used to hold the state for a MAC-based PRNG.

out

Buffer into which the random data will be placed.

l

Number of random bytes to be placed into the output buffer.

If your hash function produces n-bit outputs and has no practical weaknesses, do not use the generator after you run the MAC more than 2n/2 times. For example, with SHA1, this generator should be not be a problem for at least 280 x 20 bytes. In practice, you probably will not have to worry about this issue.

To bind this cryptographic pseudo-random number generator to the API in Recipe 11.2, you can use a single global generator context that you seed in spc_rand_init( ), requiring you to get a secure seed. Once that is done (assuming the generator variable is a statically allocated global variable named spc_prng), you can simply implement spc_rand( ) as follows:

unsigned char *spc_rand(unsigned char *buf, size_t l) {
  return spc_bcprng_rand(&spc_prng, buf, l);
}

Note that, although we don't show it in the previous code, you should ensure that the generator is initialized before giving output.

11.5.4 See Also

Recipe 5.2, Recipe 5.5, Recipe 5.9, Recipe 5.23, Recipe 6.10, Recipe 11.2, Recipe 11.6, Recipe 11.8, Recipe 11.16