5.9 Using a Generic CTR Mode Implementation

5.9.1 Problem

You want to use counter (CTR) mode and your library doesn't provide an interface, or you want to use a more high-level interface than your library provides. Alternatively, you would like a portable CTR interface, or you have only a block cipher implementation and you would like to use CTR mode.

5.9.2 Solution

CTR mode encrypts by generating keystream, then combining the keystream with the plaintext via XOR. This mode generates keystream one block at a time by encrypting plaintexts that are the same, except for an ever-changing counter, as shown in Figure 5-4. Generally, the counter value starts at zero and is incremented sequentially.

Figure 5-4. Counter (CTR) mode

Few libraries provide a CTR implementation, because it has only recently come into favor, despite the fact that it is a very old mode with great properties. We provide code implementing this mode in the following Section 5.9.3.

5.9.3 Discussion

You should probably use a higher-level abstraction, such as the one discussed in Recipe 5.16. Use a raw mode only when absolutely necessary, because there is a huge potential for introducing asecurity vulnerability by accident. If you still want to use CTR mode, be sure to use a message authentication code with it.

CTR mode is a stream-based mode. Encryption occurs by XOR'ing the keystream bytes with the plaintext bytes. The keystream is generated one block at a time by encrypting a plaintext block that includes a counter value. Given a single key, the counter value must be unique for every encryption.

This mode has many benefits over the "standard" modes (e.g., ECB, CBC, CFB, and OFB). However, we recommend a higher-level mode, one that provides stronger security guarantees (i.e., message integrity detection), such as CWC or CCM modes. Most high-level modes use CTR mode as a component.

In Recipe 5.4, we discuss the advantages and drawbacks of CTR mode and compare it to other popular modes.

Like most other modes, CTR mode requires a nonce (often called an IV in this context). Most modes use the nonce as an input to encryption, and thus require something the same size as the algorithm's block length. With CTR mode, the input to encryption is generally the concatenation of the nonce and a counter. The counter is usually at least 32 bits, depending on the maximum amount of data you might want to encrypt with a single {key, nonce} pair. We recommend using a good random value for the nonce.

In the following sections we present a reasonably optimized implementation of CTR mode that builds upon the raw block cipher interface presented in Recipe 5.5. It also requires the spc_memset( ) function from Recipe 13.2. By default, we use a 6-byte counter, which leaves room for a nonce of SPC_BLOCK_SZ - 6 bytes. With AES and other ciphers with 128-bit blocks, this is sufficient space.

CTR mode with 64-bit blocks is highly susceptible to birthday attacks unless you use a large random portion to the nonce, which limits the message you can send with a given key. In short, don't use CTR mode with 64-bit block ciphers. The high-level API

This implementation has two APIs. The first is a high-level API, which takes a message as input and returns a dynamically allocated result.

unsigned char *spc_ctr_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,
                               unsigned char *in, size_t il);
unsigned char *spc_ctr_decrypt(unsigned char *key, size_t kl, unsigned char *nonce,
                               unsigned char *in, size_t il)

Both of the previous functions output the same number of bytes as were input, unless a memory allocation error occurs, in which case 0 is returned. The decryption routine is exactly the same as the encryption routine, and it is implemented by macro.

These two functions also erase the key from memory before exiting. You may want to have them erase the plaintext as well.

Here's the implementation of the interface:

#include <stdlib.h>
#include <string.h>
unsigned char *spc_ctr_encrypt(unsigned char *key, size_t kl, unsigned char *nonce,
                               unsigned char *in, size_t il) {
  SPC_CTR_CTX       ctx;
  unsigned char *out;
  if (!(out = (unsigned char *)malloc(il))) return 0;
  spc_ctr_init(&ctx, key, kl, nonce);
  spc_ctr_update(&ctx, in, il, out);
  return out;
#define spc_ctr_decrypt spc_ctr_encrypt

Note that this code depends on the SPC_CTR_CTX data type and the incremental CTR interface, both discussed in the following sections. In particular, the nonce size varies depending on the value of the SPC_CTR_BYTES macro (introduced in the next subsection). The incremental API

Let's look at the SPC_CTR_CTX data type. It's defined as:

typedef struct {
  int           ix;
  unsigned char ctr[SPC_BLOCK_SZ];
  unsigned char ksm[SPC_BLOCK_SZ];

The ks field is an expanded version of the cipher key (block ciphers generally use a single key to derive multiple keys for internal use). The ix field is used to determine how much of the last block of keystream we have buffered (i.e., that hasn't been used yet). The ctr block holds the plaintext used to generate keystream blocks. Buffered keystream is held in ksm.

To begin encrypting or decrypting, you need to initialize the mode. Initialization is the same operation for both encryption and decryption, and it depends on a statically defined value SPC_CTR_BYTES, which is used to compute the nonce size.

#define SPC_CTR_BYTES 6
void spc_ctr_init(SPC_CTR_CTX *ctx, unsigned char *key, size_t kl, unsigned char
                  *nonce) {
  SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);
  spc_memset(key, 0, kl);
  memcpy(ctx->ctr, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);
  spc_memset(ctx->ctr + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);
  ctx->ix = 0;

Note again that we remove the key from memory during this operation.

Now you can add data as you get it using the spc_ctr_update( ) function. This function is particularly useful when a message arrives in pieces. You'll get the same results as if it all arrived at once. When you want to finish encrypting or decrypting, call spc_ctr_final( ).

You're responsible for making sure the initialization, updating, and finalization calls do not happen out of order.

The function spc_ctr_update( ) has the following signature:

int spc_ctr_update(CTR_CTX *ctx, unsigned char *in, size_t il, unsigned char *out);

This function has the following arguments:


Pointer to the SPC_CTR_CTX object associated with the current message.


Pointer to a buffer containing the data to be encrypted or decrypted.


Number of bytes contained by the input buffer.


Pointer to the output buffer, which needs to be exactly as long as the input buffer.

Our implementation of this function always returns 1, but a hardware-based implementation might have an unexpected failure, so it's important to check the return value!

This API is in the spirit of PKCS #11, which provides a standard cryptographic interface to hardware. We do this so that the above functions can have the bulk of their implementations replaced with calls to PKCS #11-compliant hardware. PKCS #11 APIs generally pass out data explicitly indicating the length of data outputted, while we ignore that because it will always be zero on failure or the size of the input buffer on success. Also note that PKCS #11-based calls tend to order their arguments differently from the way we do, and they will not generally wipe key material, as we do in our initialization and finalization routines.

Because this API is developed with PKCS #11 in mind, it's somewhat more low-level than it needs to be, and therefore is a bit difficult to use properly. First, you need to be sure the output buffer is big enough to hold the input; otherwise, you will have a buffer overflow. Second, you need to make sure the out argument always points to the first unused byte in the output buffer. Otherwise, you will keep overwriting the same data every time spc_ctr_update( ) outputs data.

Here's our implementation of spc_ctr_update( ), along with a helper function:

static inline void ctr_increment(unsigned char *ctr) {
  unsigned char *x = ctr + SPC_CTR_BYTES;
  while (x-- != ctr) if (++(*x)) return;
int spc_ctr_update(SPC_CTR_CTX *ctx, unsigned char *in, size_t il, unsigned char
                   *out) {
  int i;
  if (ctx->ix) {
    while (ctx->ix) {
      if (!il--) return 1;
      *out++ = *in++ ^ ctx->ksm[ctx->ix++];
      ctx->ix %= SPC_BLOCK_SZ;
  if (!il) return 1;
  while (il >= SPC_BLOCK_SZ) {
    SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, out);
    for (i = 0;  i < SPC_BLOCK_SZ / sizeof(int);  i++)
      ((int *)out)[i] ^= ((int *)in)[i];
    il  -= SPC_BLOCK_SZ;
    in  += SPC_BLOCK_SZ;
    out += SPC_BLOCK_SZ;
  SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, ctx->ksm);
  for (i = 0;  i < il;  i++)
    *out++ = *in++ ^ ctx->ksm[ctx->ix++];
  return 1;

To finalize either encryption or decryption, use the spc_ctr_final( ) call, which never needs to output anything, because CTR is a streaming mode:

int spc_ctr_final(SPC_CTR_CTX *ctx) {
  spc_memset(&ctx, 0, sizeof(SPC_CTR_CTX));
  return 1;

5.9.4 See Also

Recipe 4.9, Recipe 5.4, Recipe 5.5, Recipe 5.16, Recipe 13.2