12.17 Using Self-Modifying Code

12.17.1 Problem

You want to hide portions of your binary using self-modifying code without rewriting existing code in assembler.

12.17.2 Solution

The most effective use of self-modifying code is to overwrite a section of vital code with another section of vital code, such that both vital sections do not exist at the same time. This can be time-consuming and costly to develop; a more expedient technique can be achieved with C macros that decrypt garbage bytes in the code section to proper executable code at runtime. The process involves encrypting the protected code after the binary has been compiled, then decrypting it only after it has been executed.

The code presented in this recipe applies to FreeBSD, Linux, NetBSD, OpenBSD, and Solaris. The concepts apply to Unix and Windows in general.

12.17.3 Discussion

For the code presented in this recipe, we'll be using RC4 to perform our encryption. We've chosen to use RC4 because it is fast and easy to implement. You will need to use the RC4 implementation from Recipe 5.23 or an alternative implementation from somewhere else to use the code we will be presenting.

The actual code to decrypt and replace the code in memory is minimal. The complexity arises from having to obtain the code to be encrypted, encrypting it, and making it accessible to the code that will be decrypting and executing it. A set of macros provides the means to mark replaceable code, and a single function, spc_smc_decrypt( ), performs the decryption of the code. Because we're using RC4, encryption and decryption are performed in exactly the same way, so spc_smc_decrypt( ) can also be used for encryption, which we'll do later on.

#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/mman.h>
#define SPC_SMC_START_BLOCK(label)  void label(void) {  }
#define SPC_SMC_END_BLOCK(label)    void _##label(void) {  }
#define SPC_SMC_BLOCK_LEN(label)    (int)_##label - (int)label
#define SPC_SMC_BLOCK_ADDR(label)   (unsigned char *)label
#define SPC_SMC_START_KEY(label)    void key_##label(void) {  }
#define SPC_SMC_END_KEY(label)      void _key_##label(void) {  }
#define SPC_SMC_KEY_LEN(label)      (int)_key_##label - (int)key_##label
#define SPC_SMC_KEY_ADDR(label)     (unsigned char *)key_##label
#define SPC_SMC_OFFSET(label)       (long)label - (long)_start
extern void _start(void);
/* returns number of bytes encoded */
int spc_smc_decrypt(unsigned char *buf, int buf_len, unsigned char *key, int key_len) {
  RC4_CTX ctx;
  RC4_set_key(&ctx, key_len, key);
  /* NOTE: most code segments have read-only permissions, and so must be modified
   * to allow writing to the buffer
  if (mprotect(buf, buf_len, PROT_WRITE | PROT_READ | PROT_EXEC)) {
    fprintf(stderr, "mprotect: %s\n", strerror(errno));
  /* decrypt the buffer */
  RC4(&ctx, buf_len, buf, buf);
  /* restore the original memory permissions */
  mprotect(buf, buf_len, PROT_READ | PROT_EXEC);

The use of mprotect( ), or an equivalent operating system routine for modifying the permissions of a page of memory, is required on most modern operating systems to write to the code segment. This is an inherent weakness of the self-modifying code technique: the call to mprotect( ) is suspicious, and it is trivial to write a utility that searches the disassembly of a program for calls to mprotect( ) that enable write access or take an address in the code segment as the first parameter. The use of mprotect( ) should be obfuscated (see Recipe 12.3 and Recipe 12.9).

Once the binary has been compiled, the protected code will have to be encrypted before it can be executed. The following code demonstrates a utility for encrypting a portion of an ELF executable file based on the contents of another portion of the file. The usage is:

smc_encrypt filename code_offset code_len key_offset key_len

In the command, code_offset and code_len are the location in the file of the code to be encrypted and the code's length, and key_offset and key_len are the location in the file of the key with which to encode the code and the key's length.

#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
/* ELF-specific stuff */
#define ELF_ENTRY_OFFSET  24 /* e_hdr e_entry field offset */
#define ELF_PHOFF_OFFSET  28 /* e_hdr e_phoff field offset */
#define ELF_PHESZ_OFFSET  42 /* e_hdr e_phentsize field offset */
#define ELF_PHNUM_OFFSET  44 /* e_hdr e_phnum field offset */
#define ELF_PH_OFFSET_OFF 4  /* p_hdr p_offset field offset */
#define ELF_PH_VADDR_OFF  8  /* p_hdr p_vaddr field offset */
#define ELF_PH_FILESZ_OFF 16 /* p_hdr p_size field offset */
static unsigned long elf_get_entry(unsigned char *buf) {
  unsigned long  entry, p_vaddr, p_filesz, p_offset;
  unsigned int   i, phoff;
  unsigned short phnum, phsz;
  unsigned char  *phdr;
  entry  = *(unsigned long *) &buf[ELF_ENTRY_OFFSET];
  phoff  = *(unsigned int *) &buf[ELF_PHOFF_OFFSET];
  phnum  = *(unsigned short *) &buf[ELF_PHNUM_OFFSET];
  phsz  = *(unsigned short *) &buf[ELF_PHESZ_OFFSET];
  phdr = &buf[phoff];
  /* iterate through program headers */
  for ( i = 0; i < phnum; i++, phdr += phsz ) {
    p_vaddr = *(unsigned long *)&phdr[ELF_PH_VADDR_OFF];
    p_filesz = *(unsigned long *)&phdr[ELF_PH_FILESZ_OFF];
    /* if entry point is in this program segment */
    if ( entry >= p_vaddr && entry < (p_vaddr + p_filesz) ) {
      /* calculate offset of entry point */
      p_offset = *(unsigned long *)&phdr[ELF_PH_OFFSET_OFF];
      return( p_offset + (entry - p_vaddr) );
  return 0;
int main(int argc, char *argv[  ]) {
  unsigned long entry, offset, len, key_offset, key_len;
  unsigned char *buf;
  struct stat   sb;
  int           fd;
  if (argc < 6) {
    printf("Usage: %s filename offset len key_offset key_len\n"
           "       filename:   file to encrypt\n"
           "       offset:     offset in file to start encryption\n"
           "       len:        number of bytes to encrypt\n"
           "       key_offset: offset in file of key\n"
           "       key_len:    number of bytes in key\n"
           "       Values are converted with strtol with base 0\n",
    return 1;
  /* prepare the parameters */
  offset = strtoul(argv[2], 0, 0);
  len = strtoul(argv[3], 0, 0);
  key_offset = strtoul(argv[4], 0, 0);
  key_len = strtoul(argv[5], NULL, 0);
  /* memory map the file so we can access it via pointers */
  if (stat(argv[1], &sb)) {
    fprintf(stderr, "Stat failed: %s\n", strerror(errno));
    return 2;
  if ((fd = open(argv[1], O_RDWR | O_EXCL)) < 0) {
    fprintf(stderr, "Open failed: %s\n", strerror(errno));
    return 3;
  buf = mmap(0, sb.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
  if ((int)buf < 0) {
    fprintf(stderr, "Open failed: %s\n", strerror(errno));
    return 4;
  /* get entry point : here we assume ELF example */
  entry = elf_get_entry(buf);
  if (!entry) {
    fprintf(stderr, "Invalid ELF header\n");
    munmap(buf, sb.st_size);
    return 5;
  /* these are offsets from the entry point */
  offset += entry;
  key_offset += entry;
  printf("Encrypting %d bytes at 0x%X with %d bytes at 0x%X\n",
         len, offset, key_len, key_offset);
  /* Because we're using RC4, encryption and decryption are the same operation */
  spc_smc_decrypt(buf + offset, len, buf + key_offset, key_len);
  /* mem-unmap the file */
  msync(buf, sb.st_size, MS_SYNC);
  munmap(buf, sb.st_size);
  return 0;

This program incorporates an ELF file-header parser in the elf_get_entry( ) routine. The program header table entries of the ELF header are searched for the loadable segment containing the entry point. This is done to translate the entry point virtual address into an offset from the start of the file. This is necessary because the offsets generated by the SPC_SMC_OFFSET macro are relative to the program entry point (_start).

The following code provides an example of using the code we've presented in this recipe. The program decrypts itself at runtime, using bogus_routine( ) as a key for decrypting test_routine( ).

#include <stdio.h>
#include <unistd.h>
int test_routine(void) {
  int x;
  for (x = 0;  x < 10;  x++) printf("decrpyted!\n");
  return x;
int bogus_routine(void) {
  int x, y;
  for (x = 0;  x < y;  x++) {
    y = x + 256;
    y /= 32;
    x = y * 2 / 24;
  return 1;
int main(int argc, char *argv[  ]) {
  spc_smc_decrypt(SPC_SMC_BLOCK_ADDR(test), SPC_SMC_BLOCK_LEN(test),
                  SPC_SMC_KEY_ADDR(test), SPC_SMC_KEY_LEN(test));
  /* This printf(  ) displays the parameters to pass to the smc_encrypt utility on
   * stdout.  The printf(  ) must be removed, and the program recompiled before
   * running smc_encrypt.  Having the printf(  ) at the end of the file prevents
   * the offsets from changing after recompilation.
  printf("(offsets from _start)offset: 0x%X len 0x%X key 0x%X len 0x%X\n",
  test_routine(  );
  return 0;

As mentioned in the comment just prior to the printf( ) call in main( ), this program should be compiled with UNENCRYPTED_BUILD defined, then executed to obtain the parameters to the smc_encrypt utility:

/bin/sh>cc -I. smc.c smc_test.c -D UNENCRYPTED_BUILD
(offsets from _start)offset: 0xB0 len 0x36 key 0xEB len 0x66

The program is then recompiled, with UNENCRYPTED_BUILD not defined in order to remove the printf( ) and exit( ) statements. The smc_encrypt utility is then run on the resulting binary to produce a working program:

/bin/sh>cc -I. smc.c smc_test.c
/bin/sh>smc_encrypt a.out 0xB0 0x36 0xEB 0x66

Self-modifying code is one of the most potent techniques available for protecting binary code; however, it makes the build process more complex, as you can see in the above example. In addition, some processor architectures (such as the x86 line before the Pentium II) cache instructions and do not invalidate this cache when the code segment is written to. To be compatible with these older architectures, you will need to use one of the three ring3 serializing instructions (cpuid, iret, and rsm) to invalidate the cache. This can be performed with a macro:

#define INVALIDATE_CACHE asm volatile( \
        "pushad \n"                    \
        "cpuid  \n"                    \
        "popad  \n")

The pushad and popad instructions are needed because the cpuid instruction overwrites the four general-purpose registers. Once again, as with the call to mprotect( ), note that the use of the cpuid instruction is suspicious and will draw attention to the code of the protection. It is better to place the call to the decrypted code far enough away (16 bytes should be sufficient, because only 486 and Pentium CPUs will be affected) from the actual decryption routine so that the decrypted code will not be in the instruction cache.

This implementation of self-decrypting code is a simple one; it could be defeated by pulling the decryption code from the binary, decrypting the protected code, then replacing the call to the decryption routine with nop instructions. This is possible because the size of the encrypted code is the same as the decrypted code; a more robust solution would be to use a stronger encryption method or a compression method, and extract the protected code to a dynamically allocated region of memory. However, such a method requires extensive manipulation of the object files before and after linking. You might consider using a commercially available binary packer to reduce development and testing time.

12.17.4 See Also

Recipe 5.23, Recipe 12.3, Recipe 12.9