3.11 Preventing SQL Injection Attacks

3.11.1 Problem

You are developing an application that interacts with a SQL database, and you need to defend against SQL injection attacks.

3.11.2 Solution

SQL injection attacks are most common in web applications that use a database to store data, but they can occur anywhere that a SQL command string is constructed from any type of input from a user. Specifically, a SQL injection attack is mounted by inserting characters into the command string that creates a compound command in a single string. For example, suppose a query string is created with a WHERE clause that is constructed from user input. A proper command might be:

SELECT * FROM people WHERE first_name="frank";

If the value "frank" comes directly from user input and is not properly validated, an attacker could include a closing double quote and a semicolon that would complete the SELECT command and allow the attacker to append additional commands. For example:

SELECT * FROM people WHERE first_name="frank";  DROP TABLE people;

Obviously, the best way to avoid SQL injection attacks is to not create SQL command strings that include any user input. In some small number of applications, this may be feasible, but more frequently it is not. Avoid including user input in SQL commands as much as you can, but where it cannot be avoided, you should escape dangerous characters.

3.11.3 Discussion

SQL injection attacks are really just general input validation problems. Unfortunately, there is no perfect solution to preventing these types of attacks. Your best defense is to apply strict checking of input?even going so far as to refuse questionable input rather than attempt to escape it?and hope that that is a strong enough defense.

There are two main approaches that can be taken to avoid SQL injection attacks:

Restrict user input to the smallest character set possible, and refuse any input that contains character outside of that set.

In many cases, user input needs to be used in queries such as looking up a username or a message number, or some other relatively simple piece of information. It is rare to need any character in a user name other than the set of alphanumeric characters. Similarly, message numbers or other similar identifiers can safely be restricted to digits.

With SQL, problems start to occur when symbol characters that have special meaning are allowed. Examples of such characters are quotes (both double and single), semicolons, percent symbols, hyphens, and underscores. Avoid these characters wherever possible; they are often unnecessary, and allowing them at all just makes things more difficult for everyone except an attacker.

Escape characters that have special significant to SQL command processors.

In SQL parlance, anything that is not a keyword or an identifier is a literal. Keywords are portions of a SQL command such as SELECT or WHERE, and an identifier would typically be the name of a table or the name of a field. In some cases, SQL syntax allows literals to appear without enclosing quotes, but as a general rule you should always enclose literals with quotes.

Literals should always be enclosed in single quotes ('), but some SQL implementations allow you to use either single or double quotes ("). Whichever you choose to use, always close the literal with the same character with which you opened it.

Within literals, most characters are safe to leave unescaped, and in many cases, it is not possible to escape them. Certainly, with whichever quoting character you choose to use with your literals, you may need to allow that character inside the literal. Escaping quotes is done by doubling up on the quote character. Other characters that should always be escaped are control characters and the escape character itself (a backslash).

Finally, if you are using the LIKE keyword in a WHERE clause, you may wish to prevent input from containing wildcard characters. In fact, it is a good idea to prevent wildcard characters in most circumstances. Wildcard characters include the percent symbol, underscore, and square brackets.

You can use the function spc_escape_sql( ), shown at the end of this section, to escape all of the characters that we've mentioned. As a convenience (and partly due to necessity), the function will also surround the escaped string with the quote character of your choice. The return from the function will be the quoted and escaped version of the input string. If an error occurs (e.g., out of memory, or an invalid quoting character chosen), the return will be NULL.

spc_escape_sql( ) requires three arguments:


The string that is to be escaped.


The quote character to use. It must be either a single or double quote. Any other character will cause spc_escape_sql( ) to return failure.


If this argument is specified as 0, wildcard characters recognized by the LIKE operator in a WHERE clause will not be escaped; otherwise, they will be. You should only escape wildcards when you are going to be using the escaped string as the right-hand side for the LIKE operator.

#include <stdlib.h>
#include <string.h>
char *spc_escape_sql(const char *input, char quote, int wildcards) {
  char       *out, *ptr;
  const char *c;
  /* If every character in the input needs to be escaped, the resulting string
   * would at most double in size.  Also, include room for the surrounding
   * quotes.
  if (quote != '\'' && quote != '\"') return 0;
  if (!(out = ptr = (char *)malloc(strlen(input) * 2 + 2 + 1))) return 0;
  *ptr++ = quote;
  for (c = input;  *c;  c++) {
    switch (*c) {
      case '\'': case '\"':
        if (quote == *c) *ptr++ = *c;
        *ptr++ = *c;
      case '%': case '_': case '[': case ']':
        if (wildcards) *ptr++ = '\\';
        *ptr++ = *c;
      case '\\': *ptr++ = '\\'; *ptr++ = '\\'; break;
      case '\b': *ptr++ = '\\'; *ptr++ = 'b';  break;
      case '\n': *ptr++ = '\\'; *ptr++ = 'n';  break;
      case '\r': *ptr++ = '\\'; *ptr++ = 'r';  break;
      case '\t': *ptr++ = '\\'; *ptr++ = 't';  break;
        *ptr++ = *c;
  *ptr++ = quote;
  *ptr = 0;
  return out;