9.3 Ensuring Input Safety

One of most important rules regarding the acceptance of user input is that all user input should be considered dangerous until proven otherwise. Why is this? For one thing, in a web-based application, it is easy for a malicious user to enter script commands into a textbox (commonly known as a cross-site scripting attack), since it is likely this input will later be displayed by the application such that the script will be executed.

9.3.1 Request Validation

To solve the majority of problems with input safety, the ASP.NET team added a new feature to Version 1.1 of the .NET Framework called request validation. Request validation, which is enabled by default in ASP.NET 1.1, automatically checks all forms of input in the Request object for HTML characters, or content, and raises an exception if such content is found.

You should never turn off request validation unless you need to allow users to provide HTML input, and you have provided your own filtering or input checking logic. It is also important to always filter out anything other than the expected input. If you attempt only to filter out known dangerous content, you will most certainly miss something.

Request validation is enabled at the machine level through the validateRequest attribute of the <pages> element in machine.config. You can disable request validation at the application level by adding a <pages> element to the application's web.config file with the validateRequest attribute set to False. You can disable request validation at the page level by adding the validateRequest attribute to the @ Page directive, with the value set to False.

9.3.2 Other Filtering/Prevention Techniques

If you want to allow HTML input in some parts of your application (or parts of a page), but still want to protect against script attacks, here are a couple of techniques you can use.

9.3.2.1 Regular expressions

The RegularExpressionValidator control allows specific input based on a given regular expression, while preventing everything else. In the following code snippet, only <i> and <b> tags, spaces, any text (A-Za-z0-9), and the following punctuation: ?!,.'" will be allowed as input.

<asp:TextBox id="TextBox1" runat="server"/>
<asp:RegularExpressionValidator runat="server" 
   ErrorMessage="Invalid Input Found!"
   ValidationExpression="^([\s\w\?\!\,\.\'\&quot;]*|(</?(i|I|b|B)>))*$"
   ControlToValidate="TextBox1"/>

All other input will cause validation to fail.

A good source of useful regular expressions for validating various types of input is http://www.regexlib.com/.

9.3.2.2 HTML encoding

Another technique for filtering input is to HTML encode all input (and/or all output), and use the String.Replace function to allow specific HTML content by replacing the encoded value with an unencoded version. This snippet shows how:

Dim InputString As String = Server.HtmlEncode(TextBox1.Text)
InputString = InputString.Replace("&lt;b&gt;", "<b>")
InputString = InputString.Replace("&lt;B&gt;", "<B>")
InputString = InputString.Replace("&lt;/b&gt;", "</b>")
InputString = InputString.Replace("&lt;/B&gt;", "</B>")
InputString = InputString.Replace("&lt;i&gt;", "<i>")
InputString = InputString.Replace("&lt;I&gt;", "<I>")
InputString = InputString.Replace("&lt;/i&gt;", "</i>")
InputString = InputString.Replace("&lt;/I&gt;", "</I>")

Like the RegularExpressionValidator code snippet, HTML encoding will allow the use of the <b> and <i> tags. In this case, all other tags will remain encoded. Note that extensive string manipulation can be expensive from a performance standpoint so, where possible, using regular expressions may be more efficient.

Certain HTML tags, such as the <img> tag, allow script in their attributes. If you allow these tags, you will need to perform additional filtering to ensure that script is not passed in with these tags.

9.3.3 SQL Injection

Another potential input problem occurs when developers use input from users to create SQL queries dynamically. In this case, if the developer does not check the input before concatenating the SQL string, attackers may add a second full query to their input, potentially allowing them to access other databases, grant themselves privileges, etc., depending on the account on which the SQL query is run.

Fortunately, it is very easy to prevent SQL injection attacks. All you need to do is avoid creating SQL queries using string concatenation. Rather, you should use stored procedures and/or parameterized queries, which allows you to limit both the type and the length of data provided for a given parameter.