C# Core Language Features

Natural and programming languages consist of both simple and complex structures. The complex structures can be decomposed into simple elements. When learning a new natural language, you probably wouldn't start with a review of sentence structure. Instead, you probably would begin with an exploration of nouns, verbs, and the simpler components of the language. For C#, the simpler components are symbols, tokens, white space, punctuators, comments, preprocessor statements, and other elements. This is an excellent place to start when learning C#.

Symbols and Tokens

Symbols and tokens are the basic constituents of the C# language. Sentences are composed of spaces, tabs, and characters. Similarly, C# statements consist of symbols and tokens. Indeed, statements cannot be articulated without an understanding of these basic elements. Table 1-1 provides a list of the C# tokens.

Table 1-1: C# Symbols and Tokens
Description	Symbols or Tokens
White space	Space
Tab	Horizontal_tab, Vertical_tab
Punctuator	, : ;
Line terminator	carriage_returns
Comment	// /* / /// /*
Preprocessor directive	#
Block	{}
Generics	< >
Nullable Type	?
Character	Unicode_character
Escape character	\code
Numeric Suffix	f d m u l ul lu
Operator	+, -, >, <, *, ??, () and so on

White Space

White space is defined as a space, horizontal tab, vertical tab, or form feed character. White space characters can be combined; where one character is required, two or more contiguous characters of white space can be substituted. Where white space is permitted, one or more instances of white space are allowed.

Tabs

Tabs—horizontal and vertical—are white space characters. Refer to the preceding explanation of white space.

Punctuators

Punctuators separate and delimit elements of the C# language. Punctuators include the semicolon (;), dot (.), colon (:), and comma (,), which are discussed in this section.

Semicolon punctuator In a natural language, sentences consist of phrases and clauses and are units of cohesive expression. Sentences are terminated with a period (.). Statements consist of one or more expressions and are the commands of the C# programming language. Statements are terminated by a semicolon (;). C# is a free-form language in which a statement can span multiple lines of source code and start in any position. Conversely, multiple statements can be combined on a single source code line, assuming that each statement is delimited by a semicolon. This statement is not particularly good style, but it is syntactically correct:

int variablea=
          variableb +
                variablec;

Dot punctuator Dot syntax connotes membership. The dot character (.) binds a target to a member, in which the target can be a namespace, type, structure, enumeration, interface, or object. This assumes the member is accessible. Membership is sometimes nested and therefore described with multiple dots.

Dot punctuator:

Target‥Member

This is an example of the dot punctuator:

System.Windows.Forms.MessageBox.Show("A nice day!")

Colon punctuator The colon punctuator is used primarily to delimit a label, to describe inheritance, to indicate the implementation of an interface, to set a generic constraint, and as part of the conditional operator. (The conditional operator, the only ternary operator in C#, is reviewed later in this chapter. Inheritance and generic constraints are discussed in later chapters.) Labels are tags where program execution can be transferred. A label is terminated with a colon punctuator. The scope of a label is limited to the containing block and any nested block. Jump to a label with the goto statement.

Label punctuator:

label_identifier: statement

A statement must follow a label, even if it's an empty statement.

Comma punctuator The comma punctuator delimits array indexes, function parameters, types of an inheritance list, statement clauses, and most other lists of C# language elements. The comma punctuator is separating statement clauses in the following code:

for(int iBottom=1, iTop=10; iBottom < iTop; ++iBottom, --iTop) {
        Console.WriteLine("{0}x{1} {2}", iBottom, iTop, iBottom*iTop);
}

A statement clause is similar to a sentence phrase or clause, which are also sometimes delimited by commas. A statement clause is a substatement in which multiple statement clauses can be substituted for a single statement. Not all statements are replaceable with statement clauses—check the documentation of the statement to be sure.

Line Terminators

Line terminators separate lines of source code. A carriage-return, line-feed, line-separator, and paragraph-separator are the line terminators of C#. Where one line terminator is inserted, two or more are acceptable. Line terminators can be inserted anywhere white space is allowed. The following code is syntactically incorrect:

int variablea=var
            iableb+variablec;

As an identifier, variablea cannot contain spaces. Thus, line terminators are also disallowed.

Comments

C# supports four styles of comments: single-line, delimited, single-line documentation, and delimited documentation comments. Comments cannot be nested. Although comments are not mandated, liberal comments are considered good programming style. Self-documenting code and comments in your source code aid in later maintenance. Be kind to the maintenance program—comment! In Code Complete, Second Edition, (Microsoft Press, 2004), Steve McConnell gives valuable best practices on programming, including how to properly document your source code.

Single-line comments: // Single-line comments start at the symbol and end at the line terminator:

Console.WriteLine(objGreeting.French); // Display Hello (French)

Delimited comments: /* and */ Delimited comments, also called multiline or block comments, are bracketed by the /* and */ symbols. Delimited comments can span multiple lines of source code:

/*
        Class Program: Programmer Donis Marshall
*/
class Program {
    static int Main(string[] args) {
        Greeting objGreeting = new Greeting();// Display Hello (French)
        Console.WriteLine(objGreeting.French);
        return 0;
    }
}

Single-line documentation comments: /// Use documentation comments to apply a consistent format to source code comments. Documentation comments precede types, members, parameters, delegates, enums, and structs; they do not precede namespaces. Documentation comments use XML tags to classify comments. These comments are exportable to an XML file using the documentation generator. The resulting file is called the documentation file, which can be bound to a Visual Studio project to augment the information presented in IntelliSense and the Object Browser.

Single-line documentation comments are automated in the Visual Studio IDE, which makes them more popular than delimited documentation comments. Visual Studio IDE has Smart Comment Editing that inserts the comment framework after immediately inputting the /// symbol.

The following code snippet shows the previous code after preceding the Main method with a single-line documentation comment ///. From there, Smart Comment Editing completed the remainder of the comment framework, including adding comments and XML tags for the method parameter and return. You only need to add specific comments.

/// <summary>
///
/// </summary>
class Program {
/// <summary>
   ///
   /// </summary>
   /// <param name="args"></param>
   /// <returns></returns>
   static int Main(string[] args) {
         Greeting objGreeting = new Greeting();
         Console.WriteLine(objGreeting.French
         return 0;
   }
}

Here are the documentation comments with added details:

/// <summary>
/// Starter class for Simple HelloWorld
/// </summary>
class Program {
   /// <summary>
   /// Program Entry Point
   /// </summary>
   /// <param name="args">Command Line Parameters</param>
   /// <returns>zero</returns>
   static int Main(string[] args) {
         Greeting objGreeting = new Greeting();
         Console.WriteLine(objGreeting.French);
         return 0;
   }
}

The C# compiler is a documentation generator. The /doc compiler option instructs the compiler to generate the documentation file. Alternatively, you can request that the documentation file be generated in the Visual Studio IDE. Select the Properties menu item from the Project menu. From the Properties window, switch to the Build options. In the Build pane (see Figure 1-2), you can activate and enter the name of the XML documentation file.

Figure 1-2: The Build pane of the Project Settings window

Delimited documentation tags Delimited documentation tags can be used instead of the single-line version. Smart Comment Editing is not available with delimited documentation tags. Documentation symbols, XML tags, and comments must be entered manually, which is the primary impediment to using delimited documentation tags. Here is an example of delimited documentation tags:

/**<summary>
Starter class for Simple HelloWorld</summary>
*/

This is the documentation file generated by the C# compiler from the preceding source code:

<?xml version="1.0" ?>
<doc>
<assembly>
    <name>HelloWorld</name>
</assembly>
<members>
    <member name="T:HelloWorld.Program">
        <summary>Starter class for Simple HelloWorld</summary>
    </member>
    <member name="M:HelloWorld.Program.Main(System.String[])">
        <summary />
        <param name="args" />
        <returns />
      </member>
</members>
</doc>

The documentation generator assigns IDs to element names. T is the prefix for a type, whereas M is a prefix for a method. Here's a listing of IDs:

E	Event
F	Field
M	Method
N	Namespace
P	Property
T	Type
!	Error

Preprocessor Directives

Use preprocessor directives to define symbols, include source code, exclude source code, name sections of source code, and set warning and error conditions. The variety of preprocessor directives is limited when compared with C++, and many of the C++ preprocessor directives are not available in C#. There is not a separate preprocessor for preprocessor statements. Preprocessor statements are processed by the normal C# compiler. The term "preproccesor" is used for historical connotations only.

Preprocessor directive:

#command expression

This is a list of preprocessor commands available in C#:

#define	#undef	#if
#else	#elif	#endif
#line	#error	#warning
#region	#endregion	#pragma

The preprocessor symbol and subsequent command are optionally separated with a space, but must be on the same line. For this reason, preprocessor commands can be followed only with a single line comment.

Declarative preprocessor directives The declarative preprocessor directives are #define and #undef, which define and undefine a preprocessor symbol, respectively. Defined symbols are implicitly true, whereas undefined symbols are false. Declarative symbols must be defined in each compilation unit where the symbol is referenced. Undeclared symbols default to undefined and false. The #define and #undef directives must precede any source code. Redundant #define and #undef directives are trivial and have no affect.

Declarative preprocessor directives:

#define identifier
#undef identifier

Conditional preprocessor directives Conditional preprocessor directives are the #if, #else, #elif, and #endif directives, which exclude or include subsequent source code. A conditional preprocessor directives begins with #if and ends with #endif. The intervening conditional preprocessing directives, #else and #elif, are optional.

Conditional preprocessor directives:

#if boolean_expression
#elif boolean_expression
#else
#endif

The boolean_expression of the #if and #elif directive is a combination of preprocessor symbols and normal Boolean operators. If the boolean_expression is true, the source code immediately after the #if or #elif directive is included in the compilation. If the boolean_expression is false, the source code is hidden. The #else directive can be combined with an #if or #elif directive. If the boolean_expression of #if or #elif is false, the code following the #else is included in the compilation. When true, the source code after the #else is hidden. Here's sample code with preprocess symbols and related directives:

#define DEBUGGING

using System;

namespace Donis.CSharpBook {
    class Starter{
#if DEBUGGING
        static void OutputLocals() {
              Console.WriteLine("debugging...");
        }
#endif
        static void Main() {
#if DEBUGGING
            OutputLocals();
#endif
        }
    }
}

Finally, the #elif directive essentially nests #if conditional preprocessor directives:

#if expression
    source_code
#elif expression
    source_code
#else
    source_code
#endif

Diagnostic directives Diagnostic directives include the #error, #warning, and #pragma directives. The #error and #warning directives display error and warning messages, correspondingly. The diagnostic messages are displayed in the Error List window of the Visual Studio IDE. Similar to standard compilation errors, an #error directive prevents the program from compiling; a #warning directive does not prevent the program from successfully compiling. Use conditional directives to conditionally apply diagnostic directives.

Diagnostic directives:

#error error_message
#warning error_message

The error_message is of string type and is optional.

The #pragma directive enables or disables standard compilation warnings.

Pragma directives:

#pragma warning disable warning_list
#pragma warning restore warning_list

The warning_list contains one or more warnings delimited with commas. The status of a warning included in the warning_list remains unchanged until the end of the compilation unit unless altered in a later #error directive.

This #pragma directive disables the 219 warning, which is the "variable is assigned but its value is never used" warning:

#pragma warning disable 219
    class Starter{
        static void Main() {
         int variablea=10;
        }
    }

Region directives Region directives mark sections of source code. The #region directive starts a region, whereas the #endregion directive ends the region. Region directives can be nested. The Visual Studio IDE outlines the source code using region tags. In Visual Studio, you can collapse or expand regions of source code.

Region directives:

#region identity
source_code
#endregion

Line directives Line directives modify the line number reported in subsequent compiler errors and warnings. There are three versions of the line directive.

Line directives:

#line line_number source_filename
#line default
#line hidden

The first #line directive shown renumbers the source code from the location of the directive until the end of the compilation unit is reached or overridden by another #line directive. In the following code, the #line directive resets the current line to 25:

#line 25
static void Main() {
    Console.WriteLine("#line application");
    int variablea=10; // 219 warning
}

The second type of #line directive resets or undoes any previous #line directive. The line number is reset to the natural line number.

The third #line directive is only tangentially related to the line number. This directive does not affect the line number; it hides source code from the debugger. Excluding another #line hidden directive, the source code is hidden until the next #line directive is encountered.

Blocks

Blocks define the scope of a type, where type is a class, struct, or enum. Additionally, members of the type are listed inside the block.

Block:

type typename{ // block
}

Blocks are used as compound statements. Paragraphs join related sentences that convey an extended thought or concept. A statement block combines related statements as a single entity. In this context, a statement block is a compound statement and can contain one or more statements. Each statement of the statement block is delimited by a semicolon. In most circumstances in which a single statement is allowed, a statement block can be substituted. Statement blocks are prevalent as method bodies but are used with conditional and iterative statements.

The if statement in the following code, which is a conditional statement, controls a single statement. The Console.WriteLine is controlled by the if statement that precedes it, so a statement block is not required.

static void Main() {
    int variablea=5, variableb=10;
    if(((variablea*variableb)%2)==0)
        Console.WriteLine("the sum is even");
}

In the modified code, the if statement controls multiple statements, and a statement block is needed. Some would suggest, and I agree, that always using statement blocks with conditional statements is a good practice. This prevents a possible future omission when additional statements are added to the realm of the conditional statement:

static void Main() {
    int variablea=5, variableb=10;
    if(((variablea*variableb)%2)==0) {
        Console.WriteLine("{0} {1}", variablea,
            variableb);
        Console.WriteLine("the sum is even");
    }
}

Generic Types

Generic types are templated types. A type is an abstraction of identity: a car class is an abstraction of a type of car, an employee class is an abstraction of an employee, and a generic type is an abstraction of the specifics of a type.

The NodeInt class partially implements and is an abstraction of a node within a linked list of integers:

class NodeInt {
    public NodeInt(int f_Value, NodeInt f_Previous) {
        m_Value=f_Value;
        m_Previous=f_Previous;
    }

    // Remaining methods

    private int m_Value;
    private NodeInt m_Previous;
}

The Node class is further abstracted when compared with the NodeInt class. The integer specifics of the NodeInt class have been removed. This resulting type could be a link list of any type:

class Node<T> {
    public Node(T f_Value, Node<T> f_Previous) {

        m_Value=f_Value;
        m_Previous=f_Previous;
    }

    // Remaining methods

    private T m_Value;
    private Node<T> m_Previous;
}

The generics symbol bounds the type parameters, which is T in the preceding program.

There is much more about generics later in the book.

Nullable Types

Nullable types are value types that can be assigned a null value. Nullable types provide a consistent mechanism for determining whether a value type is empty (null).

Nullable type:

valuetype? identifier;

Nullable types are discussed in detail later in this chapter.

Characters

C# source files contain Unicode characters, which are the most innate of symbols. (Every element, keyword, operator, or identifier in the source file is a composite of Unicode characters.)

Numeric Suffixes

Numeric suffixes cast a literal value to a related type. Literal integer values can have the L, U, UL, and LU suffixes appended to them; literal real values can have the F, D, and M suffixes added. The suffixes are case insensitive. Table 1-2 describes each suffix.

Table 1-2: Description of Suffixes
Description	Type	Suffix
Unsigned integer or unsigned long	uint or long	u
Long or unsigned long	long or ulong	l
Unsigned long	ulong	ul
Float	float	f
Double	double	d
Money	decimal	m

When casting a real type using the M suffix, rounding might be required. If so, banker's rounding is used, which rounds to the nearest possible value. If midway between two values, the even number is returned. Gaussian rounding, albeit harder to pronounce, is another name for banker's rounding.

Here is an example:

int variable=10u;

The next statement causes a compile error. You cannot append a real suffix to an integral literal value because they are not related types.

long variable = 456f;

Escape Characters

The escape character provides an alternate means of encoding Unicode characters, especially special characters that are not available on a standard keyboard. Escape sequences can be used as characters within identifiers and elsewhere.

Unicode escape sequences must have four hexadecimal digits and are therefore limited to a single character.

Escape sequence:

\uhexadecimal digit1 digit2 digit3 digit4

Hexadecimal escape sequences define one or more Unicode characters and contain one or more digits.

Hexadecimal sequence:

\xhexadecimal digit1 digit2 digitn

Table 1-3 shows a list of the predefined escape sequences in C#.

Table 1-3: Predefined Escape Sequences
Simple Escape	Sequence
Single quote	\'
Double quote	\"
Backlash	\\
Null	\0
Alert	\a
Backspace	\b
Form feed	\f
New line	\n
Carriage return	\r
Horizontal tab	\t
Unicode character	\u
Vertical tab	\v
Hexadecimal character(s)	\x

This is an unconventional version of the traditional Hello World program:

class HelloWorld {
    static void Main() {
        Console.Write("\u0048\u0065\u006C\u006C\u006f\n");
        Console.Write("\x77\x6F\x72\x6c\x64\x21\b");
    }
}

Verbatim Characters

The verbatim character prevents the translation of a string or identifier, where it is treated "as-is." To create a verbatim string or identifier, prefix it with the verbatim character.

A verbatim string is a string literal prefixed with the verbatim character. The characters of the verbatim string, including escape sequences, are not translated. The exception is the quote escape character, which is translated even in a verbatim string. Unlike a normal string, verbatim strings can contain physical line feeds.

Here is a sample verbatim string:

class Verbatim{
        static void Main() {
        string fileLocation=@"c:\datafile.txt";
        Console.WriteLine("File is located at {0}",
                    fileLocation);
        }
}

A verbatim identifier is an identifier prefixed with the verbatim character that prevents the identifier from being parsed as a keyword. Although this is of limited usefulness, porting source code from another language—in which the keywords are different—is a circumstance in which verbatim identifiers might be helpful. Otherwise, it is a best practice not to use this language feature because verbatim identifiers make your code less readable and harder to maintain.

This is a partial translation of French to English:

L'espoir is a waking dream.

Can you decipher this sentence? Unless you are fluent in French, the partial translation is ineffectual at best. The original sentence was ''L'espoir est un rêve de réveil." The following is an equally unskillful translation, although technically acceptable:

public class ExampleClass {
    public static void Function() {
        int @for = 12;
        MessageBox.Show(@for.ToString());
    }
}

In the preceding code, the for statement is being used as a variable name. Although technically acceptable, it is confusing at best. The for statement is common in C# and many other programming languages. For this reason, most developers would view the for as a statement regardless of the usage, which will inevitably lead to confusion.

Operators

Operators are used in expressions and always return a value. There are three categories of operators: unary, binary, and ternary. The following sections describe most of the operators in C#.

Unary operators Table 1-4 is a list of unary operators.

Unary operators have a single parameter.
Prefix operators are evaluated before the encompassing expression.
Postfix operators are evaluated after the encompassing expression.

Table 1-4: Unary Operators
Operator	Symbol	Sample
Unary Plus	+	variable=+5; 5
Unary minus	-	variable=-(-10); 10
Boolean Negation	!	variable=!true; false
Bitwise 1's complement	~	variable=~((uint)1); 4294967294
Prefix Increment	++	++ variable; 11
Prefix Decrement	--	-- variable; 10
Postfix Increment	++	variable ++; 11
Postfix Decrement	--	variable --; 10
Cast Operator	( )	variable =(int) 123.45; 123
Function Operator	( )	FunctionCall(parameter); return value
Array Index Operator	[ ]	arrayname[iIndex]; nth element
Global Namespace Qualifier	::

Binary operators This section lists and discusses the use of binary operators.

Binary operators have two operands: a left and right operand.
Integer division truncates the floating point portion of the result.
Bitwise Shift Left (value << bitcount).
Bitwise Shift Right (value >> bitcount).

Table 1-5 details the binary operators.

Table 1-5: Binary Operators
Operator	Symbol	Sample	Result
Assignment	=	variable=10;	10
Binary Plus	+	variable=variable + 5;	15
Binary Minus	-	variable=variable - 10;	5
Multiplication	*	variable=variable 5;*	25
Division	/	variable=variable / 5;	5
Modulus	%	variable=variable % 3;	2
Logical And	&	variable=5 & 3;	1
Logical Or	\|	variable=5 \| 3;	7
Bitwise XOR	^	variable=5 ^ 3;	6
Bitwise Shift Left	<<	variable=5 << 3;	40
Bitwise Shift Right	>>	variable=5 >> 1;	2
Null Coalescing	??	variableb=variablea??5	2

Compound operators Compound operators combine an assignment and another operator. If the expanded expression is 'variablea=variablea operator value', the compound operator is 'variable operator= value'.

variable=variable+5;

The preceding compound operation is equivalent to this:

variable+=5;

Compound operations are a shortcut and are never required in lieu of the expanded operation. Table 1-6 lists the compound operators.

Table 1-6: List of Compound Operators
Operator	Symbol	Sample
Addition Assignment	+=	variable+=5;
Subtraction Assignment	-=	variable-=10;
Multiplication Assignment	*=	variable=5;*
Division Assignment	/=	variable/=5;
Modulus Assignment	%=	variable%=3;
And Assignment	&=	variable&=3;
Or Assignment	\|=	variable\|=3;
XOR Assignment	^=	variable^= 3;
Left-Shift Assignment	<<=	variable<<=3;
Right-Shift Assignment	>>=	variable>>=1

Boolean operators Boolean expressions evaluate to true or false. The integer values of nonzero and zero cannot be substituted for a Boolean true or false.

There are two versions of the logical And and Or operators. The && and || operators support short-circuiting, whereas & and | do not. What is short-circuiting? If the result of the expression can be determined with the left side, the right side is not evaluated. Without disciplined coding practices, short-circuiting might cause unexpected side effects.

This is an example of possible short-circuiting:

if(FunctionA() && FunctionB()) {

}

In the preceding code, assuming that FunctionA returns false, the entire expression evaluates to false. Therefore, the expression short-circuits and FunctionB is not invoked.

Table 1-7 shows the Boolean operators.

Table 1-7: List of Boolean Operators
Operator	Symbol
Equals	==
Not Equal	!=
Less Than	<
Greater Than	>
And (Short Circuiting)	&&
Or (Short Circuiting)	\|\|
And	&
Or	\|
Less Than or Equal	<=
Greater Than or Equal	>=
Logical XOR	^

Ternary operators The conditional operator is the sole ternary operator in C# and is an abbreviated if else statement.

Conditional operator:

boolean_expression?truth_statement:false_statement

This is the conditional operator in source code:

variable>5?Console.WriteLine(">5"):Console.WriteLine("<= 5");

Pointer operators Pointer operators are available in unsafe mode and support conventional pointers. The unsafe compiler option builds a program in unsafe mode. Alternatively, in Visual Studio IDE, set the Allow Unsafe Mode option on the Build Page of Project Settings. Table 1-8 includes the pointer operators.

Table 1-8: List of Pointer Operators
Operator	Symbol	Description
Asterisk Operator¹	*	Declare a pointer
Asterisk Operator²	*	Dereference a pointer
Ampersand Operator	&	Obtain an address
Arrow Operator	->	Dereference a pointer and member access

Here is some sample code using pointers:

static void Main(string[] args)
{
   unsafe {
         int variable = 10;
         int* pVariable = &variable;
         Console.WriteLine("Value at address is {0}.",
               *pVariable);
   }
}

The above code must be compiled with the unsafe compiler option on. A more extensive review of pointers is presented later in the book.

Identifiers

An identifier is the name of a C# entity, which includes type, method, property, field, and other names. Identifiers can contain Unicode characters, escape character sequences, and underscores. A verbatim identifier is prefixed with the verbatim character (as discussed earlier in this chapter).

Keywords

One of the strengths of C# is that the language offers relatively few keywords. C# keywords represent the verbs, nouns, and adjectives of the language. The nouns of C# are instances of classes, structs, interfaces, delegates, and namespaces. Verbs infer an action. The goto, for, while, and similar keywords have that role in C#. The adjectives, including the public, private, protected, and static keywords, are modifiers of the C# nouns.

Table 1-9 is an overview of the C# keywords. Extended explanations of each keyword are provided in context at the appropriate location in the book.

Table 1-9: Overview of C# Keywords with Explanations

Keyword

Syntax

Explanation

abstract¹

abstract class identifier

The class identifier cannot be instantiated.

abstract²

abstract return identifier

The method identifier is not implemented in the current class.

identifier as type