Testing Terminal Server Environments

Testing Terminal Server Environments

Comprehensive testing before launching a complex terminal server environment often helps avoid unpleasant surprises. So how are these tests performed and what are the key parameters for a test to succeed? These questions must be viewed in the context of the fact that tests are usually one of the more unpopular tasks involved in setting up terminal server environments. The normally technically minded person shows a rather muted excitement when it comes to testing. Reasons for the lack of enthusiasm include the following:

  • Really meaningful tests, including thorough evaluation, usually require considerable time and effort and are thus often reduced to a minimum due to budget restrictions.

  • Tests are normally performed at the end of a project. If the previous project phases overrun, tests are cut short or eliminated to meet the overall project deadline.

  • There are only a few test tools that are suitable for complex terminal server environments.

  • Performing a meaningful test is only half of the work. Evaluation and documentation are also important, and they require special skills on the part of the test engineers in charge.

  • Conscientious test engineers will possibly be able to show consultants, developers, or system integrators system bottlenecks or even mistakes in the terminal server infrastructure, which then results in additional (troubleshooting) work. Although this sounds rather harmless, testing can be social dynamite in complex projects.

  • According to general opinion, test engineers do not generate creative results but instead examine the results of technical creativity using formal methods.

To cut a long story short, testing tends to be viewed with reluctance because test engineers are not very well liked, they often do not have the right tools, they do not generate “creative” results, and their work is very expensive. These are common prejudices, although they are definitely unjustified!

So why are tests needed at all? Couldn’t we just do without them? As you were able to see in Chapter 5, without meaningful tests it is very difficult to predict a terminal server environment’s resource requirements. Furthermore, you can easily be accused of negligence in a project where no tests have been done to secure the results. For this reason, tests are indispensable in larger environments—especially where time and money are in limited supply.


Testing is a discipline that is reminiscent of classic natural sciences. A test often cannot incorporate an entire target environment, so strong abstraction is necessary. Some tests, by their very nature, change the observed object so fundamentally that general statements are no longer possible. Sometimes, marginal parameters (for example, exact user behavior) are not known or can be mapped only roughly in statistical terms. Because no test environment is exactly like the target environment, tests are often performed with individual components. This, however, requires the components to be prioritized. The key rule here is: do it properly or don’t do it at all. To obtain meaningful test results, all of the following tasks must be performed: measuring, creating statistics, documenting, evaluating, interpreting, and communicating.

Test Criteria

To perform a successful test, you first need to define the objectives. First, it is imperative to find out if terminal servers are the right corporate solution:

  • Can all important applications be provided on terminal servers?

  • How many users will be affected by a possible migration to terminal servers?

  • What are the requirements relating to the terminal server environment’s stability, availability, and scalability?

  • How will the company and employees be affected if a terminal server fails?

  • How easy will it be to increase system resources, if required?

These questions need to be answered in respect for business processes within the company in the first instance. If the answer turns out in favor of terminal servers, the same questions need to be answered with a view to the technical aspects. Tests are, of course, required to back up the entire argumentation. After all, the strategic introduction of terminal servers is quite expensive and is no easy task from a technical point of view.

This is exactly where the problems start: how to test? It is almost impossible to install a reference environment that corresponds to the planned target environment. Furthermore, it is usually counterproductive if the future users are asked to participate in internal beta testing. For these reasons, it is recommended that an independent test environment be defined and a simulation be performed that includes all suitable methods and tools. Many of the potential problems can be discovered at this early stage. Only in relatively small and undemanding environments is it possible to perform the relevant tests directly in a production environment, and only for a limited period of time.

Evaluating the terminal servers in a test environment helps make concrete statements on the expected operating conditions and limiting values. When evaluating the environment, several criteria must be taken into account:

  • Performance Performance tests are used to evaluate an environment’s individual areas in terms of speed. A test tool must therefore be able to simulate many simultaneous accesses to determine the respective response times of each area. The results can be integrated in the corresponding service level agreements. With this type of test, it is, for example, possible to determine the dialog change time of a terminal server application under different conditions.

  • Load A load test subjects the environment to the kind of access and usage rate expected in routine operation. This requires the test to define the maximum response time for the individual system components and to align the response time with the existing state of technical implementation. The corresponding measurements need realistic load samples simulated by the test tool. The results show whether the response times are within the expected limits.

  • Stress Stress tests are simulated, mostly benign attacks that generate excessive loads in an environment. In reality, this type of test corresponds to a peak load that could be the result of certain conditions, such as the start of work, end of a quarter, or successful marketing activities. The main goal of a stress test is to find out when an environment starts generating errors and whether it reverts to normal after extreme stress.

  • Endurance An endurance test subjects an environment to a predefined load for some considerable time. During this period, the test examines whether certain parameters change significantly, for instance memory requirement, processor load, and response times. The results allow statements to be made on the quality of the programs and components used. This is especially important for an environment’s long-term operation.

  • Scalability The term scalability does not say much about an environment’s response behavior. It does, however, describe its behavior in relation to access times when the number of users is increased. A scalability test therefore delivers the values required for extending an environment and guaranteeing constant service quality with increasing numbers of users. The individual components are examined and evaluated as to their ability to be scaled up (that is, the relevant platform is updated) and scaled out (that is, the number of relevant platforms is increased).

Security is another criterion, but we will not cover this in detail here. Special tests help observe terminal server behavior in case of attacks on the security system. The basics are described in detail in Chapter 8.


If there is not an expert in your company with sufficient knowledge of Windows Server 2003 and networks, testing and productive operation of Terminal Services will be problematic. A terminal server requires, at least in the beginning, a high degree of competent maintenance, which takes a lot of time and effort.

Measurement Methods and Test Tools

In most cases, there are not enough “real” users for tests to take place under controlled conditions. Moreover, scenarios with test users are usually not really reproducible because the individual users behave differently from test to test. Therefore, typical user actions need to be simulated, which requires precise knowledge of user behavior and the type of application. However, it quickly becomes evident that only simple user activity can be simulated because complex behavior modeling takes too much time and effort.

In terms of automated and thus reproducible terminal server environment tests, it is necessary to distinguish between different concepts that relate to the location of the test logic. The test logic can either run on the server or be client based. Modeling the temporal processes of graphical user interaction is problematic due to the communication protocols, RDP and ICA, being stateful. This is why a terminal server test differs significantly from a Web server test with the stateless HTTP protocol.

If the logic for a terminal server test is set up on clients, the following constellation results: a script is able to simulate all user actions on an RDP or an ICA client. A suitable runtime environment that can perform mouse clicks and keyboard strokes like a human user needs to be installed on each client platform. The graphical display of applications is, however, only an image of the graphical application elements and does not represent the application elements themselves. The latter are created and managed on the terminal server only.

It is, of course, desirable that the test clients be controlled centrally and that multiple simultaneous instances of RDP or ICA clients be executed per client platform. In the best case scenario, the terminal server does not even “notice” that it is not a real user but a script logic requesting a session and interacting with the applications.

How to handle time and space requirements for the simulated user interaction is a problematic issue. Considerations include whether the script is able to “click” on the correct screen element at the right time or to enter a text into a dialog box only when the box actually appears on the screen. Graphical application elements that are not always displayed at fixed screen coordinates and increasing time delays for the graphical output when the server load increases often take scripts on the clients to their limits. This, in turn, results in a high number of failed simulated user sessions that freeze and thus distort the entire result.

The most successful test approaches based on client platforms involve special RDP or ICA clients. They concern the presentation layer of an application that is executed on a terminal server. It is therefore necessary either to program a modified client with massive support from Microsoft or Citrix, or the test runtime environment can control a normal client for the most part. Both options incur quite some effort in terms of developing the respective test environments.

An alternative to the client-side test is to use tools that automate user sessions on the server. This shifts the test activities from the presentation layer to the application layer. Macro tools for automating user actions are the best-known members of this product category. The problem with using macro tools is, though, that they change the server constellation. The macro tool itself is an auxiliary program that needs to be run during all tests. It thus distorts the result. Furthermore, events on the client play no role at all, which does not represent “normal” terminal server environment behavior.

Standardized benchmark tests that physically run on the terminal server are another option. However, for them to be displayed through several clients, they generally need to be launched manually in different terminal server sessions. These tests often contain popular application programs or representative screen output to model reality as closely as possible (such as Ziff-Davis benchmarks). Still, they are not really suitable for objective and individualized testing of a terminal server environment.

The continuous generation of processor load and the targeted occupation of memory resources is another way of testing terminal server performance. This procedure often helps obtain a first impression of the terminal server’s behavior when using standard applications under different conditions.

There are a number of products and tools available for the options described here, but using them professionally usually entails relatively high expense. They all have one thing in common, though: they all extensively utilize the Windows Server 2003 System Monitor performance indicators for evaluations. The relevant performance indicators for terminal servers are described in detail in Chapter 4.


Integrated tests that include access to terminal server environments through a Web interface, as described in Chapter 12 and Chapter 13, pose real problems. If you additionally want to include logon via certificates and virtual private networks in your tests, it will be even more difficult to obtain the relevant commercial test tools.

Objective assessment of the test results can be supported by the following methods:

  • Measuring time using a stop watch or the system timers

  • Comparing the results of the standardized load tests with reference environments

  • Observing system activities with the System Monitor and the Network Monitor

  • Evaluating the test results using statistical and analysis tools

Subsequently, the raw measurement data must be brought into a structured form to determine the utilization of all system resources. In this way, the basic cornerstones can be identified and analyzed to work out the key indicators of satisfactory system performance. Expensive test environments usually include powerful automated interpretation functions. These need to be set up manually when using the more cost-effective tools.

The goal after interpreting all test results is to be able to make an objective statement on the behavior of terminal server environments under set load conditions, clearly defining the limits pertaining to memory, processors, hard drive systems, and network. Furthermore, it should be possible to give an estimate of the maximum number of simultaneous terminal server users and the scalability of a load-balanced server farm.

Mercury Interactive

A rather expensive but very popular test tool is LoadRunner by Mercury Interactive (http://www.mercuryinteractive.com). Its functions were extended especially for use on MetaFrame servers, and this version is named LoadRunner for Citrix.

LoadRunner for Citrix’s test logic is on the client side. A specially adapted ICA client allows the use of ICA functions for controlling tests via scripts. An external administration console centrally controls the clients. Powerful processing script tools facilitate highly realistic simulation of user sessions. The scripts support the dynamic import of server names, user names, passwords, domains, applications, and window parameters. The interaction of the simulated users comprises text entries and mouse events. The user session’s synchronization with the control script focuses on the window name and the content of graphical elements.

LoadRunner’s particular strength lies in its options for evaluating MetaFrame server tests. The relevant performance indicators, which can be displayed in many correlations and statistical views, allow detailed statements on the test results to be made.


CitraTest by Tevron (http://www.tevron.com) is a purely client-side test environment, like LoadRunner for Citrix by Mercury Interactive. It is based on image comparison algorithms that are able to identify predefined bitmap images within an ICA client. The relevant script logic is coded in Microsoft Visual Basic 6 and determines the behavior when recognizing expected image patterns. In line with the predefined logic for finding the expected graphical application elements, the script generates mouse and keyboard events. A tool supplied with CitraTest records, saves, and, if necessary, modifies the reference images for pattern recognition.

Because an application’s graphical output elements might differ depending on the user, CitraTest is able to handle variations in colors and character strings. Furthermore, integrated algorithms for optical character recognition (OCR) allow text output to be read and interpreted. The tests are evaluated based on the measurement of the response times caused by mouse and keyboard events on the client.

Scapa Technologies

StressTest for Thin Client by Scapa Technologies (http://www.scapatech.com) is based on an architecture that needs components on both the terminal server and the clients. The relatively thin server component starts and controls the execution of test scripts and coordinates this with the client interfaces. Communication with the relevant client component is integrated in the RDP or ICA communication protocols.

The Scapa StressTest client component allows the import of different standard scripts for coding the test logic. This includes process control, support of different user names, passwords, and dialog box entries. A separate component handles the controlling of test clients. It executes, controls, and evaluates the relevant scripts. Scapa StressTest is compatible with script tools such as Wilson WindowWare WinBatch or TaskWare WinTask.

A Scapa StressTest specialty is the simultaneous support of the ICA and RDP communication protocols. Scapa StressTest also comprises functions that allow it to cooperate with Canaveral iQ by New Moon Systems, SoftGrid for Terminal Servers by Softricity, and Secure Access Manager by Citrix.

Script and Macro Tools

Script and macro tools help create automated process controls for user interfaces and Windows-based applications. The results can either be used as basic material for test environments such as Scapa StressTest or can be used directly for server-side application tests. The following list contains an overview of the most popular script and macro tools used on terminal servers:

  • MacroExpress by Insight Software Solution (http://www.macroexpress.com) is a macro tool developed to support users in automating repeated tasks. MacroExpress includes hundreds of commands and the option to program responses to system events.

  • Macro ToolsWorks by Pitrinec Software (http://www.pitrinec.com) is another common macro tool. It includes its own macro language with more than 150 commands, a graphical development environment, and several elaborate options to control the timing of the developed macros.

  • WinBatch by Wilson WindowWare (http://www.winbatch.com) is a script and macro tool with a recording component and its own development environment.

  • WinTask by TaskWare (http://www.wintask.com) is another script and macro tool whose programming language is very similar to Visual Basic. It includes a recording component for keyboard strokes, mouse movement, and Windows functions, which facilitates the creation of scripts. In addition, WinTask contains functions to control program timings, which make it highly suitable for maintenance tasks.

  • AutoIt by Hiddensoft (http://www.hiddensoft.com/autoit) is a simple but free script tool.


The Citrix Server Test Kit (CSTK) is an automated tool that a Citrix server administrator can use to configure and execute different load tests. It creates consistent and repeatable loads on different system configurations by using scripts that simulate application access without requiring user interaction. It also allows several sessions to be started from a single client. The test is controlled through a server console.


The Citrix Server Test Kit is free and can be ordered from the Citrix Developer Network Web site (http://www.citrix.com/cdn).

Once the CSTK is installed on a MetaFrame server, the CSTK client, the CSTK console, the System Monitor, and the documentation are all located in one program group. The CSTK client is also included in each user’s auto-start group. For this reason, the CSTK is not suited for testing in production environments. When the CSTK console is started on the MetaFrame server, the simulation scripts can be imported on the server. These scripts have previously been generated with one of the script and macro tools mentioned in the previous section. Each script can be assigned to a normal user or a power user. The former user category can start individual scripts successively, while the latter can execute several scripts simultaneously.

When all test users have been set up with the CSTK console and all relevant test scripts have been set, the test can begin. Initially, it is started through the CSTK console, but at least one user must log on from an ICA client to start the test process. A special CSTK client program (Cstklaun.exe), located in the CSTK installation directory, enables the simultaneous start of several test user sessions on one ICA client platform.

Analyzing the results of a CSTK test is not easy. First of all, you need the results from the System Monitor. System Monitor collects all relevant data during the entire test on the MetaFrame server. This activity itself consumes quite a significant amount of system resources. Furthermore, each CSTK client instance that is executed in each test user session requires more than 2 MB of memory. This is why the CSTK can supply only a rough estimate of the performance situation on a MetaFrame server.

Microsoft Test Tools

The CD that comes with the Windows Server 2003 Technical Reference contains various tools that can be used for different tests on terminal servers.


Combined with various test scenarios, the Consume.exe command-line tool simulates a resource bottleneck. With this tool, it is possible to occupy physical memory, swap file, hard drive, processor, or kernel pool in a targeted manner. This helps developers and administrators estimate how applications might behave in many extreme situations.

The Consume.exe command-line syntax, shown here, is explained in Table 11.1.

Consume {-physical-memory | -page-file | -disk-space | -cpu-time | -kernel-pool} [-time seconds]
Table 11.1: Consume.exe Command-Line Parameters




Consumes as much physical memory as possible.


Consumes the swap file.


Consumes hard drive space.


Consumes processor resources by starting up 128 threads with normal priority.


Consumes as many kernel pool resources as possible (non-paged pool).

-time seconds

Determines the time in seconds that the resources were occupied for. If this parameter is not set, resource occupation goes on until it is ended by pressing Ctrl+C on the keyboard.


This program replaces the tools Cpustres.exe and Leakyapp.exe of the Windows 2000 Server Technical Reference. Cpustres.exe allows load generation through four separate threads that each contain individual priorities and activity levels. Leakyapp.exe behaves like an application with a memory leak. Both tools still work under Windows Server 2003.

Create File

The Create File command-line tool (Createfil.exe) generates files of a predefined size filled with blanks. It can therefore occupy hard drive space during a test or transmit defined amounts of data via the network.

Windows Program Timer

The Windows Program Timer (Ntimer.exe) command-line tool measures the time a program runs. Ntimer.exe launches the relevant program as a parameter. Subsequently, Ntimer.exe indicates the time that the program took overall, the time the program spent in user mode, and the time it spent in privileged system mode.

Terminal Server Scalability Planning Tools

Significantly more sophisticated terminal server test environments can be created using Terminal Server Scalability Planning Tools. They are included in a self-extracting installation file named Tsscalling.exe on the CD that comes with the Windows Server 2003 Technical Reference. When you run this program, the installation process creates a folder with a name of your choice and saves a number of tools, test scripts, and documents in it.

The test environment includes the following automation and test tools:

  • Robosrv.exe RoboServer is the central control tool with a graphical interface for testing the load on a terminal server. It is usually installed on a special control computer that controls a number of clients. RoboServer determines these settings: the number of user sessions per client platform, the number of clients in one group, and the time between certain test events.

  • Robocli.exe RoboClient is installed on each client platform and controls the test scripts that the clients execute on the terminal server. RoboClient receives the commands for how and when to start the test scripts from RoboServer. Together with RoboServer, RoboClient is responsible for automatically executing complete test scenarios. Before RoboClient can be launched, RoboServer must already be installed on the network.

  • Tbscript.exe Terminal Services Bench Scripting is a test tool and represents a script interpreter for Microsoft Visual Basic Scripting Edition scripts. It supports a number of specific extensions for controlling terminal server clients.

  • Qidle.exe: Query Idle is a test tool that is able to automatically identify user sessions on terminal servers that are idle for a long time. In test environments, this is the same as an interrupted script. As soon as Qidle.exe finds such a user session, it informs the administrator with a system sound.

Besides these tools, the test suite also includes pre-prepared test scripts and some quite comprehensive documentation on installing test environments, executing tests, and using the Tbscript.exe tool.

In addition to the scripts on the remote clients, RoboServer is also able to execute a local canary script. This script runs before the actual test and before adding a new group of client sessions. The duration of the execution of this script is logged and allows a statement to be made on the respective terminal server load before and during the test.

To start a test, the relevant clients must be selected using the user interface. The context menu then allows you to select the Run Script option.


You can also run RoboServer from the command line. The syntax is as follows: Robosrv –s:ServerName –n:ClientNumber, where ServerName specifies the target server for RoboClient and ClientNumber specifies the number of initial connections to RoboServer. RoboClient can also be run from the command line; the syntax is: Robocli –s:RoboServerName.

Click To expand
Figure 11-18: Test script execution options with RoboServer: two RoboClients with 10 user sessions each are connected with RoboServer.

In combination with Tbscript.exe, VBScript scripts do the actual load test work. This is based on an already installed version of the usual Windows 2000, Windows XP, or Windows Server 2003 RDP client. The relevant configuration is performed by the Smclient.ini file that sets all required parameters. Starting Tbscript.exe and the corresponding scripts then allows the automated execution of the following functions:

  • Logging on, logging off, and ending a connection

  • Launching applications

  • Transmitting mouse and keyboard input

  • Transmitting data and character strings

  • Using the clipboard

  • Executing loops and conditional jumps

  • Using API calls in DLLs

All in all, using Terminal Server Scalability Planning Tools successfully implies quite considerable effort that should not be underestimated. Once you are acquainted with the tools, however, you will find they present a powerful method for helping administrators improve their knowledge of terminal servers and the way terminal servers behave when experiencing heavy load.