As you have seen throughout this chapter, you can collect a lot of metadata information from SQL Server. You can use any of the system views, functions, or tables I have explained in this chapter to create the metrics you are looking for. Often, the type of monitoring you are interested in will dictate the collection method you need to use. For instance, if you need immediate feedback on a query, you can use SQL Server Profiler. If you need to collect information on an ad-hoc basis in code, you can use SQL Trace. For long-term monitoring for performance or growth, you can use the sys.dm_os_performance_counters table or dynamic views, or even the Windows System Monitor. Do not limit yourself to one tool, because each has advantages depending on what you need to collect. It is important to combine the monitoring with an understanding of the application and the performance problems before you try to interpret the results. Do not get caught making an absolute statement about an interpretation you have made only to be shown another variable that contradicts the statement. After you collect the data, you need to be able to analyze the results. There are a lot of ways to show the data such as reports and Web presentations, but if you are preparing the data for business users, you cannot go wrong by putting the results in a spreadsheet. That is the approach I use in the exercise that follows. A Monitoring ExerciseIn this section, I show you a pared-down example of a monitoring exercise I perform on servers that are showing performance issues. In practical uses, I include far more information. In this example, I use the System Monitor counters I explained earlier.
I normally use the logging feature in System Monitor and send the output to a tab-delimited file. I set the collection interval to five seconds if I am monitoring for a single day, and five minutes if I am monitoring longer. Next, I create an Excel workbook with three tabs: Observations, Interpretations, and Recommendations. I read the System Monitor results into the tab called Observations. I change the format on the columns to remove the server name. I also color the blocks differently for Processor, Memory, Disk, Network, and SQL Server counters. On the Interpretations tab, I create formulas that reference the rows in the Observations tab for maximum, minimum, average, and standard deviation values. I evaluate each of the numbers in the columns and color them red for values I consider outside the norm, green for values that are good, and blue for those that are indicators, such as active transactions. I include the minimums, maximums, and standard deviations so that the averages are useful. Below the numbers, I explain why I feel the numbers are good or bad and the other factors that affect the numbers. I also explain a little about the testing environment, and sometimes I even include server information. It is important to understand the interactions between the numbers. I have only included a few counters here, but in practice I use far more to diagnose an issue. On the Recommendations tab, I explain what to do to fix the issues. For instance, if the active connections move up at a certain rate and at the same time the memory numbers fall below what is physically installed on the server, the server is paging and the server needs more RAM. If I am monitoring more than one server, I include each of them in blocks below each other in the tabs I mentioned. This is especially important if the servers interact within the system. It is surprising to find that you can tell management over and over that a particular part is a problem and you will be ignored. Show the same results on a spreadsheet and your suggestions will be adopted. |