The primary purpose of these benchmarks is to provide a number of measurements that allow you to compare different Squid configurations and features. In order to produce comparable results, I've taken care to minimize any differences between systems being tested.
I used five identical computer systemsone for each of the following operating systems: FreeBSD, Linux, NetBSD, OpenBSD, and Solaris. The boxes are IBM Netfinity servers with one 500-MHz PIII CPU, 1 GB of RAM, an Intel fast-Ethernet NIC, and three 8-GB disk SCSI drives. I realize that these aren't particularly powerful machines by today's standards, but they are good enough for these tests. Anyway, it is more important that they be identical than powerful.
The requirement to use identical hardware means that I can't generate comparable results for other hardware platforms, such as Sun, Digital/Compaq/HP, and others.
Except for the coss tests, all results are from Squid Version 2.5.STABLE2. The coss results are from a patched version of 2.5.STABLE3. Those patches have been committed to the source tree for inclusion into 2.5.STABLE4.
Unless otherwise specified, I used only the enable-storeio option when running ./configure before compiling Squid. For example:
% ./configure --enable-storeio=diskd,ufs,null,coss
In all cases, Squid is configured to use 7500 MB of each 8.2-GB disk. This is a total cache size of 21.5 GB. Additionally, access.log and store.log have been disabled in the configuration file. Here is a sample squid.conf file:
visible_hostname linux-squid.bench.tst acl All src 0/0 http_access allow All cache_dir aufs /cache0 7500 16 256 cache_dir aufs /cache1 7500 16 256 cache_dir aufs /cache2 7500 16 256 cache_effective_user nobody cache_effective_group nobody cache_access_log /dev/null cache_store_log none logfile_rotate 0
All the tests in this appendix use the same Polygraph workload file. Meeting this requirement was, perhaps, the hardest part of running these tests. Normally, the desired throughput is a configuration parameter in a Polygraph workload. However, because the sustainable throughput is different for each configuration, my colleague Alex Rousskov and I developed a workload that can be used for all tests. We call this the "peak finder" workload because it finds the peak throughput for a device under test.
 Except for the number-of-spindles tests, in which the cache size depends on the number of disks in use.
 You can download this workload at http://squidbook.org/extras/pf2-pm4.pg.txt.
The name "peak finder" is somewhat misleading because, at least in Squid's case, sustainable throughput decreases over time. The workload is designed to periodically adjust the offered load (throughput) subject to response time requirements. If the measured response time is below a given threshold, Polygraph increases the load. If response time is above the threshold, it decreases the load. Thus, at any point in time during the test, we know the maximum throughput that still satisfies the response time requirements.
In order to reach a steady-state condition, the test runs until the cache has been filled twice. Polygraph knows the total cache size (21.5 GB) and keeps track of the amount of fill traffic pulled into the cache. These are responses that are cachable but not cache hits. The test duration, then, depends on the sustainable throughput. When the throughput is low, the test takes longer to complete. Some of these tests took more than 10 days to run.