8.6 The coss Storage Scheme

The Cyclic Object Storage Scheme (coss) is an attempt to develop a custom filesystem for Squid. With the ufs-based schemes, the primary performance bottleneck comes from the need to execute so many open( ) and unlink( ) system calls. Because each cached response is stored in a separate disk file, Squid is always opening, closing, and removing files.

coss, on the other hand, uses one big file to store all responses. In this sense, it is a small, custom filesystem specifically for Squid. coss implements many of the functions normally handled by the underlying filesystem, such as allocating space for new data and remembering where there is free space.

Unfortunately, coss is still a little rough around the edges. Development of coss has been proceeding slowly over the last couple of years. Nonetheless, I'll describe it here in case you feel adventurous.

8.6.1 How coss Works

On the disk, each coss cache_dir is just one big file. The file grows in size until it reaches its maximum size. At this point, Squid starts over at the beginning of the file, overwriting any data already stored there. Thus, new objects are always stored at the "end" of this cyclic file.[3]

[3] The beginning is the location where data was first written; the end is the location where data was most recently written.

Squid actually doesn't write new object data to disk immediately. Instead, the data is copied into a 1-MB memory buffer, called a stripe. A stripe is written to disk when it becomes full. coss uses asynchronous writes so that the main Squid process doesn't become blocked on disk I/O.

As with other filesystems, coss also uses the blocksize concept. Back in Section 7.1.4, I talked about file numbers. Each cached object has a file number that Squid uses to locate the data on disk. For coss, the file number is the same as the block number. For example, a cached object with a swap file number equal to 112 starts at the 112th block in a coss filesystem. File numbers aren't allocated sequentially with coss. Some file numbers are unavailable because cached objects generally occupy more than one block in the coss file.

The coss block size is configurable with a cache_dir option. Because Squid's file numbers are only 24 bits, the block size determines the maximum size of a coss cache directory: size = block_size x 224. For example, with a 512-byte block size, you can store up to 8 GB in a coss cache_dir.

coss doesn't implement any of Squid's normal cache replacement algorithms (see Section 7.5). Instead, cache hits are "moved" to the end of the cyclic file. This is, essentially, the LRU algorithm. It does, unfortunately, mean that cache hits cause disk writes, albeit indirectly.

With coss, there is no need to unlink or remove cached objects. Squid simply forgets about the space allocated to objects that are removed. The space will be reused eventually when the end of the cyclic file reaches that place again.

8.6.2 Compiling and Configuring coss

To use coss, you must add it to the enable-storeio list when running ./configure:

% ./configure --enable-storeio=ufs,coss ...

coss cache directories require a max-size option. Its value must be less than the stripe size (1 MB by default, but configurable with the enable-coss-membuf-size option). Also note that you must omit the L1 and L2 values that are normally present for ufs-based schemes. Here is an example:

cache_dir coss /cache0/coss 7000 max-size=1000000

cache_dir coss /cache1/coss 7000 max-size=1000000

cache_dir coss /cache2/coss 7000 max-size=1000000

cache_dir coss /cache3/coss 7000 max-size=1000000

cache_dir coss /cache4/coss 7000 max-size=1000000

Furthermore, you can change the default coss block size with the block-size option:

cache_dir coss /cache0/coss 30000 max-size=1000000 block-size=2048

One tricky thing about coss is that the cache_dir directory argument (e.g., /cache0/coss) isn't actually a directory. Instead, it is a regular file that Squid opens, and creates if necessary. This is so you can use raw partitions as coss files. If you mistakenly create the coss file as a directory, you'll see an error like this when starting Squid:

2003/09/29 18:51:42|  /usr/local/squid/var/cache: (21) Is a directory

FATAL: storeCossDirInit: Failed to open a coss file.

Because the cache_dir argument isn't a directory, you must use the cache_swap_log directive (see Section 13.6). Otherwise Squid attempts to create a swap.state file in the cache_dir directory. In that case, you'll see an error like this:

2003/09/29 18:53:38| /usr/local/squid/var/cache/coss/swap.state:

        (2) No such file or directory

FATAL: storeCossDirOpenSwapLog: Failed to open swap log.

coss uses asynchronous I/Os for better performance. In particular, it uses the aio_read( ) and aio_write( ) system calls. These may not be available on all operating systems. At this time, they are available on FreeBSD, Solaris, and Linux. If the coss code seems to compile okay, but you get a "Function not implemented" error message, you need to enable these system calls in your kernel. On FreeBSD, your kernel must have this option:

options         VFS_AIO

8.6.3 coss Issues

coss is still an experimental feature. The code has not yet proven stable enough for everyday use. If you want to play with and help improve it, be prepared to lose any data stored in a coss cache_dir. On the plus side, coss's preliminary performance tests are very good. For an example, see Appendix D.

coss doesn't support rebuilding cached data from disk very well. When you restart Squid, you might find that it fails to read the coss swap.state files, thus losing any cached data. Furthermore, Squid doesn't remember its place in the cyclic file after a restart. It always starts back at the beginning.

coss takes a nonstandard approach to object replacement. This may cause a lower hit ratio than you might get with one of the other storage schemes.

Some operating systems have problems with files larger than 2 GB. If this happens to you, you can always create more, smaller coss areas. For example:

cache_dir coss /cache0/coss0 1900 max-size=1000000 block-size=128

cache_dir coss /cache0/coss1 1900 max-size=1000000 block-size=128

cache_dir coss /cache0/coss2 1900 max-size=1000000 block-size=128

cache_dir coss /cache0/coss3 1900 max-size=1000000 block-size=128

Using a raw disk device (e.g., /dev/da0s1c) doesn't work very well yet. One reason is that disk devices usually require that I/Os take place on 512-byte block boundaries. Another concern is that direct disk access bypasses the systems buffer cache and may degrade performance. Many disk drives, however, have built-in caches these days.

    Appendix A. Config File Reference