Fixing Problems with fsck

/usr/sbin/fsck is a file system checking and repair program commonly found on Solaris and other UNIX platforms. It is usually executed by the superuser while the system is in a single-user mode state (for example, after entering run level S), but can also be performed on individual volumes during multiuser run levels. However, there is one golden rule for using fsck: never, ever apply fsck to a mounted file system. To do so could leave the file system in an inconsistent state and cause a kernel panic, at which point it’s best to head for the backup tape locker! Any fixes to potential problems on a mounted file system could end up creating more damage than the original problem. In this section, we will examine the output of fsck, as well as look at some examples of common problems, and investigate how fsck repairs corrupt and inconsistent disk data. Of course, you must enable logging for each file system in /etc/vfstab before being confident that data can be recovered accurately using journaling.

Although Solaris 7, 8, and 9, still retain fsck, it is really only necessary for Solaris 2.6 and prior releases. This is because logging is now provided for UNIX file systems. Thus, before any changes are made to a file system, details of the change are recorded in a log prior to their physical application. While this consumes some extra CPU and disk overhead (approximately 1 percent of disk space on each volume with logging enabled is required), it does ensure that the file system is never left in an inconsistent state.

Tip

Boot time is reduced, because fsck does not need to be executed.

Why do inconsistencies occur in the first place? In theory, they shouldn’t, but there are three common reasons:

Switching off a Solaris server like an old MS-DOS machine, without powering down first
Halting a system without synchronizing disk data (it is advisable to explicitly use sync before shutting down using halt)
Defective hardware, including damage to disk blocks and heads, which can be caused by moving the system, and/or power surges.

These problems realize themselves in corruption to the internal set of tables that every UNIX file system keeps to manage free disk blocks and inodes, leading to blocks that are actually free and reported as already allocated, and conversely, some blocks occupied by a program, but that might be recorded as being free. This is obviously problematic for mission-critical data, which is a good advertisement for RAID storage (or at least, reliable backups).

Caution

Disk corruption is obviously problematic for mission-critical data, which is a good advertisement for RAID storage (or at least, reliable backups).

The Phases of fsck

The first step to running fsck is to enable file system checking to occur during boot. To do this, it is necessary to specify an integer value in the fsck field in the virtual file system configuration file/etc/vfstab. Entering a 1 in this field ensures sequential fsck checking, while entering 2 does not ensure sequential checking, as in the following example:

#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
/dev/dsk/c1t2d1s3 /dev/rdsk/c1t2d1s3 /usr ufs 2 yes -/
-

After being enabled for a particular file system, fsck can be executed. fsck checks the integrity of several different features of the file system. Most significant is the superblock, which stores summary information for the volume. Since the superblock is the most modified item on the file system being written and rewritten when data is changed on a disk, it is the most commonly corrupted feature. Checks on the superblock include:

A check of the file system size, which obviously must be greater than the size computed from the number of blocks identified in the superblock
The total number of inodes, which must be less than the maximum number of inodes
A tally of reported free blocks and inodes

If any of these values are identified as corrupt by fsck, the superuser can select one of the many superblock backups that were created during initial file system creation as a replacement for the current superblock. We will examine superblock corruption and how to fix it in the next section. In addition to superblock, the number and status of cylinder group blocks, inodes, indirect blocks, and data blocks are also checked. Since free blocks are located by maps stored in the cylinder group, fsck verifies that all the blocks marked as free are not actually being used by any files—if they are, files could be corrupted. If all blocks are correctly accounted for, fsck determines whether the number of free blocks plus the number of used blocks equals the total number of blocks in the file system. If fsck detects any incongruity, the maps of unallocated blocks are rebuilt, although there is obviously a risk of data loss whenever there is a disagreement over the actual state of the file system. fsck always uses the actual count of inodes and/or blocks if the superblock information is wrong, and replaces the incorrect value if this is verified by the superuser. We will revisit this issue in the next section.

When inodes are examined by fsck, the process is sequential in nature and aims to identify inconsistencies in format and type, link count, duplicate blocks, bad block numbers, and inode size. Inodes should always be in one of three states: allocated (being used by a file), unallocated (not being used by a file), and partially allocated, meaning that during an allocation or unallocation procedure, data has been left behind that should have been deleted or completed. Alternatively, partial allocation could result from a physical hardware failure. In both of these cases, fsck will attempt to clear the inode.

The link count is the number of directory entries that are linked to a particular inode. fsck always checks that the number of directory entries listed is correct, by examining the entire directory structure beginning with the root directory, and tallying the number of links for every inode. Clearly, the stored link count and the actual link count should agree, but the stored link count can occasionally be different than the actual link count. This could result from a disk not being synchronized before a shutdown, for example, and while changes to the file system have been saved, the link count has not been correctly updated. If the stored count is not zero, but the actual count is zero, then disconnected files are placed in the lost+found directory found in the top level of the file system concerned. In other cases, the actual count replaces the stored count.

An indirect block is a pointer to a list of every block claimed by an inode. fsck checks every block number against a list of allocated blocks: if two inodes claim the same block number, that block number is added to a list of duplicate block numbers. The administrator may be asked to choose which inode is correct—obviously a difficult decision, and usually time to verify files against backups. fsck additionally checks the integrity of the actual block numbers, which can also become corrupt—it should always lie in the interval between the first data block and the last data block. If a bad block number is detected, the inode is cleared.

Directories are also checked for integrity by fsck. Directory entries are equivalent to other files on the file system, except they have a different mode entry in the inode. fsck checks the validity of directory data blocks, checking for the following problems: unallocated nodes associated with inode numbers, inode numbers exceeding the maximum number of inodes for a particular file system, incorrect inode numbers for the standard directory entries “.” and “..”, and directories actually being accidentally disconnected from the file system. We will examine some of these errors and how they are rectified in the next section.

fsck examines each disk volume in five distinct stages, performing all of the checks discussed earlier: phase 1, in which blocks and sizes are checked; phase 2, where pathnames are verified; phase 3, where connectivity is examined; phase 4, where an investigation of reference counts is undertaken; and phase 5, where the actual cylinder groups are checked.

EXAM TIP

You should be able to identify the different phases of fsck and their purposes.

fsck Examples

In this section, we will examine a full run of fsck, outlining the most common problems and how they are rectified, as well as presenting some examples of less commonly encountered problems. On a SPARC 20 system, fsck for the / file system looks like this:

** /dev/rdsk/c0d0s0
** Currently Mounted on /
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE?

Clearly, the actual block count and the block count recorded in the superblock are at odds with each other. At this point, fsck requires superuser permission to install the actual block count in the superblock, which the administrator indicates by pressing Y. The scan continues with the /usr partition:

1731 files, 22100 used, 51584 free (24 frags, 6445 blocks,  0.0% fragmentation)
** /dev/rdsk/c0d0s6
** Currently Mounted on /usr
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups

FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX?

In this case, the file system state in the superblock records is incorrect, and again the administrator is required to give consent for it to be repaired. The scan then continues with the /var and /export/home partitions:

26266 files, 401877 used, 217027 free (283 frags, 27093 blocks,  0.0% fragmentation)
** /dev/rdsk/c0d0s1
** Currently Mounted on /var
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
1581 files, 4360 used, 25545 free (41 frags, 3188 blocks,  0.1% fragmentation)
** /dev/rdsk/c0d0s7
** Currently Mounted on /export/home
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
2 files, 9 used, 7111589 free (13 frags, 888947 blocks,  0.0% fragmentation)

Obviously, the /var partition and /export/home have passed examination by fsck, and are intact. However, the fact that the / and /usr file systems were in an inconsistent state suggests that the file systems were not cleanly unmounted, perhaps during the last reboot. Fortunately, the superblock itself was intact. However, this is not always the case. In this example, the superblock of /dev/dsk/c0t0d0s2 has a bad magic number, indicating that it is damaged beyond repair:

# fsck /dev/dsk/c0t0d0s2
 BAD SUPER BLOCK: MAGIC NUMBER WRONG
 USE ALTERNATE SUPER-BLOCK TO SUPPLY NEEDED INFORMATION
eg. fsck [-F ufs] -o b=# [special ...]
where # is the alternate super block. SEE fsck_ufs(1M).

In this case, you need to specify one of the alternative superblocks that were created by the newfs command. When a file system is created, there is a message printed about the creation of superblock backups:

super-block backups (for fsck -b #) at:
32, 5264, 10496, 15728, 20960, 26192, 31424, 36656, 41888,
47120, 52352, 57584, 62816, 68048, 73280, 78512, 82976, 88208,
93440, 98672, 103904, 109136, 114368, 119600, 124832, 130064,
135296, 140528, 145760, 150992, 156224, 161456.

In the previous example, you may need to specify one of these alternative superblocks, so that the disk contents are once again readable. If you didn’t record the superblock backups during the creation of the file system, you can easily retrieve them by using newfs (and using -N to prevent the creation of a new file system):

# newfs -Nv /dev/dsk/c0t0d0s2

Once you have determined an appropriate superblock replacement number (for example, 32), use fsck again to replace the older superblock with the new one:

# fsck -o b=32 /dev/dsk/c0t0d0s2

Disks that have physical hardware errors often report being unable to read inodes beyond a particular point. For example, the error message

Error reading block 31821 (Attempt to read from filesystem
resulted in short read) while doing inode scan. Ignore error
<<y>> ?

stops the user from continuing with the fsck scan, and correcting the problem. This is probably a good time to replace a disk, rather than attempting any corrective action. Never be tempted to ignore these errors, and hope for the best—especially in commercial organizations, you will ultimately have to take responsibility for lost and damaged data.

Tip

Users will be particularly unforgiving if you had advance warning of a problem.

Here is an example of what can happen when there is a link count problem:

# fsck /
 ** /dev/rdsk/c0t1d0s0
 ** Currently Mounted on /
 ** Phase 1 - Check Blocks and Sizes
 ** Phase 2 - Check Pathnames
 ** Phase 3 - Check Connectivity
 ** Phase 4 - Check Reference Counts
 LINK COUNT DIR I=4  OWNER=root MODE=40700
 SIZE=4096 MTIME=Nov  1 11:56 1999  COUNT 2 SHOULD BE 4
 ADJUST? y

If the adjustment does not fix the error, use the find command to track down the problem file, and delete it this way:

# find / -mount -inum 4 -ls

It should be in the lost+found directory for the partition in question (in this case, /lost+found).

As previously outlined, duplicate inodes can also be a problem:

** Phase 1 - Check Blocks and Sizes
 314415 DUP I=5009
 345504 DUP I=12011
 345505 DUP I=12011
 854711 DUP I=91040
 856134 DUP I=93474
 856135 DUP I=93474

This problem is often found in Solaris 2.5 and 2.6, although not usually seen in Solaris 7, 8 or 9, and so an upgrade may correct the problem.