Although tar is the most common archiver on Linux systems, there are two other important systems called dump/restore and cpio that you should be aware of. The following sections provide an overview of the most important archiver operations, but they do not go into very much detail.
The dump/restore system is a throwback to the older days of Unix. Unlike most other archivers, the dump program looks at the internal structure of a filesystem, and therefore it only supports certain filesystem types. On Linux, this list currently includes only the ext2 and ext3 filesystems, and this is unlikely to change, because there is little interest in porting the system to the myriad other filesystems. This inflexibility makes dump/restore a somewhat unattractive backup solution. Still, some people prefer dump (probably because they're too old to learn new tricks, or perhaps that they don't like their backup program to change the access times on their files), so you may be stuck with reading one of these archives.
The dump command to back up an entire directory looks like this:
dump -0 -f archive directory
You can use a tape device as archive if you like, or if your tape device is in the TAPE environment variable, you can omit -f archive entirely.
The -0 option forces a full (level 0) backup. If you actually intend to use dump to do regular backups, add a -u option to make dump write date information to /usr/etc/dumpdates. With this file and option in place, you can specify backup levels other than 0 (for example, a level 5 with -5).
When using -u, you must dump the entire filesystem — that is, the directory parameter must be the mount point or the disk device (for example, /dev/hda4).
The restore program extracts files and directories from an archive created with dump. If you're only looking for one or a few files, you need to run restore in interactive mode, so follow this procedure:
Run the following command, where archive is the archive (this can be a tape drive):
restore iavf archive
Wait until restore has read the filesystem index from the archive and prints a restore > prompt.
Use the cd and ls commands until you find a file or directory that you want to restore from the archive.
Add the file or directory to the extraction list with the add command (add as many files and directories as you like):
Type extract to begin the extraction.
After extraction, restore asks this question:
set owner/mode for '.'? [yn]
If you want to set the current directory's permissions and owner to the same as the root directory in the archive, answer y. This is nearly always a bad choice for an interactive restore, so answer n if you have any doubt. It's not hard to fix this if you do make a mistake, though.
The root directory in the archive corresponds to the originating filesystem's mount point. Let's say that you dump a partition mounted at /home. None of the paths in the archive would contain /home — that is, /home/user would show up as /user in the archive.
To forgo interactive mode and restore everything (and set the permissions for the current directory), use the rvf options:
restore rvf archive
Linux administrators rarely use cpio (or its newer, more capable cousin named afio) for backups or other tasks, but you should still have an idea of how this program works so that you can create, test, and extract archives.
cpio has three modes:
Create, or "copy-out," mode (activated with -o) takes a file list from the standard input, creates an archive from a matching directory hierarchy, and sends the result to standard output or a file.
Extract, or "copy-in," mode (activated with -i) tests and extracts files from an archive.
Pass-through mode (activated with -p) copies files. tar and rsync (see Chapter 15) usually do this job better unless you have a very specific list of files. This book does not cover this somewhat obscure cpio mode.
Because you are unlikely to have a cpio file lying around on your system, you should experiment by creating one yourself. The cpio copy-out mode requires that you generate a file list first. You can use the output of find to create a file list in a pipeline to cpio.
Let's say that you want to archive /usr/local/bin as a file named local_bin.cpio:
find /usr/local/bin | cpio -o > local_bin.cpio
On any halfway modern system, the output of this command includes lines like this for every single file created:
cpio: /usr/local/bin/file: truncating inode number
This annoying message isn't anything to worry about; it just means that cpio can't store the full inode number in the limited space available in the archive. The very end of the command should contain a block count (by default, 512 bytes equal one block), as in this example output:
If you need to write an archive to a tape, you have two choices (these are equivalent because the block sizes in cpio and dd are the same):
cpio -o -O tape cpio -o | dd of=tape
Now that you have local_bin.cpio, you can list its contents with this command:
cpio -i -t -v -I local_bin.cpio
The options have the following meanings:
-i Specifies copy-in mode.
-t Specifies a test archive.
-v Sets verbose mode; shows the files as ls -l would. You may omit this option if you don't care.
-I file Reads file as the archive (the default file is the standard input).
To extract an archive, replace the -t with -d:
cpio -i -d -v -I archive
However, this can be dangerous with the local_bin.cpio archive that you created earlier, because that archive contains absolute pathnames. This can be a threat to system stability and security, especially if you extract as root, because the extraction can easily overwrite a file such as /etc/passwd. Use the --no-absolute-filenames option to prevent this:
cpio --no-absolute-filenames -i -d -v -I local_bin.cpio
Keep in mind that there are a few more cpio options that cpio may need in copy-in mode if it can't figure out some of the format options of the archive for itself:
-b Switches the byte order. The default cpio format depends on the byte order of the architecture of the CPU that was used to create the archive, so if you have an archive from a completely different kind of machine, you may need this option.
-C n Uses n as the archive block size.
-H name Uses name as the archive format (valid formats are bin, crc, hpbin, hpodc, newc, odc, tar, and ustar). Note that only GNU cpio supports all of these options.
Amanda is an automatic backup system that writes a "hybrid" archive on a tape. That is, the first 32KB block in an Amanda tape contains information about the archive, and the rest of the blocks make up a regular archive in tar or dump format.
If you come across an Amanda file, do the following to extract the files:
Put the first block into a file named header:
dd if=file of=header bs=32k count=1
Run strings header. The output should look something like this:
AMANDA: FILE 20031117 duplex /etc lev 0 comp .gz program /bin/tar To restore, position tape at start of file and run: dd if=<tape> bs=32k skip=1 | /bin/gzip -dc | bin/tar -f... -
Follow the instructions. Extract the contents from the rest of the file as follows (if you're using a tape drive as file, the tape head should already be correctly positioned, so don't use skip=1):
dd if=file bs=32k skip=1 | unpack_command
unpack_command should follow the output you got in step 2. For example, if you're dealing with a tar archive, unpack_command might look like this:
gzip -dc | tar xpf -