Despite the amount of ink I've devoted here to FTP, I've also said repeatedly that FTP is one of the least secure and least securable file-transfer techniques. The remainder of this chapter therefore concerns file-transfer mechanisms more appropriate for the exchange of nonpublic data between authenticated hosts and users.
The first FTP alternative I'll cover here is the most FTP-like: Secure FTP (SFTP), part of the Secure Shell (SSH) suit of tools. SSH was designed as a secure replacement for the "r" commands (rlogin, rsh, and rcp), which like FTP, transmit all session data in clear text, including authentication credentials. In contrast, SSH transparently encrypts all its transactions from start to finish, including authentication credentials: local logon credentials are never exposed to network eavesdroppers. SSH offers a remarkable combination of security and flexibility and is the primary topic of Chapter 4.
What About NFS and Samba?
NFS and Samba provide two ways to mount volumes on remote systems as though they were local. This is extremely useful, particularly if you use "thin clients" with limited local storage space, or if you want to relieve users of backing up their personal data. NFS, developed and touted mainly by Sun Microsystems, is widely used in both Sun and Linux environments; in fact, the Linux version interoperates very well with the Sun version. Similarly, Samba is a Linux port of the Microsoft (actually IBM) SMB protocol and its related file- and printer-sharing functions, allowing Linux systems to act as clients and even servers to Windows hosts.
As nifty as both NFS and Samba are, however, I'm not covering them in any depth here, for the simple fact that neither is very secure, especially for Internet use. Both rely heavily on UDP, a connectionless and therefore easily spoofed protocol, and both have authentication mechanisms that have been successfully attacked in various ways over the years, in some cases trivially.
In short, I recommend that if you need either NFS or Samba, use them only in trusted LAN environments (and even then, only with careful attention to security), and never over the Internet.
SSH has always supported scp, its encryption-enabled replacement for the rcp command, so it may seem redundant for SSH to also support sftp. But usability and familiarity notwithstanding, sftp provides a key feature lacking in scp: interactivity. By being interactive, sftp allows the client to browse files both on the remote host and locally (via the FTP commands dir and ldir, respectively) prior to downloading or uploading anything.
To use scp, however, you need prior knowledge of the remote system's filesystem layout and contents. While in many situations this isn't a big deal, particularly when using scp in scripts, it's an annoying limitation in many others. Thus, sftp deserves a place in the toolkits of SSH beginners and experts alike.
Note, however, that SSH doesn't explicitly support anonymous/public file sharing via either sftp or scp. It's certainly possible, given hefty amounts of caution and testing, to set up a nonprivileged account with an empty password and a closely watched home directory for this purpose. (sshd has a configuration option called PermitEmptyPasswords that is disabled by default but which may be set to yes.) I consider this to be playing with fire, however: SSH was designed for and excels at providing secure, restricted access. Anonymous file services are not only the best use of conventional FTP daemons such as ProFTPD; such access is best provided by them.
Configuration and use of the OpenSSH version of the Secure Shell, including scp and sftp, is covered in depth in Chapter 4.
Andrew Tridgell's rsync is another useful file-transfer tool, one that has no encryption support of its own but is easily "wrapped" (tunneled) by encryption tools such as SSH and Stunnel. What differentiates rsync (which, like scp, is based on rcp) is that it has the ability to perform differential downloads and uploads of files.
For example, if you wish to update your local copy of a 10 MB file, and the newer version on the remote server differs in only 3 places totaling 150 KB, rsync will automatically download only the differing 150 KB (give or take a few KB) rather than the entire file. This functionality is provided by the "rsync algorithm," invented by Andrew Tridgell and Paul Mackerras, which very rapidly creates and compares "rolling checksums" of both files, and thus determines which parts of the new file to download and add/replace on the old one.
Since this is a much more efficient use of the network, rsync is especially useful over slow network connections. It does not, however, have any performance advantage over rcp in copying files that are completely new to one side or the other of the transaction. By definition, "differential copying" requires that there be two files to compare.
In summary, rsync is by far the most intelligent file-transfer utility in common use, one that is both amenable to encrypted sessions and worth taking the trouble to figure out how. Using rsync securely will be the focus of the remainder of the chapter.
Note that rsync supports a long list of flags and options, most of them relevant to specific aspects of maintaining software archives, mirrors, backups, etc. Only those options directly relevant to security will be covered in depth here, but the rsync(8) manpage will tell you anything you need to know about these other features.
Since Andrew Tridgell, rsync's original lead developer, is also one of the prime figures in the Samba project, rsync's home page is part of the Samba web site, http://rsync.samba.org. That, of course, is the definitive source of all things rsync. Of special note is the resources page (http://rsync.samba.org/resources.html), which has links to some excellent off-site rsync documentation.
The latest rsync source code is available at http://rsync.samba.org/ftp/rsync/, with binary packages for Debian, LinuxPPC, and Red Hat Linux at http://rsync.samba.org/ftp/rsync/binaries/ (binaries for a variety of other Unix variants are available here as well). rsync is already considered a standard Linux tool and is therefore included in all popular Linux distributions; you probably needn't look further than the Linux installation CD-ROMs to find an rsync package for your system.
However, there are security bugs in the zlib implementation included in rsync prior to rsync v.2.5.4 (i.e., these bugs are applicable regardless of the version of your system's shared zlib libraries). There is also an annoying bug in v2.5.4 itself, which causes rsync to sometimes copy whole files needlessly. I therefore recommend you run no version earlier than rsync v.2.5.5, which, as of this writing, is the most current version, so you may very likely have to build rsync from source.
Happily, compiling rsync from source is fast and easy. Simply unzip and untar the archive, change your working directory to the top-level directory of the source code, enter ./configure, and if this script finishes without errors, enter make && make install.
Once rsync is installed, you can use it several ways. The first and most basic is to use rcp as the transport, which requires any host to which you connect to have the shell service enabled (i.e., in.rshd) in inetd.conf. Don't do this! The reason why the Secure Shell was invented was because of a complete lack of support for strong authentication in the "r" services (rcp, rsh, and rlogin), which led to their being used as entry points by many successful intruders over the years.
Therefore, I won't describe how to use rsync with rcp as its transport. However, you may wish to use this method between hosts on a trusted network; if so, ample information is available in both rsync's and in.rshd's respective manpages.
A much better way to use rsync than the rcp method is by specifying the Secure Shell as the transport. This requires that the remote host be running sshd and that the rsync command is present (and in the default paths) of both hosts. If you haven't set up sshd yet, refer to Chapter 4 before you attempt the following.
Suppose you have two hosts, near and far, and you wish to copy the local file thegoods.tgz to far's /home/near.backup directory, which you think may already contain an older version of thegoods.tgz. Assuming your username, yodeldiva, exists on both systems, the transaction might look like this (Example 9-8).
yodeldiva@near:~ > rsync -vv -e ssh ./thegoods.tgz far:~ opening connection using ssh -l yodeldiva far rsync --server -vv . "~" yodeldiva@far's password: ********** expand file_list to 4000 bytes, did move thegoods.tgz total: matches=678 tag_hits=801 false_alarms=0 data=11879 wrote 14680 bytes read 4206 bytes 7554.40 bytes/sec total size is 486479 speedup is 25.76
First, let's dissect the command line in Example 9-8. rsync has only one binary executable, rsync, which is used both as the client command and, optionally, as a daemon. In Example 9-8, it's present on both near and far, but it runs on a daemon on neither: sshd is acting as the listening daemon on far.
The first rsync flag in Example 9-8 is -vv, which is the nearly universal Unix shorthand for "very verbose." It's optional, but instructive. The second flag is -e, with which you can specify an alternative to rsync's default remote copy program rcp. Since rcp is the default and since rcp and ssh are the only supported options, -e is used to specify ssh in practice.
(Perhaps surprisingly, -e scp will not work, since prior to copying any data, rsync needs to pass a remote rsync command via ssh to generate and return rolling checksums on the remote file. In other words, rsync needs the full functionality of the ssh command to do its thing, so specify this rather than scp if you use the -e flag.)
After the flags come rsync's actionable arguments, the local and remote files. The syntax for these is very similar to rcp's and scp's: if you immediately precede either filename with a colon, rsync will interpret the string preceding the colon as a remote host's name. If the username you wish to use on the remote system is different from your local username, you can specify it by immediately preceding the hostname with an @ sign and preceding that with your remote username. In other words, the full rsync syntax for filenames is the following:
There must be at least two filenames: the rightmost must be the "destination" file or path, and the others must be "source" files. Only one of these two may be remote, but both may be local (i.e., colonless), which lets you perform local differential file copying ? this is useful if, for example, you need to back up files from one local disk or partition to another.
Getting back to Example 9-8, the source file specified is ./thegoods.tgz, an ordinary local file path, and the destination is far:~, which translates to "my home directory on the server far." If your username on far is different from your local username, say yodelerwannabe rather than yodeldiva, use the destination yodelerwannabe@far:~.
The last thing to point out in Example 9-8 is its output (that is to say, its very verbose output). We see that although the local copy of thegoods.tgz is 486,479 bytes long, only 14,680 bytes were actually sent. Success! thegoods.tgz has been updated with a minimum of unchanged data sent.
Using rsync with SSH is the easiest way to use rsync securely with authenticated users ? in a way that both requires and protects the use of real users' accounts. But as I mentioned earlier in Section 9.2.1, SSH doesn't lend itself easily to anonymous access. What if you want to set up a public file server that supports rsync-optimized file transfers?
This is quite easy to do: create a simple /etc/rsyncd.conf file and run rsync with the flag ? daemon (i.e., rsync ? daemon). The devil, however, is in the details: you should configure /etc/rsyncd.conf very carefully if your server will be connected to the Internet or any other untrusted network. Let's discuss how.
rsyncd.conf has a simple syntax: global options are listed at the beginning without indentation. "Modules," which are groups of options specific to a particular filesystem path, are indicated by a square-bracketed module name followed by indented options.
Option lines each consist of the name of the option, an equal sign, and one or more values. If the option is boolean, allowable values are yes or no (don't be misled by the rsyncd.conf(5) manpage, which, in some cases, refers to true and false). If the option accepts multiple values, these should be comma-space delimited, e.g., option1, option2, ....
Example 9-9 lists part of a sample rsyncd.conf file that illustrates some options particularly useful for tightening security. Although I created it for this purpose, it's a real configuration file: Example 9-9 is syntactically complete. Let's dissect it.
# "global-only" options syslog facility = local5 # global options which may also be defined in modules use chroot = yes uid = nobody gid = nobody max connections = 20 timeout = 600 read only = yes # a module: [public] path = /home/public_rsync comment = Nobody home but us tarballs hosts allow = near.echo-echo-echo.org, 10.18.3.12 ignore nonreadable = yes refuse options = checksum dont compress = *
As advertised, Example 9-9's global options are listed at the top.
The first option set in Example 9-9 also happens to be the only "global-only" option: syslog facility, motd file, log file, pid file, and socket options may be used only as global settings, not in module settings. Of these, only syslog facility has direct security ramifications: like the ProFTPD directive SyslogFacility, rsync's syslog facility can be used to specify which syslog facility rsync should log to if you don't want it to use daemon, its default. If you don't know what this means, see Chapter 10.
For detailed descriptions of the other "global-only" options, see the rsyncd.conf(5) manpage; I won't cover them here, as they don't directly affect system security. (Their default settings are fine for most situations.)
All other allowable rsyncd.conf options may be used as global options, in modules, or both. If an option appears in both the global section and in a module, the module setting overrides the global setting for transactions involving that module. In general, global options replace default values and module-specific options override both default and global options.
The second group of options in Example 9-9 falls into the category of module-specific options:
If use chroot is set to yes, rsync will chroot itself to the module's path prior to any file transfer, preventing or at least hindering certain types of abuses and attacks. This has the tradeoff of requiring that rsync ? daemon be started by root, but by also setting the uid and gid options, you can minimize the amount of the time rsync uses its root privileges. The default setting is yes.
The uid option lets you specify with which user's privileges rsync should operate during file transfers, and it therefore affects which permissions will be applicable when rsync attempts to read or write a file on a client's behalf. You may specify either a username or a numeric user ID; the default is -2 (nobody on many systems, but not on mine, which is why uid is defined explicitly in Example 9-9).
The gid option lets you specify with which group's privileges rsync should operate during file transfers, and it therefore affects (along with uid) which permissions apply when rsync attempts to read or write a file on a client's behalf. You may specify either a username or a numeric user ID; the default is -2 (nobody on many systems).
This limits the number of concurrent connections to a given module (not the total for all modules, even if set globally). If specified globally, this value will be applied to each module that doesn't contain its own max connections setting. The default value is zero, which places no limit on concurrent connections. I do not recommend leaving it at zero, as this makes Denial of Service attacks easier.
The timeout also defaults to zero, which, in this case, also means "no limit." Since timeout controls how long (in seconds) rsync will wait for idle transactions to become active again, this also represents a Denial of Service exposure and should likewise be set globally (and per-module, when a given module needs a different value for some reason).
The last option defined globally in Example 9-9 is read only, which specifies that the module in question is read-only, i.e., that no files or directories may be uploaded to the specified directory, only downloaded. The default value is yes.
The third group of options in Example 9-9 defines the module [public]. These, as you can see, are indented. When rsync parses rsyncd.conf downward, it considers each option below a module name to belong to that module until it reaches either another square-bracketed module name or the end of the file. Let's examine each of the module [public]'s options, one at a time.
This is the name of the module. No arguments or other modifiers belong here: just the name you wish to call this module, in this case public.
The path option is mandatory for each module, as it defines which directory the module will allow files to be read from or written to. If you set the global option use_chroot to yes, this directory rsync will chroot to prior to any file transfer.
This string will be displayed whenever a client requests a list of available modules. By default there is no comment.
You may, if you wish, use the hosts allow and hosts deny options to define Access Control Lists (ACLs). Each accepts a comma-delimited list of FQDNs or IP addresses from which you wish to explicitly allow or deny connections. By default, neither option is set, which is equivalent to "allow all." If you specify a FQDN (which may contain the wildcard *), rsync will attempt to reverse-resolve all connecting clients' IP addresses to names prior to matching them against the ACL.
rsync's precise interpretation of each of these options depends on whether the other is present. If only hosts allow is specified, then any client whose IP or name matches will be allowed to connect and all others will be denied. If only hosts deny is specified, then any client whose IP or name matches will be denied, and all others will be allowed to connect.
If, however, both hosts allow and hosts deny are present:
hosts allow will be parsed first and if the client's IP or name matches, the transaction will be passed
If the IP or name in question didn't match hosts allow, then hosts deny will be parsed, and if the client matches there, the transaction will be dropped
If the client's IP or name matches neither, it will be allowed
In Example 9-9, both options are set. They would be interpreted as follows:
Requests from 10.18.3.12 will be allowed, but requests from any other IP in the range 10.16.3.1 through 10.16.3.254 will be denied.
Requests from the host near.echo-echo-echo.org will be allowed, but everything else from the echo-echo-echo.org domain will be rejected. Everything else will be allowed.
Any remote file for which the client's rsync process does not have read permissions (see the uid and gid options) will not be compared against the client's local copy thereof. This probably enhances performance more significantly than security; as a means of access control, the underlying file permissions are more important.
The refuse options option tells the server-side rsync process to ignore the specified options if specified by the client. Of rsync's command-line options, only checksum has an obvious security ramification: it tells rsync to calculate CPU-intensive MD5 checksums in addition to its normal "rolling" checksums, so blocking this option reduces certain DoS opportunities. Although the compress option has a similar exposure, you can use the dont compress option to refuse it rather than the refuse options option.
You can specify certain files and directories that should not be compressed via the dont compress option. If you wish to reduce the chances of compression being used in a DoS attempt, you can also specify that nothing be compressed by using an asterix (*), as in Example 9-9.
Before we leave Example 9-9, here's a word about setting up rsync modules (directories) at the filesystem level. The guidelines for doing this are the same as for anonymous FTP chroot environments, except that no system binaries or configuration files need to be copied inside them for chroot purposes, as is the case with some FTP servers. If you skipped it, refer back to Section 18.104.22.168 for more information.
The rsync configuration file listed in Example 9-9 is self-contained: with only a little customization (paths, etc.), it's all you need to serve files to anonymous users. But that's a pretty narrow offering. How about accepting anonymous uploads and adding a module for authenticated users? Example 9-10 illustrates how to do both.
[incoming] path = /home/incoming comment = You can put, but you can't take read only = no ignore nonreadable = yes transfer logging = yes [audiofreakz] path = /home/cvs comment = Audiofreakz CVS repository (requires authentication) list = no auth users = watt, bell secrets file = /etc/rsyncd.secrets
First, we have a module called incoming, whose path is /home/incoming. Again, the guidelines for publicly writable directories (described earlier in Section 22.214.171.124) apply, but with one important difference: for anonymous rsync, this directory must be world-executable as well as world-writable ? i.e., mode 0733. If it isn't, file uploads will fail without any error being returned to the client or logged on the server.
Some tips that apply from the FTP section are to watch this directory closely for abuse, never make it or its contents world-readable, and move uploaded files out of it and into a non-world-accessible part of the filesystem as soon as possible (e.g., via a cron job).
The only new option in the [incoming] block is transfer logging. This causes rsync to log more verbosely when actual file transfers are attempted. By default, this option has a value of no. Note also that the familiar option read only has been set to no, overriding its global setting of yes. There is no similar option for telling rsync that this directory is writable: this is determined by the directory's actual permissions.
The second part of Example 9-10 defines a restricted-access module named audiofreakz. There are three new options to discuss here.
The first, list, determines whether this module should be listed when remote users request a list of the server's available modules. Its default value is yes.
The second two new options, auth users and secrets file, define how prospective clients should be authenticated. rsync's authentication mechanism, available only when run in daemon mode, is based on a reasonably strong 128-bit MD5 challenge-response scheme. This is superior to standard FTP authentication for two reasons.
First, passwords are not transmitted over the network and are therefore not subject to eavesdropping attacks. (Brute-force hash-generation attacks against the server are theoretically feasible, however).
Second, rsync doesn't use the system's user credentials: it has its own file of username-password combinations. This file is used only by rsync and is not linked or related in any way to /etc/passwd or /etc/shadow. Thus, even if an rsync login session is somehow compromised, no user's system account will be directly threatened or compromised (unless you've made some very poor choices regarding which directories to make available via rsync or in setting those directories' permissions).
Like FTP, however, data transfers themselves are unencrypted. At best, rsync authentication validates the identities of users, but it does not ensure data integrity or privacy against eavesdroppers. For those qualities, you must run it either over SSH as described earlier or over Stunnel (described later in this chapter and in Chapter 5).
The secrets file option specifies the path and name of the file containing rsync username-password combinations. By convention, /etc/rsyncd.secrets is commonly used, but the file may have practically any name or location ? it needn't end, for example, with the suffix .secrets. This option has no default value: if you wish to use auth users, you must also define secrets file. Example 9-11 shows the contents of a sample secrets file.
The auth users option in Example 9-10 defines which users (among those listed in the secrets file) may have access to the module. All clients who attempt to connect to this module (assuming they pass any applicable hosts allow and hosts deny ACLs) will be prompted for a username and password. Remember to set the permissions of the applicable files and directories carefully because these ultimately determine what authorized users may do once they've connected. If auth users is not set, users will not be required to authenticate, and the module will be available via anonymous rsync. This is rsync's default behavior in daemon mode.
And that is most of what you need to know to set up both anonymous and authenticated rsync services. See the rsync(8) and rsyncd.conf(5) manpages for full lists of command-line and configuration-file options, including a couple I haven't covered here that can be used to customize log messages.
Lest I forget, I haven't yet shown how to connect to an rsync server as a client. This is a simple matter of syntax: when specifying the remote host, use a double colon rather than a single colon, and use a path relative to the desired module, not an absolute path.
For example, to revisit the scenario in Example 9-8 in which your client system is called near and the remote system is called far, suppose you wish to retrieve the file newstuff.tgz, and that far is running rsync in daemon mode. Suppose further that you can't remember the name of the module on far in which new files are stored. First, you can query far for a list of its available modules, as shown in Example 9-12.
[root@near darthelm]# rsync far:: public Nobody home but us tarballs incoming You can put, but you can't take
(Not coincidentally, these are the same modules we set up in Examples Example 9-9 and Example 9-10, and as I predicted in the previous section, the module audiofreakz is omitted.) Aha, the directory you need is named public. Assuming you're right, the command to copy newstuff.tgz to your current working directory would look like this:
[yodeldiva@near ~]# rsync far::public/newstuff.tgz .
Both the double colon and the path format differ from SSH mode. Whereas SSH expects an absolute path after the colon, the rsync daemon expects a module name, which acts as the "root" of the file's path. To illustrate, let's look at the same command using SSH mode:
[yodeldiva@near ~]# rsync -e ssh far:/home/public_rsync/newstuff.tgz .
These two aren't exactly equivalent, of course, because whereas the rsync daemon process on far is configured to serve files in this directory to anonymous users (i.e., without authentication), SSH always requires authentication (although this can be automated using null-passphrase RSA or DSA keys, described in Chapter 4). But it does show the difference between how paths are handled.
What About Recursion?
I've alluded to rsync's usefulness for copying large bodies of data, such as software archives and CVS trees, but all my examples in this chapter show single files being copied. This is because my main priority is showing how to configure and use rsync securely.
I leave it to you to explore the many client-side (command-line) options rsync supports, as fully documented in the rsync(8) manpage. Particularly noteworthy are -a (or ? archive), which is actually shorthand for -rlptgoD and which specifies recursion of most file types (including devices and symbolic links); and also -C (or ? cvs-exclude), which tells rsync to use CVS-style file-exclusion criteria in deciding which files not to copy.
The last rsync usage I'll mention is the combination of rsync, running in daemon mode, with Stunnel. Stunnel is a general-purpose TLS or SSL wrapper that can be used to encapsulate any simple TCP transaction in an encrypted and optionally X.509-certificate-authenticated session. Although rsync gains encryption when you run it in SSH mode, it loses its daemon features, most notably anonymous rsync. Using Stunnel gives you encryption as good as SSH's, while still supporting anonymous transactions.
Stunnel is covered in depth in Chapter 5, using rsync in most examples. Suffice it to say that this method involves the following steps on the server side:
Configure rsyncd.conf as you normally would.
Invoke rsync with the ? port flag, specifying some port other than 873 (e.g., rsync ? daemon ? port=8730).
Set up an Stunnel listener on TCP port 873 to forward all incoming connections on TCP 873 to the local TCP port specified in the previous step.
If you don't want anybody to connect "in the clear," configure hosts.allow to block nonlocal connections to the port specified in Step 2. In addition or instead, you can configure iptables to do the same thing.
On the client side, the procedure is as follows:
As root, set up an Stunnel listener on TCP port 873 (assuming you don't have an rsync server on the local system already using it), which forwards all incoming connections on TCP 873 to TCP port 873 on the remote server.
When you wish to connect to the remote server, specify localhost as the remote server's name. The local stunnel process will now open a connection to the server and forward your rsync packets to the remote stunnel process, and the remote stunnel process will decrypt your rsync packets and deliver them to the remote rsync daemon. Reply packets, naturally, will be sent back through the same encrypted connection.
As you can see, rsync itself isn't configured much differently in this scenario than anonymous rsync: most of the work is in setting up Stunnel forwarders.