Securing the PostgreSQL Data Files

The first step in securing a PostgreSQL installation is to secure the actual data files that comprise each database. PostgreSQL is typically installed in the /usr/local/pgsql directory. Executables (such as psql, initdb, and the postmaster) are often installed in the /usr/local/pgsql/bin directory. If you have a typical installation, you can expect to find data files: databases, configuration, and security information in /usr/local/pgsql/data. I'll refer to this last directory as $PGDATA. PostgreSQL uses the $PGDATA environment variable to find its data files.

Let's start by looking at the directory structure of a PostgreSQL installation. Figure 21.1 gives you a look at the structure.

Figure 21.1. The directory structure of a PostgreSQL installation.

graphics/21fig01.gif

The data directory contains three subdirectories: base, global, and pg_xlog[1].

[1] You may see more files and subdirectories if you are running a different version of PostgreSQL. This snapshot shows a typical installation of PostgreSQL release 7.1.3.

The data/base directory is where your databases live. Notice that I have three subdirectories underneath the base directory?that's because I have three databases. If you are curious about the directory naming scheme, the numbers correspond to the OIDs (object ids) of the corresponding rows in the pg_database table. You can see the correspondence by executing the following query:


psql> select oid, datname from pg_database;



  oid | datname

------+-----------

18721 | movies

    1 | template1

18719 | template0

The data/global directory contains information that spans all databases; in other words, the information in the global directory is independent of any particular database. The global directory contains the following files: 1260, 1261, 1262, 1264, 1269, 17127, 17130, pg_control, and pg_pwd.

Like the data/base directory, the data/global directory contains a few files whose names are actually OID values. Table 21.1 shows how the OID values translate into table names.

Table 21.1. OID to Table Mapping in the global Directory

Filename/OID

Corresponding Table

1260

pg_shadow

1261

pg_group

1262

pg_database

1264

pg_variable

1269

pg_log

17127

index (on name) for pg_group

17130

index (on sysid) for pg_group

Each of these files is explained in Chapter 19, "General PostgreSQL Administration," so I won't cover that information here.

The data/pg_xlog directory contains the write-ahead transaction log (also described in Chapter 19).

Unix File Permissions and Ownership

In a Unix environment, there are three aspects to file system security. Each file (or directory) has an owner, a group, and a set of permissions. You can see all three of these attributes using the ls ?l command. Here is an example:

total 40 drwx------ 5 postgres postgresgrp 4096 Oct 22 17:40 base drwx------ 2 postgres postgresgrp 4096 Jan 15 18:58 global -rw------- 1 postgres postgresgrp 7482 Jan 15 19:26 pg_hba. graphics/ccc.gifconf -rw------- 1 postgres postgresgrp 1118 Oct 22 17:35 pg_ident. graphics/ccc.gifconf -rw------- 1 postgres postgresgrp 4 Oct 22 17:35 graphics/ccc.gifPG_VERSION drwx------ 2 postgres postgresgrp 4096 Oct 22 17:35 pg_xlog -rw------- 1 postgres postgresgrp 3137 Oct 22 17:35 graphics/ccc.gifpostgresql.conf -rw------- 1 postgres postgresgrp 49 Jan 10 14:18 graphics/ccc.gifpostmaster.opts -rw------- 1 postgres postgresgrp 47 Jan 10 14:18 graphics/ccc.gifpostmaster.pid

Each line of output can be divided into seven columns. Starting at the right-most column, you see the file (or directory) name. Working to the left, you'll see the modification date, file size (in bytes), group name, username, link count, and file permissions.

The file permissions column can be interpreted as follows:


drwxrw-r--

The first character is a file type indicator and contains a "d" for directories and a "-" for normal files (other values are possible?refer to your OS documentation for more information).

Following the type indicator are three groups of access permissions, and each group contains three characters. The first group (rwx in this example) specifies access permissions for the owner of the file. rwx means that the owner can read, write, and execute the file. The next three characters (rw-) specify access permissions for members of the group. rw- means that members of the group can read and write this file, but cannot execute it. The last three characters in the permissions column control access by other users (you are considered an "other" user if you are not the owner and you are not in the file's group). r-- means that other users can read the file, but cannot write or execute it.

Permissions mean something a little different for directories. If you have read permissions for a directory, you can list the contents of that directory (using ls, for example). If you have write permissions for a directory, you can create files in, and remove files from, that directory. If you have execute permission, you can access the files in a directory (read permission allows you to list the contents of a directory; execute permission allows you to work with the contents of the files in that directory).

When you install PostgreSQL from a standard distribution, such as an RPM package, the installation procedure will automatically apply the correct ownership and permissions to all PostgreSQL components. In rare circumstances, you may find that you need to reset ownerships and permissions back to their correct states. Why? You may find that your system has been "hacked." You may need to recover from an error in a backup/restore procedure. You may have executed a recursive chown, chmod, or chgrp starting in the wrong directory?you're not an experienced system administrator until you have made (and recovered from) this mistake. It's a good idea to understand what the correct ownerships and permissions are, just in case you ever need to put things back the way they are supposed to be.

The entire directory tree (starting at and including the $PGDATA directory) should be owned by the PostgreSQL administrative user (this user is typically named :postgres"). It's easy to correct the file ownerships using the chown command:


$ chown -R postgres $PGDATA

You can use the following commands to find any files that are not owned by user postgres:


$ cd $PGDATA

$ find . -not -user postgres -ls

The $PGDATA directory tree should be readable and writable by the PostgreSQL administrative user, and should provide no access to the group and other categories. Again, setting the file permissions is easy:


$ cd $PGDATA

$ find . ?type d ?exec chmod 700 '{}' ';'

$ find . ?type f ?exec chmod 600 '{}' ';'

The first find command modifies the directories, and the second modifies the normal files. The numbers (700 and 600) are a portable way to specify access permissions. 700 is equivalent to u=rwx,g=,o=, meaning that the owner of the directory should have read, write, and execute permissions; other users have no rights. 600 is equivalent to u=rw,g=,o= meaning that the owner of the file should have read and write permissions and other users should have no access rights. You can use whichever form you prefer. The numeric form is more succinct and more portable. I prefer the symbolic form, probably because I can't do octal arithmetic in my head.

It's a good idea to verify file and directory permissions occasionally for the reasons I mentioned earlier: You may have an intruder on your system, or you might need to recover from a user mistake. You can also use the find command to find any files or directories with incorrect permissions:


$ cd $PGDATA

$ find . ?type d -not ?perm 700 -print

$ find . ?type f ?not ?perm 600 -print

There is one more file that you should consider securing besides the files in the $PGDATA directory tree. When local users (meaning users who are logged in to the system that hosts your PostgreSQL database) connect to the postmaster, they generally use a Unix-domain socket. (A Unix-domain socket is a network interface that doesn't actually use a network. Instead, a Unix-domain socket is implemented entirely within a single Unix operating system.) When you start the postmaster process, it creates a Unix-domain socket, usually in the /tmp directory. If you have a postmaster running on your system, look in the /tmp directory and you will see the socket that your postmaster uses to listen for connection requests:


$ ls ?la /tmp

total 8095

drwxrwxrwt   12 root     root         1024 Jan 25 18:04 .

drwxr-xr-x   21 root     root         4096 Jan 25 16:23 ..

drwxr-xr-x    2 root     root         1024 Jan 10 10:37 lost+found

srwxrwxrwx    1 postgres postgresgrp     0 Jan 25 18:01 .s.PGSQL.5432

-r--r--r--    1 root     root           11 Jan 24 19:18 .X0-lock

(You will likely find other files in the /tmp directory.) The postmaster's socket is named s.PGSQL.5432. You can tell that this is a socket because of the s in the left-most column. Because the name of the socket starts with a., I had to use the -a flag on the ls command. Files whose names begin with a period (.) are normally hidden from the ls command.

Notice that the permissions on this socket are rwxrwxrwx. This means that any user (the owner, members of the group, or others) can connect to this socket. You might consider restricting access to this socket. For example, if you change the permissions to rwxrwx---, only user postgres and members of the postgresgrp group could connect.

Unlike normal files, you don't set the socket permissions using the chmod command (the postmaster's socket is created each time the postmaster starts). Instead, you use the UNIX_SOCKET_PERMISSION runtime-configuration option (Chapter 19 discusses runtime-configuration options in more detail).

Note that just because you can connect to the socket does not mean that the postmaster will allow you to access a database?the next section describes how to secure the postmaster.



    Part II: Programming with PostgreSQL