The next step in securing a PostgreSQL installation is determining which computers are allowed to access your data.
PostgreSQL uses the $PGDATA/pg_hba.conf file to control client access (hba is an acronym for host-based authentication). Let's start by looking at a simple example:
# Allow all local users to connect without providing passwords local all trust # Allow users on our local network to connect to # database 'movies' if they have a valid password host movies 192.168.0.0 255.255.255.0 password
First, you should know that lines that begin with a # character are comments, and blank lines are ignored.
The remainder of the records in pg_hba.conf control access to one or more databases for one or most hosts.
Each record is composed of three or more fields.
The first field in each record corresponds to a type of connection. PostgreSQL currently supports three types of connections:
local? A local connection is one that comes in over a Unix-domain socket. By definition, a client connecting via a Unix-domain socket is executing on the same machine as the postmaster.
hostssl? A hostssl connection is a TCP/IP connection that uses the SSL (secure sockets layer) protocol.
host? A host connection is a TCP/IP connection that does not use SSL.
TCP/IP Connections with postmaster
When you start the postmaster process, the default is to prohibit access from other systems. Unless you enable TCP/IP connections, the postmaster will listen for connection requests coming from only local clients (in other words, the postmaster will listen only on a Unix-domain socket). You can enable TCP/IP connections using the postmaster's -i flag or by setting the tcp_ip configuration variable to TRUE.
The second field in each pg_hba.conf record specifies which database (or set of databases) the record controls. You can include the name of a database in this field, or you can specify one of two special values. The string all controls access to all databases, and sameuser controls access to a database whose name is identical to the name of the user making the connection.
The remainder of the pg_hba.conf record depends on the connection type. I'll look at each one in turn.
The format of a local record is
local database authentication-method [authentication-option]
You know that the database field contains the name of a database (or all or sameuser). The authentication-method field determines what method you must use to prove your identity. I'll explain authentication methods and authentications options in a moment.
The format of a host or hostssl record is
host database ip-address mask authentication-method [ option ] hostssl database ip-address mask authentication-method [ option ]
The ip-address field specifies either a TCP/IP host or a TCP/IP network (by numeric address). The mask field specifies how many bits in the ip-address are significant.
If you want to provide access to a specific host, say 192.168.0.1, you might specify
host all 192.168.0.1 255.255.255.255 krb5
The mask value of 255.255.255.255 tells PostgreSQL that all the bits in the ip-address are significant. If you want to specify that all the hosts on a network are granted some form of access, you would use a restricted mask. For example:
host all 192.168.0.0 255.255.255.0 krb5
This mask value specifies that all hosts on the 192.168.0.xxx network are granted access.
If you try to connect to a postmaster and your host address does not match any of the pg_hba.conf records, your connection attempt is rejected.
Now let's look at the authentication methods. Remember that you can specify a different authentication method for each host (or for each network). Some authentication methods are more secure than others, whereas some methods are more convenient than others.
When you use the trust authentication method, you allow any user on the client system to access your data. The client application is not required to provide any passwords (beyond what may be required to log in to the client system).
trust is the least secure of the authentication methods?it relies on the security of the client system.
You should never use trust to authenticate a connection attempt in an insecure network.
In most cases, you won't want to use the trust method to authenticate local connections. At first glance, it seems reasonable to trust the security on your own host; after all, I have to prove my identity to the operating system before I can start a client application. But the problem is not that I can fool the operating system; the problem is that I can impersonate another user. Consider the following scenario:
Welcome to arturo, please login... login: korry Password: cows Last login: Fri Jan 18 10:48:00 from marsalis [korry]$ psql ?U sheila ?d movies Welcome to psql, the PostgreSQL interactive terminal. movies=>
To log in to my host (arturo) as user korry, I am required to provide an operating system-authenticated password. But, if the movies database allows local connection attempts to be trusted, nothing stops me from impersonating another user (possibly gaining elevated privileges).
Given the security problems with trust, why would you ever want to use it? The trust authentication method is useful on single-user machines (that is, systems with only one user authorized to log in). You may also use trust to authenticate local connections on development or testing systems.
You never want to use trust on a multiuser system that contains important data.
The ident authentication method (like trust) relies on the client system to authenticate the user.
In the previous section, I showed you how easy it is to impersonate another user using the trust authentication method. All I have to do to impersonate another user is use the -U flag when I fire up the psql client application.
ident tries to be a bit more secure. Let's pretend that I am currently logged in to host vivaldi as user korry, and I want to connect to a PostgreSQL server running on host arturo:
$ whoami korry $ psql ?h arturo ?d movies ?U korry Welcome to psql, the PostgreSQL interactive terminal. movies=> select user; current_user korry
I'll walk through the authentication process for this connection.
First, my local copy of psql makes a TCP/IP connection to the postmaster process on host arturo and sends my username (korry). The postmaster (on arturo) connects back to the identd daemon on host vivaldi (remember, I am running psql on host vivaldi). The postmaster sends the psql-to-postmaster connection information to identd and identd replies with my username (also korry).
Now, the postmaster examines the pg_hba.conf record that matches my host. Assume that it finds the following:
host all 192.168.0.85 255.255.255.255 ident sameuser
The sameuser field tells the postmaster that if I am trying to connect using a name that matches the identd response, I am allowed to connect. (That might sound a little confusing at first. When you use the ident authentication method, the postmaster works with two different usernames: the name that I provided to the client application and the name returned by the identd daemon.)
Now let's see what happens when I try to impersonate another user. Recall from the previous section that I can fool the trust authentication method simply by lying about my username. It's a little harder to cheat with ident.
Let's say that I am logged in to host vivaldi as user sheila and I try to impersonate user korry. You can assume that because I am logged in to vivaldi, I have proven my identity to vivaldi by providing sheila's password.
$ whoami sheila $ psql ?h arturo ?d movies ?U korry psql: IDENT authentication failed for user 'sheila'
As before, my local copy of psql makes a TCP/IP connection to the postmaster process on host arturo and sends the username that I provided on the command line (korry). The postmaster (on arturo) connects back to the identd daemon on host vivaldi. This time, the identd daemon returns my real username (sheila).
At this point, the postmaster (on arturo) is working with two usernames. I have logged in to the client (vivaldi) as user sheila but when I started psql, I specified a username of korry. Because my pg_hba.conf record specified sameuser, I can't connect with two different usernames?my connection attempt is rejected.
Now that you've seen how the ident method provides a bit more security than trust, I'll show you a few more options that you can use with ident.
In the preceding examples, I used the sameuser option in my pg_hba.conf record. Instead of sameuser, I can specify the name of a map. A map corresponds to a set of entries in the $PGDATA/pg_ident.conf file. pg_ident.conf is a text file containing one record per line (as usual, blank lines and lines starting with a '#' character are ignored). Each record in the pg_ident.conf file contains three fields:
mapname? Corresponds to the map field in a pg_hba.conf record
ident-name? This is a name returned by the identd daemon on a client system
pguser-name? PostgreSQL username
Here is an example:
# pg_ident.conf # #mapname ident-name pguser-name #----------- ------------- ----------- host-wynton Administrator bruce host-vivaldi Administrator sheila host-vivaldi sheila sheila host-vivaldi korry korry # pg_hba.conf # host all 192.168.0.85 255.255.255.255 ident host-vivaldi host all 192.168.0.22 255.255.255.255 ident host-wynton
You can see in this example that I have defined two ident maps: host-vivaldi and host-wynton. The pg_hba.conf file specifies that any connection attempts from host 192.168.0.85 should use the ident method with the host-vivaldi ident map; any connection attempts from host 192.168.0.22 should use the host-wynton map.
Now look at the pg_ident.conf file?there are three entries in the host-vivaldi map and one entry in the host-wynton map.
The host-wynton map says that if I am logged in to my client machine (192.168.0.22) as user Administrator, I can connect to a database as PostgreSQL user bruce.
The host-vivaldi map says that I can connect as PostgreSQL user sheila if I am logged in to my host as Administrator or if I am logged in as user sheila. Also, if I am logged in as korry, I can connect as PostgreSQL user korry.
So, why is the ident method insecure? Think back to the trust method?it is insecure because you trust the user to tell the truth about his or her identity. ident is insecure because you are trusting the client system. The network protocol used by the identd daemon is very simple and easy to impersonate. It's easy to set up a homegrown program to respond to identd queries with inaccurate usernames. In fact, I recently downloaded and installed an ident server on my Windows laptop, and one of the command-line options allowed me to specify a fake username!
I would recommend against using the ident authentication method except on closed networks (that is, networks where you control all the connected hosts).
The password authentication method provides a reasonably high level of security compared to trust and ident. When you use password authentication, the client is required to prove its identity by providing a valid password.
PostgreSQL authentication passwords are not related to the password that you use to log in to your operating system.
On a Unix (or Linux) host, OS passwords are usually stored in /etc/passwd or /etc/shadow. When you log in to a Unix machine, you are prompted for your OS password, and the login program compares the password that you enter with the appropriate entry in the /etc/passwd file.
How does PostgreSQL decide whether to look in pg_shadow or in a flat password file? It examines the pg_hba.conf record that matches your client's host IP address. Here are two sample pg_hba.conf entries:
# pg_hba.conf # host all 192.168.0.85 255.255.255.255 password host all 192.168.0.22 255.255.255.255 password accounting
When you log in to a PostgreSQL database using password authentication, you must provide a password, but that password is stored in a separate location. By default, PostgreSQL passwords are stored (in unencrypted form) in the pg_shadow table. You can also store encrypted passwords in files that are external to the database (these external files are called flat password files).
The first record specifies that host 192.168.0.85 should use password authentication. Because there is nothing following the word password, PostgreSQL looks for passwords in the pg_shadow table.
The second record in this pg_hba.conf file specifies that host 192.168.0.22 should use password authentication as well. In this case, I included an authentication-option. Recall that the format of a pg_hba.conf record is
connect-type database authentication-method [authentication-option]
The authentication-option for password authentication specifies the name of a flat password file. The name that you provide is assumed to be the name of a file in the $PGDATA directory. In this example, the flat password file is named $PGDATA/accounting. Note that you can define as many flat password files as you like.
When you store passwords in the pg_shadow table, you use the CREATE USER or ALTER USER commands to maintain passwords. For example, to create a new (password-authenticated) user, you would use the following command:
CREATE USER bruce WITH PASSWORD 'cricketers';
If you want to change bruce's password, you would use the ALTER USER command:
ALTER USER bruce WITH PASSWORD 'Wooloomooloo';
Are pg_shadow Passwords Encrypted?
When you store passwords in the pg_shadow table, you may be surprised to find that they are not stored in an encrypted format. If you are a PostgreSQL superuser (see Chapter 19), you can view anyone's password by selecting from the pg_shadow table. If you are a Unix superuser, you can see anyone's password by examining the $PGDATA/global/pg_pwd file (all passwords are copied from the pg_shadow table into the $PGDATA/global/pg_pwd each time you change any password using CREATE USER or ALTER USER).
PostgreSQL release 7.2 gives you another option. You can choose to store md5 encrypted passwords in the pg_shadow table. md5 encrypted passwords cannot be used with either the password or crypt authentication methods. I'll describe authentication using md5 in a moment.
Remember that flat password files are stored in the $PGDATA directory (or in a subdirectory). You can't use the CREATE USER or ALTER USER commands to maintain flat password files; instead, you use an external utility program named pg_passwd.
Provide the Pathname When Running pg_passwd
When you run the pg_passwd command, you must provide the pathname of the flat password file. (A common mistake is to omit the path and supply only the filename?if you don't happen to be in the $PGDATA directory, you won't be editing the correct flat file.)
Here is a sample pg_passwd session:
$ pwd /usr/local/pgsql/data $ pg_passwd accounting Username: oswald New password: Re-enter new password: $
You'll notice that the passwords that I typed in are not echoed to the screen.
In the usual case, you must be logged in as the PostgreSQL administrative user (postgres) to edit flat password files; the files are located in the $PGDATA directory and that directory is secured.
In the preceding example, the pg_passwd program stored oswald's password in the $PGDATA/accounting file. Flat password files look very much like the /etc/passwd file:
$ cat $PGDATA/accounting oswald:C63KRm.yVkrH2
You can see that there are two fields in this file, (separated by a colon). The first field is the name of a PostgreSQL user (oswald). The second field contains an encrypted form of oswald's password. (The /etc/passwd file contains more information than just a username and a password.) You can edit a flat password file by hand; of course, you would have trouble coming up with an encrypted password. If you remove the password (or set the password to +) for a user, the postmaster will look to the pg_shadow table to authenticate that user.
If you want each user to be able to change his own password, you can link a flat password file to the /etc/passwd file (ln ?s $PGDATA/accounting /etc/passwd). When you use the /etc/passwd file to authenticate, each user can use the standard Unix passwd program to change his own password?the OS password and PostgreSQL password are then the same.
The crypt authentication method is nearly identical to password. There are two features that differentiate password and crypt:
Using the crypt method, the password is sent from the client in encrypted (rather than cleartext) form.
The crypt method will not use an external flat password file?it will always use the pg_shadow table.
How Are Crypt Passwords Encrypted?
I mentioned in the last section that pg_shadow passwords are not stored in an encrypted form. So how is it that the crypt authentication method works with encrypted passwords?
When a client application wants to connect to a crypt-authenticating server, the server sends a random number (called a salt value) back to the client. After the client knows what salt value to use, it encrypts the password (entered by the user) with the salt and sends the result to the server. The server reads the cleartext password (stored in the pg_shadow table) and encrypts it with the same salt value. If the two encrypted passwords match, the client is successfully authenticated. The result is that passwords are stored in cleartext form, but encrypted passwords are sent across the network.
Now here's a tricky question: If you can store encrypted passwords in a flat password file (using the password method) and cleartext passwords are stored in $PGDATA/global/ pg_pwd when you use crypt, which method is more secure? The answer depends on whom you trust. If you allow clients to connect over an untrusted network, use crypt; otherwise, network eavesdroppers might see the cleartext passwords sent by the password method. If all your clients connect over a trusted network, you might favor password authentication?that way, you are minimizing the damage that might be done if someone happens to obtain superuser access. The md5 authentication method is designed to resolve both of these problems.
The third password-based authentication method is md5. With md5 authentication, passwords are stored in the pg_shadow table in encrypted form. md5 authentication was not available prior to PostgreSQL release 7.2.
You create encrypted passwords using the CREATE USER and ALTER USER commands.
ALTER USER bruce WITH ENCRYPTED PASSWORD 'Wooloomooloo';
Note the keyword ENCRYPTED.
md5 is a cryptographically secure message digest algorithm developed by Ron L. Rivest of RSA Security. A message digest algorithm takes a cleartext message (in our case, a password) and produces a long number, called a hash or digest, based on the contents of the message. The md5 algorithm is carefully designed so that no two messages are likely to produce the same digest. It is nearly impossible to recover the original password given an md5 digest.
How can a message digest be used as a password? If you feed two passwords into the md5 algorithm, you will get the same digest value if the passwords are identical. When you create an encrypted password, the password itself is not actually stored in pg_shadow. Instead, PostgreSQL computes an md5 digest over the password and stores the digest. When a client attempts to connect using md5 authentication, the client computes an md5 digest over the password provided by the user and sends the digest to the server. The server compares the digest stored in pg_shadow with the digest provided by the client. If the two digests are identical, it is extremely likely that the passwords match.
There are a couple of security holes in the procedure that I just described. Let's say that bruce and sheila each happened to choose the same password. Two identical passwords will produce the same message digest. If bruce happened to notice that his pass word had the same message digest as sheila's, he would know that he and sheila had chosen the same password. To avoid this problem, PostgreSQL combines each password with the user's name before computing the md5 digest. That way, if two users happen to choose the same password, they won't have the same md5 digests. The second problem has to do with network security. If a client sent the same message digest to the server every time a given user logged in, the message digest would essentially function as a cleartext password. A nefarious user could watch the network traffic, capture the cleartext message digest, and impersonate the real user (by providing the same cleartext message digest). Instead, PostgreSQL uses the salt strategy that I described earlier (see the sidebar "How Are Crypt Passwords Encrypted?"). When a client connects to an md5 authenticating server, the server sends a random salt to the client. The client computes an md5 digest based on the user ID and password; this digest matches the digest stored in pg_shadow. The client then combines the salt (from the server) with the first md5 digest and computes a second digest. The second digest is sent to the server. The server combines the salt with the digest stored in pg_shadow and computes a new md5 digest. The server then compares the client's digest with its (salted) own?if the digests match, the passwords match.
The final password-based authentication method is pam (Pluggable Authentication Module). You've probably noticed that PostgreSQL offers many methods for authenticating a user. This problem is not unique to PostgreSQL?many applications have the need to authenticate a user. The goal of pam is to separate the act of authenticating a user from each and every application by placing authentication services in a framework that can be called by any application.
A system administrator can define different authentication methods for each application, depending on how secure the application needs to be. Using pam, an administrator can create a completely open system, requiring no passwords at all, or can choose to authenticate users using passwords, challenge-response protocols, or even more esoteric biometric authentication methods. PostgreSQL can use the pam framework.
Although pam can be ported to many Unix systems, it is most commonly found in Linux and Solaris. Configuring a pam system is not for the faint-of-heart, and the topic deserves an entire book. Because of space considerations, I won't try to describe how to configure a pam installation. Instead, I recommend that you visit the Linux-PAM web site (http://www.kernel.org/pub/linux/libs/pam/) for more information.
The krb4 and krb5 authentication methods correspond to Kerberos version 4 and Kerberos version 5, respectively. Kerberos is a network-secure authentication service developed at MIT.
Kerberos is a complex package (particularly from the administrator's point of view), but it offers a high level of security. After Kerberos is properly installed and configured, it is easy to use.
The easiest way to understand Kerberos is to compare it with a more traditional authentication method.
Let's say that you want to use telnet to connect to another host (bach) on your network. You start by issuing the telnet command:
$ telnet bach Trying bach... Connected to bach (192.168.0.56) Escape character is '^]'. login: korry Password: cows Last login: Thu Jan 24 19:18:44 $
After providing your username, the login program (on bach) asks for your password. Your password is compared with the password stored on bach (in the /etc/passwd or /etc/shadow file). If the password that you provide matches, you have proven your identity and bach permits access.
If you log out of bach and log back in, you must again provide your identity and prove that you are who you say you are.
Now let's see how you perform the same operation when using Kerberos.
With Kerberos, you don't have to prove your identity to each server; instead, you authenticate yourself to a trusted server. In this case, trusted means that both the client (that's you) and the server will trust the Kerberos authentication agent to verify that you are who you say you are.
Before you telnet using Kerberos, you must first obtain a ticket.
$ kinit Password for email@example.com: cows
After you enter your password, the kinit program contacts the Kerberos authentication server (AS) and asks for a ticket. If your password was correct, the AS returns a chunk of data known as a TGT (ticket-granting ticket). The kinit program stores your TGT in a cache file inside of a temporary directory on your system.
At this point, you have proven your identity to the AS, and the AS has given back a certificate that you can use with servers that trust the AS. You can view your TGT using the klist command:
$ klist Ticket cache: /tmp/krb5cc_tty1 Default principal: firstname.lastname@example.org Valid starting Expires Service principal 25 Jan 02 01:25:47 25 Jan 02 09:25:42 krbtgt/movies.bi@ movies.biz $
(Notice that the ticket expires in about eight hours?I have to occasionally reauthenticate myself to the AS.)
Now, you can use that TGT by using a Kerberos-enabled telnet client to connect to a Kerberos-enabled telnet server:
$ telnet -a bach Trying bach... Connected to bach (192.168.0.56) Escape character is '^]'. Last login: Thu Jan 24 19:18:44 $
There are two things that you should notice about this login example. First, I used the -a flag when I started telnet?that flag asks telnet to use Kerberos authentication. Second, I was not prompted for a user or for a password. Why not? The telnet client (on my local machine) used my TGT to ask the AS for another ticket, specifically a ticket that allows me to connect to the telnet server on bach. The AS sent the second ticket back to my local machine, and the new ticket was stored in my ticket cache. This new ticket is specific to telnet. My local telnet client sends the new ticket to the telnet server. The ticket contains enough secure (encrypted) information to satisfy the telnet server that I have proven my identity (specifically, I have proven my identity to the AS, and the telnet server trusts the AS).
I can view the new ticket with the klist command:
$ klist Ticket cache: /tmp/krb5cc_tty1 Default principal: email@example.com Valid starting Expires Service principal 25 Jan 02 01:25:47 25 Jan 02 09:25:42 firstname.lastname@example.org 25 Jan 02 03:01:25 25 Jan 02 13:01:20 email@example.com $
So, how does all this fit into PostgreSQL? PostgreSQL client applications (psql, for example) and the postmaster can be compiled to support Kerberos authentication.
When you specify the krb4- or krb5-authentication method, you are telling the postmaster that client applications must provide a valid Kerberos ticket.
When you connect to a krb4 or krb5 authenticated postmaster with a Kerberos-enabled client application, you are not required to supply a username or password?instead, the client application sends a Kerberos ticket to the postmaster.
The nice thing about Kerberos authentication is that it is secure and convenient at the same time. It is secure because you never send cleartext passwords over an insecure network. It is convenient because you authenticate yourself only once (using the kinit program).
As I mentioned earlier, setting up a Kerberos system is not a trivial project. After you have gone through the pain and mystery of installing and configuring Kerberos, you can configure PostgreSQL to use Kerberos to authenticate connection requests. Explaining how to install and configure would require a second book. If you are interested in using Kerberos authentication with PostgreSQL, I recommend you start by reading through the Kerberos web site: http://web.mit.edu/kerberos/www/index.html. The PostgreSQL Administrator's Guide provides the details you will need to connect a PostgreSQL database to an installed Kerberos system.
Kerberos is the second most secure authentication method.
The reject authentication method is the easiest to understand and is also the most secure. When a client tries to connect from a system authenticated by the reject method, the connection attempt is rejected.
If you try to connect from a system that does not match any of the pg_hba.conf records, you are also rejected.
Why might you want to use the reject method? Let's say that you have a reasonable amount of trust in most of the machines on your network, but you reserve one host as a demonstration machine (192.168.0.15). The demonstration machine should be allowed to access the demo database, but no other databases. Every other host should be allowed to access all databases (using Kerberos 5).
# File: pg_hba.conf # # Type Database Client IP address Netmask Method ###### ######## ################# ############### ###### host demo 192.168.0.15 255.255.255.255 trust host all 192.168.0.15 255.255.255.255 reject host all 192.168.0.0 255.255.255.0 krb5
Notice that there are two entries for the demo machine (192.168.0.15). The first entry allows trusted access to the demo database. The second entry rejects access to all other databases. This demonstrates an important point: The postmaster starts reading at the beginning of the pg_hba.conf file and stops as soon as it finds a record that matches on connection type, database name, and IP address/mask. When a user tries to connect to the demo database from the demo machine, the postmaster searches for a record of type host with a database of either demo, all, or sameuser (and of course, a match on the IP address/Netmask combination). The first record matches, so the postmaster allows access without requiring any form of authentication other than the IP address of the demo machine. Now suppose that a user (again on the demo machine) tries to connect to a different database (say, accounting). This time, the postmaster searches for a record of type host and a database of accounting, all, or sameuser. The first record no longer matches (wrong database name), so the postmaster moves on. The second record matches and the postmaster rejects the connection attempt. If a user logged in to a different host tries to connect, the postmaster will find the third record (the first two records won't match the IP address) and allow access using Kerberos 5 authentication.
If the postmaster can't find a record that matches a connection attempt, the connection is rejected, so you may be wondering why the reject method is needed.
Consider what would happen if you removed the second record from this file. If a user on the demo machine tries to connect to the accounting database, the postmaster will ignore the first record (wrong database) and move on to the last record. The last record says that anyone in our local network should be allowed to connect to all databases using Kerberos 5 authentication. That is clearly the wrong answer.