5.4 The MX Algorithm

That's the basic idea behind MX records and mail exchangers, but there are a few more wrinkles you should know about. To avoid routing loops, mailers need to use a slightly more complicated algorithm than what we've described when they determine where to send mail.[1]

[1] This algorithm is based on RFC 974, which describes how Internet mail routing works.

Imagine what would happen if mailers didn't check for routing loops. Let's say you send mail from your workstation to nuts@oreilly.com, raving (or raging) about the quality of this book. Unfortunately, ora.oreilly.com is down at the moment. No problem! Recall oreilly.com's MX records:

oreilly.com.    IN    MX    0  ora.oreilly.com.
oreilly.com.    IN    MX    10 ruby.oreilly.com.
oreilly.com.    IN    MX    10 opal.oreilly.com.

Your mailer falls back and sends your message to ruby.oreilly.com, which is up. ruby.oreilly.com's mailer then tries to forward the mail on to ora.reilly.com but can't because ora.oreilly.com is down. Now what? Unless ruby.oreilly.com checks the sanity of what she is doing, she'll try to forward the message to opal.oreilly.com or maybe even to herself. That's certainly not going to help get the mail delivered. If ruby.oreilly.com sends the message to herself, we have a mail routing loop. If ruby.oreilly.com sends the message to opal.oreilly.com, opal.oreilly.com will either send it back to ruby.oreilly.com or send it to herself, and we again have a mail routing loop.

To prevent this from happening, mailers discard certain MX records before they decide where to send a message. A mailer sorts the list of MX records by preference value and looks in the list for the canonical domain name of the host on which it's running. If the local host appears as a mail exchanger, the mailer discards that MX record and all MX records in which the preference value is equal or higher (that is, equally or less-preferred mail exchangers). That prevents the mailer from sending messages to itself or to mailers "farther" from the eventual destination.

Let's think about this in the context of our airport analogy. This time, imagine you're an airline passenger (a message) trying to get to Greeley, Colorado. You can't get a direct flight to Greeley, but you can fly to either Fort Collins or Denver (the two next-highest mail exchangers). Since Fort Collins is closer to Greeley, you opt to fly to Fort Collins.

Now, once you've arrived in Fort Collins, there's no sense in flying to Denver, away from your destination (a lower-preference mail exchanger). (And flying from Fort Collins to Fort Collins would be silly, too.) So the only acceptable flight to get you to your destination is now a Fort Collins-Greeley flight. You eliminate flights to less-preferred destinations to prevent frequent-flyer looping and wasteful travel time.

One caveat: most mailers will look only for their local host's canonical domain name in the list of MX records. They don't check for aliases (domain names on the left side of CNAME records). Unless you always use canonical names in your MX records, there's no guarantee that a mailer will be able to find itself in the MX list, and you'll run the risk of having your mail loop.

If you do list a mail exchanger by an alias and it unwittingly tries to deliver mail to itself, most mailers will detect the loop and bounce the mail with an error. Here's the error message from recent versions of sendmail:

554 MX list for movie.edu points back to relay.isp.com
554 <root@movie.edu> . . .  Local configuration error

The moral: in an MX record, always use the mail exchanger's canonical name.

One more caveat: the hosts you list as mail exchangers must have address records. A mailer needs to find an address for each mail exchanger you name or else it can't attempt delivery there.

To go back to our oreilly.com example, when ruby.oreilly.com received the message from your workstation, her mailer would have checked the list of MX records:

oreilly.com.    IN    MX    0  ora.oreilly.com.
oreilly.com.    IN    MX    10 ruby.oreilly.com.
oreilly.com.    IN    MX    10 opal.oreilly.com.

Finding the local host's domain name in the list at preference value 10, ruby.oreilly.com's mailer would discard all the records at preference value 10 or higher (the records in bold):

oreilly.com.    IN    MX    0 ora.oreilly.com.
oreilly.com.    IN    MX    10 ruby.oreilly.com.
oreilly.com.    IN    MX    10 opal.oreilly.com.

leaving only:

oreilly.com.    IN    MX    0 ora.oreilly.com.

Since ora.oreilly.com is down, ruby.oreilly.com would defer delivery until later and queue the message.

What happens if a mailer finds itself at the highest preference (lowest preference value) and has to discard the whole MX list? Some mailers attempt delivery directly to the destination host's IP address as a last-ditch effort. In most mailers, however, it's an error. It may indicate that DNS thinks the mailer should be processing (not just forwarding) mail for the destination, but the mailer hasn't been configured to know that. Or it may indicate that the administrator has ordered the MX records incorrectly by using the wrong preference values.

Say, for example, the folks who run acme.com add an MX record to direct mail addressed to acme.com to a mailer at their Internet service provider:

acme.com.    IN    MX    10 mail.isp.net.

Most mailers need to be configured to identify their aliases and the names of other hosts for which they process mail. Unless the mailer on mail.isp.net is configured to recognize email addressed to acme.com as local mail, it will assume it's being asked to relay the mail and attempt to forward the mail to a mail exchanger closer to the final destination.[2] When it looks up the MX records for acme.com, it will find itself as the most-preferred mail exchanger and will bounce the mail back to the sender.

[2] Unless, of course, mail.isp.net's mailer is configured not to relay mail for unknown domains. In this case, it would simply reject the mail.

You may have noticed that we tend to use multiples of 10 for our preference values. Ten is convenient because it allows you to insert other MX records temporarily at intermediate values without changing the other weights, but otherwise there's nothing magical about it. We could just as easily have used increments of 1 or 100?the effect would have been the same.