sendmail — history and design

Speaker: Eric Allman

1. History

1.1. How did it all start?

sendmail(8) started as something not official, without any financial support. It was born in the 1980's, out of necessity. Many staff were working on Ernie CoVAX. Eric was working on INGRES; relational databases were hot new things at the time. At one point, the INGRES machine (a valient 16-bit PDP) on which Eric was working got an ARPAnet connection1. Many staff wanted access to the INGRES machine in order to use its ARPAnet connection. However, the machine had only two TTY lines, and adding more would have been too expensive. Eric soon saw that what most people wanted out of the ARPAnet connection was to be able to send mail to other machines. His solution was to develop delivermail, a system for forwarding messages between Ernie and the other systems accessible over the ARPAnet.

At this point, Eric showed us a view of the architecture of delivermail: it mostly connected together a lot of systems, from FTP-delivered mail to UUCP2. Each system was having its own pool, its own way to address mailboxes… Eric showed a table with a selection of addresses. Each network (BerkNet, ARPAnet, UUCP-accessible machines) used a different way to designate machines and mailboxes, and a moderately complex address could be interpreted in several different ways. foo!bar!baz could mean mailbox baz on machine bar after a hop through foo (over UUCP), or mailbox bar!baz on machine foo if foo accepted colons in user-names. This accounted for significant complexity in the initial delivermail.

1.2. First advices

In order to build successful software, the first thing (according to Eric) one needs to accept is that one programmer is finite. Don't redesign the UA (even when the UA was /usr/bin/mail), not (only) because users were already used to it, but also because it would be too much work. In the same vein, he decided not to redesign the system mail store.

The second thing was to make delivermail adapt to the world, not the other way.

These two things (implementing as little stuff as possible, and adapting to the rest of the world) guided the initial design of delivermail, as a rather small message passing device between other mail systems.

1.3. Problems with the solution

The configration was compiled-in; that is, in order to install delivermail on a new site, one needed to hack the source, recompile and install.

There was no address translation between networks. Address parsing was both simplistic and opaque. Users needed to read a man page (man pages?) before being able to build the mail address of someone else living accross one or two networks.

But hey, it was supposed to be a quick hack for freeing the INGRES machine, and it kinda worked!

1.4. The transition to sendmail

The DARPA gave a grant for completing 4.2BSD (the first with TCP/IP!). Bill Joy asked Eric to add SMTP to delivermail. Supporting SMTP required adding a mail queue, which had quite an impact to the internal architecture. Eric ended up rewriting his dæmon, creating sendmail.

1.5. The chaos years: 1981–90

Eric left Berkeley to pursuit a lucrative (yeah… :-p) industry in 1981. Around the same time, Bill Joy left Berkeley to join a new and promising company, called Sun. The rest is part of a larger history (told in part by Kirk McKusick), in particular the Unix wars, during which every vendor would extend their system (including sendmail) in different, sometimes incompatible directions.

1.6. Return to sendmail

Eric got back to Berkeley in the early 1990's. He started with adding subdomains handling, and one thing leading to another, scope creep resulted in a complete rewrite, called sendmail8 3.

1.7. Sendmail, Inc.

Sendmail, Inc. was the first commercial / OSS hybrid company (and still employs Eric). At the time, it was not obvious how to build a business plan on such a situation; but evidently Eric did not manage too badly.

New features were introduced:

  • encryption;
  • milters, a feature Eric seems quite proud of;
  • virtual hosting;
  • LDAP support;
  • lots more checking of data that comes from the outside world…

Those features came from commercial needs. Before, what drove Eric to build new features was more the "nice to have" attitude.

1.8. Changes in requirements

Reliability has always been important. An important point of focus was to always get the mail through or send back an error to the user.

With a more commercial incentive, Eric started adding functionality and performance; then protection (Fred: I just noted that in my notes, don't remember more precisely what Eric meant), then legal compliance (think audit and tracability, log retention), then cost control.

2. Design decisions

Eric started with a few remarks:

  • it is easier to build a tool than a solution;
  • the world at the time was ugly;
  • the world today is still ugly.

He would do things mostly the same way, modulo some updates.

2.1. Rewriting rules

In hindside, it wasn't overkill. The concept was sound: regexps replaced with tokens, but the syntax and the control flow could have been better.

One stupid thing was making tabs into active characters. To Eric's defence, if make(1) made tabs active, it must have been a good idea! Not.

2.2. Message munging

This was essential for interoperability at the time, not necessary today. In retrospect, sendmail should have had a passthrough mode, where trash would be either accepted or rejectes, without trying to fix others' mistakes.

2.3. Syntax of configuration files

It is ugly, flat (no nesting), with too much signal characters.

Today, Eric would have used something like the Apache configuration.

2.4. SMTP and queuing into sendmail

Eric was reluctant to include it, but it was The Right Thing. He would have added more privilege separation.

The queue had two files per message (for the header and for text). Having data and protocols as ASCII helped the debugging.

This was the right approach for the time, today Eric would put envelopes in a DB (less trashing around).

2.5. m4(1) for configuration

The dnl macro was bad. It was added to produce a neat output, at the cost of a significant uglification of the input files.

Some tool was needed for the configuration, but m4 was probably not the right one.

2.6. Extending vs changing

This is a big, important concern. In hindside, Eric thinks he paid too much attention to backward compatibility4. He wanted really hard to be able to install a new release over an old one, no touch the configuration, and have sendmail to keep on working. This was a noble goal, but prevented a number of changes that would have broken configurations on upgrades.

3. Things Eric would do differently

Fix problems earlier, already mentionned above.

Use modern tools, the build system in particular which was hand-rolled.

Privilege separation.

A string abstraction.

Separate mailbox names from unix IDs.

A cleaner configuration file.

4. Things Eric would do the same

Use C. Eric described C++ as the most ugly thing in the Universe, having both the limitations of C and the problems of OO languages, without any of their advantages.

Bite things in small chunks (see "the programmer is finite" above).

Use syslog(8), which was very new at the time, being written. Eric grew quickly tired of having random processes write their logs to random files in random places.

The rewriting rules, except for the active TABs.

Don't rely too heavily on outside tools.

5. Lessons learnt

KISS.

Know what you're doing; this is way more important than having an advanced design.

Flexibility trumps performance.

Fix things early. Stuff is easier to fix early, and is less painful when you have 5 clients than when you have 50. Of course, this means, sometime, breaking backward compatibility.

ASCII is great for internal files and protocols.

Documentation is key. According to Eric, the bat book was very important in the success of sendmail with sysadmins.

Footnotes:

1

The ARPAnet backbone was built on 56kbps links!

2

With a half-smile, Eric asked who even remember UUCP. A guy seated just before me said he was still using it. Way to show that Unix maintains backward compatibility —or does it show that we are mostly gray beards that refuse obsolescence? :-)

3

I seem to remeber that numbering did not start at 1, but at something like sendmail6 —but I don't have notes backing that up.

4

I feel I have to point to The UNIX Haters Handbook, page 185. Incidentally, the all-to-important backward compatibility issue in this case was also about having TABs as active spaces… I find the Haters Handbook a very interesting read, even today. The game is to find out which concerns have since been fixed, and which ones are still true today.

Auteur : Frédéric Perrin

Date : dimanche 16 octobre 2011, modifié le samedi 29 octobre 2011

Sauf mention contraire, les textes de ce site sont sous licence Creative Common BY-SA.

Ce site est produit avec des logiciels libres 100% élevés au grain et en plein air.