Towards Usable E-Mail

7th January 1998
John Dallman and Bob Dowling
jgd@cix.co.uk and rjd4@cam.ac.uk
Postal address: Bob Dowling
University Computing Service
New Museum Site
Pembroke Street
Cambridge CB2 3QG

Introduction

John Dallman is a Software Tools Engineer with Unigraphics Solutions, of Cambridge, England. He contacted the Direct Marketing Association to ask if they could do anything about unsolicited commercial e-mail from a UK organisation, and was asked to submit ideas in writing. He enlisted the help of Bob Dowling, a UNIX Support Engineer at Cambridge University Computing Service. We hope that this document will be helpful in formulating the DMA's strategy; we are writing as individuals, rather than as representatives of our employers.

Unsolicited e-mail, and what's wrong with it

Most Internet users receive unsolicited commercial e-mail ("spam", or UCE), advertising, for the most part:

Assorted get-rich-quick and pyramid schemes, often illegal in this country
Pornographic web sites, often offering video porn over the Internet
Very dubious "cut-price" deals on American products and services
Advertising for software for sending UCE

By and large, the more someone uses the Internet, the more UCE they get. John Dallman received, during his Christmas holiday, three get-rich-quick schemes, four porn advertisements, nine dodgy "business opportunities" and one for UCE software. This is a lower proportion of sex advertising than normal, for reasons which escape us. Bob Dowling, working at a large site (Cambridge University) with one of the strictest automatic UCE-discarding systems in existence gets an average of three UCEs per day.

UCE is annoying for a variety of reasons:

The sexual UCEs are extremely upsetting to many users.
The get-rich-quick schemes and "business opportunities" are annoying on the grounds of their sheer stupidity, normally inappropriate to this country, and often illegal. The idea of insurance salesmen who won't tell you who they are is particularly silly.
The extra telephone time taken to download UCE pushes up the direct cost of using the Internet for individuals.
The time wasted shifting through UCE is enormous. Consider 1,000,000 people taking one second each to dispose of an obvious UCE. 300 otherwise-productive man-hours have been wasted.
The sheer volume of UCE being transmitted congests the Internet and pushes up the prices that businesses and individuals pay for Internet services. Imagine what London traffic would be like if there were thousands of extra vehicles on the streets whose only purpose was to display insulting advertisements.
Computer systems sometimes become so overloaded by UCE that they crash, or run so slowly that they can't be used effectively.

Political considerations

It is currently foolish for legitimate businesses to use e-mail for any form of advertising, unsolicited or otherwise. It will simply look like more UCE and will be discarded as a knee-jerk reaction. Worse, the receiver's attitude towards any company that they can identify will be damaged. The Internet cannot be a medium for business while using it carries such risks.

We feel that the Internet already works better as a "pull" medium for commerce than as a "push" medium. The freely-available search engines on the Internet make it easy to locate Web pages on any subject, giving the customer choice over what he views, rather than being forced to pay to receive advertising he doesn't want.

The British Government is shortly due for a nasty shock, due to their policy of connecting all schools to the Internet. Imagine the reaction when the tabloid press discovers that schoolchildren are being sent advertisements for pornography via the e-mail accounts that the government has provided. Note that there is a considerable difference between pupils searching out pornography of their own volition, and their being sent it.

There seems to be considerable potential for the Direct Marketing Association to gain from proposing a scheme to reduce the amount of UCE in circulation. An E-mail Preference Service, along the lines of the schemes in operation for post, telephone calls, and faxes, will not be effective in reducing volumes of UCE. Almost all UCE originators lack the integrity and business sense possessed by the organisations currently covered by DMA schemes, and will simply ignore any preference scheme. The subjects of existing UCE make this very plain.

Why does UCE happen?

The basic cause of UCE is, unfortunately, the low cost to the originator (the "spammer") of sending large amounts of e-mail. The real cost is borne collectively by all of the organisations that use the Internet. Since it's impossible to calculate the total cost of an e-mail message, no attempt is made to charge the originator proportionally. Trying to change this is impractical.

We therefore have to find appropriate countermeasures for UCE, so that it becomes ineffective. Human greed and gullibility means that end-user education won't solve the problem, and a technical or semi-technical solution must be found.

The war to date

UCE is a fairly new phenomenon. It became noticeable in early 1995, and has increased hugely during 1997. This, of course, mirrors public consciousness of the Internet and the growth of the World-Wide Web. The public's e-mail addresses were and are obtained through a variety of means:

ISP users lists, which are often made available so that users can look up their friends.
Trawled from Usenet discussions, where every message should contain its author's e-mail address.
Obtained from Web pages, by programs that automatically search through the World-Wide Web looking for e-mail addresses.
Searching the archives of mailing lists, to obtain lists of people interested in a particular topic.

The first countermeasures were simple complaints to the spammers, which sometimes grew into torrents of hate-mail. These were largely ineffective, providing the first confirmation that persistent spammers cared nothing for their unpopularity. They also provided confirmation that the complainant's e-mail addresses were genuine, thus ensuring further UCE.

Complaints to the e-mail system administrators ("postmasters") of the Internet Service Providers ("ISPs") usually worked when the spammers were using reputable ISPs, but failed with poorly run systems, and with spammers who had their own Internet site and who just ignored the complaints. Many reputable ISPs now provide an "abuse" e-mail address (e.g., 'abuse@aol.com'), so that UCE complaints can be handled by specialists.

Spammers began to develop strategies to avoid their operations being disrupted. The simplest, and hardest to stop at present, is "throwaway" accounts. Most large ISPs offer a free-trial period for their system. It's simple to open a trial account, use it to send a single bulk e-mail to millions of addresses, and then give up the account. The account will be closed by the ISP, but it's too late. There are simple countermeasure for this, but they need to be carried out by the ISPs, and they haven't so far, in spite of numerous suggestions.

Almost all UCE contains false information in the "message headers". These are information used by Internet computers to deliver mail; to avoid complaints and disconnections, UCE messages lie about where they have come from and who sent them. Technically capable users can spot this, and can often deduce the true source, with help from various Internet management tools. However, the naïve user (who is always in the majority) is misled, and complaints are sent to an ISP who had nothing whatever to do with the UCE.

In an attempt to portray themselves as reasonable businessmen, many spammers offer list-removal services: they claim that if you send a request to an address specified in the UCE, they will remove your address from their lists. Very occasionally, this may even happen. Normally, the specified address is undeliverable, or requests sent to it are ignored, or used as confirmation that your address is active, and can thus be sent even more UCE.

The most invidious weapon of the UCE "businesses" is Public Relations. They try hard to present themselves as "The leading edge of business on the Internet". Speaking of "promoting electronic commerce" to US politicians and presenting the opposition to UCE as "anti-commerce nerds" has been quite effective. For example, there are currently four bills before the US Congress relating to UCE. Three of these essentially legalise it, with protective conditions for users that might be adequate if UCE originators had reasonable commercial integrity. As the current users of UCE have no such integrity, the bills simply invite legitimate businesses into the trap of having their reputations tarred by association.

Technical buzzwords

It isn't easy to discuss methods of dealing with UCE without a few technical terms. Here they are.

TCP/IP (an acronym) is the fundamental communications protocol of the Internet. It allows computers to send data to each other. E-mail, the World-Wide Web, Usenet news and so on are all varieties of TCP/IP data.

An IP address is needed to communicate over the Internet using TCP/IP. Every computer connected to the Internet has a unique address: they consist of strings of numbers, such as "131.111.10.27", or "127.0.0.1". Since these addresses are hard to remember, almost all machines have Internet names, of the familiar form "cix.co.uk", "griffin.csi.cam.ac.uk", or "aol.com".

The Domain Name Service, or DNS, is the world-wide co-operative naming system, used to turn Internet names into IP addresses. Essentially, a computer with DNS software can be asked to provide the address corresponding to a name (DNS lookup), or vice-versa (DNS reverse lookup). If the computer doesn't have any information about that name, there is a system for asking other computers with more global information, and for providing information about local computers to the global ones.

SMTP (Simple Mail Transfer Protocol) is used for sending e-mail messages from one computer to another. Unlike the postal service, where a letter is out of the sender's hands once it has been posted, SMTP operates by means of a dialogue between two computers. Since many computers are only connected to the Internet for part of the time, SMTP allows computers to accept mail that isn't for them, in the expectation of forwarding it to the correct destination, or a computer that can more easily communicate with the final destination. This is called Relaying.

Headers are the strange text at the start of e-mail messages (often hidden by mail reading programs). Some are created by the program used to create the e-mail, others are created during the SMTP dialogue between computers. Here is a simple example:

Received: from chardonnay (relay.mars.it [555.120.22.2]) by mail-relay.compulink.co.uk (8.8.7/8.8.7) for <jgd@cix.compulink.co.uk>; Sun, 4 Jan 1998 13:56:25 GMT To: jgd@cix.compulink.co.uk From: Alessandro Bogianno <bogianno@mars.it> Subject: Re: I must write a program for matching 2 strings

Note that is it possible for a computer sending e-mail via SMTP to lie about its name, so that there will be misleading information recorded in the e-mail's headers (e.g., the "Received" line in the example). It is not possible for a computer to successfully lie about its IP address, since the SMTP dialogue would break down and the mail would not be sent.

Resources

Quite a number of organisations already exist to combat UCE. While none of them have "succeeded", they have studied the problem extensively.

On Usenet, news.admin.net-abuse.email is the main newsgroup dealing with UCE, while alt.stop.spamming is less generally used. Other groups in the news.admin.net-abuse.* hierarchy deal with other forms of network abuse.

The Coalition Against Unsolicited Commercial E-mail (CAUCE) can be found on the World-Wide Web at http://www.cauce.org. It is an American lobbying organisation, which is taking an interest in the four UCE bills currently before Congress.

The Mail Abuse Protection System (MAPS) operates a "Real-time Black-hole List": an automatically distributed list of sites which are emitting UCE. ISPs that subscribe to the service can check incoming e-mail against the list and discard or reject any e-mail from listed sites. This is an extremely blunt instrument: it works on the basis of the IP address (since the Internet name can be easily forged) and can't discriminate between UCE and legitimate e-mail coming from the same machine. The potential for unwarranted addition to the list, and the time taken for removal from the list make this scheme unpopular even with reputable ISPs. More information can be obtained at http://maps.vix.com.

Types of spammers and their tools

Spammers can be loosely divided into three categories:

Amateurs usually haven't heard of UCE, or that it's a bad idea. They re-invent the idea of sending mass e-mail, or are duped into it by "friends". They use their ISP or college accounts, without any attempt at disguise, and are surprised at the response. Preventing them entirely isn't practical; reducing their numbers is a matter of education.

Persistent spammers have usually bought software for sending large volume of e-mail rapidly. They tend to use throwaway or trial accounts, and/or attempt to disguise where their UCE is coming from. They tend to be the biggest nuisances, since they pop up repeatedly under different names. Their motives are somewhat mysterious to us, apart from the ones that have realised they can make money by selling software for sending UCE. This trade is based entirely on the gullibility of Internet users and false claims about the effectiveness of UCE, and of their software.

Professional spammers have their own large computer systems, or networks, and a full-time Internet connection. They send vast volumes of UCE, but can usually easily be blocked by MAPS-style systems. While they are thus less annoying than the previous category, they do more real harm, since they contribute greatly to Internet congestion.

Software for sending UCE usually works thus:

Using your legitimate ISP account, connect to the Internet.
Contact an innocent third party's SMTP service.
Use it to send e-mail which is addressed to users of a different computer. The mail will be relayed, as described above.
Disconnect. You haven't used your legitimate account for anything obviously wrong, because it isn't obvious to your ISP that you are using a third-party SMTP service, and your messages have disguised where you came from.

If this works as the spammer intends it to, any complaints go to the ISP that the forged message claimed to come from. If that doesn't fool people, the backup defence is false information about the path the message has taken through the SMTP relays on the Internet: the message simply claims to have already been relayed at the point it is first sent. A skilled Internet user can sometimes trace through the message headers and determine the real origin point, but software for doing this automatically is currently the subject of research projects.

What ISPs can do

There are several courses of action that all ISPs can and should take. The problem tends to be with motivation. Responding to UCE reports and complaints requires skilled manpower, and some ISPs seem to feel that it is cheaper for them to ignore the problem. America On-line (AOL) are an example of an ISP who take the problem seriously and are even prepared to go to court about it. ISPs who do not take it seriously are too numerous to enumerate, but AT&T, UU.net, PSI.net and Earthlink are notable examples.

The problem lies in convincing boards of directors who do not use e-mail that their company should spend money to do something with intangible results in the short term. The only means that we have available at present is market pressure: if legitimate e-mail from these ISPs is rejected, along with UCE, their customers ask why and often consider changing suppliers when they are told that it is because of lax UCE policy. This, however, has not been sufficiently effective at changing companies' attitudes so far.

There is also a cultural problem, in that large companies tend to be reluctant to make decisions quickly. They prefer to convene working parties and otherwise hope that the problems will go away. This is not appropriate for deciding to cut off a spammer's account: the authority to do that needs to be delegated, completely, to operational staff and software.

A general rule of computer security is to forbid the machine from doing anything except the things it's meant to do. SMTP software set up on this basis can be quite effective at fighting spam relaying. Possibilities include:

Simply not accepting any relayed messages. This is only appropriate for simple organisations who don't expect to pass on e-mail, but these sites are in the majority, and it is easy to implement.
Restricting acceptance of relaying. This is appropriate for more complex organisations and the ISPs themselves. It involves checking received e-mail in a variety of ways, including:
- Does the IP address the mail comes from correspond to the Internet name that it says it is from? (this requires a DNS reverse-lookup, and is quite effective).
- Is the computer sending us the mail on the list of subsidiary computers that the system has been told to accept relays from?
- Is the destination address one that the system has been told it may relay mail to?
- Do the headers describing any previous relaying appear to be valid?
Checking the identity of a system that is sending a message for relaying. Even if it isn't going to be rejected, its identity should be checked via DNS reverse-lookup and passed on with the message. Some Internet systems do this already; it is very helpful in tracking down spammers.
Restricting the number of people that can receive any single message. Messages address to more people are assumed to be UCE, and are rejected. This has to be done carefully, since legitimate mailing lists might validly include a large number of people at a company or university.
Restricting the number of simultaneous SMTP dialogues that your computer can hold with any single other computer. Sophisticated spamming software often makes several connections in parallel, to evade other restrictions.
Rejecting all unauthorised accesses, if the system can know which computers are allowed to use it (e.g., the range of IP addresses corresponding to an ISP's customers).

These precautions do not require nearly as much computer power as it might seem. If they are carried out during the SMTP dialogue rejected messages don't get a chance to use up resources in their own right. Cambridge University's systems do this, quite successfully. If these precautions were widely adopted, the load would also be widely distributed, and should not overburden any one system.

ISPs fall into two main categories: those which provide "dial-up" service to individuals using modems and the telephone system, and those which provide permanent connections to companies, universities, etc., over leased lines. There are some tactics that are specific to them.

Dial-up ISPs could further limit spamming by:

Limiting the size and/or number of e-mails that can be sent from trial accounts.
Restricting access to some services: don't allow access to any SMTP service except the one provided by the ISP for customer use.
Better credit checking: ISP accounts are always paid for with credit cards, but the only check made when most accounts are opened is that the credit card number has the correct format. A real check should be made, checking the user's postal address against that held by the credit card company.
Explaining to their users that sending UCE is not permitted, explaining the concept, why it doesn't work, and that their accounts will be summarily terminated. It might be effective to levy a fine against their credit card accounts, but this is probably a legal minefield.

All of these points (except for credit card numbers) also apply to universities, schools, cybercafés and other public Internet points.

Permanent connection providers often supply and set up company e-mail systems. They need to ensure that their clients' SMTP service is configured with the checks we describe above.

In combination, these measures make relaying of UCE and the use of false header information impractical. This allows spammers to be located, and then blocked or disconnected.

Legislation, and ideas that won't work

We would be very cautious about recommending any legislation to address the problem of UCE. The problem is international, and changes so rapidly that any legislation that would still be useful in five years time would almost certainly make legitimate Internet applications illegal. The difficulties of communication between the people who understand the issues and legislators are also quite significant.

However, a "Voluntary code of practice for the industry" would also be quite ineffective. The problem is international, and the senders of UCE lack even the most basic commercial ethics. We therefore feel that, short of setting up a new and intrinsically secure Internet, technical approaches of the kind we have described are the best option.

It would be very dangerous to attempt to coerce ISPs into taking these measures. The heavy-handed attitude of the Metropolitan Police when they suggested to the ISPs that a large number of Usenet groups should be barred on the grounds of possible sexual content marked a nadir in official relations with the ISPs, and they still do not trust the police to apply the law with any understanding of their business or the social conventions of the Internet.

For these reasons, all our technical suggestions are objective, based on usage of e-mail protocols, rather than subjective, based on the content of messages. ISPs greatly fear accusations of censorship, preferring to see themselves as common carriers. US ISPs also have the First Amendment, and the zealots who invoke it without due cause, to worry about.

Actions that should be taken

To summarise, we recommend that the three classes of spammer be dealt with as follows:

Amateur spammers can be reduced through basic user education and the measures outlined above. There will always be a few, but they will be easy to deal with.
Persistent spammers can be caught and stopped quite easily if the measures described above to prevent relaying and falsified e-mail headers are implemented. If they are easily stopped, their business will not be worthwhile for them, and they'll stop.
Professional spammers can be blocked by discarding all data from their IP addresses. Again, their business will not be worth carrying out.

To allow ISPs to take the measures we describe, authors of SMTP software should be encouraged to provide the facilities listed above for preventing relaying and falsification. There is a draft Internet standard ("Draft-Lindberg-Anti-Spam-MTA-01.txt", obtainable from http://www.internic.net/drafts) which makes very similar suggestions, in rather more technical language.

The SMTP software packages which are written and maintained by the Internet volunteer community already have, or are likely to soon acquire these features. Commercial providers of SMTP software may benefit from some pressure to implement these facilities soon.

Conclusion

UCE is a decided nuisance to existing Internet users and a major barrier to legitimate commercial exploitation of the Internet. Technical steps can be taken that will reduce its effects dramatically. These will reduce the flexibility of Internet e-mail slightly, and increase its usability greatly.

The best way to reduce the nuisance of UCE is for the owners of the systems that transmit it to take action against it. Motivating them to do this requires planning by people who understand publicity better than the authors of this paper. Such motivation is required in both the UK and the USA, at the very least, and probably more widely. We suspect that carrots - possibly in the form of good publicity - and sticks - in the form of bad publicity - may both be required.