1. Field of Invention
This invention relates to the removal of unsolicited e-mail messages, commonly known as SPAM, from a client's e-mail.
2. Discussion of Prior Art
E-mail has become a very important means of communications. Unfortunately, unsolicited e-mail messages, commonly referred to as SPAM, is cluttering this communications channel. This unsolicited e-mail wastes time, storage space, and communications bandwidth. This translates into lost productivity and increased computing and communication costs. Some of these unsolicited messages are also offensive. They are clearly directed at an adult audience; unfortunately, there is nothing to protect minors from receiving this material.
One approaches to this problem involves setting up keyword filters in a browser to detect messages with works such as “porn” or “sex.” This defense is easily defeated by using variation such as “***PORN****” or “_Sex_”.
Another approach to this problem involves setting up filters in the browser to block messages from the return address of the unsolicited e-mail. This defense is not effective since the SPAMer usually use a false return address.
Another approach would be to compare each incoming message again a set of unsolicited e-mail messages. Unfortunately, this is quite computationally expensive.
A set of unsolicited e-mail messages is collected. Each unsolicited e-mail message is “finger printed” to produce an identifier. The “finger printing” can be accomplished by sampling the message and using portions of the samples to form an identifier. The “finger printing” can also be accomplished by hashing a portion of the message and using the result as an identifier. These “finger prints” or identifiers are used to construct an unsolicited message database.
The client's e-mail messages are processed in off-line manner by periodically fetching their messages; “finger printing” each message in a manner identical to the unsolicited messages; checking to see if the “finger print” is in the unsolicited message database; discarding any messages with a “finger print” in the unsolicited message database; and forwarding any message with a “finger print” not in the unsolicited message database to the “clean” POP.
The client's e-mail messages are processed in a on-demand manner by intercepting their “clean” POP request; fetching their mail from their “dirty” POP; “finger printing” each message; checking to see if the “finger print” is in the unsolicited message database; forwarding any message with a “finger print” not in the unsolicited e-mail database to the “clean” POP; and then passing the intercepted POP request to the “clean” POP.
For purposes of illustration, a computer network, 1; an unsolicited e-mail (SPAM) generator, 2; a client, 3; a client, and a “dirty” e-mail server, 4 are shown in
Currently, the unsolicited e-mail generator, 1, generates e-mail and sends it to various blocks of IP addresses. The “dirty” POP e-mail server, 4, receives the client's valid e-mail along with the unsolicited e-mail. The client, 3, then fetches the valid and unsolicited e-mail from the “dirty” POP e-mail server.
In our new approach, we add a “clean” POP e-mail server, 5, and a SPAM removal agent, 6. The SPAM removal agent, 6, down loads the client mail from the “dirty” POP e-mail server, 7, removes all the unsolicited e-mail that it can find; and sends the scrubbed e-mail to the “clean” POP e-mail server, 5. The “clean” POP e-mail server holds the client's scrubbed e-mail. The client. 3, then fetches his e-mail from the “clean” POP e-mail server, 5, rather than the “dirty” POP e-mail server, 7.
Our approach detects unsolicited e-mail messages by comparing the “finger print” of a client e-mail message with the “finger prints” of a set of un-solicited e-mail messages.
This is accomplished by first gathering a set of un-solicited e-mail messages. One approach to accomplishing this is to setup “honey pots.” These “honey pots” are e-mail address which have no purpose other than to collect unsolicited e-mail. There are some websites, such as spamhaus (http://www.spamhaus.org/sbl/latest.lasso), spews.org (http://www.spews.org/faq.html), and dsbl.org (http://dsbl.org/usage.html) which collect and report unsolicited e-mail.
The unsolicited e-mail messages are then “finger printed” to create an identifier. There are two basic approaches creating this identifier or “finger print.” One approach based on sampling the message is shown in
An example of this approach is shown in
The location of the sampled characters in the sample based e-mail “finger printing” as shown in
A second approach to creating a e-mail message “finger print” or identifier based on hashing is shown in
An unsolicited message membership database is then constructed as shown in
As an example, the eight bit identifier generated in
Any type of database can be used, however it is advantageous to keep the unsolicited e-mail membership database as compact as possible. This will allow the database to be kept within memory and thus speed up access.
A flowchart describing an off-line process for removing unsolicited e-mail messages from a client's e-mail is shown in
All the fetched “dirty” client e-mail messages are then processed in the following manner.
A message is selected. The selected message is “finger printed” and an identifier is constructed. The identifier is checked for membership in the unsolicited e-mail membership database. If the identifier is found to be in the database then the selected message is deleted and client's “clean” POP login, the deleted message identifier, and a time stamp are saved in the deleted unsolicited message database shown in
The previously examined message database shown in
A flowchart describing an on-demand process for removing the unsolicited e-mail messages from a client's e-mail is shown in
All the fetched “dirty” client messages are then processed in the following manner. A message is selected. The selected message is “Finger printed” and an identifier is constructed. The identifier is checked for membership in the unsolicited message membership database. If the identifier is in the unsolicited message membership database then the message is deleted and the client's “clean” POP login, deleted message identifier, and time stamp are saved in the deleted message database shown in
There is no need to use the previously examined database as shown in
The deleted unsolicited message database as shown in
One advantage that both the off-line and the on-demand unsolicited e-mail processes have is that client does not have to change their e-mail address or install any software. They only have to change the IP address of their POP server. Another advantage is looking up the “finger print” identifier in the unsolicited e-mail membership database requires considerably less computation than comparing the client's message against the entire pool of unsolicited messages. Another advantage is that detailed records of the deleted messages are available for billing purposes.
The difference between the off-line and the on-demand approach to processing a client's unsolicited e-mail is that the off-line approach can process the “dirty” e-mail at its leisure and thus spread the computational load whereas the on-demand approach has to process the “dirty” e-mail while the client is waiting.
Although the present invention has been described above in terms of specific embodiments, it is anticipated that alteration and modifications thereof will no doubt become apparent to those skilled in the art. It is therefore intended that the following claims be interpreted as covering all such alterations and modifications as falling within the true spirit and scope of the invention.
This application is a continuation of application Ser. No. 10/179,446 entitled “A Method for Removing Unsolicited E-Mail Messages”, filed on Jun. 25, 2002 now abandoned, which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5283856 | Gross et al. | Feb 1994 | A |
5377354 | Scannell et al. | Dec 1994 | A |
5619648 | Canale et al. | Apr 1997 | A |
5826022 | Nielsen | Oct 1998 | A |
5999932 | Paul | Dec 1999 | A |
6023723 | McCormick et al. | Feb 2000 | A |
6052709 | Paul | Apr 2000 | A |
6112227 | Heiner | Aug 2000 | A |
6199102 | Cobb | Mar 2001 | B1 |
6321267 | Donaldson | Nov 2001 | B1 |
6460050 | Pace et al. | Oct 2002 | B1 |
6650890 | Irlam et al. | Nov 2003 | B1 |
6654787 | Aronson et al. | Nov 2003 | B1 |
6868498 | Katsikas | Mar 2005 | B1 |
6941348 | Petry et al. | Sep 2005 | B2 |
6965919 | Woods et al. | Nov 2005 | B1 |
6996606 | Hasegawa | Feb 2006 | B2 |
7016939 | Rothwell et al. | Mar 2006 | B1 |
20020162025 | Sutton et al. | Oct 2002 | A1 |
20020181703 | Logan et al. | Dec 2002 | A1 |
20020199095 | Bandini et al. | Dec 2002 | A1 |
20030009698 | Lindeman et al. | Jan 2003 | A1 |
Number | Date | Country | |
---|---|---|---|
Parent | 10179446 | Jun 2002 | US |
Child | 10700752 | US |