1. Technical Field
The present invention relates generally to electronic computer communication and in particular to electronic mail communication. Still more particularly, the present invention relates to a method for reducing spam within electronic mail communication.
2. Description of the Related Art
Electronic mail (email) communication is utilized by a large and growing population of computer users. Each user has one or more email accounts having an inbox within which electronic mail that is addressed to the account is received. In recent years, businesses and persons desiring to spread their information to a large number of email users have resorted to a practice known as “spamming” within the computing environment. Spamming produces an un-solicited email from another user (often unknown to the recipient), and such email is referred to as spam or spam mail. Unfortunately, the practice of spamming has grown and it is not uncommon for a user to receive tens to hundreds of spam mail in his/her inbox over a 24-hour period.
Email spam is both a burden to the user receiving it and the servers processing it. Users who receive email have to determine what is valid (i.e., wanted) email and what is not. Sifting through and deleting of unwanted spam mail from the inbox gobbles up time and productivity. The same negative effects are experienced by the email servers that process and disseminate all email originating from and terminating at the various users assigned to the server.
One method of reducing the clutter in the user's inbox is the use of filtering. Several commercial filters have been created which attempt to sort the spam mail into a separate junk email box. However, the filtering process is usually an after-the-fact blocking technique. That is, a particular sender (email address) is identified as a spammer and any subsequent mail received from that sender (address), the filter recognizes the spammer and blocks the email from entering the user's inbox.
This filtering method works in theory, if the spammers keep the same email address or if the spammers keep the same subject title. However, experienced spammers are aware of this methodology and employ a variety of techniques to attempt to make their solicitations unique enough to bypass modern filters and blacklists. These techniques include counter-heuristical and other tricky spamming techniques such as source and destination forgery, randomized text body, and fake titles. With such methods being utilized to find ways around/through the filters, filtering spam is increasingly difficult because the spam mail typically does not have a single, easily recognizable signature to consistently filter upon.
Disclosed is a method, system, and computer program product that substantially reduces or substantially eliminates the amount of spam mail received by a recipient email account. The technique provided is referred to as Match and Destroy (MaDe). MaDe acts as a filter assistant and ensures that the existing spam filters have a good and continually updated basis by which email can be filtered at the email server. The MaDe technique assigns at least two email addresses per user when the user first registers for an email account or when the MaDe software is installed on the server.
In one embodiment in which DHCP is utilized, two addresses are dynamically acquired. The first email address is designated as the “True address,” and the user is made aware of only this address. The second email address is designated as the “Trap address,” and the user is not made aware of this address. Because spammers typically collect as many email addresses as possible to have the broadest target for their message, both addresses are made available to the spammer, who assumes both to be valid addresses. The spammer is unable to differentiate the True addresses from the Trap addresses as that designation is guarded by the email server.
A MaDe filter is provided at the server, and once enabled, the MaDe filter detects when the same email is received in both the True and the Trap(s) inboxes. When this duplicate receipt occurs, the filter informs the resident spam filters that the email is a spam candidate. If the MaDe filter only sees an email in the Trap(s) inbox, the filter informs the spam filters that the email is spam, since there is no identifiable user of that address. If the MaDe filter only sees the email arriving for the True inbox, then the email is left alone to the existing filtering techniques.
In another embodiment, a server-side implementation is provided by creating several server Trap addresses. A fingerprint (hash) is generated for each incoming email. Then if any True address email fingerprint (hash) matches a Trap fingerprint, the email is tagged as spam. Further, any email that is sent to the Trap addresses is spam. Finally, email that is only sent to the True address is not stopped at the server and is passed on to the client email engine for further filtering.
In another embodiment, an enhancement to MaDe is provided to validate that email from a source is legitimate. This technique is referred to as Match and Certify (MaCe). MaCe works by expecting that all valid emails to be sent to a particular email address pair (or triplet or quadruplet, quintuplet, sextuplet, septuplet, and so on). The user's primary email address is the True address. The user's validating email address(es) is the Confirm address(es). If the user receives an email at his True email address and it was not accompanied by a duplicate email (by carbon copy (cc:) or to: or bcc:) to his Confirm email address then the email is not a legitimate one. Once both emails are received by a MaCe-enabled system, the MaCe filter tells the spam filters to let this email through now (and possibly in the future).
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.
The invention itself, as well as a preferred mode of use, further purposes, and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
The present invention provides a method, system and computer program product that substantially reduces or substantially eliminates the amount of spam mail received by a recipient email account. The technique provided is referred to as Match and Destroy (MaDe). MaDe acts as a filter assistant and ensures that the existing spam filters have a good and continually updated basis by which email can be filtered at the email server.
With reference now to the figures,
In the depicted example, network system 100 comprises email server 104 and multiple email clients 108, 110, and 112 connected to network 102, of which one client 108 is the recipient/receiving client (user), another client 110 is a sending client (sender) and a third client 112 is a spammer. For purposes of the invention, clients refer to the device and software by which an email communication may be created, transmitted and/or received. Also, the receiving client (108) may interchangeably be addressed according to its email inbox functionality.
Clients 108, 110, 112 may be, for example, personal computers or network computers. In the depicted example, email server 104 provides an email engine 105 that enables multiple clients 108, 110, 112 to register for an email account with a unique email address assigned thereto. In the described embodiment, email engine 105 also provides spam pre-filtering functions via either one or a combination of MaDe or MaCe techniques, which are described in details below. Network system 100 may include additional servers, clients, and other devices not shown.
In the described embodiment, network system 100 is the Internet with network connectivity 102 representing a worldwide collection of networks and gateways that utilize the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. Of course, network system 100 also may be implemented as a number of different types of networks, such as an intranet, a local area network (LAN), or a wide area network (WAN), for example.
Referring now to
Computer system 200 also comprises a network interface device (NID) 230 utilized to connect computer system 200 to another computer system and/or computer network (as illustrated by
In one embodiment, the hardware components of computer system 200 are of conventional design. Computer system 200 may also include other components (not shown) such as fixed disk drives, removable disk drives, CD and/or DVD drives, audio components, modems, network interface components, and the like. It will therefore be appreciated that the system described herein is illustrative and that variations and modifications are possible. Further, the techniques for messaging middleware functionality may also be implemented in a variety of differently-configured computer systems. Thus, while the invention is describe as being implemented in a computer system 200, those skilled in the art appreciate that various different configurations of computer systems exists and that the features of the invention are applicable regardless of the actual configuration of the computer system.
Located within memory 220 and executed on processor 210 are a number of software components, including operating system (OS) 225 (e.g., Microsoft Windows®, a trademark of Microsoft Corp, or GNU®/Linux®, registered trademarks of the Free Software Foundation and The Linux Mark Institute) and a plurality of software applications, including email engine 233 and MaDe utility 235. Among the software components are components for providing general email server functionality, components for enabling network connection and communication via NID 230 (e.g., a modem or network adapter), and more specific to the invention, code for enabling the “spam pre-filtering” functionality of the invention. For simplicity, the collective body of code that enables the duplicate assigning of email addresses and subsequent spam pre-filtering features are referred to hereinafter as the MaDe utility 235 (or MaCe utility for the alternate embodiment described below). In actual implementation, the MaDe utility may be added to existing email engine code to provide the enhanced email assignment and filtering.
Processor 210 executes these (and other) application programs 234 (e.g., network connectivity programs) as well as OS 225, which supports the application programs. According to the illustrative embodiment, processor 210 executes OS 225, applications 234 (namely network access applications/utilities), and MaDe utility 235 to provide/enable the spam pre-filtering features and functionality described herein and illustrated by
Notably, in one embodiment, the assignment of the second or multiple email address occurs both when the user first registers for an email and/or when the MaDe utility is installed on the email server. In one embodiment utilizing DHCP (Dynamic Host Configuration Protocol), two addresses are dynamically acquired. According to the described embodiment, one email address is designated as the “True address,” and the user is made aware of only this address, as indicated at block 306. The next email address is designated as the “Trap address,” and this email address is not provided to the user (i.e., the user has no actual knowledge of this address). The email engine links the Trap address with the True address within the server's email database, as indicated at block 308.
Then, as provided at block 310, both addresses are made available to the general email environment, which includes spammers. Because spammers typically collect as many email addresses as possible to have the broadest target for their message, these spammers assume both addresses to be valid addresses which represents two separate email accounts. The spammer is unable to differentiate the True addresses from the Trap addresses as that designation is guarded by the email server.
The MaDe technique may be implemented as either a client-side system or a server-side system. For simplicity, the invention is described as a server-side system with each email user assigned at least two (2) email addresses, one True address and one Trap address. These Trap email addresses are then published (e.g., posted to a newsgroup) so that potential spammers will attempt to contact the newly acquired targets as they attempt to collect as many email as possible to increase the targets for their message.
Returning to
In the client-side implementation, the MaDe filter detects when the same email is received in both the True and the Trap(s) inbox, and the MaDe filter informs the resident spam filters that the email is a spam candidate. If MaDe only sees an email in the Trap(s), the MaDe filter informs the resident filters that the email is spam. If the MaDe filter only sees the email in the True inbox then the email is left alone to the existing filtering techniques. Also, with a client-side implementation, the Trap address(es) is not made public to legitimate sources by the user so the user who is not aware of the second email account does not check the account for email. In one embodiment, the MaDe utility creates multiple Trap addresses and peppers these Trap addresses throughout the Internet to entice the often over-zealous spammer.
In one embodiment, multiple Trap addresses are assigned per single user to further improve coverage as spammers become more sophisticated or attempt to exclude/include specific addresses, perhaps based on a lack of response or other methods. In yet another embodiment, a server-side implementation is provided by creating several server Trap addresses. A fingerprint (hash) is generated for each incoming email. Then, if any True address email fingerprint (hash) matches a Trap fingerprint, the email is tagged as spam. Further, any email that is sent to the Trap addresses is spam. Finally, email that is only sent to the True address is not stopped at the server and is passed on to the client email engine for further filtering.
In one embodiment in which multiple Trap addresses (or Confirm addresses) are utilized, a revolving True-Trap or True-Confirm address pair may be implemented by which the Trap or Confirm address changes periodically to another one within the list of available Trap or Confirm addresses, respectively. Further, the True address may itself be interchangeable in one embodiment. With this embodiment, the client software, e.g., Lotus Notes®, lets the user type in the name of the recipient. The client/server software resolves the recipient name into an email address and transmits the email content. Coordination between trusted secure source and destination email servers ensures that the True and Confirm address are known. A shared temporal algorithm may be exchanged or the servers may communicate in real-time via their own SMTP messages.
In another alternate embodiment, an enhancement to MaDe is provided to validate that email from a source is legitimate. This technique is referred to as Match and Certify (MaCe) and involves executing a MaCe utility at the user device or at an email server. MaCe works by expecting that all valid emails are to be sent to a particular email address pair (or triplet or quadruplet, quintuplet, sextuplet, septuplet, and so on). The user's primary email address is the True address. The users validating email address(es) are referred to as the Confirm address(es).
The client and or server software are enhanced to make sure that the software of the recipient and legitimate sending addresses are made aware of the Confirm address as well as the True address. Once the Confirm address is known, the client or server software is able to send the duplicate validating email to the Confirm address Once both emails are received by a MaCe-enabled system, the MaCe filter tells the spam filters to let this email through now (and possibly in the future).
According to the above described process. If the user receives an email at his True email address and it was not accompanied by a duplicate email (by carbon copy (cc:) or to: or bcc:) to his Confirm email address then the MaCe utility recognizes that the email is not a legitimate one (i.e., is most probably not from a legitimate sender). As with MaDe's multiple Trap addresses, one embodiment provides implementation of MaCe techniques with multiple confirming addresses.
According to the invention, the naming schemes for the Trap and the Confirm addresses are arbitrary. That is, these addresses may play off the True address or be randomly assigned/named. Additionally, the True, Trap, and Confirm addresses do not necessarily have the same email domain. In one embodiment, email domains are server coordinated to validate the emails.
In one embodiment, the True, Trap, and Confirm addresses are revolved through a predefined list. With this embodiment, a single user may be assigned five addresses and then for a MaDe system, one address is the True address, while the remaining addresses are the Trap addresses for one day. The following day, however, another one of the addresses is assigned as the True address, with the remaining addresses again utilized as the Trap addresses. This revolving assignment of the True address may also be extended to a MaCe system, with True and Confirm addresses. Based on this implementation, the user sending an email does not need to know what the True address is for a given day (or period). Rather, the client software, e.g., Lotus Notes®, lets the user type in the name of the recipient. The client/server software resolves this email into an email address and transmits to the recipient. Coordination between trusted secure source and destination email servers would then ensure that the True and Confirm address are known. This is achieved by en exchange of a shared temporal algorithm or communication among the servers in real-time via their own SMTP messages.
In one embodiment, a company may sell services to their customers, which ensures that their customers assigned Trap addresses are as dispersed as possible throughout the Internet. This would ensure that spammers have certain email addresses on their spam lists. This service would then ensure the functionality of the MaDe and MaCe systems.
As a final matter, it is important that while an illustrative embodiment of the present invention has been, and will continue to be, described in the context of a fully functional computer system with installed management software, those skilled in the art will appreciate that the software aspects of an illustrative embodiment of the present invention are capable of being distributed as a program product in a variety of forms, and that an illustrative embodiment of the present invention applies equally regardless of the particular type of signal bearing media used to actually carry out the distribution. Examples of signal bearing media include recordable type media such as floppy disks, hard disk drives, CD ROMs, and transmission type media such as digital and analogue communication links.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.