The present invention relates to the field of inhibiting spread of Spam mail.
Spam, also referred to as unsolicited bulk email, or “junk” email, is an undesired email that is sent to multiple recipients, with a purpose to promote a business, an idea or a service. Spam is also used by hackers to spread vandals and viruses in email, or to trick users into visiting hostile or hacked sites, which attack innocent surfers. Spam usually promotes “get rich quickly” schemes, porn sites, travel/vacation services, and a variety of other topics.
eSafe Gateway and eSafe Mail of Aladdin Knowledge Systems Ltd. are typical spam facilities that can block incoming or outgoing email based on the sender, recipient, body text, or subject text. Administrators can block or get a copy of mail messages containing specific keywords. For example, they can block email containing profanity or confidential project names. This feature blocks messages that violate corporate policies, thereby allowing full unattended enforcement of these policies. They can also prevent attacks by hackers or vandal programs that use SMTP as a way of sending stolen information out of the network.
The term “False Positive” refers herein to classifying an email message as spam despite of the fact that it is not a spam.
The major problem with spam detection is that classifying an email as spam is carried out according to subjective examination rather than objective examination. For example, an email message that comprises the word “travel” may be classified as spam when received in the user's office email box, however when received at the home email box of the same user, it can be considered as non-spam, since the user may be interested in traveling deals.
Therefore, it is an object of the present invention to provide a method and system for classifying email messages as spam.
It is another object of the present invention to provide a method and system for inhibiting spread of spam.
It is a further object of the present invention to provide a method and system for inhibiting spread of spam, upon which the number of false positives is decreased in comparison to the prior art.
It is yet a further object of the present invention to provide a method and system for detecting spam originators.
Other objects and advantages of the invention will become apparent as the description proceeds.
In one aspect, the present invention is directed to a method for identifying and blocking spam email messages on an inspecting point, the method comprising the steps of:
The method may further comprise:
According to one embodiment of the invention, the flow rate is based on a number of email messages received at the gateway from the originator in a time period. According to another embodiment of the invention, the flow rate is based on a number of email messages received from two or more originators having a common denominator at the gateway in a time period.
The common denominator may be a domain, an email address, certain keyword(s) within the text of the email messages, certain keyword(s) within the title of the email messages, certain keyword(s) within the email address of the originator of the email messages, certain keyword(s) within the email address of the recipient(s) of the email messages, and so forth.
The inspecting point may be a gateway server, mail server, firewall server, proxy server, ISP server, VPN server, a server that filters incoming data to an organization network, etc.
On another aspect, the present invention is directed to a system for identifying and blocking spam email messages at an inspecting point, the system comprising:
According to one embodiment of the invention, the flow rate calculator comprises:
According to another embodiment of the invention, the flow rate calculator comprises:
The spam detector, flow rate calculator and spam indicator are computerized facilities.
The present invention may be better understood in conjunction with the following figures:
An email message sent from, e.g., user 21 to, e.g. user 42, passes through the mail server 20, through the Internet 100, until it reaches to mail server 10. At the mail server 10 the email message is scanned by the blocking facility 15, and if no malicious code is detected, it is then stored in email box 12, which belongs to user 42. The next time user 42 opens his mailbox 12 he finds the delivered email message.
At block 201 the email is “inspected”, i.e. one or more tests are carried out in order to determine whether the email message is suspected as spam. As known to a person of ordinary skill in the art, there are a variety of tests to classify an email as spam, such as searching for certain keyword(s) in the email text or title.
From block 202, if the email is not suspected as spam, the flow continues with block 207, otherwise the flow continues with block 203.
On block 203, the identity of the originator of the email message is identified.
On block 204, a “flow rate” of the email messages from the particular originator is calculated.
From block 205, if the flow rate exceeded a certain threshold, the flow continues to block 206, otherwise to block 207.
The method decreases the number of false positives since it takes into consideration a plurality of email messages instead of analyzing each email message individually. Moreover, the method allows also detecting “spammers”, i.e. spamming originators.
An originator can be identified in a variety of ways. According to one embodiment of the invention, an originator is identified by the email address of the sender of an email message. Even if the spam sender's email address is a fake email address, a plurality of email messages sent from the same “sender” can still indicate that the email messages are spam messages.
It is common that spammers send email messages which differ by their size, text, etc., although they promote the same subject, in order to overcome signature detection and virus detection methods. According to a preferred embodiment of the present invention the most common keywords in incoming email messages are detected, and in case the common keywords indicate spam, further email messages having these keywords are blocked.
The term Flow Rate refers herein as to an expression representing a quantity of email messages sent from an originator and pass through an inspection point in a time period. For example: F=E/T, where: F is the flow rate; E is the number of email messages received in an inspection point from an originator (or a group of originators) during time T. Of course a combination of these parameters can also present a flow rate.
The threshold does not have to be an absolute number, but also an expression, such as, for example, 70% of the average flow rate of incoming email messages in 24 hours.
The inspection facility 10 comprises a spam detector 60, and a flow rate calculator 70 and spam indicator 80. The spam detector 70 indicates if an email message is suspected as spam. The flow rate calculator calculates the flow rate of spam-suspected email messages from certain originator. The spam indicator 80 indicates if the spam-suspected email messages are indeed spam. The flow rate calculator 60, the spam detector 70 and the spam indicator 80 are programmed facilities, i.e. they may employ software and/or hardware elements.
Of course these methods for calculating flow rate are only examples, and a variety of other methods can be employed.
Those skilled in the art will appreciate that the invention can be embodied by other forms and ways, without losing the scope of the invention. The embodiments described herein should be considered as illustrative and not restrictive.
This is a continuation-in-part of U.S. Provisional Patent Application No. 60/609,344, filed Sep. 14, 2004
Number | Date | Country | |
---|---|---|---|
60609344 | Sep 2004 | US |