1. Field of the Invention
The present invention relates generally to computer security, and more particularly but not exclusively to methods and apparatus for detecting phishing.
2. Description of the Background Art
Phishing involves stealing information, such as usernames, passwords, and credit card information, by mimicking a legitimate organization in Internet communications. Phishing is typically perpetrated by sending emails that include a link to a webpage of a malicious website or other harmful content. Victims are fooled into clicking the link because the emails are designed to look like they are from a legitimate organization trusted by the victim.
Detecting phishing emails by analyzing the email content is difficult because a phishing email is designed to look like a legitimate email. Patterns or signatures for detecting phishing emails by pattern matching will also match legitimate emails, raising the number of false positives to unacceptable levels. Detecting phishing emails by uniform resource locator (URL) analysis is also problematic because phishing sites are constantly being relocated and their numbers are increasing.
In one embodiment, phishing is detected by creating a message transfer agent (MTA) map, with each point on the MTA map referencing an MTA. Points on the MTA map are connected based on a number of emails with same signature sent by MTAs represented on the MTA map. Reference MTA groups are identified from the map. Phishing is detected when an MTA sent an email with the same signature as that of emails sent by MTAs belonging to a reference MTA group but the MTA is not a member of the reference MTA group.
These and other features of the present invention will be readily apparent to persons of ordinary skill in the art upon reading the entirety of this disclosure, which includes the accompanying drawings and claims.
The use of the same reference label in different drawings indicates the same or like components.
In the present disclosure, numerous specific details are provided, such as examples of apparatus, components, and methods, to provide a thorough understanding of embodiments of the invention. Persons of ordinary skill in the art will recognize, however, that the invention can be practiced without one or more of the specific details. In other instances, well-known details are not shown or described to avoid obscuring aspects of the invention.
Referring now to
The computer 100 is a particular machine as programmed with software modules 110. The software modules 110 comprise computer-readable program code, i.e., computer instructions, stored non-transitory in the main memory 108 for execution by the processor 101. Execution of the software modules 110 by the processor 101 causes the computer 100 to perform the functions of the software modules 110. As an example, the software modules 110 may comprise an analysis module and a phishing detector when the computer 100 is employed as a phishing analysis system.
The phishing analysis system 220 may comprise one or more computers that generate an MTA map from IP addresses of a plurality of MTAs and emails sent by the MTAs, connect points on the MTA map based on a number of same emails sent by the MTAs, identify reference MTA groups from the MTA map, subsequently collect signatures of emails sent by MTAs in the identified reference MTA groups, and detect phishing by identifying a particular MTA that sent an email having a same signature as emails sent by MTAs belonging to a reference MTA group but the particular MTA is not a member of the reference MTA group.
In the example of
In one embodiment, the phishing detector 225 comprises computer-readable program code that receives emails sent by MTAs belonging to reference MTA groups and detects a particular MTA that sent an email having a signature that is same as signatures of emails sent by MTAs belonging to a particular reference MTA group but the particular MTA is not a member of the particular reference MTA group. The phishing detector 225 may consult the listing of reference MTA groups 224 to identify members of various reference MTA groups. In one embodiment, each reference MTA group in the listing of reference MTA groups 224 comprises a plurality of MTAs that send emails having the same signature. That is, each MTA in a reference MTA group sends an email with a particular signature known to belong to that reference MTA group.
The phishing detector 225 and the listing of reference MTA groups 224 may be deployed in the phishing analysis system 220 or in some other system, such as in the security system 221. The phishing analysis module 223 may provide the listing of reference MTA groups 224 to subscribing computers, such as the security system 221 and other computer systems. The phishing analysis module 223 may also continually update the listing of reference MTA groups 224 with new data and provide the updates to the subscribing computers.
Table 1 shows examples of IP (Internet protocol) addresses of sender computers and signatures of emails sent by the sender computers in accordance with an embodiment of the present invention. In one embodiment, the signature of an email is calculated by hashing the content of the body of the email, for example. Variables, such as names of recipients and the like, may be eliminated from the signature to reveal the template of the email. That is, the signature represents a template of the email. Other ways of calculating an email signature may also be employed. In the example of Table 1, the sender IP address is the IP address of the sender computer, which in this case is a sending MTA that sent the email. The sender IP addresses and corresponding email signatures may be obtained from various logs. An email signature may be obtained directly from the log or calculated by the phishing analysis module 223 upon receipt of the email.
In one embodiment, each MTA, as represented by its IP address, is represented by a point in an MTA map that comprises a plurality of MTAs. The phishing analysis module 223 may evaluate similarity between two points, i.e., two MTAs, on the MTA map based on the number of emails with the same signatures sent by or from the two points. That is, in one embodiment, similarity of two points may be determined by comparing the signatures of emails sent by the two points. As an example, Table 2 shows the similarities of two points detected by the phishing analysis module 223 from the example of Table 1. In the example of Table 2, the MTAs having the IP addresses “169.254.1.6” and “141.113.102.114” have a similarity value of 2 because these IP addresses sent two emails with the same signatures; the MTAs having the IP addresses “75.127.151.162” and “169.254.1.139” have a similarity value of 1 because these IP addresses sent one email with the same signature; and so on.
When the number of emails with the same signatures sent by two points exceeds a similarity threshold, the phishing analysis module 223 may connect the two points in the MTA map. In graph theory, this means that the two points have an edge between them. For example, assuming the similarity threshold is set to 100, two points must have sent over 100 emails with the same signatures before the phishing analysis module 223 will connect the two points on the MTA map.
In one embodiment, the phishing analysis module 223 processes an MTA map to identify reference MTA groups that may be employed to detect phishing. More particularly, the phishing analysis module 223 may process the MTA map 260 into reference MTA groups 224 that may be consulted by a phishing detector 2225.
In one embodiment, the phishing analysis module 223 calculates the betweenness centrality of each point on an MTA map, removes the point with the highest betweenness centrality from the MTA map, removes the edges to the point with the highest betweenness centrality, finds MTA groups that have been isolated by removal of the edges to the point with the highest betweenness centrality, and calculates the modularity of the isolated MTA groups to determine whether or not to include the MTA groups into listing of reference MTA groups 224.
Continuing with
In the example of
Generally speaking, in the example of
Each identified reference MTA group comprises a plurality of MTAs that are similar for having sent emails with the same signature. Each reference MTA group is thus associated with a particular email signature. That is, an MTA belonging to a reference MTA group sends emails with the same signature as other emails sent by other MTAs belonging to the same reference MTA group. Accordingly, when a particular MTA sends an email with a signature associated with a particular reference MTA group and that particular MTA is not a member of the particular reference MTA group, that particular MTA is deemed to be phishing.
In the example of
Continuing in
As an example, assume the MTAs 203 and 204 of
Methods and apparatus for detecting phishing have been disclosed. While specific embodiments of the present invention have been provided, it is to be understood that these embodiments are for illustration purposes and not limiting. Many additional embodiments will be apparent to persons of ordinary skill in the art reading this disclosure.
Number | Name | Date | Kind |
---|---|---|---|
7021534 | Kiliccote | Apr 2006 | B1 |
7802298 | Hong et al. | Sep 2010 | B1 |
7895515 | Oliver et al. | Feb 2011 | B1 |
20050160330 | Embree et al. | Jul 2005 | A1 |
20060064374 | Helsper et al. | Mar 2006 | A1 |
20060070126 | Grynberg et al. | Mar 2006 | A1 |
20060095966 | Park | May 2006 | A1 |
20060101120 | Helsper et al. | May 2006 | A1 |
20060123464 | Goodman et al. | Jun 2006 | A1 |
20060123478 | Rehfuss et al. | Jun 2006 | A1 |
20060168066 | Helsper et al. | Jul 2006 | A1 |
20070112814 | Cheshire | May 2007 | A1 |
20070282739 | Thomsen | Dec 2007 | A1 |
20080028444 | Loesch et al. | Jan 2008 | A1 |
20080082662 | Dandliker et al. | Apr 2008 | A1 |
20120272330 | Soghoian et al. | Oct 2012 | A1 |
20120303348 | Lu et al. | Nov 2012 | A1 |
20120324580 | Glasser et al. | Dec 2012 | A1 |
Entry |
---|
Powerful Phishing Detection with Innovative Technology from Trend Micro, 2 sheet [retrieved on Feb. 28, 2013], retrieved from the Internet: http://blog.trendmicro.com/trendabs-security-intelligence/powerful-phishing-detection-wit . . . . |
Modularity (networks)—from Wikipedia, the free encyclopedia, 6 sheets (retrieved on Mar. 4, 2013], retrieved from the internet: http://en.wikipedia.org.wiki/Modularity—(networks). |
Modularity—from Wikipedia, the free encyclopedia, 9 sheets (retrieved on Jan. 22, 2013], retrieved from the internet: http://en.wikipedia.org.wiki/Modularity. |
Centrality—from Wikipedia, the free encyclopedia, 7 sheets (retrieved on Jan. 22, 2013], retrieved from the internet: http://en.wikipedia.org.wiki/Modularity. |
M.E. J. Newman “Modularity and community structure in networks”, published online on May 24, 2006, 7 sheets [retrieved on Mar. 4, 2013], retrieved from the internet: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1482622/. |