Claims
- 1. A method for detecting an occurrence of a violation of an email security policy of a computer system by transmission of selected email through said computer system, said computer system comprising a server and one or more clients having an email account, the method comprising:
(a) defining a model relating to prior transmission of email through said computer system derived from statistics relating to prior emails transmitted through said computer system; (b) gathering statistics relating to said transmission of selected email through said computer system; and (c) classifying said selected email as being a member of a classification by applying said model to said statistics relating to said transmission of selected email through said computer system.
- 2. The method as recited in claim 1, wherein said step of defining a model relating to prior transmission of email comprises defining a model relating to attachments to said prior emails transmitted through said computer system.
- 3. The method as recited in claim 2, wherein said method further comprises extracting said attachments from each of said selected emails transmitted through said computer system.
- 4. The method as recited in claim 3, which further comprises identifying said attachment with a unique identifier.
- 5. The method as recited in claim 2, wherein the step of gathering statistics relating to said transmission of selected email through said computer system comprises recording the number of occurrences of said attachment received by said client.
- 6. The method as recited in claim 2, wherein the step of gathering statistics relating to said transmission of selected email through said computer system comprises, for each attachment that is transmitted by an email account, recording a total number of addresses to which said attachment is transmitted.
- 7. The method as recited in claim 2, wherein the step of gathering statistics relating to said transmission of selected email through said computer system comprises, for each attachment that is transmitted by an email account, recording a total number of email accounts which transmit said attachment.
- 8. The method as recited in claim 2, wherein said step of defining a probabilistic model comprises, for each attachment that is transmitted by an email account, defining a model that classifies an attachment as violating an email security policy based on said total number of email addresses to which said attachment is transmitted and said total number of email accounts which transmit said attachment.
- 9. The method as recited in claim 1, wherein said step of classifying said selected email comprises classifying said email as a member of a group comprising violative of a security policy and non-violative of a security policy.
- 10. The method as recited in claim 1, wherein said step of classifying said selected email is performed at said client.
- 11. The method as recited in claim 10, further comprising transmitting said classification to said server.
- 12. The method as recited in claim 1, wherein said step of classifying said selected email is performed at said server.
- 13. The method as recited in claim 12, further comprising transmitting said classification to said one or more clients.
- 14. The method as recited in claim 1, wherein the step of identifying said selected email with a unique identifier comprises substituting an email account user name with an alphanumeric code.
- 15. The method as recited in claim 1, wherein said step of defining a model relating to prior transmission of email comprises defining a model derived from statistics relating to prior transmission of emails of one of said email accounts.
- 16. The method as recited in claim 15, wherein said step of defining a model comprises defining a histogram of prior transmission of emails of one of said email accounts.
- 17. The method as recited in claim 16, wherein said step of gathering statistics relating to said transmission of selected email through said computer system further comprises defining a histogram of selected transmission of emails of one of said email accounts.
- 18. The method as recited in claim 17, wherein the step of classifying said selected email as being a member of a classification comprises comparing said histogram of prior transmission of emails to said histogram of selected transmission of emails.
- 19. The method as recited in claim 18, where in said step of comparing comprises performing a Mahalanobis distance analysis on said histogram of prior transmission of emails to said histogram of selected transmission of emails.
- 20. The method as recited in claim 18, where in said step of comparing comprises performing a Kolmogorov-Simironov test on said histogram of prior transmission of emails to said histogram of selected transmission of emails.
- 21. The method as recited in claim 18, where in said step of comparing comprises performing a Chi-square test on said histogram of prior transmission of emails to said histogram of selected transmission of emails.
- 22. The method as recited in claim 15, wherein the step of defining a model relating to prior transmission of email comprises grouping email addresses into cliques corresponding to email addresses of recipients occurring in a respective email transmitted by one of said email accounts.
- 23. The method as recited in claim 22, wherein the step of gathering statistics relating to said transmission of selected email through said computer system comprises, for email transmitted by one of said email accounts, gathering information on the email addresses of the recipients in each said email.
- 24. The method as recited in claim 23, wherein the step of classifying said selected email as being a member of a classification based on said statistics comprises classifying said email as violating said email security policy based on whether said email addresses in said email are members of more than one clique.
- 25. The method as recited in claim 15, wherein the step of defining a model relating to transmission of emails from one of said email accounts comprises, for emails transmitted from said email account, defining said model based on the time in which said emails are transmitted by said email account.
- 26. The method as recited in claim 15, wherein the step of defining a model relating to transmission of emails from one of said email accounts comprises defining said model based on the size of said emails that are transmitted by said email account.
- 27. The method as recited in claim 15, wherein the step of defining a model relating to transmission of emails from one of said email accounts comprises defining said model based on the number of attachments that are transmitted by said email account
- 28. The method as recited in claim 1, wherein said client comprises a plurality of email accounts and wherein said step of defining a model relating to prior transmission of email comprises defining a model relating to statistics concerning emails transmitted by said plurality of email accounts.
- 29. The method as recited in claim 28, wherein said step of defining a statistical model comprises defining a histogram of prior transmission of emails of a first one of said plurality of email accounts.
- 30. The method as recited in claim 29, wherein said step of gathering statistics relating to said transmission of selected email through said computer system comprises defining a histogram of selected transmission of emails of a second one of said plurality of email accounts.
- 31. The method as recited in claim 30, wherein the step of classifying said selected email as being a member of a classification comprises comparing said histogram of prior transmission of emails of said first one of said plurality of email accounts to said histogram of selected transmission of emails of said second one of said plurality of email accounts.
- 32. The method as recited in claim 31, where in said step of comparing comprises performing a Mahalanobis distance analysis on said histogram of prior transmission of emails of said first one of said plurality of email accounts to said histogram of selected transmission of emails of said second one of said plurality of email accounts.
- 33. The method as recited in claim 31, where in said step of comparing comprises performing a Kolmogorov-Simironov test on said histogram of prior transmission of emails of said first one of said plurality of email accounts to said histogram of selected transmission of emails of said second one of said plurality of email accounts.
- 34. The method as recited in claim 31, where in said step of comparing comprises performing a Chi-square test on said histogram of prior transmission of emails of said first one of said plurality of email accounts to said histogram of selected transmission of emails of said second one of said plurality of email accounts.
- 35. The method as recited in claim 28, wherein said step of defining a model comprises defining a model based on the number of emails transmitted by each of said email accounts.
- 36. The method as recited in claim 28, wherein said step of defining a model comprises defining a model based on the number of recipients in each email transmitted by each of said email accounts.
- 37. A method for detecting an occurrence of a violation of an email security policy of a computer system by transmission of selected email through said computer system, said computer system comprising a server and one more clients having an email account, the method comprising:
(a) defining a model relating to prior email transmitted by said email account derived from statistics relating to prior emails transmitted by said email account; (b) gathering statistics relating to said selected emails transmitted by said email account; (c) defining a model of said new email transmission derived from said statistics; and (d) comparing said model of said new email transmission and said model relating to prior email transmitted by said email account.
- 38. The method as recited in claim 37, wherein said step of defining a model relating to prior email comprises defining a model relating to statistics accumulated over a predetermined time period.
- 39. The method as recited in claim 37, wherein said step of defining a model relating to prior email comprises defining a model relating the number of emails sent by said email account during a predetermined time period.
- 40. The method as recited in claim 37, wherein said step of defining a model relating to prior email comprises defining a model relating to statistics accumulated irrespective of a time period.
- 41. The method as recited in claim 37, wherein said step of defining a model relating to prior email comprises defining a model relating to the number of email recipients to which said email account transmits said emails.
- 42. The method as recited in claim 37, wherein said step of defining a model relating to prior email comprises defining a model relating to the number of attachments in each email transmitted by said email account.
- 43. The method as recited in claim 37, wherein the step of defining a model relating to prior email comprises defining said model based on said email addresses of recipients to which said emails are transmitted by said email account.
- 44. The method as recited in claim 43, wherein the step of defining a model relating to said prior email comprises grouping said email addresses into cliques corresponding to email addresses of recipients occurring in the same email.
- 45. The method as recited in claim 44, wherein the step of gathering statistics relating to said transmission of new email transmitted by said email account comprises, for email transmitted by said email account, gathering information on the email addresses of the recipients in each email.
- 46. The method as recited in claim 45, wherein the step of comparing said model of said new email transmission and said model relating to prior email transmitted by said email account comprises classifying said email as violating said email security policy based on whether said email addresses in said email are members of more than one clique.
- 47. A system for detecting an occurrence of a violation of an email security policy of a computer system by transmission of selected email through said computer system comprising:
(a) a client comprising:
(i) an email server configured to receive and transmit said selected email for one or more email accounts; (ii) a client database configured to store information relating to said selected email and a model derived from statistics relating to prior emails transmitted through said computer system; and (iii) an analysis component configured to define a model for said selected email based on statistics relating to said selected email and compare said selected email model and said model derived from statistics relating to said prior emails; (iv) a communications component configured to transmit statistics relating to the selected email to a server; and (b) a server comprising a server database configured to store statistics relating to said emails, and to transmit said statistics to said client.
- 48. The system as recited in claim 47, wherein the client database is configured to store statistics relating to a sender email address of a respective email.
- 49. The system as recited in claim 47, wherein the client database is configured to store statistics relating to a recipient email address of a respective email.
- 50. The system as recited in claim 47, wherein the client database is configured to store statistics relating to a classification of an email as violative of the email security policy of the computer system.
- 51. The system as recited in claim 47, wherein the client database is configured to store statistics relating to prior email transmitted by said one or more email accounts.
- 52. The system as recited in claim 52, wherein the client database is configured to store statistics relating to prior email transmitted by said one or more email accounts in a histogram.
- 53. The system as recited in claim 53, wherein the analysis component is configured to compare a histogram relating to said selected email to said histogram relating to said prior email.
CLAIM FOR PRIORITY TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional Patent Application Serial No. 60/340,197, filed on Dec. 14, 2001, entitled “System for Monitoring and Tracking the Spread of Malicious E-mails,” and U.S. Provisional Patent Application Serial No. 60/312,703, filed Aug. 16, 2001, entitled “Data Mining-Based Intrusion Detection System,” which are hereby incorporated by reference in their entirety herein.
STATEMENT OF GOVERNMENT RIGHT
[0002] The present invention was made in part with support from United States Defense Advanced Research Projects Agency (DARPA), grant no. F30602-00-1-0603. Accordingly, the United States Government may have certain rights to this invention.
Provisional Applications (2)
|
Number |
Date |
Country |
|
60340198 |
Dec 2001 |
US |
|
60312703 |
Aug 2001 |
US |