This specification relates in general to the field of network security, and more particularly, to a system and method for protocol fingerprinting and reputation correlation.
The field of network security has become increasingly important in today's society. The Internet has enabled interconnection of different computer networks all over the world. The ability to effectively protect and maintain stable computers and systems, however, presents a significant obstacle for component manufacturers, system designers, and network operators. This obstacle is made even more complicated due to the continually evolving array of tactics exploited by malicious operators. Once a certain type of malicious software (e.g., a bot) has infected a host computer, a malicious operator may issue commands from a remote computer to control the malicious software. The software can be instructed to perform any number of malicious actions such as, for example, sending out spam or malicious emails from the host computer, stealing sensitive information from a business or individual associated with the host computer, propagating to other host computers, and/or assisting with distributed denial of service attacks. In addition, the malicious operator can sell or otherwise give access to other malicious operators, thereby escalating the exploitation of the host computers. Hence, significant challenges remain for developing innovative tools to combat tactics that allow malicious operators to exploit computers.
To provide a more complete understanding of the present disclosure and features and advantages thereof, reference is made to the following description, taken in conjunction with the accompanying figures, wherein like reference numerals represent like parts, in which:
Overview
A method is provided in one example embodiment that includes generating a fingerprint based on properties extracted from data packets received over a network connection and requesting a reputation value based on the fingerprint. A policy action may be taken on the network connection if the reputation value received indicates the fingerprint is associated with malicious activity. The method may additionally include displaying information about protocols based on protocol fingerprints, and more particularly, based on fingerprints of unrecognized protocols. In yet other embodiments, the reputation value may also be based on network addresses associated with the network connection.
Example Embodiments
Turning to
Each of the elements of
For purposes of illustrating the techniques of the system for network protection against malicious software, it is important to understand the activities occurring within a given network. The following foundational information may be viewed as a basis from which the present disclosure may be properly explained. Such information is offered earnestly for purposes of explanation only and, accordingly, should not be construed in any way to limit the broad scope of the present disclosure and its potential applications.
Typical network environments used in organizations and by individuals include the ability to communicate electronically with other networks using, for example, the Internet to access web pages hosted on servers connected to the Internet, to send or receive electronic mail (i.e., email) messages, or to exchange files with end users or servers connected to the Internet. Malicious users are continuously developing new tactics that use the Internet to spread malware and gain access to confidential information.
Tactics that represent an increasing threat to computer security often include botnets, which have become a serious Internet security problem. In many cases they employ sophisticated attack schemes that include a combination of well-known and new vulnerabilities. Botnets generally use a client-server architecture where a type of malicious software (i.e., a bot) is placed on a host computer and communicates with a command and control server, which may be controlled by a malicious user (e.g., a botnet operator). Usually, a botnet is composed of a large number of bots that are controlled by the operator using a C&C protocol through various channels, including Internet Relay Chat (IRC) and peer-to-peer (P2P) communication. The bot may receive commands from the command and control server to perform particular malicious activities and, accordingly, may execute such commands. The bot may also send any results or pilfered information back to the command and control server.
Botnet attacks generally follow the same lifecycle. First, desktop computers are compromised by malware, often through drive-by downloads, Trojans, or un-patched vulnerabilities. The malware may then subvert these computers into bots, giving a botmaster control over them. Malware generally includes any software designed to access and/or control a computer without the informed consent of the computer owner, and is most commonly used as a label for any hostile, intrusive, or annoying software such as a computer virus, spyware, adware, etc. Once compromised, the computers may then be subverted into bots, giving a botmaster control over them. The botmaster may then use these computers for malicious activity, such as spamming. In addition to receiving commands to perform malicious activities, a bot also typically include one or more propagation vectors that enable it to spread within an organization's network or across other networks to other organizations or individuals. Common propagation vectors include exploiting known vulnerabilities on hosts within the local network and sending malicious emails having a malicious program attached or providing malicious links within the emails.
Existing firewall and network intrusion prevention technologies are not always capable of recognizing and containing botnets. Current firewalls may have the ability to detect and act on traffic associated with known applications. However, a large number of threats on a network, such as advanced persistent threats (APTs), use unknown communication mechanisms, including custom protocols, for example. Furthermore, it can be expected that existing firewalls may not be able to classify a sizeable amount of traffic on any given network with a standard set of application signatures. Thus, existing firewalls and other network intrusion prevention technologies are unable to implement any meaningful policy decisions on unrecognized traffic.
Some reputation systems can offer a viable defense to particular botnets. In general, a reputation system monitors activity and assigns a reputation value or score to an entity based on its past behavior. The reputation value may denote different levels of trustworthiness on the spectrum from benign to malicious. For example, a connection reputation value (e.g., minimal risk, unverified, high risk, etc.) may be computed for a network address based on network connections made with the address or email originating from the address. Connection reputation systems may be used to reject email or network connections with IP addresses having an unacceptable connection reputation, such as one that indicates an IP address is known or likely to be associated with malicious activity. Other reputation systems can block activity of applications having hashes known or likely to be associated with malicious activity. However, connection reputation lookups may be driven purely by network traffic and other reputation lookups may not consider any network traffic.
In accordance with one embodiment, network environment 10 can overcome these shortcomings (and others) by fingerprinting protocols and correlating reputation data. For example, network environment 10 may provide a mechanism for fingerprinting unrecognized protocols based on particular properties indicative of the protocol, and global threat intelligence (GTI) data can be used to guide policy decisions on traffic that uses the unrecognized protocols. Such GTI data can include protocol reputation, reputation of external addresses contacted, or geographic breakdown for unknown protocol traffic, for example.
More particularly, a protocol fingerprint may be generated for an unrecognized protocol on the network. An unrecognized protocol broadly includes protocols not already having a fingerprint or not associated with an application having an existing signature, for instance. A protocol fingerprint can be derived from properties extracted from the observed traffic using the protocol, and can be sent along with connection data to a threat intelligence server. The threat intelligence server may return a reputation value that is based on the connection data and the protocol fingerprint. Thus, protocol reputation can make information on unrecognized traffic flows actionable, including previously fingerprinted traffic flows and flows for which an application signature is available.
Turning to
In one example implementation, endhosts 20a-b, remote host 25a, and/or threat intelligence server 30 are network elements, which are meant to encompass network appliances, servers, firewalls, routers, switches, gateways, bridges, load-balancers, processors, modules, or any other suitable device, component, element, or object operable to exchange information in a network environment. Firewall 22 may also be integrated or combined with another network element as appropriate. Network elements may include any suitable hardware, software, components, modules, interfaces, or objects that facilitate the operations thereof. This may be inclusive of appropriate algorithms and communication protocols that allow for the effective exchange of data or information. However, endhosts 20a-b generally may be distinguished from other network elements, as they tend to serve as a terminal point for a network connection, in contrast to a gateway or firewall. Endhosts are inclusive of wired and wireless network endpoints, such as desktop computers, laptop computers, tablet computers (e.g., iPads), e-book readers, mobile phones, smart phones (e.g., iPhones, Android phones, etc.) and other similar devices. Remote host 25a may similarly server as a terminal point for a network connection and may be inclusive of such devices.
In regards to the internal structure associated with network environment 10, each of endhosts 20a-b, firewall 22, remote host 25a, and/or threat analysis host 30 can include memory elements (as shown in
In one example implementation, endhosts 20a-b, firewall 22, remote host 25a, and/or threat intelligence server 30 include software (e.g., as part of fingerprinting engine 42, etc.) to achieve, or to foster, operations as outlined herein. In other embodiments, such operations may be carried out externally to these elements, or included in some other network element to achieve the intended functionality. Alternatively, these elements may include software (or reciprocating software) that can coordinate in order to achieve the operations, as outlined herein. In still other embodiments, one or all of these devices may include any suitable algorithms, hardware, software, components, modules, interfaces, or objects that facilitate the operations thereof.
Note that in certain example implementations, the functions outlined herein may be implemented by logic encoded in one or more tangible, non-transitory media (e.g., embedded logic provided in an application specific integrated circuit (ASIC), digital signal processor (DSP) instructions, software (potentially inclusive of object code and source code) to be executed by a processor, or other similar machine, etc.). In some of these instances, memory elements (as shown in
A fingerprint may be generated by extracting various behavior properties of traffic protocols observed on a network, such as inbound network traffic 302a and outbound network traffic 302b. For example, fingerprinting engine 42 may observe a number of data packets received over a network connection, and record the query/response ratio (e.g., by packet count and/or size) of the traffic as one fingerprint characteristic. Fingerprint engine 42 may also characterize the protocol as stream or message-based, based on properties such as packet size distribution, and record that information as a fingerprint characteristic. For example, large downloads in a stream-based protocol are likely to be broken into a large number of packets having the maximum packet size. In contrast, message-based streams are likely to be composed of smaller packets with variable sizes. Likewise, traffic may be characterized as ASCII or binary and incorporated into a fingerprint. Other examples of fingerprint properties include the transport protocol, the first token (e.g., “GET” in an ASCII protocol), first X number of bytes, the last X number of bytes of the first line, and the last token of the first line (e.g., “HTTP/1.1\r\n” for hypertext transfer protocol). Entropy (i.e., amount of randomness) of packet content is yet another example of a protocol property that can be observed and fingerprinted. Packets consisting mostly of English text, for instance, may have substantial redundancy, while a compressed or encrypted file may have very little redundancy and thus high entropy. Other distinguishing properties can include the first two bytes of packets (e.g., if they are length pointers), key-colon-value-newline formatting, Abstract Syntax Notation One (ASN.1) encoded data, order of exchange (i.e., client or server sends first message), numerical values as first bytes in a packet (e.g., “200 OK” for hypertext transfer protocol), messages that begin with a magic number, negotiation-before-stream pattern (e.g., small packets exchanged before streaming data), transaction identifiers (e.g., first two bytes from a client are same as first two bytes from the server), and type-length-value (TLV) or length-value (LV) format.
A reputation query may be transmitted at 308 to threat intelligence server 30, for example, across a network, such as Internet 15. Reputation query 308 may include connection data and the protocol fingerprint, for example. Connection data can include various parameters that identify the network connection, such as network addresses. Network addresses generally include data that identifies both the endhost and the remote end of the connection, such as the local (endhost) IP address and port and the remote host IP address and port. Threat intelligence server 30 may correlate the protocol fingerprint with reputation data 310 at 312, and return a response at 314.
The response at 314 may include a reputation value, which can denote different levels of trustworthiness on the spectrum from benign to malicious based on the reputation of the protocol fingerprint (i.e., a protocol reputation) and/or a network address associated with the connection (i.e., a connection reputation), and may further indicate whether a connection should be allowed. If the query response indicates that the connection is probably benign, then the connection can be allowed, but if the response indicates that the connection may be malicious, then appropriate policy action may be taken based on policy. For example, appropriate action may include blocking the connection, alerting a user or administrator, or recording the fingerprint and other network information in a log for subsequent forensic analysis.
Alternatively or additionally, at 316 a user interface may display information about all unknown protocols based on protocol fingerprints, as well as statistics on all known applications/protocols, which may be based on application signatures in certain embodiments. For example, if there are ten unique protocols with associated fingerprints, the user interface can display information on the unknown protocols. An administrator or other user may also transmit a query to threat intelligence server 30 through the user interface at 318 to retrieve additional data associated with the protocol fingerprint, thus further enriching the information with GTI data. For example, the query may retrieve global data for a protocol fingerprint, such as the geographic distribution of remote host addresses (i.e., in what countries are external IPs located that speak the unknown protocol), the reputation of remote host addresses using the unknown protocol (e.g., bad or unknown), and the number of sites reporting the unknown protocol (which may indicate whether the protocol is a local phenomenon or a targeted threat). Thus, as a more particular example, if an American bank fingerprints an unknown protocol and threat intelligence server 30 determines that the fingerprint is most frequently used by network addresses in Russia, then an administrator can block the unknown protocol based on this information. Moreover, as additional intelligence about the unknown protocol is collected, descriptive information and metadata can be associated with the protocol. For example, if an unknown protocol is seen by a firewall and submitted to a threat intelligence server, the threat intelligence server may alert the firewall or an administrator if the protocol is subsequently associated with an APT.
Note that with the examples provided above, as well as numerous other potential examples, interaction may be described in terms of two, three, or four network elements. However, this has been done for purposes of clarity and example only. In certain cases, it may be easier to describe one or more of the functionalities of a given set of operations by only referencing a limited number of network elements. It should be appreciated that network environment 10 is readily scalable and can accommodate a large number of components, as well as more complicated/sophisticated arrangements and configurations. Accordingly, the examples provided should not limit the scope or inhibit the broad teachings of network environment 10 as potentially applied to a myriad of other architectures. Additionally, although described with reference to particular scenarios, where a particular module, such as a fingerprinting engine, is provided within a network element, these modules can be provided externally, or consolidated and/or combined in any suitable fashion. In certain instances, such modules may be provided in a single proprietary unit.
It is also important to note that the steps in the appended diagrams illustrate only some of the possible scenarios and patterns that may be executed by, or within, network environment 10. Some of these steps may be deleted or removed where appropriate, or these steps may be modified or changed considerably without departing from the scope of teachings provided herein. In addition, a number of these operations have been described as being executed concurrently with, or in parallel to, one or more additional operations. However, the timing of these operations may be altered considerably. The preceding operational flows have been offered for purposes of example and discussion. Substantial flexibility is provided by network environment 10 in that any suitable arrangements, chronologies, configurations, and timing mechanisms may be provided without departing from the teachings provided herein.
Numerous other changes, substitutions, variations, alterations, and modifications may be ascertained to one skilled in the art and it is intended that the present disclosure encompass all such changes, substitutions, variations, alterations, and modifications as falling within the scope of the appended claims. In order to assist the United States Patent and Trademark Office (USPTO) and, additionally, any readers of any patent issued on this application in interpreting the claims appended hereto, Applicant wishes to note that the Applicant: (a) does not intend any of the appended claims to invoke paragraph six (6) of 35 U.S.C. section 112 as it exists on the date of the filing hereof unless the words “means for” or “step for” are specifically used in the particular claims; and (b) does not intend, by any statement in the specification, to limit this disclosure in any way that is not otherwise reflected in the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
5970066 | Lowry et al. | Oct 1999 | A |
5987610 | Franczek et al. | Nov 1999 | A |
6073142 | Geiger et al. | Jun 2000 | A |
6460050 | Pace et al. | Oct 2002 | B1 |
7305709 | Lymer et al. | Dec 2007 | B1 |
7415727 | Lowe et al. | Aug 2008 | B1 |
7506155 | Stewart et al. | Mar 2009 | B1 |
7555776 | Lymer et al. | Jun 2009 | B1 |
7680890 | Lin | Mar 2010 | B1 |
7681032 | Peled et al. | Mar 2010 | B2 |
7712134 | Nucci et al. | May 2010 | B1 |
7870203 | Judge et al. | Jan 2011 | B2 |
7937480 | Alperovitch et al. | May 2011 | B2 |
7953814 | Chasin et al. | May 2011 | B1 |
8042181 | Judge | Oct 2011 | B2 |
8045458 | Alperovitch et al. | Oct 2011 | B2 |
8069481 | Judge | Nov 2011 | B2 |
8132250 | Judge et al. | Mar 2012 | B2 |
8201257 | Andres et al. | Jun 2012 | B1 |
8239915 | Satish et al. | Aug 2012 | B1 |
8341724 | Burns et al. | Dec 2012 | B1 |
8381289 | Pereira et al. | Feb 2013 | B1 |
8621618 | Ramsey et al. | Dec 2013 | B1 |
20040047356 | Bauer | Mar 2004 | A1 |
20050021740 | Bar et al. | Jan 2005 | A1 |
20060015561 | Murphy et al. | Jan 2006 | A1 |
20060015563 | Judge et al. | Jan 2006 | A1 |
20060031314 | Brahms et al. | Feb 2006 | A1 |
20060253447 | Judge | Nov 2006 | A1 |
20060253458 | Dixon et al. | Nov 2006 | A1 |
20060253579 | Dixon et al. | Nov 2006 | A1 |
20060253581 | Dixon et al. | Nov 2006 | A1 |
20060265747 | Judge | Nov 2006 | A1 |
20060267802 | Judge et al. | Nov 2006 | A1 |
20070002769 | Matityahu et al. | Jan 2007 | A1 |
20070056035 | Copley | Mar 2007 | A1 |
20070078675 | Kaplan | Apr 2007 | A1 |
20070079379 | Sprosts et al. | Apr 2007 | A1 |
20070083929 | Sprosts et al. | Apr 2007 | A1 |
20070107059 | Chasin et al. | May 2007 | A1 |
20070130350 | Alperovitch et al. | Jun 2007 | A1 |
20070130351 | Alperovitch et al. | Jun 2007 | A1 |
20070162587 | Lund et al. | Jul 2007 | A1 |
20070220607 | Sprosts et al. | Sep 2007 | A1 |
20070244974 | Chasin | Oct 2007 | A1 |
20070289015 | Repasi et al. | Dec 2007 | A1 |
20080022384 | Yee et al. | Jan 2008 | A1 |
20080133540 | Hubbard et al. | Jun 2008 | A1 |
20080162265 | Sundaresan et al. | Jul 2008 | A1 |
20080175226 | Alperovitch et al. | Jul 2008 | A1 |
20080175266 | Alperovitch et al. | Jul 2008 | A1 |
20080178259 | Alperovitch et al. | Jul 2008 | A1 |
20080229422 | Hudis et al. | Sep 2008 | A1 |
20080244744 | Thomas et al. | Oct 2008 | A1 |
20080282338 | Beer | Nov 2008 | A1 |
20090007102 | Dadhia et al. | Jan 2009 | A1 |
20090150236 | Price | Jun 2009 | A1 |
20090172818 | Sutherland et al. | Jul 2009 | A1 |
20090178142 | Lieblich et al. | Jul 2009 | A1 |
20090222877 | Diehl et al. | Sep 2009 | A1 |
20090232300 | Zucker et al. | Sep 2009 | A1 |
20090328209 | Nachenberg | Dec 2009 | A1 |
20100077445 | Schneider et al. | Mar 2010 | A1 |
20100223349 | Thorson | Sep 2010 | A1 |
20100242082 | Keene et al. | Sep 2010 | A1 |
20100306846 | Alperovitch et al. | Dec 2010 | A1 |
20110040825 | Ramzan et al. | Feb 2011 | A1 |
20110067086 | Nachenberg et al. | Mar 2011 | A1 |
20110197275 | Chasin et al. | Aug 2011 | A1 |
20110305141 | Horovitz | Dec 2011 | A1 |
20120096516 | Sobel et al. | Apr 2012 | A1 |
20120174219 | Hernandez et al. | Jul 2012 | A1 |
20120291087 | Agrawal | Nov 2012 | A1 |
20130246925 | Ahuja et al. | Sep 2013 | A1 |
20130247201 | Alperovitch et al. | Sep 2013 | A1 |
20130268994 | Cooper et al. | Oct 2013 | A1 |
Number | Date | Country |
---|---|---|
2009-296036 | Dec 2009 | JP |
2010-079901 | Apr 2010 | JP |
10-2007-0065267 | Jun 2007 | KR |
10-2008-0025207 | Mar 2008 | KR |
WO 2007019521 | Feb 2007 | WO |
WO 2010008825 | Jan 2010 | WO |
WO 2013003493 | Jan 2013 | WO |
WO 2013155239 | Oct 2013 | WO |
Entry |
---|
Davis, Tom—Utilizing Entropy to Identify Undetected Malware. Guidance Software 2009. http://image.lifeservant.com/siteuploadfiles/VSYM/99B5C5E7-8B46-4D14-A53EB8FD1CEEB2BC/43C34073-C29A-8FCE-4B653DBE35B934F7.pdf. |
USPTO Mar. 13, 2013 Response to Dec. 13, 2012 Nonfinal Office Action from U.S. Appl. No. 13/052,739. |
USPTO Mar. 25, 2013 Final Office Action from U.S. Appl. No. 13/052,739. |
USPTO Jul. 25, 2013 AFCP Response to Mar. 25, 2013 Final Office Action from U.S. Appl. No. 13/052,739. |
USPTO Aug. 12, 2013 Advisory Action from U.S. Appl. No. 13/052,739. |
USPTO Aug. 26, 2013 Request for Continued Examination and Response to Final Office Action mailed Mar. 25, 2013 from U.S. Appl. No. 13/052,739. |
USPTO Sep. 13, 2013 Nonfinal Rejection from U.S. Appl. No. 13/443,865. |
Josang, Audun et al., “A Survey of Trust and Reputation Systems for Online Service Provision,” Decision Support Systems, 43(2), 2007, pp. 618-644, 43 pages. |
International Search Report and Written Opinion for International Application No. PCT/US2013/036053 mailed Sep. 23, 2013, 14 pages. |
USPTO Dec. 13, 2012 Nonfinal Office Action from U.S. Appl. No. 13/052,739. |
Ford, R.; Allen, W.H., “Malware Shall Greatly Increase . . . ,” Security & Privacy, IEEE, vol. 7, No. 6, pp. 69-71, Nov.-Dec. 2009. |
Bonatti, Piero, et al., “An integration of reputation-based and policy-based trust management,” networks 2.14 (2007): 10. |
Kai Hwang; Kulkareni, S.; Yue Hu, “Cloud Security with Virtualized Defense and Reputation-Based Trust Management,” Dependable, Autonomic and Secure Computing, 2009. DASC '09. Eighth IEEE International Conference on, vol., No., pp. 717-722, Dec. 12-14, 2009. |
International Search Report and Written Opinion for International Application No. PCT/US2012/044453 mailed Jan. 14, 2013. |
International Preliminary Report on Patentability for International Application No. PCT/US2013/044453, mailed Jan. 16, 2014, 10 pages. |
Jamie Barnett, Reputation: The Foundation of Effective Threat Protection, McAfee, White Paper, 11 pages, copyright 2010, retrieved Apr. 16, 2012 from http://www.mcafee.com/us/resources/white-papers/wp-rep-effective-threat-protection.pdf. |
McAfee GTI Reputation & Categorization Services, copyright 2003-2012, retrieved Apr. 16, 2012 from http://www.mcafee.com/us/mcafee-labs/technology/gti-reputation-technologies.aspx. |
TrustedSource: The Next Generation Reputation System for Enterprise Gateway Security, McAfee, White Paper, copyright 2009 McAfee, Inc., 20 pages, retrieved Apr. 16, 2012 from http://www.mcafee.com/us/resources/white-papers/wp-trusted-source-next-gen-rep-sys.pdf. |
U.S. Appl. No. 13/052,739, filed Mar. 21, 2011, entitled “System and Method for Malware and Network Reputation Correlation,” Inventor(s) Dmitri Alperovitch. |
U.S. Appl. No. 13/443,865, filed Apr. 10, 2012, entitled “System and Method for Determining and Using Local Reputations of Users and Hosts to Protect Information in a Network Environment,” Inventor(s) Geoffrey Howard Cooper, et al. |
Non-Final Office Action in U.S. Appl. No. 13/052,739 mailed on Sep. 2, 2014 (25 pages). |
Final Office Action in U.S. Appl. No. 13/443,865 mailed on May 22, 2014 (19 pages). |
Notice of Allowance in U.S. Appl. No. 13/443,865 mailed on Aug. 29, 2014 (12 pages). |
International Preliminary Report on Patentability for International Application Serial No. PCT/US2013/036053 mailed on Oct. 14, 2014 (10 pages). |
Korean Intellectual Property Office Notice of Preliminary Rejection in Korean Patent Applicaton Serial No. 10-2013-7034510 mailed on Dec. 4, 2014 (Translation) (3 pages). |
European Patent Office Supplementary Search Report and Opinion in EP Application U.S. Appl. No. 12804840.2 mailed on Jan. 7, 2015 (6 pages). |
Japan Patent Office Notice of Reasons for Rejection in JP Application Serial No. 2014-514938 mailed on Jan. 20, 2015 (2 pages). |
Notice of Allowance in U.S. Appl. No. 13/052,739 mailed on Jan. 15, 2015 (18 pages). |
U.S. Appl. No. 14/580,091, filed Dec. 22, 2014 and entitled System and Method for Determining and Using Local Reputations of Users and Hosts to Protect Information in a Network Environment, inventors Geoffrey Howard Cooper et al. (52 pages). |
Notice of Allowance in U.S. Appl. No. 13/052,739 mailed on Apr. 22, 2015 (11 pages). |
Number | Date | Country | |
---|---|---|---|
20120331556 A1 | Dec 2012 | US |