Systems and methods for analyzing PDF documents

Information

  • Patent Grant
  • 9954890
  • Patent Number
    9,954,890
  • Date Filed
    Friday, September 2, 2016
    8 years ago
  • Date Issued
    Tuesday, April 24, 2018
    6 years ago
Abstract
A system and method for detecting malicious activity within a Portable Document Format (PDF) document. The system includes a parser and one or more virtual machines. The parser that, when executed by a hardware processor, examines one or more portions of the PDF document to determine if one or more suspicious characteristics indicative of malicious network content are included in the one or more examined portions of the PDF document. The examined portion(s) in total are less than an entirety of the PDF document. The virtual machine(s) are adapted to receive the PDF document in response to the one or more examined portions of the PDF document being determined to include one or more suspicious characteristics indicative of malicious network content. The virtual machine(s) to process at least the one or more examined portions of the PDF document so as to determine whether the PDF document includes malicious network content.
Description
BACKGROUND
1. Field

The present invention relates generally to network security and more particularly to detecting malicious network content.


2. Discussion of Related Art

Presently, malicious network content (e.g., malicious software or malware) can attack various devices via a communication network. For example, malware may include any program or file that is harmful to a computer user, such as bots, computer viruses, worms, Trojan horses, adware, spyware, or any programming that gathers information about a computer user or otherwise operates without permission.


Adware is a program configured to direct advertisements to a computer or a particular user. In one example, adware identifies the computer and/or the user to various websites visited by a browser on the computer. The website may then use the adware to either generate pop-up advertisements or otherwise direct specific advertisements to the user's browser. Spyware is a program configured to collect information regarding the user, the computer, and/or a user's network habits. In an example, spyware may collect information regarding the names and types of websites that the user browses and then transmit the information to another computer. Adware and spyware are often added to the user's computer after the user browses to a website that hosts the adware and/or spyware. The user is often unaware that these programs have been added and are similarly unaware of the adware and/or spyware's function.


Various processes and devices have been employed to prevent the problems that malicious network content can cause. For example, computers often include antivirus scanning software that scans a particular client device for viruses. Computers may also include spyware and/or adware scanning software. The scanning may be performed manually or based on a schedule specified by a user associated with the particular computer, a system administrator, and so forth. Unfortunately, by the time a virus or spyware is detected by the scanning software, some damage on the particular computer or loss of privacy may have already occurred.


In some instances, malicious network content comprises a bot. A bot is a software robot configured to remotely control all or a portion of a digital device (e.g., a computer) without authorization by the digital device's legitimate owner. Bot related activities include bot propagation and attacking other computers on a network. Bots commonly propagate by scanning nodes (e.g., computers or other digital devices) available on a network to search for a vulnerable target. When a vulnerable computer is scanned, the bot may install a copy of itself. Once installed, the new bot may continue to seek other computers on a network to infect. A bot may also be propagated by a malicious web site configured to exploit vulnerable computers that visit its web pages.


A bot may also, without the authority of the infected computer user, establish a command and control communication channel to receive instructions. Bots may receive command and control communication from a centralized bot server or another infected computer (e.g., via a peer-to-peer (P2P) network established by a bot on the infected computer). When a plurality of bots (i.e., a botnet) act together, the infected computers (i.e., zombies) can perform organized attacks against one or more computers on a network, or engage in criminal enterprises. In one example, bot infected computers may be directed to flood another computer on a network with excessive traffic in a denial-of-service attack. In another example, upon receiving instructions, one or more bots may direct the infected computer to transmit spam across a network. In a third example, bots may host illegal businesses such as pharmaceutical websites that sell pharmaceuticals without a prescription.


Malicious network content may be distributed over a network via web sites, e.g., servers operating on a network according to an HTTP standard. Malicious network content distributed in this manner may be actively downloaded and installed on a user's computer, without the approval or knowledge of the user, simply by accessing the web site hosting the malicious network content. The web site hosting the malicious network content may be referred to as a malicious web site. The malicious network content may be embedded within data associated with web pages hosted by the malicious web site. For example, a web page may include JavaScript code, and malicious network content may be embedded within the JavaScript code. In this example, the malicious network content embedded within the JavaScript code may be obfuscated such that it is not apparent until the JavaScript code is executed that the JavaScript code contains malicious network content. Therefore, the malicious network content may attack or infect a user's computer before detection by antivirus software, firewalls, intrusion detection systems, or the like.


Beginning on or about 2009, it became a widespread practice for the authors of bots to use malicious documents in the Portable Document Format (PDF) of Adobe Systems Inc. to propagate web borne attacks. Malicious PDF documents were hosted on web servers controlled by criminals, and then links to them created from many other websites. Innocent users could therefore accidentally, without realizing, browse a website which would cause a malicious PDF to be loaded into their browser, and from their into a PDF reader, which it would then exploit in order to gain control of the user's computer account, or entire computer. From there, malicious bot software would be installed.


SUMMARY

According to some embodiments, one embodiment of the present invention may be directed to methods detecting malicious portable document format (PDF) network content may include at least the steps of (a) examining at least a portion of received PDF network content to determine if one or more suspicious characteristics indicative of malicious network content are included in the at least a portion of PDF network content; and (b) wherein if the at least a portion of PDF network content is determined to include one or more suspicious characteristics indicative of malicious network content, providing the at least a portion of PDF network content to one or more virtual machines to verify the inclusion of malicious network content in the at least a portion of PDF network content. Other embodiments are further described and set forth in the claims below.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a diagram of an exemplary malicious network content detection environment 100.



FIG. 2 illustrates an exemplary analysis environment.



FIG. 3 illustrates an exemplary method for detecting malicious network content.



FIG. 4 illustrates another exemplary method for detecting malicious network content.



FIG. 5 illustrates an exemplary controller.



FIG. 6 illustrates an exemplary malicious PDF network content detection environment.



FIG. 7 illustrates an exemplary method for detecting malicious PDF network content.





DETAILED DESCRIPTION

Network content may include any data transmitted over a network (i.e., network data). Network data may include text, software, images, audio, or other digital data. An example of network content includes web content, or any network data that may be transmitted using a Hypertext Transfer Protocol (HTTP), HyperText Markup Language (HTML) protocol, or be transmitted in a manner suitable for display on a web browser software application. Another examples of network content includes email messages, which may be transmitted using an email protocol such as Simple Mail Transfer Protocol (SMTP), Post Office Protocol version 3 (POP3), or Internet Message Access Protocol (IMAP4). A further example of network content includes Instant Messages, which may be transmitted using an Instant Messaging protocol such as Session Initiation Protocol (SIP) or Extensible Messaging and Presence Protocol (XMPP). In addition, network content may include any network data that is transferred using other data transfer protocols, such as File Transfer Protocol (FTP). We distinguish network content from network protocol header information used for addressing, routing, and otherwise delivering the network content.


To detect malicious network content (e.g., malicious web content) being transmitted over a communication network to a computing device, a virtual machine may be used to simulate the receipt and processing of network content on the receiving system. A determination may be made as to whether the network content is malicious based on a response of the virtual machine to the network content. Sometimes, suspicious network content is determined to be non-malicious. Processing the suspicious network content in a virtual machine is an important step to determine whether the suspicious network content is in fact malicious and prevent a false assumption that the suspicious network content is malicious. False positives in detecting malicious network content may be avoided by processing suspicious network content in a virtual machine and detecting malicious network content by analyzing the virtual machine's response to the suspicious network content.


In the prior art, a proxy may be used in the network between the computing device and a web server hosting the malicious network content. The proxy may intercept a request for network content issued by a web browser executing on the computing device. The proxy may then issue the request to the web server as a proxy on behalf of the computing device. The proxy may receive a response to the request from the web server. The proxy may then process a data exchange including the request and response on a virtual machine and evaluate the virtual machine's response to the data exchange to detect malicious network content. If no malicious network content is detected, the proxy may forward the requested network content to the computing device from which the original request originated.


Because each data exchange is processed using a virtual machine, this approach is highly computation intensive, and is not scalable for large numbers of computing devices on a network. Also, because the requested network content is not delivered to the computing device until after it has been determined that the requested network content does not include malicious network content, a significant delay is introduced between the request for network content and the delivery of the requested network content.


Provos et al. (N. Provos, P. Mavrommatis, M. A. Rajab, and F. Monrose, “All your iFRAMEs Point to Us,” Google Technical Report Provos-2008a, Feb. 4, 2008) reported on an analysis of web malware using a large web repository and corpus of malicious URLs. Provos et al. collected data for the analysis by first using a machine-learning framework in a pre-processing phase to extract features from web pages in the web repository and translate the features into a likelihood score. Next, a virtual machine was used in a verification phase to verify candidates identified by the machine-learning framework. Approximately 0.1% of the web pages in the web repository were processed by the virtual machine in the verification phase. Provos et al. noted that exhaustive inspection of each URL in the repository is prohibitively expensive. The system used by Provos et al. relied on a crawler proceeding gradually through the web to gather data in the repository for inspection, and could not inspect and select web pages in transit in the network for examination in a virtual machine.



FIG. 1 is a diagram of an exemplary malicious network content detection environment 100. The malicious network content detection environment 100 comprises a server device 105, a client device 110, and a tap 115 (also known as “a data access component”), each coupled to a communication network 120. In various embodiments, there may be multiple server devices 105 and multiple client devices 110. The tap 115 is further coupled to a malicious network content detection system 125. The malicious network content detection system 125 may monitor exchanges of network content (e.g., web content) rather than intercepting and holding the network content until after determining whether the network content includes malicious network content. The malicious network content detection system 125 may be configured to inspect exchanges of network content over the communication network 120, identify suspicious network content, and analyze the suspicious network content using a virtual machine to detect malicious network content. In this way, the malicious network content detection system 125 may be computationally efficient and scalable as data traffic volume and a number of computing devices communicating over the communication network 120 increase. Therefore, the malicious network content detection system 125 may not become a bottleneck in the malicious network content detection environment 100.


The communication network 120 may include a public computer network such as the Internet, or a private computer network such as a wireless telecommunication network, wide area network, or local area network, or a combination of networks. Though the communication network 120 may include any type of network and be used to communicate different types of data, communications of web data may be discussed below for purposes of example.


The server device 105 and the client device 110 may include digital devices. Some examples of digital devices include computers, servers, laptops, personal digital assistants, and cellular telephones. The server device 105 may be configured to transmit network data over the communication network 120 to the client device 110. The client device 110 may be configured to receive the network data from the server device 105. The network data may include network content, such as web pages transmitted using a network communications protocol (e.g., Hypertext Transfer Protocol, or HTTP). In various embodiments, the server device 105 may include a web server configured to provide network content. The client device 110 may include a web browser configured to retrieve and/or display network content.


The tap 115 may include a digital data tap configured to monitor network data and provide a copy of the network data to the malicious network content detection system 125. Network data may comprise signals and data that are transmitted over the communication network 120 including data flows from the server device 105 to the client device 110. In one example, the tap 115 monitors and copies the network data without an appreciable decline in performance of the server device 105, the client device 110, or the communication network 120. The tap 115 may copy any portion of the network data. For example, the tap 115 may receive and copy any number of data packets from the network data.


In some embodiments, the network data may be organized into one or more data flows and provided to the malicious network content detection system 125. In various embodiments, the tap 115 may sample the network data based on a sampling scheme. Data flows may then be reconstructed based on the network data samples.


The tap 115 may also capture metadata from the network data. The metadata may be associated with the server device 105 and/or the client device 110. For example, the metadata may identify the server device 105 and/or the client device 110. In some embodiments, the server device 105 transmits metadata, which is captured by the tap 115. In other embodiments, a heuristic module 130 (described herein) may determine the server device 105 and the client device 110 by analyzing data packets within the network data in order to generate the metadata.


The malicious network content detection system 125 may include a digital device, software, or a combination thereof that receives network data from the tap 115. The malicious network content detection system 125 includes a heuristic module 130, a heuristics database 135, a scheduler 140, a virtual machine pool 145, and an analysis environment 150. In some embodiments, the tap 115 may be contained within the malicious network content detection system 125.


The heuristic module 130 receives the copy of the network data from the tap 115 and applies heuristics to the data to determine if the network data might contain suspicious network content. The heuristics applied by the heuristic module 130 may be based on data and/or rules stored in the heuristics database 135. In one example, the heuristic module 130 flags network data as suspicious after applying a heuristic analysis. The network data may then be buffered and organized into a data flow. The data flow may then be provided to the scheduler 140. In some embodiments, the suspicious network data is provided directly to the scheduler 140 without buffering or organizing the data flow. In other embodiments, a notification of a group of data flows (e.g., a set of related web page requests and responses) may be sent to the scheduler 140 for later retrieval by the virtual machine.


The heuristic module 130 may perform one or more heuristic analyses on the network data. The heuristic module 130 may retain data packets belonging to a particular data flow previously copied by the tap 115. In one example, the heuristic module 130 receives data packets from the tap 115 and stores the data packets within a buffer or other memory. Once the heuristic module 130 receives a predetermined number of data packets from a particular data flow, the heuristic module 130 performs the heuristics and/or probability analysis.


In some embodiments, the heuristic module 130 performs a heuristic analysis on a set of data packets belonging to a data flow and then stores the data packets within a buffer or other memory. The heuristic module 130 may then continue to receive new data packets belonging to the same data flow. Once a predetermined number of new data packets belonging to the same data flow are received, the heuristic analysis may be performed upon the combination of buffered and new data packets to determine a likelihood of suspicious network content.


In some embodiments, an optional buffer receives the flagged network data from the heuristic module 130. The buffer may be used to store and organize the flagged network data into one or more data flows before providing the one or more data flows to the scheduler 140. In various embodiments, the buffer is used to store network data until the network data is provided to the scheduler 140. In one example, the buffer stores the network data to allow other components of the malicious network content detection system 125 time to complete functions or otherwise clear data congestion.


In some embodiments, the heuristic module 130 may maintain copies of network content data of potential interest to virtual machines and provide the network content data on request (e.g., when a web browser later executes inside a virtual machine and requests entities that were transmitted on the network earlier). The length of time that the heuristic module 130 keeps this data in memory may be based on how suspicious the data is, how much workload the system is under, and/or other factors.


The scheduler 140 may identify the client device 110 and retrieve a virtual machine associated with the client device 110. A virtual machine is software that is configured to mimic the performance of a device (e.g., the client device 110). The virtual machine may be retrieved from the virtual machine pool 145. Furthermore, the scheduler 140 may identify a web browser running on the client device 110, and retrieve a virtual machine associated with the web browser.


In some embodiments, the heuristic module 130 transmits the metadata identifying the client device 110 to the scheduler 140. In other embodiments, the scheduler 140 receives one or more data packets of the network data from the heuristic module 130 and analyzes the one or more data packets to identify the client device 110. In yet other embodiments, the metadata may be received from the tap 115.


The scheduler 140 may retrieve and configure the virtual machine to mimic the pertinent performance characteristics of the client device 110. In one example, the scheduler 140 configures the characteristics of the virtual machine to mimic only those features of the client device 110 that are affected by the network data copied by the tap 115. The scheduler 140 may determine the features of the client device 110 that are affected by the network data by receiving and analyzing the network data from the tap 115. Such features of the client device 110 may include ports that are to receive the network data, select device drivers that are to respond to the network data, and any other devices coupled to or contained within the client device 110 that can respond to the network data. In other embodiments, the heuristic module 130 may determine the features of the client device 110 that are affected by the network data by receiving and analyzing the network data from the tap 115. The heuristic module 130 may then transmit the features of the client device to the scheduler 140.


The virtual machine pool 145 may be configured to store one or more virtual machines. The virtual machine pool 145 may include software and/or a storage medium capable of storing software. In one example, the virtual machine pool 145 stores a single virtual machine that can be configured by the scheduler 140 to mimic the performance of any client device 110 on the communication network 120. The virtual machine pool 145 may store any number of distinct virtual machines that can be configured to simulate the performance of a wide variety of client devices 110.


The analysis environment 150 simulates the receipt and/or display of the network content from the server device 105 after the network content is received by the client device 110 to analyze the effects of the network content upon the client device 110. The analysis environment 150 may identify the effects of malware or malicious network content by analyzing the simulation of the effects of the network content upon the client device 110 that is carried out on the virtual machine. There may be multiple analysis environments 150 to simulate multiple streams of network content. The analysis environment 150 is further discussed with respect to FIG. 2.


Although FIG. 1 depicts data transmitted from the server device 105 to the client device 110, either device can transmit and receive data from the other. Similarly, although only two devices are depicted, any number of devices can send and/or receive data across the communication network 120. Moreover, the tap 115 can monitor and copy data transmitted from multiple devices without appreciably affecting the performance of the communication network 120 or the devices coupled to the communication network 120.



FIG. 2 illustrates an exemplary analysis environment 150. The analysis environment 150 includes a replayer 205, a virtual switch 210, and a virtual machine 215. The replayer 205 receives network content that has been flagged by the heuristic module 130 and provides the network content to the virtual machine 215 via the virtual switch 210 (i.e., replays the network content) in the analysis environment 150. In some embodiments, the replayer 205 mimics the behavior of the server device 105 in transmitting the flagged network content. There may be any number of replayers 205 simulating the transmission of network content between the server device 105 and the client device 110. In a further embodiment, the replayer 205 dynamically modifies session variables, as is appropriate, to emulate a “live” client or server of the protocol sequence being replayed. In one example, dynamic variables that may be dynamically substituted include dynamically assigned ports, transaction IDs, and any other variable that is dynamic to each protocol session.


The virtual switch 210 may include software that is capable of forwarding packets of flagged network content to the virtual machine 215. In one example, the replayer 205 simulates the transmission of the data flow by the server device 105. The virtual switch 210 simulates the communication network 120, and the virtual machine 215 simulates the client device 110. The virtual switch 210 may route the data packets of the data flow to the correct ports of the virtual machine 215.


In some embodiments, requests for data from client software in the virtual machine 215 (e.g., a web browser) may be proxied by the replayer to the heuristic module 130 where the data has been cached, and a response from the heuristic module 130 may then be proxied back to the client software executing in the virtual machine 215.


The virtual machine 215 includes a representation of the client device 110 that may be provided to the analysis environment 150 by the scheduler 140. In one example, the scheduler 140 retrieves an instance of the virtual machine 215 from the virtual machine pool 145 and configures the virtual machine 215 to mimic a client device 110. The configured virtual machine 215 is then provided to the analysis environment 150 where it may receive flagged network content from the virtual switch 210.


As the analysis environment 150 simulates the transmission and reception of the network content, behavior of the virtual machine 215 can be closely monitored for unauthorized activity. If the virtual machine 215 crashes, performs illegal operations, performs abnormally, or allows access of data to an unauthorized entity (e.g., an unauthorized computer user, a bot, etc.), the analysis environment 150 may react. In one example, the analysis environment 150 may transmit a command to the client device 110 to stop accepting the network content or data flows from the server device 105.


In some embodiments, the analysis environment 150 monitors and analyzes the behavior of the virtual machine 215 in order to determine a specific type of malware or malicious network content. The analysis environment 150 may also generate computer code configured to eliminate new viruses, worms, bots, adware, spyware, or other malware or malicious network content. In various embodiments, the analysis environment 150 generates computer code configured to repair damage performed by malware or malicious network content. By simulating the transmission and reception of suspicious network content and analyzing the response of the virtual machine 215, the analysis environment 150 may identify known and previously unidentified malware and malicious network content before a computer system is damaged or compromised.



FIG. 3 illustrates an exemplary method 300 for detecting malicious network content. In step 305, a packet of network content is intercepted or copied. The packet may be intercepted and/or copied from a network data transmission between the server device 105 and an intended destination (e.g., the client device 110), such as by the tap 115. Alternatively, the packet may be intercepted and/or copied from a network data transmission between the client device 110 and an intended destination (e.g., the server device 105). The packet may include a request for data, such as network content, or data provided in response to a request.


In step 310, a packet of network content is inspected. The heuristic module 130 may utilize one or more heuristics to inspect the packet of network content for suspicious network content which indicates the potential presence of malicious network content or malware within the packet.


A packet of network content may be part of a data flow which includes additional packets of network content. For example, the packet of network content may represent a portion of a web page, while other related packets in the data flow represent additional portions of the web page. The packet of network content may be stored along with the other related packets of network content comprising the data flow, such that multiple packets of network content within the data flow may be inspected in a sequence or in parallel. The malicious network content detection system may store the packets of network content and all or a portion of a data flow. The data flow and data packets may be stored for any length of time, from a few seconds to minutes, tens of minutes, or more, for analysis at any time.


To facilitate longer storage times for data flows over a high data rate communication network, large data objects comprised of numerous data packets may be truncated to a small subset of representative data packets. Data object truncation is particularly useful where network communication bandwidth is mostly utilized by a small percentage of large data objects, such as video. For example, video data may be truncated to a few data packets, such as the first few data packets. An extent to which the large data objects are truncated may be adaptive based on available memory, data bandwidth, type of data objects, and other factors. An amount of memory allocated to storing a data flow may also be dependent upon a characteristic of the data flow, such as data type. In an example, octet streams, text streams, HTML streams, and miscellaneous binary streams may be allocated 1 megabyte (MB). Images and PDF files may be allocated 384 kilobytes (kB). Video, audio, and most other data types may be allocated 128 kB. The memory allocated to storing each data flow type may be adjusted, periodically or dynamically, to improve analysis throughput while maintaining accuracy in detection of malicious network content and working within memory limitations.


In step 315, a suspicious characteristic of the network content is identified. The heuristic module 130 may identify the suspicious characteristic of the network content as a result of inspecting the network content in step 310. When a characteristic of the packet, such as a sequence of characters or keyword, is identified that meets the conditions of a heuristic used in step 310, a suspicious characteristic or “feature” of the network content is identified. The identified features may be stored for reference and analysis. In some embodiments, the entire packet may be inspected and multiple features may be identified before proceeding to the next step. In some embodiments, features may be determined as a result of an analysis across multiple packets comprising the network content.


Keywords used by heuristics may be chosen by performing an approximate Bayesian probability analysis of all the keywords in an HTML specification using a corpus of malicious network content and a corpus of non-malicious network content. The approximate Bayesian probability analysis may be based on the principles of the Bayesian theorem and/or naïve Bayesian classification. For instance, a probability Pm that the keyword appears in malicious network content may be computed using the corpus of malicious network content, while a probability Pn that the keyword appears in non-malicious network content may be computed using the corpus of non-malicious network content. A given keyword may be determined to be a suspicious characteristic for being associated with malicious network content if a score based on a computed ratio Pm/Pn exceeds a threshold of suspicion. The threshold of suspicion may be a value greater than 1, 10, 30, 60, 100, or some other number indicating how much more likely the suspicious characteristic is to indicate malicious network content than to indicate non-malicious network content.


In step 320, a score related to a probability that the suspicious characteristic identified in step 315 indicates malicious network content is determined. An approximate Bayesian probability analysis may be used to determine the score. In various embodiments, the approximate Bayesian probability analysis may be performed in real-time or using a look-up table based on a previously performed approximate Bayesian probability analysis.


For example, the approximate Bayesian probability analysis may be performed to determine a relative probability score that a particular feature is associated with the presence of malicious network content in a packet by comparing a corpus of malicious network content and a corpus of regular, non-malicious network content. A feature may include a characteristic of the packet, such as a sequence of characters or keyword, that meets the conditions of a heuristic used in step 310. The feature may also include a characteristic involving more than one packet inspected in sequence or in parallel. An example of a feature may include the character sequence “eval(unescape(”, which indicates a JavaScript “unescape” command nested within a JavaScript “eval” command argument. Further examples of features are described below with respect to step 445 in method 400. A probability Pf|m that the feature is present in a packet of malicious network content is computed by analyzing the corpus of malicious network content. A probability Pf|n that the feature is present in a packet of non-malicious network content is computed by analyzing the corpus of non-malicious network content. A malicious probability score is computed as the base two logarithm of a relative probability factor Pm|f that the feature is associated with malicious network content. The malicious probability score is computed by computing the ratio of the base two logarithm (log2) of the probability that the feature is present in a packet of malicious network content and the base two logarithm of the probability that the feature is present in a packet of non-malicious network content. The relative probability factor Pm|f may be expressed as follows:

log2(Pm|f)=log2(Pf|m)/log2(Pf|n)  Equation 1


The size of the result log2(Pm|f) (i.e., malicious probability score) may indicate the probability that the suspicious network content includes malicious network content. For example, a result of eleven may indicate that the feature is approximately two thousand times more likely to appear in malicious network content than in non-malicious network content. Likewise, a value of twelve may indicate that the feature is approximately four thousand times more likely to appear in malicious network content.


In some embodiments, the malicious corpus and/or the non-malicious corpus may be continuously updated in response to monitored network data traffic, and the malicious probability scores associated with the features may be continuously updated in response to the updates to the corpuses. In other embodiments, the corpuses may be created and used in advance to store pre-computed malicious probability scores in a look-up table for reference when features are identified. The features associated with significant probabilities of malicious network content may change as the corpuses change.


In step 325, malicious network content is identified or flagged if the malicious probability score of a feature computed in step 320 satisfies an analysis threshold. The analysis threshold may be greater than 1, 10, 30, 60, 100, 1000, 2000, or higher. The analysis threshold may be preset, or may be variable based on operating conditions of the malicious network content detection system 125. If the malicious probability score does not satisfy the analysis threshold, no action may be taken with regard to the feature associated with the malicious probability score. Otherwise, the analysis may proceed to the next step, such as step 330 for analysis through processing by a virtual machine, such as the virtual machine 215. In some embodiments, the malicious probability scores of all features computed in step 320 may be compared against the analysis threshold to assign a priority level to each feature and/or the packet as a whole. The priority level may be computed based on a variety of factors, such as the number of features identified in the packet, the highest malicious probability score of a feature in the packet, an average malicious probability score of the features in the packet, a mean malicious probability score of the features in the packet, and the like.


The analysis threshold may be adaptive or be frequently updated based on operating conditions of the malicious network content detection system 125. For example, the threshold value may be dynamically revised according to a quantity of packets of network content to be inspected. As a quantity of data packets which are intercepted and/or copied from the network data transmission in step 310 increases, a quantity of data packets to be inspected may also increase. This may increase a computational load and leave less computational bandwidth available for more detailed analysis of the data packets. Consequently, the threshold may be increased to compensate for the decrease in available computational bandwidth for more detailed analysis. As another example, the threshold value may be dynamically revised according to an availability of one or more virtual machines to be used for the more detailed analysis. The threshold value may be set such that only features which have a significant probability of indicating malicious network content are processed using a virtual machine. For example, out of over one thousand features, less than fifty may be considered significant.


There may be multiple dynamically adaptive thresholds, which may be synchronized with each other. For example, the scheduler 140 may use a threshold to determine whether a virtual machine should be dispatched to process a queued suspicious network content. The scheduler 140's threshold may increase due to lack of available computational resources for the analysis environment 150 to execute virtual machines. The heuristic module 130 may use another threshold to determine whether heuristics should be applied to an identified feature. The heuristic module 130's threshold may be based on the malicious probability score for the identified feature. As the scheduler 140's threshold increases, the heuristic module 130's threshold may also increase. This is because flagging suspicious network content based on running heuristics on identified features may be irrelevant and an inefficient use of computational resources if the scheduler 140 will not process the suspicious network content in a virtual machine due to an increased threshold in the scheduler 140.


After suspicious network content has been flagged at step 325 for further analysis, the entire stored data flow including the suspicious network content may be reanalyzed. Each feature may be given a higher malicious probability score by virtue that one feature in the data flow has been found to have a malicious probability score greater than the threshold. A priority level for each feature found in the data flow may also be increased. Furthermore, all data packets and data flows associated with any domains associated with suspicious network content may be cached and given higher priorities and malicious probability scores than they would otherwise. The scheduler 140 may execute the virtual machine to process each flagged suspicious network content in the data flow individually, in priority order, in their original sequence of presentation, or in some other order. The virtual machine may process the suspicious network content until pre-empted by a higher priority suspicious network content.


In step 330, a virtual machine is executed to process the suspicious network content. The virtual machine may effectively replay the suspicious network content in a web browser executing on the virtual machine. The heuristic module 130 may provide the packet containing the suspicious network content to the scheduler 140, along with a list of the features present in the packet and the malicious probability scores associated with each of those features. Alternatively, the heuristic module 130 may provide a pointer to the packet containing the suspicious network content to the scheduler 140 such that the scheduler 140 may access the packet via a memory shared with the heuristic module 130. In another embodiment, the heuristic module 130 may provide identification information regarding the packet to the scheduler 140 such that the scheduler 140, replayer 205, or virtual machine may query the heuristic module 130 for data regarding the packet as needed.


The heuristic module 130 may also provide a priority level for the packet and/or the features present in the packet. The scheduler 140 may then load and configure a virtual machine from the virtual machine pool 145, and dispatch the virtual machine to the analysis environment 150 to process the suspicious network content. The virtual machine may be configured to execute for a minimum amount of processing, or for a minimum period of time, such as approximately 45 seconds. After the minimum period of time passes, the virtual machine may be pre-empted by the scheduler 140 to dispatch another virtual machine. Multiple virtual machines may be run simultaneously.


The scheduler 140 may choose which feature to process first according to the priority levels provided by the heuristic module 130. The scheduler 140 may cause another virtual machine already processing or analyzing another feature or packet, or set of packets, in the analysis environment 150 to terminate prior to dispatching the loaded virtual machine. For example, this may occur if computational resources are occupied with other virtual machines processing other features and therefore are not available to execute the loaded virtual machine. The scheduler 140 may choose which virtual machine(s) to terminate based on the priority levels of the features being processed by the virtual machine, how much time the virtual machine has already spent executing, or other reasons.


The scheduler 140 may reprioritize suspicious network content already in queue to be processed by virtual machines based on newly identified suspicious network content. For example, already queued suspicious network content may be reprioritized if there is a domain identified in common with the newly identified suspicious network content. Numerous incidents of suspicious network content associated with a single domain may increase the priority of all suspicious network content associated with the domain.


The replayer 205 in the analysis environment 150 may keep track of network content requested by the virtual machine. If suspicious network content already in the scheduler 140's queue is requested and processed by the virtual machine while processing other previously dispatched suspicious network content, and the queued suspicious network content is not found to be malicious, then the scheduler 140 may delete the queued suspicious network content from the queue. In this way, computational requirements can be reduced because an item of suspicious network content may only be processed in a virtual machine once, rather than each time a reference to the item of suspicious network content is made by another item of suspicious network content.


In step 335, malicious network content is detected by analyzing the virtual machine response to the suspicious network content. The analysis environment 150 may be configured to monitor the virtual machine for indications that the suspicious network content is in fact malicious network content. The analysis environment 150 may monitor the virtual machine for unusual memory accesses, unusual spawning of executable processes, unusual network transmissions, crashes, unusual changes in performance, and the like. The analysis environment may flag the suspicious network content as malicious network content according to the observed behavior of the virtual machine.


If a virtual machine processes suspicious network content for greater than a predetermined amount of time without any malicious network content being detected, the scheduler 140 may terminate the virtual machine to free up computational resources. The predetermined amount of time may be variable, according to a queue of suspicious network content that is awaiting processing by a virtual machine, the probability that the suspicious network content may be malicious network content, the feature being evaluated by the virtual machine, available computational resources, and the like. For example, the predetermined amount of time may be 45 seconds, two minutes, twenty minutes, or any other length of time.


If the suspicious network content is determined to be malicious network content, the malicious network content detection system 125 may report the malicious network content and/or log the malicious network content for future reference. For example, the malicious network content detection system 125 may generate an alert for a network content packet detected to include malicious network content. The malicious network content detection system 125 may report the malicious network content to an entity responsible for the client device 105. If the malicious network content was determined to originate from the server device 105, the client device 110 may be instructed not to continue network transmissions with the server device 105. If a party responsible for the server device 105 is known, the malicious network content detection system 125 may report the malicious network content to the party responsible for the server device 105. The server device 105 may be added to a list of malicious network content providers, and future network transmissions originating from the server device 105 may be blocked from reaching their intended destinations.



FIG. 4 illustrates another exemplary method 400 for detecting malicious network content. The method 400 may be performed by the heuristic module 130. In the method 400, a packet of network content is inspected to identify features which may indicate the presence of malicious network content. The method 400 may include the use of a single pass parser and/or an augmented finite state machine, which may maintain a stack of states. The method 400 may begin processing a data packet starting with a character after a character sequence “HTTP” has been identified.


In step 405, a data character is read from the data packet. The data character read may be subsequent to the character sequence “HTTP” or a data character previously read in a prior iteration of step 405. A pointer may be incremented to indicate the next data character to read in the method 400.


In step 410, the data character read in step 405 is evaluated to determine if the data character may indicate the start of a possible keyword or a possible feature as described with respect to method 300, or a different kind of data (e.g., JavaScript content embedded in HTML content). The data character may include a left angled bracket (i.e., “<”), for example. If the data character read may indicate the start of a keyword or a feature, the method may proceed to step 415. Otherwise, the method may proceed to step 420.


In step 415, a new state is pushed onto the stack of states to indicate that the method 400 has encountered the start of a keyword or feature. The new state may be an InKeyword state to indicate that the method is in the midst of processing a keyword. Depending on the character read, a different new state may be pushed onto the stack. A string of data characters may be stored, starting with the most recent character read or the next character to be read. The method 400 then proceeds to step 440.


In step 420, the data character read in step 405 is evaluated to determine if the data character may indicate the end of a keyword or a feature as described with respect to method 300. The data character may include a right angled bracket (i.e., “>”), for example. If the data character read may indicate the end of a keyword or a feature, the method may proceed to step 425. Otherwise, the method may proceed to step 440.


In step 425, heuristics to be applied to the data packet are identified and applied based on a character string read, which may start with the data character identified in step 410 and end with the data character identified in step 420. The heuristic module 300 may store the character string. The character string may be compared against a database of character strings stored in the heuristics database 135 to determine one or more heuristics that may be applied to the data packet based on the keyword. In some embodiments, a list of results of applying heuristics may be created. The list of results may be stored so that the list may be referenced in step 445.


Some examples of a heuristic that may be applied to the packet include keyword matches. Some keywords may be associated more with malicious network content than non-malicious network content, and their presence in a packet of network content may be an indication that the packet contains suspicious network content.


In one exemplary heuristic, an object filename's extension following a period may be examined. For example, a filename ending in the characters “.ini”, “.anr”, or “.htm” may be determined to be suspicious. Also, a filename generally associated with one filetype but associated with a different filetype in the reference may be determined to be suspicious. For example, a filename ending in “.jpg” which is not referring to an image file may be determined to be suspicious.


In other exemplary heuristics, content of web pages may be analyzed to determine whether network content is suspicious. For example, presence of small iframes, such as an iframe in which the width and/or height is 0 or 1 pixel, in a web page may be determined to be suspicious.


Further examples of heuristics may be associated with JavaScript code sequences. When an “eval(unescape( . . . ))” JavaScript command sequence, which includes an “unescape” command nested within the argument of an “eval” command, is detected in the data packet, the heuristic may evaluate the command sequence to identify suspicious network content. The “eval(unescape( . . . ))” command sequence may be used to obfuscate malicious network content so that the malicious network content is not easily detected in the network data transmission, and may therefore indicate suspicious network content.


Another example of a heuristic is a length of the argument of the “unescape” or other JavaScript function from a starting character to an ending character. The length may be determined by counting a number of characters, or measuring a length of time, between the opening parenthesis and the closing parenthesis after “unescape” or other function name. A greater number of characters between the parentheses may indicate that an obfuscated body to the command is being used.


Bi-gram detection is another exemplary heuristic that may be employed in JavaScript or other types of network content. In bi-gram detection, character transitions within the network content are analyzed. A table of conditional probabilities may be generated and updated continuously as data is evaluated. The table of conditional probabilities indicates the probability of each second character appearing after each first character. The conditional probability of a second character C2 given the first character C1 may be written as P(C2|C1). The heuristic may identify when a string of unusual character transitions occurs according to the table of conditional probabilities. Thresholds for the length of the string of unusual character transitions, combined with the values of the conditional probabilities that flags the character transitions as being unusual, may be set a priori based on an approximate Bayesian probability analysis using a corpus of malicious network content and a corpus of non-malicious network content. Alternatively, the thresholds may be adjusted in near real time as the table of conditional probabilities is updated. For example, a long string of unusual character transitions may indicate the presence of malicious network content in a JavaScript “eval(unescape( . . . ))” clause.


The use of domain profiles is another exemplary heuristic that may be used to reduce a rate of false positives from other heuristics. The domain profiles heuristic may be used in conjunction with other heuristics in order to increase throughput and reduce computational requirements for detecting malicious network content. Each network domain with which monitored network content is exchanged may be cataloged and annotated with a list of the features present in network content associated with the network domain. A typical network domain may be approximately constant in the features present in associated network content. When a feature is identified by another heuristic, the feature may be looked up in the list of features associated with the network domain. If the feature is listed as being associated with the network domain, and malicious network content was not previously detected due to identification of the feature in network content associated with the domain, a virtual machine may not be executed to process the network content containing the feature associated with the network domain. If, on the other hand, the feature was not previously detected or associated with the network domain, the network content may be identified as being suspicious and processed by a virtual machine.


A list of domains or web sites containing malicious network content may be maintained. The list of sources of malicious network content may be hosted on the computer network and accessible by clients on the computer network. The heuristic module 130 may access the list of domains and web sites containing malicious network content to supplement the information provided by the domain profiles heuristic. For example, the threshold for network content associated with a web site on a list of malicious network content sources may be set to be lower and/or the priority of a suspicious network content may be set higher than for other network content. When malicious network content is detected, the list of domains may be notified or updated with the information for reference by others.


In step 430, if a state is being exited, the state being exited is popped from the stack of states. The state being exited is the most recent state pushed onto the stack of states. For example, if the state being exited is the InKeyword state, the InKeyword state is popped from the stack of states to indicate that the method is no longer in the midst of reading a keyword. If a state is not being exited, a state may not be popped from the stack, and multiple states may be stored on the stack. In some embodiments, up to 32 states may be present on the stack of states at one time. For example, JavaScript may have embedded HTML, and therefore multiple states may be active at one time to account for nested features. In various embodiments, there may be more than 60 states associated with data packets being analyzed for malicious network content.


In step 435, a new state is pushed onto the stack of states to indicate that the method is now in the midst of a new state. The new state may be determined by the last keyword that was read, or a character indicating a new kind of content. For example, the new state may be an InBetweenKeyword state to indicate that the method is awaiting another keyword to process. In some embodiments, the new state may be an InJavaScript state to indicate that the method is in the midst of reading a JavaScript segment. The state may impact which heuristics are identified and applied to the packet of web data in step 445. For example, a first heuristic may be chosen if a first state is active, whereas a second heuristic may be chosen if a second state is active.


In step 440, the count of characters read in step 405 is evaluated to determine if the data character may lie at the end of a packet. If the data character lies at the end of the packet, the method may proceed to step 445. Otherwise, the method may proceed to step 405.


In step 445, the list of results produced by applying the heuristics in step 425 for the features in the data packet are referenced to determine which features in the data packet are to be processed using a virtual machine. Malicious probability scores for each feature may be compared against a threshold to determine whether the feature indicates suspicious network content. The features associated with the data packet may be ranked in priority order. The features may be used to prioritize whether to refer the data packet, and associated content, to a virtual machine in the order identified in step 425, in the priority order determined by their respective malicious probability scores, or in some other order.



FIG. 5 illustrates an exemplary controller 500. The controller 500 may comprise the malicious network content detection system 125 according to some embodiments. The controller 500 comprises at least a processor 505, a memory system 510, and a storage system 515, which are all coupled to a bus 520. The controller 500 may also comprise a communication network interface 525, an input/output (I/O) interface 530, and a display interface 535. The communication network interface 525 may couple with the communication network 120 via a communication medium 540. In some embodiments, the controller 500 may couple to a tap, such as the tap 115, which in turn couples with the communication network 120. The bus 520 provides communications between the communications network interface 525, the processor 505, the memory system 510, the storage system 515, the I/O interface 530, and the display interface 535.


The communications network interface 525 may communicate with other digital devices (not shown) via the communications medium 540. The processor 505 executes instructions. The memory system 510 permanently or temporarily stores data. Some examples of the memory system 510 are RAM and ROM. The storage system 515 also permanently or temporarily stores data. Some examples of the storage system 515 are hard disks and disk drives. The I/O interface 530 may include any device that can receive input and provide output to a user. The I/O interface 530 may include, but is not limited to, a keyboard, a mouse, a touchscreen, a keypad, a biosensor, a compact disc (CD) drive, a digital versatile disc (DVD) drive, or a floppy disk drive. The display interface 535 may include an interface configured to support a display, monitor, or screen. In some embodiments, the controller 500 comprises a graphical user interface to be displayed to a user over a monitor in order to allow the user to control the controller 500.


According to other embodiments, malicious network content may also include malicious portable document format (PDF) network content. It will be understood that the term “malicious PDF network content” may be understood to include portable document format (PDF) files located on one or more server devices 105 and made available for distribution via the communication network 120 to one or more client devices 110.


Generally speaking, the tap 115 may be adapted to intercept requests to obtain PDF network content received from a web browser, PDF reader application, or any other application, module, or engine requesting such PDF network content, associated with a client device 110. In some embodiments, the tap 115 may be disposed between one or more client devices 110 and one or more server devices 105 and direct requests to obtain PDF network content into the malicious network content detection system 600, which will be described in greater detail infra with reference to FIG. 6. It will be understood that the tap 115 may also be disposed between two or more server devices 105 and provide mechanisms for intercepting PDF network content exchanged between two or more server devices 105.


As background, PDF network content, such as a PDF document may include a fixed-layout document, that when parsed by a PDF reader application (not shown), produce a visual representation of the data included in the PDF document. The data within a PDF document is arranged in a hierarchical manner beginning with a header, a body portion that includes information indicative of one or more objects, a cross-reference table, also known as an “XREF” table, and a trailer. The header includes information indicative of the PDF specification version number to which the document adheres. The version number may be utilized to determine the PDF document reader version that is best adapted to parse the PDF document.


The XREF table includes offset information indicative of the position of objects within the PDF document. As such, the XREF table allows the PDF reader application to parse or walk individual portions (e.g., pages) of the PDF document without a need to parse or walk the entire PDF document. Lastly, the trailer of the PDF document allows the PDF reader application to efficiently locate the XREF table along with any other pertinent objects that may be utilized by the PDF reader application to construct the visual representation.


The body may contain one or more objects that comprise the contents of the PDF document. Generally speaking, the objects of a PDF document may include, but are not limited to, Boolean operators, numbers, names, strings, arrays, dictionaries, streams, and combinations thereof. The body may also include transparent objects such as metadata, security features, and the like.


Objects of a PDF document may be broadly categorized as either direct or indirect. It will be understood that direct objects may not reference other objects. Conversely, indirect object may make reference to one or more objects, which may include both direct and/or indirect objects. The PDF document may also include interactive elements such as AcroForm elements and XML Forms Data Format (XFDF elements). Both AcroForm and XFDF elements allow the inclusion of JavaScript code, also known as JavaScript API.


It will be understood that malware may be adapted to utilize JavaScript code to exploit vulnerabilities present within both PDF reader applications and web browser applications. It will further be understood that PDF reader applications and web browser applications may cooperate with one another via a plug-in. For example, when a web browser requests a PDF document, the PDF reader application is automatically launched to parse the PDF document. If the PDF document includes one or more objects that reference JavaScript code, the JavaScript codes called during parsing may construct virtual operating systems within the web browser for performing one or more functions associated with the JavaScript code.


While plug-in cooperation between the PDF reader application and web browser application provides significant benefits such as cross-platform compatibility (i.e., agnostic to the operating system of the client device 110), numerous vulnerabilities may be created within the web browser, which may expose the client device 110 to different types of malware, viruses, and the like.


In some embodiments, JavaScript code may exploit one or more vulnerabilities of the web browser application by loading memory allocated for operation of the PDF reader application with malicious code, often commonly referred to as “heap spray.” The heap spray may be adapted to prepare the memory allocated for the PDF reader application with shellcode. Once prepared, the vulnerable JavaScript code is called to trigger the vulnerability, execute the shellcode, and ultimately deliver a payload. It will be understood that the term “payload” may include any deleterious effect caused by malicious network content to the client device 110. It is noteworthy that the deleterious effects caused by malicious network content are far too numerous to include individually, but would be readily appreciated by one of ordinary skill in the art with the present disclosure before them. Non-limiting examples of deleterious effects include consumption of memory, overwriting or corruption of system or program files, and the like.


As with HTML documents, suspicious characteristics indicative of malicious network content included with a PDF document may be determined by way of the heuristic module 130 as previously described supra with reference to FIG. 1. As such, the heuristics module 130 may utilize any one (or combination) of heuristic methodologies located with the heuristics database 135.


Referring now to FIG. 6, in some embodiments, suspicious characteristics indicative of malicious network content included with a PDF document may be determined by way of a malicious network content detection system 600. The malicious network content detection system 600 may include each of the components of malicious network content detection system 125 (see FIG. 1) along with one or more additional modules such as a PDF parser 605. The PDF parser 605 may be adapted to examine received (e.g., intercepted) PDF network content to determine if one or more suspicious characteristics indicative of malicious network content are included in the PDF network content. It will be understood that the term “examine” may be understood to include walking, parsing, inspecting, viewing, compiling, reading, extracting, decoding, and/or combinations thereof.


The malicious network content detection system 600 may be adapted to determine suspicious characteristics indicative of malicious network content by way of methods disclosed with regard to HTML files, such as the exemplary method 400 for detecting malicious network content described above with regard to FIG. 4. Additionally, the PDF parser 605 may be adapted to determine one or more suspicious characteristics indicative of malicious network content that are unique to PDF network content. For example, the PDF parser 605 may be adapted to examine the body of PDF network content to located objects that include specific types of JavaScript code. Some non-limiting examples of JavaScript code that are commonly utilized to exploit vulnerabilities in web browser applications include eval( ), util.printf( ), and media.newPlayer( ).


Moreover, creators of PDF network content malware may attempt to obfuscate JavaScript code references to malware by splitting the JavaScript code into a plurality of objects that when referenced by the PDF reader application may be combined to execute the malicious code. Therefore, the PDF parser 605 may be adapted to determine obfuscated malicious JavaScript code divided across multiple objects utilizing the getField( ) function.


In additional embodiments, the PDF parser 605 may be adapted to determine malicious JavaScript code that is encrypted in to PDF network content utilizing software stream ciphers such as RC4 and AES encryption.


While the above examples contemplate the adaptability of the PDF parser 605 to determine suspicious content such as JavaScript code, the PDF parser 605 may be adapted to determine additional types of malware associated with other objects including, but not limited to, embedded Flash stream objects. For example, a Flash file may include ActionScript virtual machine instructions that are adapted to set up a heap spray with shellcode. In an additional example, malicious code may be incorporated into the PDF network content via one or more Tagged Image File Format (TIFF) vulnerabilities. It will be understood that some TIFF vulnerabilities may not utilize heap spray functionalities.


While not an exclusive list, the following features may be indicative of PDF network content having malicious network content: PDFBadVersion (determines if the PDF specification version number associated with the PDF network content is correct), PDFHeader1_0 (examines the header information of the PDF network document for errors, also known as malformed header information), PDFNameJS (examines the PDF document for PDF objects that include names incorporating JS that are indicative of JavaScript code), PDFNameJavaScript (examines the PDF document for PDF objects that include names incorporating JavaScript that are indicative of JavaScript code), PDFBadFileStart (examines the PDF network content for an improper file start signature), PDFNameOpenAction (examines the PDF network content for PDF objects that cause Javascript functionality to be run on the initial loading of the PDF content), PDFCouldNotParse (determines if the PDF parser is unable to properly parse the PDF network content), and/or combinations thereof. The PDF parser 605 may utilize the aforementioned features to determine one or more suspicious characteristics included in the PDF network content.


It will be understood that because one or more portions (e.g., pages) of PDF network content may be assembled without need to walk or parse the entire PDF network content, the PDF parser 605 may be adapted to evaluate only the portion(s) of the PDF network content that has been requested by the client device 110.


Regardless of the type or amount of suspicious characteristics located by the PDF parser 605, if one or more suspicious characteristics are determined to be included in at least a portion of the PDF network content, the at least a portion of PDF network content may be provided to one or more virtual machines for verification of malicious network content.


The one or more virtual machines may be selected from the virtual machine pool 145 based, in part, upon the particular PDF specification version number included in the header of the PDF network content being analyzed. In some embodiments, the one or more virtual machines may each include one or more augmented finite state machines. Each of these augmented finite state machines may include different configurations of computer readable instructions such as operating system instructions, web browser instructions, PDF reader application instructions, plug-in instructions for operatively coupling one or more web browsers to a PDF reader application, along with any additional types of computer readable instructions adapted to verify the inclusion of malicious network content in PDF network content.


In some embodiments, the use of a plurality (i.e., two or more) of augmented finite state machines may allows for PDF network content to be processed across various system configurations (i.e., permutations of operating system instructions, web browser instructions, and PDF reader application instructions). As such, PDF network content that may not exploit vulnerabilities within one type of web browser application may be verified to exploit vulnerabilities within additional types of web browser applications. These types of multi-faceted analyses are more likely to verify the presence of malicious network content within PDF network content that exploit vulnerabilities of specific versions of programs and/or applications.


Each of the augmented finite state machines includes an analysis environment 150 to simulate the receipt, compiling, execution, and/or display of the PDF network content from the server device 105 to analyze the effects of the PDF network content upon the client device 110. For example, the analysis environment 150 may include a web browser requesting the PDF network content from a server device 105. The web browser causes the PDF reader application to parse the PDF network content. If the PDF network content does, in fact, include malicious code (e.g., JavaScript, Flash, etc.), one or more of the vulnerabilities of the web browser application or the PDF reader application may be exploited by the malicious code to deliver a payload. The effects of the payload may be directly or indirectly observed within the analysis environment 150.


It will be understood that an exemplary analysis environment 150 is described supra in greater detail with reference to FIG. 2 and may be adapted for observing the effects of parsing PDF network content determined to include one or more suspicious characteristics.


In some embodiments, the malicious network content detection system 600 may be further adapted to index PDF network content verified to include malicious network content by associating the PDF network content with an identifier indicative of one or more domains from which the PDF network content was obtained and storing the associated PDF network content as a record that resides with one or more databases operatively coupled to one or more server devices 105.


The database may be utilized by the malicious network content detection system 600 to compare PDF network content determined to include one or more suspicious characteristics to the index of PDF network content previously verified to include malicious network. Additionally, all PDF network documents residing on domains determined to have hosted PDF network content verified to include malicious network content may be automatically reviewed without further examination of the PDF network content for suspicious characteristics.


Referring now to FIG. 7, an exemplary method 700 for detecting malicious PDF network content is shown. The method 700 may include the step 705 of intercepting a request for at least a portion of PDF network content via a tap 115 operatively associated therewith. It will be understood that the tap 115 may intercept the at least a portion of PDF network content before the at least a portion of PDF network is received by the web browser application.


Next, the method 700 may include the step 710 of examining the at least a portion of PDF network content to determine if one or more suspicious features and/or characteristics indicative of malicious network content are included in the at least a portion of PDF network content. As stated previously, examining may include utilizing heuristics or a PDF parser to determine the presence of specific features and/or specific vulnerable JavaScript code included with the at least a portion of PDF network content.


The method 700 may also include the step 715 of providing the at least a portion of PDF network content to one or more virtual machines (also known as augmented finite state machines).


Next, the inclusion of malicious network content in the at least a portion of PDF network content may be verified by executing or compiling the at least a portion of PDF network content in the one or more virtual machines in step 720. The compiling of the at least a portion of PDF network content causes vulnerable JavaScript code to execute any malicious network content associated therewith.


The method 700 then may include the step 725 of observing the performance of the one or more virtual machines to determine if the at least a portion of PDF network content, in fact, includes malicious network content. Observations of performance indicative of the execution of malicious network content include, but are not limited to the generation of heap spray within the memory allocated to the PDF reader application, the execution of shellcode, consumption of memory, overwriting or corruption of system or program files, and the like.


Lastly, the method 730 may include the step of preventing the delivery of the at least a portion of PDF network content verified to include malicious network to the client device from which the request was received.


The embodiments discussed herein are illustrative. As these embodiments are described with reference to illustrations, various modifications or adaptations of the methods and/or specific structures described may become apparent to those skilled in the art.


The above-described modules may be comprised of instructions that are stored on storage media (e.g., computer readable media). The instructions may be retrieved and executed by a processor (e.g., the processor 505). Some examples of instructions include software, program code, and firmware. Some examples of storage media comprise memory devices and integrated circuits. The instructions are operational when executed by the processor to direct the processor to operate in accordance with embodiments of the present invention. Those skilled in the art are familiar with instructions, processor(s), and storage media.


In the foregoing specification, the invention is described with reference to specific embodiments thereof, but those skilled in the art will recognize that the invention is not limited thereto. Various features and aspects of the above-described invention can be used individually or jointly. Further, the invention can be utilized in any number of environments and applications beyond those described herein without departing from the broader spirit and scope of the specification. The specification and drawings are, accordingly, to be regarded as illustrative rather than restrictive. It will be recognized that the terms “comprising,” “including,” and “having,” as used herein, are specifically intended to be read as open-ended terms of art.

Claims
  • 1. A system comprising: a hardware processor; anda memory device coupled to the processor, the memory device comprises a parser that, when executed by the hardware processor, examines one or more portions of a Portable Document Format (PDF) document to determine if one or more suspicious characteristics indicative of malicious network content are included in the one or more examined portions of the PDF document, wherein the one or more examined portions of the PDF document comprise less than an entirety of the PDF document, andone or more virtual machines to receive the PDF document in response to the one or more examined portions of the PDF document being determined to include one or more suspicious characteristics indicative of malicious network content, the one or more virtual machines to process at least the one or more examined portions of the PDF document so as to determine whether the PDF document includes malicious network content.
  • 2. The system of claim 1, wherein the PDF document is received by the system over a communication network.
  • 3. The system of claim 2, wherein a body portion of the PDF document is examined and the entirety of the PDF document is not examined prior to providing the PDF document to the one or more virtual machines.
  • 4. The system of claim 3, wherein at least one of a header, a cross-reference table, or a trailer of the PDF document is not examined prior to providing the PDF document to the one or more virtual machines.
  • 5. The system of claim 2 being communicatively coupled to a digital data tap being a device connected to the communication network to intercept the PDF document being transmitted from a web server and to provide at least the PDF document to the system.
  • 6. The system of claim 5, wherein the digital data tap being configured to monitor network data over the communication network and provide the PDF document within the network data as the PDF document to the parser being executed by the hardware processor.
  • 7. The system of claim 1, wherein the parser examines the one or more portions of the PDF document by at least: examining one or more of a header section, a body section, a trailer section, or a cross-reference table section of the PDF document; andproviding the PDF document to the one or more virtual machines to verify the inclusion of malicious network content in the PDF document when one or more of the body section, the header section, the trailer section or the cross-reference table section of the PDF document is determined to include one or more suspicious characteristics indicative of malicious network content.
  • 8. The system of claim 7, wherein the cross-reference table allows a PDF reader application to parse pages of the PDF document without a need of parsing the PDF document in its entirety.
  • 9. The system of claim 7, wherein the cross-reference table of the PDF document includes offset information indicative of a portion of objects within content of the PDF document.
  • 10. The system of claim 9, wherein the body of the PDF document includes one or more objects included as part of the content of the PDF document.
  • 11. The system of claim 1, wherein the one or more virtual machines are configured based on one or more specification version numbers of the PDF document.
  • 12. The system of claim 11, wherein a specification version number of the one or more specification version numbers is used to identify a reader application version to be run in a virtual machine of the one or more virtual machines.
  • 13. The system of claim 1, wherein the one or more examined portions of the PDF document comprises a body section and a header section of the PDF document received over a communication network.
  • 14. The system of claim 1, wherein a determination by the parser if the one or more suspicious characteristics indicative of malicious network content are included in the one or more examined portions of the PDF document, comprises: determining a score associated with the one or more suspicious characteristics for the PDF document, the score indicative of a probability that the PDF document includes malicious network content; andidentifying the PDF document as suspicious if the score satisfies a threshold value.
  • 15. The system of claim 1, wherein the parser, when executed by the processor, examines the PDF document by at least applying heuristics to determine if at least one suspicious characteristic indicative of malicious network content is included in the PDF document.
  • 16. The system of claim 1, wherein the memory further comprises a module that, when executed by the processor, prevents the delivery of the PDF document verified to include malicious network content to a web browser application from which the delivery was requested.
  • 17. The system of claim 1, wherein the parser examines the one or more portions of the PDF document by at least examining a header section of the PDF document and the entirety of the PDF document is not examined prior to providing the PDF document to the one or more virtual machines; andwhen the header section of the PDF document is determined to include one or more suspicious characteristics indicative of malicious network content, providing the PDF document to the one or more virtual machines to verify the inclusion of malicious network content in the PDF document.
  • 18. The system of claim 1, wherein the parser examines the one or more portions of the PDF document by at least examining a body section of the PDF document; andwhen the body section of the PDF document is determined to include one or more suspicious characteristics indicative of malicious network content, providing the PDF document to the one or more virtual machines to verify the inclusion of malicious network content in the PDF document.
  • 19. The system of claim 1, wherein the one or more virtual machines are configured based on at least data associated with the PDF document in response to the one or more examined portions of the PDF document being determined to include one or more suspicious characteristics indicative of malicious network content.
  • 20. The system of claim 1, wherein the PDF document is included as part of an HTML document.
  • 21. The system of claim 1, wherein the parser, upon execution by the processor, examines the PDF document by decoding content within the PDF document prior to parsing content within the PDF document.
  • 22. The system of claim 1, wherein the parser is configured to determine the one or more suspicious characteristics including examining the one or more examined portions of the PDF document for a specific type or types of JavaScript code.
  • 23. The system of claim 1, wherein the one or more virtual machines process at least the one or more examined portions of the PDF document using both a PDF reader application and a web browser application.
  • 24. The system of claim 23, wherein the PDF reader application and the web browser application cooperate with each other via a plug-in.
  • 25. The system of claim 1, wherein a first virtual machine of the one or more virtual machines processes at least the one or more examined portions of the PDF document using a PDF reader application and at least a second application operating in cooperation with the PDF reader application.
  • 26. The system of claim 25, wherein the second application operates in cooperation with the PDF reader application via a plug-in.
  • 27. The system of claim 26, wherein the second application includes a web browser application.
  • 28. A non-transitory computer readable storage medium storing software that, upon execution by a processor, detects malware within a Portable Document Format (PDF) document, the non-transitory computer readable storage medium comprising: a parser that, when executed by the processor, examines one or more portions of the PDF document to determine if one or more suspicious characteristics indicative of malicious network content are included in the one or more examined portions of the PDF document, wherein the one or more examined portions of the PDF document comprise less than an entirety of the PDF document, andone or more virtual machines to receive the PDF document in response to the one or more examined portions of the PDF document being determined to include one or more suspicious characteristics indicative of malicious network content, the one or more virtual machines to process at least the one or more examined portions of the PDF document so as to determine whether the PDF document includes malicious network content.
  • 29. The non-transitory computer readable storage medium of claim 28, wherein the parser determines if the one or more suspicious characteristics indicative of malicious network content are included in the one or more examined portions of the PDF document by at least: determining a score associated with the one or more suspicious characteristics for the PDF document, the score indicative of a probability that the PDF document includes malicious network content; andidentifying the PDF document as suspicious if the score satisfies a threshold value.
  • 30. The non-transitory computer readable storage medium of claim 28, wherein the PDF document is received by the system over a communication network.
  • 31. The non-transitory computer readable storage medium of claim 28, wherein at least one of a header, a cross-reference table, or a trailer of the PDF document is not examined prior to providing the PDF document to the one or more virtual machines.
  • 32. The non-transitory computer readable storage medium of claim 28, wherein an examination of the one or more portions of the PDF document by the parser, when executed by the processor, further comprises: examining at least one of a header section or a body section of the PDF document; andwhen the body section or the header section of the PDF document is determined to include one or more suspicious characteristics indicative of malicious network content, providing the PDF document to the one or more virtual machines to verify the inclusion of malicious network content in the PDF document.
  • 33. The non-transitory computer readable storage medium of claim 28, wherein the parser examines a body portion of the PDF document and the entirety of the PDF document is not examined prior to providing the PDF document to the one or more virtual machines.
  • 34. The non-transitory computer readable storage medium of claim 28 wherein the parser, when executed by the processor, examines the PDF document by at least applying heuristics to determine if at least one suspicious characteristic indicative of malicious network content is included in the PDF document.
  • 35. The non-transitory computer readable storage medium of claim 28, wherein the medium further comprises a module that, when executed by the processor, prevents the delivery of the PDF document verified to include malicious network content to a web browser application from which the delivery was requested.
  • 36. The non-transitory computer readable storage medium of claim 28, wherein the one or more virtual machines are configured based on at least data associated with the PDF document in response to the one or more examined portions of the PDF document being determined to include one or more suspicious characteristics indicative of malicious network content.
  • 37. The non-transitory computer readable storage medium of claim 36, wherein the one or more virtual machines are configured based on one or more specification version numbers of the PDF document.
  • 38. The non-transitory computer readable storage medium of claim 37, wherein a specification version number of the one or more specification version numbers is used to identify a specific reader application version to be run in a virtual machine of the one or more virtual machines.
  • 39. The non-transitory computer readable storage medium of claim 28, wherein the one or more examined portions of the PDF document comprises a body section and a header section of the PDF document.
  • 40. The non-transitory computer readable storage medium of claim 28, wherein the parser examines a header of the PDF document and the entirety of the PDF document is not examined prior to providing the PDF document to the one or more virtual machines.
  • 41. The non-transitory computer readable storage medium of claim 28, wherein a first virtual machine of the one or more virtual machines processes at least the one or more examined portions of the PDF document using a PDF reader application and at least a second application operating in cooperation with the PDF reader application.
  • 42. The non-transitory computer readable storage medium of claim 41, wherein the second application operates in cooperation with the PDF reader application via a plug-in.
  • 43. The non-transitory computer readable storage medium of claim 42, wherein the second application includes a web browser application.
  • 44. The non-transitory computer readable storage medium of claim 28, wherein the PDF document is included as part of an HTML document.
  • 45. The non-transitory computer readable storage medium of claim 28, wherein the parser, upon execution by the processor, examines the PDF document by decoding content within the PDF document prior to parsing content within the PDF document.
  • 46. The non-transitory computer readable storage medium of claim 28, wherein the parser is configured to determine the one or more suspicious characteristics including examining the one or more examined portions of the PDF document for a specific type or types of JavaScript code.
  • 47. A computerized method, comprising: examining, by a parser executed by a hardware processor, one or more portions of a Portable Document Format (PDF) document to determine if one or more suspicious characteristics indicative of malicious network content are included in the one or more examined portions of the PDF document, wherein the one or more examined portions of the PDF document comprise less than an entirety of the PDF document, andupon the one or more examined portions of the PDF document being determined to include one or more suspicious characteristics indicative of malicious network content, receiving the PDF document by one or more virtual machines to process at least the one or more examined portions of the PDF document so as to determine whether the PDF document includes malicious network content.
  • 48. The method of claim 47, wherein the examining of the one or more portions of the PDF document comprises: determining a score associated with the one or more suspicious characteristics for the PDF document, the score indicative of a probability that the PDF document includes malicious network content; andidentifying the PDF document as suspicious if the score satisfies a threshold value.
  • 49. The method of claim 47, wherein prior to the examining of the one or more portions of the PDF document, the method further comprises receiving the PDF document over a communication network.
  • 50. The method of claim 47, wherein the examining of the one or more portions of the PDF document excludes from an examination of at least one of a header, a cross-reference table, or a trailer of the PDF document prior to providing the PDF document to the one or more virtual machines.
  • 51. The method of claim 47, the examining of the one or more portions of the PDF document comprises: examining at least one of a header section or a body section of the PDF document; andwhen the body section or the header section of the PDF document is determined to include one or more suspicious characteristics indicative of malicious network content, providing the PDF document to the one or more virtual machines to verify the inclusion of malicious network content in the PDF document.
  • 52. The method of claim 47 wherein the examining of the one or more portions of the PDF document comprises examining the PDF document by at least applying heuristics to determine if at least one suspicious characteristic indicative of malicious network content is included in the PDF document.
  • 53. The method of claim 47 further comprising: preventing the delivery of the PDF document verified to include malicious network content to a web browser application from which the delivery was requested.
  • 54. The method of claim 47, wherein the one or more virtual machines are configured based on at least data associated with the PDF document in response to the one or more examined portions of the PDF documents being determined to include the one or more suspicious characteristics.
  • 55. The method of claim 54, wherein the one or more virtual machines are configured based on one or more specification version numbers of the PDF document.
  • 56. The method of claim 55, wherein a specification version number of the one or more specification version numbers is used to identify a specific reader application version to be run in a virtual machine of the one or more virtual machines.
  • 57. The method of claim 47, wherein the one or more examined portions of the PDF document comprises a body section and a header section of the PDF document.
  • 58. The method of claim 47, wherein the examining of the one or more portions of the PDF document comprises examining a header of the PDF document and the entirety of the PDF document is not examined prior to the PDF document being received by the one or more virtual machines.
  • 59. The method of claim 47, wherein prior to examining the one or more portions of the PDF document, the method further comprising receiving the PDF document from a web server via a network.
  • 60. The method of claim 59, wherein prior to receiving of the PDF document, monitoring network data being transmitted over the network between the web server and a client device and intercepting the PDF document being sent to the client device from the web server.
  • 61. The method of claim 60, wherein the intercepting of the PDF document includes providing a copy of the PDF document as the PDF document to the parser.
CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 14/673,292 filed Mar. 30, 2015, now U.S. Pat. No. 9,438,622 issued Sep. 6, 2016, entitled “Systems and Methods for Analyzing Malicious PDF Network Content, which is a continuation of U.S. patent application Ser. No. 13/011,344 filed on Jan. 21, 2011, now U.S. Pat. No. 8,997,219 issued Mar. 31, 2015, which is a continuation-in-part of U.S. patent application Ser. No. 12/263,971 entitled “Systems and Methods for Detecting Malicious Network Content” and filed on Nov. 3, 2008, now U.S. Pat. No. 8,850,571 issued Sep. 30, 2014. This application is also related to co-pending U.S. patent application Ser. No. 11/409,355 entitled “Heuristic Based Capture with Replay to Virtual Machine” and filed on Apr. 20, 2006, which is a continuation-in-part of U.S. patent application Ser. No. 11/152,286 entitled “Computer Worm Defense System and Method” and filed on Jun. 13, 2005, which claims the priority benefit of U.S. Provisional Patent Application Ser. No. 60/579,910 entitled “Computer Worm Defense System and Method” and filed on Jun. 14, 2004. U.S. patent application Ser. No. 11/409,355 is also a continuation-in-part of U.S. patent application Ser. No. 11/096,287 entitled “System and Method of Detecting Computer Worms” and filed on Mar. 31, 2005, which claims the priority benefit of U.S. Provisional Patent Application Ser. No. 60/559,198 entitled “System and Method of Detecting Computer Worms” and filed on Apr. 1, 2004. U.S. patent application Ser. No. 11/409,355 is also a continuation-in-part of U.S. patent application Ser. No. 11/151,812 entitled “System and Method of Containing Computer Worms” and filed on Jun. 13, 2005, which claims the priority benefit of U.S. Provisional Patent Application No. 60/579,953 entitled “System and Method of Containing Computer Worms” and filed on Jun. 14, 2004. Each of the aforementioned patent applications are incorporated by reference herein.

US Referenced Citations (515)
Number Name Date Kind
4292580 Ott et al. Sep 1981 A
5175732 Hendel et al. Dec 1992 A
5440723 Arnold et al. Aug 1995 A
5490249 Miller Feb 1996 A
5537540 Miller et al. Jul 1996 A
5603027 Ohkami Feb 1997 A
5657473 Killean et al. Aug 1997 A
5842002 Schnurer et al. Nov 1998 A
5978917 Chi Nov 1999 A
6088803 Tso et al. Jul 2000 A
6094677 Capek et al. Jul 2000 A
6108799 Boulay et al. Aug 2000 A
6118382 Hibbs et al. Sep 2000 A
6269330 Cidon et al. Jul 2001 B1
6272641 Ji Aug 2001 B1
6279113 Vaidya Aug 2001 B1
6298445 Shostack et al. Oct 2001 B1
6357008 Nachenberg Mar 2002 B1
6417774 Hibbs et al. Jul 2002 B1
6424627 Sørhaug et al. Jul 2002 B1
6442696 Wray et al. Aug 2002 B1
6484315 Ziese Nov 2002 B1
6487666 Shanklin et al. Nov 2002 B1
6493756 O'Brien et al. Dec 2002 B1
6550012 Villa et al. Apr 2003 B1
6700497 Hibbs et al. Mar 2004 B2
6775657 Baker Aug 2004 B1
6831893 Ben Nun et al. Dec 2004 B1
6832367 Choi et al. Dec 2004 B1
6895550 Kanchirayappa et al. May 2005 B2
6898632 Gordy et al. May 2005 B2
6907396 Muttik et al. Jun 2005 B1
6941348 Petry et al. Sep 2005 B2
6971097 Wallman Nov 2005 B1
6981279 Arnold et al. Dec 2005 B1
6995665 Appelt et al. Feb 2006 B2
7007107 Ivchenko et al. Feb 2006 B1
7028179 Anderson et al. Apr 2006 B2
7043757 Hoefelmeyer et al. May 2006 B2
7069316 Gryaznov Jun 2006 B1
7080407 Zhao et al. Jul 2006 B1
7080408 Pak et al. Jul 2006 B1
7093002 Wolff et al. Aug 2006 B2
7093239 van der Made Aug 2006 B1
7096498 Judge Aug 2006 B2
7100201 Izatt Aug 2006 B2
7107617 Hursey et al. Sep 2006 B2
7159149 Spiegel et al. Jan 2007 B2
7213260 Judge May 2007 B2
7231667 Jordan Jun 2007 B2
7240364 Branscomb et al. Jul 2007 B1
7240368 Roesch et al. Jul 2007 B1
7243371 Kasper et al. Jul 2007 B1
7249175 Donaldson Jul 2007 B1
7251215 Turner et al. Jul 2007 B1
7287278 Liang Oct 2007 B2
7308716 Danford et al. Dec 2007 B2
7328453 Merkle, Jr. et al. Feb 2008 B2
7346486 Ivancic et al. Mar 2008 B2
7356736 Natvig Apr 2008 B2
7386888 Liang et al. Jun 2008 B2
7392542 Bucher Jun 2008 B2
7418729 Szor Aug 2008 B2
7428300 Drew et al. Sep 2008 B1
7441272 Durham et al. Oct 2008 B2
7448084 Apap et al. Nov 2008 B1
7458098 Judge et al. Nov 2008 B2
7464404 Carpenter et al. Dec 2008 B2
7464407 Nakae et al. Dec 2008 B2
7467408 O'Toole, Jr. Dec 2008 B1
7478428 Thomlinson Jan 2009 B1
7480773 Reed Jan 2009 B1
7487543 Arnold et al. Feb 2009 B2
7496960 Chen et al. Feb 2009 B1
7496961 Zimmer et al. Feb 2009 B2
7516488 Kienzle et al. Apr 2009 B1
7519990 Xie Apr 2009 B1
7523493 Liang et al. Apr 2009 B2
7530104 Thrower et al. May 2009 B1
7540025 Tzadikario May 2009 B2
7546638 Anderson et al. Jun 2009 B2
7565550 Liang et al. Jul 2009 B2
7568233 Szor et al. Jul 2009 B1
7584455 Ball Sep 2009 B2
7603715 Costa et al. Oct 2009 B2
7607171 Marsden et al. Oct 2009 B1
7639714 Stolfo et al. Dec 2009 B2
7644441 Schmid et al. Jan 2010 B2
7657419 van der Made Feb 2010 B2
7676841 Sobchuk et al. Mar 2010 B2
7698548 Shelest et al. Apr 2010 B2
7707633 Danford et al. Apr 2010 B2
7712136 Sprosts et al. May 2010 B2
7730011 Deninger et al. Jun 2010 B1
7739740 Nachenberg et al. Jun 2010 B1
7779463 Stolfo et al. Aug 2010 B2
7784097 Stolfo et al. Aug 2010 B1
7832008 Kraemer Nov 2010 B1
7836502 Zhao et al. Nov 2010 B1
7849506 Dansey et al. Dec 2010 B1
7854007 Sprosts et al. Dec 2010 B2
7869073 Oshima Jan 2011 B2
7877803 Enstone et al. Jan 2011 B2
7904959 Sidiroglou et al. Mar 2011 B2
7908660 Bahl Mar 2011 B2
7930738 Petersen Apr 2011 B1
7937761 Bennett May 2011 B1
7949849 Lowe et al. May 2011 B2
7996556 Raghavan et al. Aug 2011 B2
7996836 McCorkendale et al. Aug 2011 B1
7996904 Chiueh et al. Aug 2011 B1
7996905 Arnold et al. Aug 2011 B2
8006305 Aziz Aug 2011 B2
8010667 Zhang et al. Aug 2011 B2
8020206 Hubbard et al. Sep 2011 B2
8028338 Schneider et al. Sep 2011 B1
8042184 Batenin Oct 2011 B1
8045094 Teragawa Oct 2011 B2
8045458 Alperovitch et al. Oct 2011 B2
8069484 McMillan et al. Nov 2011 B2
8087086 Lai et al. Dec 2011 B1
8171553 Aziz et al. May 2012 B2
8176049 Deninger et al. May 2012 B2
8176480 Spertus May 2012 B1
8201246 Wu Jun 2012 B1
8204984 Aziz et al. Jun 2012 B1
8214905 Doukhvalov et al. Jul 2012 B1
8220055 Kennedy Jul 2012 B1
8225288 Miller et al. Jul 2012 B2
8225373 Kraemer Jul 2012 B2
8233882 Rogel Jul 2012 B2
8234640 Fitzgerald et al. Jul 2012 B1
8234709 Viljoen et al. Jul 2012 B2
8239944 Nachenberg et al. Aug 2012 B1
8260914 Ranjan Sep 2012 B1
8266091 Gubin et al. Sep 2012 B1
8286251 Eker et al. Oct 2012 B2
8291499 Aziz et al. Oct 2012 B2
8307435 Mann et al. Nov 2012 B1
8307443 Wang et al. Nov 2012 B2
8312545 Tuvell et al. Nov 2012 B2
8321936 Green et al. Nov 2012 B1
8321941 Tuvell et al. Nov 2012 B2
8332571 Edwards, Sr. Dec 2012 B1
8365286 Poston Jan 2013 B2
8365297 Parshin et al. Jan 2013 B1
8370938 Daswani Feb 2013 B1
8370939 Zaitsev et al. Feb 2013 B2
8375444 Aziz et al. Feb 2013 B2
8381299 Stolfo et al. Feb 2013 B2
8402529 Green et al. Mar 2013 B1
8464340 Ahn et al. Jun 2013 B2
8479174 Chiriac Jul 2013 B2
8479276 Vaystikh et al. Jul 2013 B1
8479291 Bodke Jul 2013 B1
8510827 Leake et al. Aug 2013 B1
8510828 Guo et al. Aug 2013 B1
8510842 Amit et al. Aug 2013 B2
8516478 Edwards et al. Aug 2013 B1
8516590 Ranadive et al. Aug 2013 B1
8516593 Aziz Aug 2013 B2
8522348 Chen et al. Aug 2013 B2
8528086 Aziz Sep 2013 B1
8533824 Hutton et al. Sep 2013 B2
8539582 Aziz et al. Sep 2013 B1
8549638 Aziz Oct 2013 B2
8555391 Demir et al. Oct 2013 B1
8561177 Aziz et al. Oct 2013 B1
8566946 Aziz et al. Oct 2013 B1
8584094 Dadhia et al. Nov 2013 B2
8584234 Sobel et al. Nov 2013 B1
8584239 Aziz et al. Nov 2013 B2
8595834 Xie et al. Nov 2013 B2
8627476 Satish et al. Jan 2014 B1
8635696 Aziz Jan 2014 B1
8682054 Xue et al. Mar 2014 B2
8682812 Ranjan Mar 2014 B1
8689333 Aziz Apr 2014 B2
8695096 Zhang Apr 2014 B1
8713631 Pavlyushchik Apr 2014 B1
8713681 Silberman et al. Apr 2014 B2
8726392 McCorkendale et al. May 2014 B1
8739280 Chess et al. May 2014 B2
8776229 Aziz Jul 2014 B1
8782792 Bodke Jul 2014 B1
8789172 Stolfo et al. Jul 2014 B2
8789178 Kejriwal et al. Jul 2014 B2
8793787 Ismael et al. Jul 2014 B2
8805947 Kuzkin et al. Aug 2014 B1
8806647 Daswani et al. Aug 2014 B1
8832829 Manni et al. Sep 2014 B2
8850570 Ramzan Sep 2014 B1
8850571 Staniford Sep 2014 B2
8881234 Narasimhan et al. Nov 2014 B2
8881282 Aziz et al. Nov 2014 B1
8898788 Aziz et al. Nov 2014 B1
8935779 Manni et al. Jan 2015 B2
8984638 Aziz et al. Mar 2015 B1
8990939 Staniford et al. Mar 2015 B2
8990944 Singh et al. Mar 2015 B1
8997219 Staniford Mar 2015 B2
9009822 Ismael et al. Apr 2015 B1
9009823 Ismael et al. Apr 2015 B1
9027135 Aziz May 2015 B1
9071638 Aziz et al. Jun 2015 B1
9104867 Thioux et al. Aug 2015 B1
9106694 Aziz et al. Aug 2015 B2
9118715 Staniford Aug 2015 B2
9438622 Staniford Sep 2016 B1
20010005889 Albrecht Jun 2001 A1
20010047326 Broadbent et al. Nov 2001 A1
20020018903 Kokubo et al. Feb 2002 A1
20020038430 Edwards et al. Mar 2002 A1
20020091819 Melchione et al. Jul 2002 A1
20020095607 Lin-Hendel Jul 2002 A1
20020116627 Tarbotton et al. Aug 2002 A1
20020144156 Copeland Oct 2002 A1
20020162015 Tang Oct 2002 A1
20020166063 Lachman et al. Nov 2002 A1
20020169952 DiSanto et al. Nov 2002 A1
20020184528 Shevenell et al. Dec 2002 A1
20020188887 Largman et al. Dec 2002 A1
20020194490 Halperin et al. Dec 2002 A1
20030021728 Sharpe et al. Jan 2003 A1
20030074578 Ford et al. Apr 2003 A1
20030084318 Schertz May 2003 A1
20030101381 Mateev et al. May 2003 A1
20030115483 Liang Jun 2003 A1
20030188190 Aaron et al. Oct 2003 A1
20030191957 Hypponen et al. Oct 2003 A1
20030200460 Morota et al. Oct 2003 A1
20030212902 van der Made Nov 2003 A1
20030229801 Kouznetsov et al. Dec 2003 A1
20030237000 Denton et al. Dec 2003 A1
20040003323 Bennett et al. Jan 2004 A1
20040006473 Mills et al. Jan 2004 A1
20040015712 Szor Jan 2004 A1
20040019832 Arnold et al. Jan 2004 A1
20040047356 Bauer Mar 2004 A1
20040083408 Spiegel et al. Apr 2004 A1
20040088581 Brawn et al. May 2004 A1
20040093513 Cantrell et al. May 2004 A1
20040111531 Staniford et al. Jun 2004 A1
20040117478 Triulzi et al. Jun 2004 A1
20040117624 Brandt et al. Jun 2004 A1
20040128355 Chao et al. Jul 2004 A1
20040165588 Pandya Aug 2004 A1
20040236963 Danford et al. Nov 2004 A1
20040243349 Greifeneder et al. Dec 2004 A1
20040249911 Alkhatib et al. Dec 2004 A1
20040255161 Cavanaugh Dec 2004 A1
20040268147 Wiederin et al. Dec 2004 A1
20050005159 Oliphant Jan 2005 A1
20050021740 Bar et al. Jan 2005 A1
20050033960 Vialen et al. Feb 2005 A1
20050033989 Poletto et al. Feb 2005 A1
20050050148 Mohammadioun et al. Mar 2005 A1
20050086523 Zimmer et al. Apr 2005 A1
20050091513 Mitomo et al. Apr 2005 A1
20050091533 Omote et al. Apr 2005 A1
20050091652 Ross et al. Apr 2005 A1
20050108562 Khazan et al. May 2005 A1
20050114663 Cornell et al. May 2005 A1
20050125195 Brendel Jun 2005 A1
20050149726 Joshi et al. Jul 2005 A1
20050157662 Bingham et al. Jul 2005 A1
20050183143 Anderholm et al. Aug 2005 A1
20050201297 Peikari Sep 2005 A1
20050210533 Copeland et al. Sep 2005 A1
20050238005 Chen et al. Oct 2005 A1
20050240781 Gassoway Oct 2005 A1
20050262562 Gassoway Nov 2005 A1
20050265331 Stolfo Dec 2005 A1
20050283839 Cowbum Dec 2005 A1
20060010495 Cohen et al. Jan 2006 A1
20060015416 Hoffman et al. Jan 2006 A1
20060015715 Anderson Jan 2006 A1
20060015747 Van de Ven Jan 2006 A1
20060021029 Brickell et al. Jan 2006 A1
20060021054 Costa et al. Jan 2006 A1
20060031476 Mathes et al. Feb 2006 A1
20060047665 Neil Mar 2006 A1
20060070130 Costea et al. Mar 2006 A1
20060075496 Carpenter et al. Apr 2006 A1
20060095968 Portolani et al. May 2006 A1
20060101516 Sudaharan et al. May 2006 A1
20060101517 Banzhof et al. May 2006 A1
20060117385 Mester et al. Jun 2006 A1
20060123477 Raghavan et al. Jun 2006 A1
20060143709 Brooks et al. Jun 2006 A1
20060150249 Gassen et al. Jul 2006 A1
20060161983 Cothrell et al. Jul 2006 A1
20060161987 Levy-Yurista Jul 2006 A1
20060161989 Reshef et al. Jul 2006 A1
20060164199 Gilde et al. Jul 2006 A1
20060173992 Weber et al. Aug 2006 A1
20060179147 Tran et al. Aug 2006 A1
20060184632 Marino et al. Aug 2006 A1
20060191010 Benjamin Aug 2006 A1
20060221956 Narayan et al. Oct 2006 A1
20060236393 Kramer et al. Oct 2006 A1
20060242709 Seinfeld et al. Oct 2006 A1
20060248519 Jaeger et al. Nov 2006 A1
20060248582 Panjwani et al. Nov 2006 A1
20060251104 Koga Nov 2006 A1
20060288417 Bookbinder et al. Dec 2006 A1
20070006288 Mayfield et al. Jan 2007 A1
20070006313 Porras et al. Jan 2007 A1
20070011174 Takaragi et al. Jan 2007 A1
20070016951 Piccard et al. Jan 2007 A1
20070033645 Jones Feb 2007 A1
20070038943 FitzGerald et al. Feb 2007 A1
20070064689 Shin et al. Mar 2007 A1
20070074169 Chess et al. Mar 2007 A1
20070094730 Bhikkaji et al. Apr 2007 A1
20070101435 Konanka et al. May 2007 A1
20070128855 Cho et al. Jun 2007 A1
20070142030 Sinha et al. Jun 2007 A1
20070143827 Nicodemus et al. Jun 2007 A1
20070156895 Vuong Jul 2007 A1
20070157180 Tillmann et al. Jul 2007 A1
20070157306 Elrod et al. Jul 2007 A1
20070168988 Eisner et al. Jul 2007 A1
20070171824 Ruello et al. Jul 2007 A1
20070174915 Gribble et al. Jul 2007 A1
20070192500 Lum Aug 2007 A1
20070192858 Lum Aug 2007 A1
20070198275 Malden et al. Aug 2007 A1
20070208822 Wang et al. Sep 2007 A1
20070220607 Sprosts et al. Sep 2007 A1
20070240218 Tuvell et al. Oct 2007 A1
20070240219 Tuvell et al. Oct 2007 A1
20070240220 Tuvell et al. Oct 2007 A1
20070240222 Tuvell et al. Oct 2007 A1
20070250930 Aziz Oct 2007 A1
20070256132 Oliphant Nov 2007 A2
20070271446 Nakamura Nov 2007 A1
20080005782 Aziz Jan 2008 A1
20080018122 Zierler et al. Jan 2008 A1
20080028463 Dagon et al. Jan 2008 A1
20080032556 Schreier Feb 2008 A1
20080040710 Chiriac Feb 2008 A1
20080046781 Childs et al. Feb 2008 A1
20080066179 Liu Mar 2008 A1
20080072326 Danford et al. Mar 2008 A1
20080077793 Tan et al. Mar 2008 A1
20080080518 Hoeflin et al. Apr 2008 A1
20080086720 Lekel Apr 2008 A1
20080098476 Syversen Apr 2008 A1
20080120722 Sima et al. May 2008 A1
20080134178 Fitzgerald et al. Jun 2008 A1
20080134334 Kim et al. Jun 2008 A1
20080141376 Clausen Jun 2008 A1
20080184373 Traut et al. Jul 2008 A1
20080189787 Arnold et al. Aug 2008 A1
20080201778 Guo et al. Aug 2008 A1
20080209557 Herley et al. Aug 2008 A1
20080215742 Goldszmidt et al. Sep 2008 A1
20080222728 Chavez et al. Sep 2008 A1
20080222729 Chen et al. Sep 2008 A1
20080263665 Ma et al. Oct 2008 A1
20080295172 Bohacek Nov 2008 A1
20080301810 Lehane et al. Dec 2008 A1
20080307524 Singh et al. Dec 2008 A1
20080313738 Enderby Dec 2008 A1
20080320594 Jiang Dec 2008 A1
20090003317 Kasralikar et al. Jan 2009 A1
20090004427 Sarrafi-Nour et al. Jan 2009 A1
20090007100 Field et al. Jan 2009 A1
20090013408 Schipka Jan 2009 A1
20090031423 Liu et al. Jan 2009 A1
20090036111 Danford et al. Feb 2009 A1
20090037835 Goldman Feb 2009 A1
20090044024 Oberheide et al. Feb 2009 A1
20090044274 Budko et al. Feb 2009 A1
20090064332 Porras et al. Mar 2009 A1
20090077666 Chen et al. Mar 2009 A1
20090083369 Marmor Mar 2009 A1
20090083855 Apap et al. Mar 2009 A1
20090089879 Wang et al. Apr 2009 A1
20090094697 Provos et al. Apr 2009 A1
20090113425 Ports et al. Apr 2009 A1
20090125976 Wassermann et al. May 2009 A1
20090126015 Monastyrsky et al. May 2009 A1
20090126016 Sobko et al. May 2009 A1
20090133125 Choi et al. May 2009 A1
20090144823 Lamastra et al. Jun 2009 A1
20090158430 Borders Jun 2009 A1
20090172815 Gu et al. Jul 2009 A1
20090187992 Poston Jul 2009 A1
20090193293 Stolfo et al. Jul 2009 A1
20090199296 Xie et al. Aug 2009 A1
20090228233 Anderson et al. Sep 2009 A1
20090241187 Troyansky Sep 2009 A1
20090241190 Todd et al. Sep 2009 A1
20090265692 Godefroid et al. Oct 2009 A1
20090271867 Zhang Oct 2009 A1
20090300415 Zhang et al. Dec 2009 A1
20090300761 Park et al. Dec 2009 A1
20090328185 Berg et al. Dec 2009 A1
20090328221 Blumfield et al. Dec 2009 A1
20100005146 Drako et al. Jan 2010 A1
20100011205 McKenna Jan 2010 A1
20100017546 Poo et al. Jan 2010 A1
20100031353 Thomas et al. Feb 2010 A1
20100037314 Perdisci et al. Feb 2010 A1
20100043073 Kuwamura Feb 2010 A1
20100054278 Stolfo et al. Mar 2010 A1
20100058474 Hicks Mar 2010 A1
20100064044 Nonoyama Mar 2010 A1
20100077481 Polyakov et al. Mar 2010 A1
20100083376 Pereira et al. Apr 2010 A1
20100100718 Srinivasan Apr 2010 A1
20100115621 Staniford et al. May 2010 A1
20100132038 Zaitsev May 2010 A1
20100154056 Smith et al. Jun 2010 A1
20100180344 Malyshev et al. Jul 2010 A1
20100192223 Ismael et al. Jul 2010 A1
20100220863 Dupaquis et al. Sep 2010 A1
20100235831 Dittmer Sep 2010 A1
20100251104 Massand Sep 2010 A1
20100281102 Chinta et al. Nov 2010 A1
20100281541 Stolfo et al. Nov 2010 A1
20100281542 Stolfo et al. Nov 2010 A1
20100287260 Peterson et al. Nov 2010 A1
20100299754 Amit et al. Nov 2010 A1
20100306173 Frank Dec 2010 A1
20110004737 Greenebaum Jan 2011 A1
20110025504 Lyon et al. Feb 2011 A1
20110041179 St Hlberg Feb 2011 A1
20110047594 Mahaffey et al. Feb 2011 A1
20110047620 Mahaffey et al. Feb 2011 A1
20110055907 Narasimhan et al. Mar 2011 A1
20110078794 Manni et al. Mar 2011 A1
20110093951 Aziz Apr 2011 A1
20110099620 Stavrou et al. Apr 2011 A1
20110099633 Aziz Apr 2011 A1
20110113231 Kaminsky May 2011 A1
20110145918 Jung et al. Jun 2011 A1
20110145920 Mahaffey et al. Jun 2011 A1
20110145934 Abramovici et al. Jun 2011 A1
20110167493 Song et al. Jul 2011 A1
20110167494 Bowen et al. Jul 2011 A1
20110173460 Ito et al. Jul 2011 A1
20110219449 St. Neitzel et al. Sep 2011 A1
20110219450 McDougal et al. Sep 2011 A1
20110225624 Sawhney et al. Sep 2011 A1
20110225655 Niemela et al. Sep 2011 A1
20110247072 Staniford et al. Oct 2011 A1
20110265182 Peinado et al. Oct 2011 A1
20110289582 Kejriwal et al. Nov 2011 A1
20110302587 Nishikawa et al. Dec 2011 A1
20110307954 Melnik et al. Dec 2011 A1
20110307955 Kaplan et al. Dec 2011 A1
20110307956 Yermakov et al. Dec 2011 A1
20110314546 Aziz et al. Dec 2011 A1
20120023593 Puder Jan 2012 A1
20120054869 Yen et al. Mar 2012 A1
20120066698 Yanoo Mar 2012 A1
20120079596 Thomas Mar 2012 A1
20120084859 Radinsky et al. Apr 2012 A1
20120110667 Zubrilin et al. May 2012 A1
20120117652 Manni et al. May 2012 A1
20120121154 Xue et al. May 2012 A1
20120124426 Maybee et al. May 2012 A1
20120174186 Aziz et al. Jul 2012 A1
20120174196 Bhogavilli et al. Jul 2012 A1
20120174218 McCoy et al. Jul 2012 A1
20120198279 Schroeder Aug 2012 A1
20120210423 Friedrichs et al. Aug 2012 A1
20120222121 Staniford et al. Aug 2012 A1
20120255015 Sahita et al. Oct 2012 A1
20120255017 Sallam Oct 2012 A1
20120260342 Dube et al. Oct 2012 A1
20120266244 Green et al. Oct 2012 A1
20120278886 Luna Nov 2012 A1
20120297489 Dequevy Nov 2012 A1
20120330801 McDougal et al. Dec 2012 A1
20130014259 Gribble et al. Jan 2013 A1
20130036472 Aziz Feb 2013 A1
20130047257 Aziz Feb 2013 A1
20130074185 McDougal et al. Mar 2013 A1
20130086684 Mohler Apr 2013 A1
20130097699 Balupari et al. Apr 2013 A1
20130097706 Titonis et al. Apr 2013 A1
20130111587 Goel et al. May 2013 A1
20130117852 Stute May 2013 A1
20130117855 Kim et al. May 2013 A1
20130139264 Brinkley et al. May 2013 A1
20130160125 Likhachev et al. Jun 2013 A1
20130160127 Jeong et al. Jun 2013 A1
20130160130 Mendelev et al. Jun 2013 A1
20130160131 Madou et al. Jun 2013 A1
20130167236 Sick Jun 2013 A1
20130174214 Duncan Jul 2013 A1
20130185789 Hagiwara et al. Jul 2013 A1
20130185795 Winn et al. Jul 2013 A1
20130185798 Saunders et al. Jul 2013 A1
20130191915 Antonakakis et al. Jul 2013 A1
20130196649 Paddon et al. Aug 2013 A1
20130227691 Aziz et al. Aug 2013 A1
20130246370 Bartram et al. Sep 2013 A1
20130263260 Mahaffey et al. Oct 2013 A1
20130291109 Staniford et al. Oct 2013 A1
20130298243 Kumar et al. Nov 2013 A1
20140053260 Gupta et al. Feb 2014 A1
20140053261 Gupta et al. Feb 2014 A1
20140130158 Wang et al. May 2014 A1
20140137180 Lukacs et al. May 2014 A1
20140169762 Ryu Jun 2014 A1
20140179360 Jackson et al. Jun 2014 A1
20140328204 Klotsche et al. Nov 2014 A1
20140337836 Ismael Nov 2014 A1
20140351935 Shao et al. Nov 2014 A1
20150096025 Ismael Apr 2015 A1
Foreign Referenced Citations (14)
Number Date Country
2439806 Jan 2008 GB
2490431 Oct 2012 GB
02006928 Jan 2002 WO
0223805 Mar 2002 WO
2007022454 Feb 2007 WO
2007117636 Oct 2007 WO
2008041950 Apr 2008 WO
2008084259 Jul 2008 WO
2011084431 Jul 2011 WO
2011112348 Sep 2011 WO
2012075336 Jun 2012 WO
2012145066 Oct 2012 WO
2013067505 May 2013 WO
2014057542 Apr 2014 WO
Non-Patent Literature Citations (96)
Entry
Liljenstam, Michael , et al., “Simulating Realistic Network Traffic for Worm Warning System Design and Testing”, Institute for Security Technology studies, Dartmouth College (“Liljenstam”), (Oct. 27, 2003).
Lindorfer, Martina, Clemens Kolbitsch, and Paolo Milani Comparetti. “Detecting environment-sensitive malware.” Recent Advances in Intrusion Detection. Springer Berlin Heidelberg, 2011.
Lok Kwong et al: “DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis”, Aug. 10, 2012, XP055158513, Retrieved from the Internet: URL:https://www.usenix.org/system/files/conference/usenixsecurity12/sec12--final107.pdf [retrieved on Dec. 15, 2014].
Marchette, David J., “Computer Intrusion Detection and Network Monitoring: A Statistical Viewpoint”, (“Marchette”), (2001).
Margolis, P.E. , “Random House Webster's ‘Computer & Internet Dictionary 3rd Edition’”, ISBN 0375703519, (Dec. 1998).
Moore, D. , et al., “Internet Quarantine: Requirements for Containing Self-Propagating Code”, INFOCOM, vol. 3, (Mar. 30-Apr. 3, 2003), pp. 1901-1910.
Morales, Jose A., et al., ““Analyzing and exploiting network behaviors of malware.””, Security and Privacy in communication Networks. Springer Berlin Heidelberg, 2010. 20-34.
Mori, Detecting Unknown Computer Viruses, 2004, Springer-Verlag Berlin Heidelberg.
Natvig, Kurt , “SANDBOXII: Internet”, Virus Bulletin Conference, (“Natvig”), (Sep. 2002).
NetBIOS Working Group. Protocol Standard for a NetBIOS Service on a TCP/UDP transport: Concepts and Methods. STD 19, RFC 1001, Mar. 1987.
Newsome, J. , et al., “Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software”, In Proceedings of the 12th Annual Network and Distributed System Security, Symposium (NDSS '05), (Feb. 2005).
Newsome, J. , et al., “Polygraph: Automatically Generating Signatures for Polymorphic Worms”, In Proceedings of the IEEE Symposium on Security and Privacy, (May 2005).
Nojiri, D. , et al., “Cooperation Response Strategies for Large Scale Attack Mitigation”, DARPA Information Survivability Conference and Exposition, vol. 1, (Apr. 22-24, 2003), pp. 293-302.
Oberheide et al., CloudAV.sub.--N-Version Antivirus in the Network Cloud, 17th USENIX Security Symposium USENIX Security '08 Jul. 28-Aug. 1, 2008 San Jose, CA.
PCT/US2012/021916 filed Jan. 19, 2012, International Search Report and Written Opinion dated May 10, 2012.
Rautiainen et al. “A Look at Portable Document Format Vulnerabilities”, Information Security Technical Report, Elsevier Advanced Technology, Amsterdam, NL, vol. 14, No. L, Feb. 1, 2009 (Feb. 1, 2009).
Reiner Sailer, Enriquillo Valdez, Trent Jaeger, Roonald Perez, Leendert van Doom, John Linwood Griffin, Stefan Berger., sHype: Secure Hypervisor Appraoch to Trusted Virtualized Systems (Feb. 2, 2005) (“Sailer”).
Silicon Defense, “Worm Containment in the Internal Network”, (Mar. 2003), pp. 1-25.
Singh, S. , et al., “Automated Worm Fingerprinting”, Proceedings of the ACM/USENIX Symposium on Operating System Design and Implementation, San Francisco, California, (Dec. 2004).
Spitzner, Lance , “Honeypots: Tracking Hackers”, (“Spizner”), (Sep. 17, 2002).
The Sniffers's Guide to Raw Traffic available at: yuba.stanford.edu/.about.casado/pcap/section1.html, (Jan. 6, 2014).
Thomas H. Ptacek, and Timothy N. Newsham , “Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection”, Secure Networks, (“Ptacek”), (Jan. 1998).
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Final Office Action dated Aug. 26, 2011.
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Final Office Action dated Sep. 10, 2013.
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Final Office Action dated Sep. 11, 2012.
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Non-Final Office Action dated Apr. 11, 2012.
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Non-Final Office Action dated Apr. 18, 2011.
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Non-Final Office Action dated Dec. 23, 2013.
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Non-Final Office Action dated Mar. 14, 2013.
U.S. Appl. No. 12/263,971, filed Nov. 3, 2008 Notice of Allowance dated Jun. 20, 2014.
U.S. Appl. No. 13/011,344, filed Jan. 21, 2011 Advisory Action dated Jun. 19, 2013.
U.S. Appl. No. 13/011,344, filed Jan. 21, 2011 Final Office Action dated Apr. 2, 2014.
U.S. Appl. No. 13/011,344, filed Jan. 21, 2011 Final Office Action dated Jan. 2, 2013.
U.S. Appl. No. 13/011,344, filed Jan. 21, 2011 Non-Final Office Action dated Aug. 10, 2012.
U.S. Appl. No. 13/011,344, filed Jan. 21, 2011 Non-Final Office Action dated Oct. 22, 2013.
U.S. Appl. No. 13/011,344, filed Jan. 21, 2011 Non-Final Office Action dated Sep. 30, 2014.
U.S. Appl. No. 13/011,344, filed Jan. 21, 2011 Notice of Allowance dated Jan. 21, 2015.
U.S. Appl. No. 14/673,292, filed Mar. 30, 2015 Notice of Allowance dated Apr. 29, 2016.
U.S. Pat. No. 8,171,553 filed Apr. 20, 2006, Inter Parties Review Decision dated Jul. 10, 2015.
U.S. Pat. No. 8,291,499 filed Mar. 16, 2012, Inter Parties Review Decision dated Jul. 10, 2015.
Venezia, Paul , “NetDetector Captures Intrusions”, InfoWorld Issue 27, (“Venezia”), (Jul. 14, 2003).
Wahid et al., Characterising the Evolution in Scanning Activity of Suspicious Hosts, Oct. 2009, Third International Conference on Network and System Security, pp. 344-350.
Whyte, et al., “DNS-Based Detection of Scanning Works in an Enterprise Network”, Proceedings of the 12th Annual Network and Distributed System Security Symposium, (Feb. 2005), 15 pages.
Williamson, Matthew M., “Throttling Viruses: Restricting Propagation to Defeat Malicious Mobile Code”, ACSAC conference, Las Vegas, NV, USA, (Dec. 2002), pp. 1-9.
Yuhei Kawakoya et al: “Memory behavior-based automatic malware unpacking in stealth debugging environment”, Malicious and Unwanted Software (Malware), 2010 5th International Conference on, IEEE, Piscataway, NJ, USA, Oct. 19, 2010, pp. 39-46, XP031833827, ISBN:978-1-4244-8-9353-1.
Zhang et al., The Effects of Threading, Infection Time, and Multiple-Attacker Collaboration on Malware Propagation, Sep. 2009, IEEE 28th International Symposium on Reliable Distributed Systems, pp. 73-82.
“Network Security: NetDetector—Network Intrusion Forensic System (NIFS) Whitepaper”, (“NetDetector Whitepaper”), (2003).
“Packet”, Microsoft Computer Dictionary, Microsoft Press, (Mar. 2002), 1 page.
“When Virtual is Better Than Real”, IEEEXplore Digital Library, available at, http://ieeexplore.ieee.org/xpl/articleDetails.isp?reload=true&arnumbe- r=990073, (Dec. 7, 2013).
Abdullah, et al., Visualizing Network Data for Intrusion Detection, 2005 IEEE Workshop on Information Assurance and Security, pp. 100-108.
Adetoye, Adedayo , et al., “Network Intrusion Detection & Response System”, (“Adetoye”), (Sep. 2003).
Adobe Systems Incorporated, “PDF 32000-1:2008, Document management—Portable document format—Part1:PDF1.7”, First Edition, Jul. 1, 2008, 756 pages.
AltaVista Advanced Search Results. “attack vector identifier”. Http://www.altavista.com/web/results?Itag=ody&pg=aq&aqmode=aqa=Event+Orch- estrator . . . , (Accessed on Sep. 15, 2009).
AltaVista Advanced Search Results. “Event Orchestrator”. Http://www.altavista.com/web/results?Itag=ody&pg=aq&aqmode=aqa=Event+Orch- esrator . . . , (Accessed on Sep. 3, 2009).
Anonymous, “Inside Adobe Reader Protected 1 Mode—Part 1”—Design, Oct. 5, 2010 (Oct. 5, 2010), Retrieved from be Internet: URL:http://blogs.adobe.com/security/2010/1O/inside-adobe-reader-protected-mode-part-1-design.html [retrieved on May 6, 2014].
Apostolopoulos, George; hassapis, Constantinos; “V-eM: A cluster of Virtual Machines for Robust, Detailed, and High-Performance Network Emulation”, 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, Sep. 11-14, 2006, pp. 117-126.
Aura, Tuomas, “Scanning electronic documents for personally identifiable information”, Proceedings of the 5th ACM workshop on Privacy in electronic society. ACM, 2006.
Baecher, “The Nepenthes Platform: An Efficient Approach to collect Malware”, Springer-verlag Berlin Heidelberg, (2006), pp. 165-184.
Baldi, Mario; Risso, Fulvio; “A Framework for Rapid Development and Portable Execution of Packet-Handling Applications”, 5th IEEE International Symposium Processing and Information Technology, Dec. 21, 2005, pp. 233-238.
Bayer, et al., “Dynamic Analysis of Malicious Code”, J Comput Virol, Springer-Verlag, France., (2006), pp. 67-77.
Boubalos, Chris , “extracting syslog data out of raw pcap dumps, seclists.org, Honeypots mailing list archives”, available at http://seclists.org/honeypots/2003/q2/319 (“Boubalos”), (Jun. 5, 2003).
Chaudet, C. , et al., “Optimal Positioning of Active and Passive Monitoring Devices”, International Conference on Emerging Networking Experiments and Technologies, Proceedings of the 2005 ACM Conference on Emerging Network Experiment and Technology, CoNEXT '05, Toulousse, France, (Oct. 2005), pp. 71-82.
Chen, P. M. and Noble, B. D., “When Virtual is Better Than Real, Department of Electrical Engineering and Computer Science”, University of Michigan (“Chen”), 2001.
Cisco “Intrusion Prevention for the Cisco ASA 5500-x Series” Data Sheet (2012).
Cisco, Configuring the Catalyst Switched Port Analyzer (SPAN) (“Cisco”), (1992).
Clark, John, Sylvian Leblanc,and Scott Knight. “Risks associated with usb hardware trojan devices used by insiders.” Systems Conference (SysCon), 2011 IEEE International. IEEE, 2011.
Cohen, M.I. , “PyFlag—An advanced network forensic framework”, Digital investigation 5, Elsevier, (2008), pp. S112-S120.
Costa, M. , et al., “Vigilante: End-to-End Containment of Internet Worms”, SOSP '05, Association for Computing Machinery, Inc., Brighton U.K., (Oct. 23-26, 2005).
Crandall, J.R. , et al., “Minos:Control Data Attack Prevention Orthogonal to Memory Model”, 37th International Symposium on Microarchitecture, Portland, Oregon, (Dec. 2004).
Deutsch, P. , “Zlib compressed data format specification version 3.3” RFC 1950, (1996).
Didier Stevens: “Malicious PDF Documents Explained”, Security & Privacy, IEEE, IEEE Service Center, Los Alamitos, CA, US, vol. 9, No. L, Jan. 1, 2011 (Jan. 1, 2011).
Distler, “Malware Analysis: An Introduction”, SANS Institute InfoSec Reading Room, SANS Institute, (2007).
Dunlap, George W. , et al., “ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay”, Proceeding of the 5th Symposium on Operating Systems Design and Implementation, USENIX Association, (“Dunlap”), (Dec. 9, 2002).
EP 12736373.7 filed Jan. 19, 2012 Extended European Search Report dated May 21, 2014.
Excerpt regarding First Printing Date for Merike Kaeo, Designing Network Security (“Kaeo”), (2005).
Filiol, Eric , et al., “Combinatorial Optimisation of Worm Propagation on an Unknown Network”, International Journal of Computer Science 2.2 (2007).
FireEye Malware Analysis & Exchange Network, Malware Protection System, FireEye Inc., 2010.
FireEye Malware Analysis, Modern Malware Forensics, FireEye Inc., 2010.
FireEye v.6.0 Security Target, pp. 1-35, Version 1.1, FireEye Inc., May 2011.
Gibler, Clint, et al. AndroidLeaks: automatically detecting potential privacy leaks in android applications on a large scale. Springer Berlin Heidelberg, 2012.
Goel, et al., Reconstructing System State for Intrusion Analysis, Apr. 2008 SIGOPS Operating Systems Review, vol. 12 Issue 3, pp. 21-28.
Gregg Keizer: “Microsoft's HoneyMonkeys Show Patching Windows Works”, Aug. 8, 2005, XP055143386, Retrieved from the Internet: URL:http://www.informationweek.com/microsofts-honeymonkeys-show-patching-windows-works/d/d-id/1035069? [retrieved on Jun. 1, 2016].
Heng Yin et al, Panorama: Capturing System-Wide Information Flow for Malware Detection and Analysis, Research Showcase © CMU, Carnegie Mellon University, 2007.
Hjelmvik, Erik , “Passive Network Security Analysis with NetworkMiner”, (IN)Secure, Issue 18, (Oct. 2008), pp. 1-100.
Idika et al., A-Survey-of-Malware-Detection-Techniques, Feb. 2, 2007, Department of Computer Science, Purdue University.
IEEE Xplore Digital Library Sear Results for “detection of unknown computer worms”. Http://ieeexplore.ieee.org/searchresult.jsp?SortField=Score&SortOrder=desc- &ResultC . . . , (Accessed on Aug. 28, 2009).
Isohara, Takamasa, Keisuke Takemori, and Ayumu Kubota. “Kernel-based behavior analysis for android malware detection.” Computational intelligence and Security (CIS), 2011 Seventh International Conference on. IEEE, 2011.
Kaeo, Merike , “Designing Network Security”, (“Kaeo”), (Nov. 2003).
Kevin A Roundy et al: “Hybrid Analysis and Control of Malware”, Sep. 15, 2010, Recent Advances in Intrusion Detection, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 317-338, XP019150454 ISBN:978-3-642-15511-6.
Kim, H. , et al., “Autograph: Toward Automated, Distributed Worm Signature Detection”, Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, (Aug. 2004), pp. 271-286.
King, Samuel T., et al., “Operating System Support for Virtual Machines”, (“King”) (2003).
Krasnyansky, Max , et al., Universal TUN/TAP driver, available at https://www.kerne.org/doc/Documentation/networking/tuntap.txt (2002) (“Krasnyansky”).
Kreibich, C. , et al., “Honeycomb-Creating Intrusion Detection Signatures Using Honeypots”, 2nd Workshop on Hot Topics in Networks (HotNets-11), Boston, USA, (2003).
Kristoff, J. , “Botnets, Detection and Mitigation: DNS-Based Techniques”, NU Security Day, (2005), 23 pages.
Leading Colleges Select FireEye to Stop Malware-Related Data Breaches, FireEye Inc., 2009.
Li et al., A VMM-Based System Call Interposition Framework for Program Monitoring, Dec. 2010, IEEE 16th International Conference on Parallel and Distributed Systems, pp. 706-711.
Continuations (2)
Number Date Country
Parent 14673292 Mar 2015 US
Child 15256367 US
Parent 13011344 Jan 2011 US
Child 14673292 US
Continuation in Parts (1)
Number Date Country
Parent 12263971 Nov 2008 US
Child 13011344 US