System and method for analyzing binary code for malware classification using artificial neural network techniques

Description

1. FIELD

Embodiments of the disclosure relate to the field of cyber security. More specifically, one embodiment of the disclosure relates to a system and computerized method for statically identifying whether an object, such as an executable file for example, is associated with a cyber-attack using artificial neural network techniques, which are often referred to as “deep learning.”

2. GENERAL BACKGROUND

Over the last decade, malware detection has become a pervasive and growing problem, especially given the increased volume of new applications available for download. Currently, malware detection systems are being deployed by companies to thwart cyber-attacks originating from downloaded executable files. These conventional malware detection systems utilize machine learning techniques which examine content (e.g., de-compiled code) of the executable file in connection with signatures associated with known malware. Hence, conventional malware detection systems are reliant on expert analysis in formulating these signatures. Given the static nature of these signatures, however, detection of new (“zero-day”) or polymorphic malware has become more and more challenging in order to successfully defend a company or an individual user against cyber-attacks.

In some instances, a cyber-attack is conducted by infecting a targeted network device with malware, often in the form of an executable file, which is designed to adversely influence or attack normal operations of the targeted network device (e.g., computer, smartphone, wearable technology, etc.). One type of malware may include bots, spyware, or another executable embedded into downloadable content, which operates within the network device without knowledge or permission by the user or an administrator to exfiltrate stored data. Another type of malware may be designed as an executable file that, during processing, conducts a phishing attack by deceiving the user as to the actual recipient of data provided by that user.

Recently, significant efforts have been expended on creating different types of malware detection systems, including systems using artificial neural networks (generally referred to as a “neural network”). A neural network is logic that is designed and trained to recognize patterns in order to classify incoming data as malicious (malware) or benign. Prior approaches to using neural networks avoided some of the drawbacks of traditional malware detection systems by eliminating the need for labor intensive analyses of previous detected malware by highly trained cyber-security analysts to determine features relevant to malware detection; however, known neural network approaches to malware detection tend to be complicated in application, including training to achieve accurate classifications. It would be desirable to provide enhanced techniques effective in detecting malware with reduced complexity over other neural network approaches.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of this specification, show certain aspects of the subject matter disclosed herein and, together with the description, help explain some of the principles associated with the disclosed embodiments and implementations.

FIG. 1 is an exemplary block diagram illustrating logic included within embodiments of a cyber-security system described herein.

FIG. 2 is an exemplary block diagram of training logic for training of a convolution neural network (CNN) along with encoding logic and the classifier of FIG. 1.

FIGS. 3A-3B are exemplary flowcharts illustrating operations performed by embodiments of a cyber-security system described herein.

FIG. 4A is a first illustrative embodiment of the cyber-security system deploying a computational analysis subsystem, and a classifier collectively trained by training sets to analyze binary code from an executable file.

FIG. 4B is a second illustrative embodiment of the cyber-security system deploying a computational analysis subsystem, an intelligence-driven subsystem and a post-analysis subsystem collectively trained by training sets to analyze binary code from an executable file.

FIG. 5 is an illustrative embodiment of the operations performed by pre-processing logic of FIGS. 4A-4B operating on the binary code extracted from the executable file.

FIGS. 6A-6H provide an illustrative example of operations conducted on an extracted binary code section of an executable file and subsequent representations of the binary code section substantially performed by logic of the computational analysis subsystem of FIGS. 4A-4B.

FIG. 7A is an illustrative embodiment of the network device including software modules that support operability of the first embodiment of the cyber-security system of FIG. 4A.

FIG. 7B is an illustrative embodiment of the network device including software modules that support operability of the second embodiment of the cyber-security system of FIG. 4B.

FIG. 8 is a third illustrative embodiment of the cyber-security system deploying a computational analysis subsystem, an intelligence-driven subsystem and a post-analysis subsystem collectively trained by training sets to analyze binary code from an executable file.

DETAILED DESCRIPTION

Embodiments of subsystems and methods of a cyber-security system configured to determine whether an object is associated with a cyber-attack (i.e., malicious). One embodiment of the cyber-security system can be used to analyze raw, binary code of the executable file for malware. According to one embodiment of the disclosure, binary code of an incoming object (e.g., an executable file) undergoes feed-forward processing by a convolutional neural network (CNN), trained using supervised learning, to isolate features associated with the binary code that aid in the classification of the executable file as benign or malicious. Significantly, the binary code can be processed in this manner directly, without intermediate analysis or translation. To provide a more robust analysis, CNN-based and intelligence-driven analyses may be performed concurrently (i.e., overlapping at least partially in time), as described below. It is contemplated that other embodiments of the cyber-security system may be implemented that analyze various types of files for malware other than executable files, such as text files, Portable Document Format (PDF) files, Presentation File Format (PPT) files, scripts, for example. In general, the term “file” may pertain to any file type.

I. Overview

As set forth below, one embodiment of the cyber-security system includes a plurality of subsystems that perform neural network analyses on data based on content from a file (e.g., executable file), and in some embodiments, leveraging insight offered by intelligence-driven analyses. One of these subsystems, referred to as a computational analysis subsystem, employs an artificial neural network to automatically determine whether certain features, which are associated with one or more patterns, are present (or absent) in the binary code of an executable file. The presence (or absence) of these features allows the executable file to be classified as malicious or benign. It should be noted that this determination is made without the need to manually pre-identify the specific features to seek within the byte code.

For at least one embodiment of the disclosure, this computational analysis subsystem leverages a deep neural network, such as a convolutional neural network (CNN) for example, which operates on an input based directly on the binary code of the executable file, rather than on a disassembled version of that code. The operations of the CNN are pre-trained (conditioned) using labeled training sets of malicious and/or benign binary code files in order to identify features, corresponding to the binary code of the executable file, that are probative of how the executable file should be classified. Communicatively coupled to and functioning in concert with the CNN, a classifier, operating in accordance with a set of classification rules, receives an output from the CNN and determines a classification assigned to the executable file indicating whether the executable file is malicious or benign.

According to other embodiments of the disclosure, operating in conjunction with the computational analysis subsystem, the cyber-security system may also include an intelligence-driven analysis subsystem, whose operations are more directly influenced by and therefore depend on analyses of previously detected malware performed by highly trained cyber-security analysts. More specifically, based on intelligence generated through analyses of known malicious executable files and benign executable files by highly trained cyber-security analysts, this intelligence-driven subsystem is configured to statically (without execution) identify “indicators” in the executable file through their automatic inspection and evaluation, which permit their classification. In general, an “indicator” is one or more suspicious or anomalous characteristics of the executable file, which may be directed to the content as well as the format or delivery of such content. Accordingly, the prior work of the cyber-security analysts is used to identify the indicators that differentiate malicious from benign executable files and then these analyst results are used to configure the intelligence-driven analysis subsystem. It should be emphasized that the determination of malware in unknown (e.g., previously unanalyzed) executable files proceeds automatically and without human intervention.

Additionally, in embodiments configured to receive network traffic, this intelligence-driven subsystem can also statically identify indicators in communication packets containing the executable file. The inspection and evaluation performed may involve identifying any communication protocol anomalies or suspicious packet content, as well as using signature (hash) matching, heuristics and pattern matching, as well as other statistical or deterministic techniques, in each case, informed and guided by prior work of the analysts.

As an illustrative example, the analysts may identify the anomalies, signatures of known malware, and patterns associated with known malware and other tell-tale attributes, which can be used in generating computer applied rules used in the intelligence-driven analysis subsystem. It is worth noting that none of these analyst results are needed by the computational analysis subsystem, which only requires the labeled training sets of malicious and/or benign binary code files for training purposes. The classifier can then use these results of both the computational analysis and the intelligence-driven analysis to classify the executable file. In some embodiments, the results of the intelligence-driven analysis subsystem can be placed into a format common to the output provided by the computational analysis subsystem, and thereafter, routed to a post-analysis subsystem for classification.

More specifically, embodiments of the disclosure will now be described in greater detail. According to one embodiment of the disclosure, a cyber-security system is configured to analyze an executable file (of any size), where the cyber-security system includes a computational analysis subsystem. As described herein, this computational analysis subsystem includes a pre-processor, a CNN and, in some embodiments, a separate classifier, followed by a message generator. Each of these components may be software running on a network device or multiple (two or more) network devices that collectively operate to determine whether the executable file is associated with a cyber-attack (i.e. malicious) or benign. Herein, an “executable file” refers to a collection of digital data that is not readily readable by humans and, when processed by a processor within a network device, causes performance of a particular task or tasks (e.g., write to memory, read from memory, jump to an address to start a process, etc.). The digital data may include binary code, namely a collection of bit patterns for example, each corresponding to an executable command and/or data, along with other data resources (e.g., values for static variables, etc.). Examples of the binary code may include, but are not limited or restricted to, and the term is often used synonymously with, an executable, machine code (e.g., set of machine readable, processor-executable instructions), or object code. The executable file may be provided in a Portable Executable (PE) format, namely a data structure that encapsulates information necessary for a Windows® Operating System (OS) loader to manage the wrapped binary code, although other formats may be used.

The CNN includes a plurality of layers (logic modules) that together implement an overall programmatic function, which is generated and tuned as described below. Each of the layers operates both as a portion of the overall programmatic function and as a plurality of operations executed by kernels (i.e., execution elements sometimes called “neurons”), where the operations of each of the layers implement one or more layer functions. The layout and architecture of the CNN in terms of the number and order of the layers and their respective layer functions, fall within the ordinary skill of practitioners in this art in light of this disclosure, and so only illustrative examples of the architecture will be described herein.

Operating as part of an input layer for the CNN deployed within the computational analysis subsystem, the pre-processor is configured to receive an executable file. In some embodiments, the pre-processor may receive the executable file separately or encapsulated as a plurality of binary packets in transit over a network. The content of the binary packets may be extracted from portions of the binary packets (e.g., payloads), and thereafter, aggregated (reassembled) to produce the executable file. Where some content of the binary packets is encoded and/or compressed, the pre-processor may feature decode logic and/or decompression logic to perform such operations on the content before the content is aggregated to produce the executable file.

Upon receiving the executable file, the pre-processor is responsible for selecting a section of binary code from the executable file for analysis. In some embodiments, the pre-processor may select a plurality of subsections of the binary code for analysis by the CNN, each subsection being combined (or retained separately) and conditioned for analysis. The disclosure in connection with FIG. 5 will describe this selection process in considerable detail. Of note, this selection process requires neither the employment nor training of an attention mechanism, a component known in the art. The size of each selected section of binary code (and/or from where in the executable file the binary code section is selected) may be a static (constant) value or dynamic based on rules established during the training and dependent on attributes of the binary code such as length or format. After the binary code section(s) is (are) extracted (and padding added if needed), the pre-processor may further encode the binary code section(s) to generate an input for the CNN. The input includes a first representation of the binary code (e.g., input tensor) in a form and format suitable for processing by the CNN. The (byte) encoding may be conducted by a variety of techniques, including “one hot encoding” or “embedding,” as described below.

Communicatively coupled to the pre-processor, the CNN may be logically represented by a plurality of executable logic layers including one or more convolution layers, one or more pooling layers, and one or more fully connected/nonlinearity (FCN) layer. These layers generally represent weighting, biasing and spatial reduction operations performed by their convolution logic, pooling logic and FCN logic deployed within the cyber-security system.

According to one embodiment of the disclosure, each convolution layer is configured to (i) process an incoming representation (e.g., in the case of the first convolution layer of the CNN, the first representation of the binary code) by applying operations executing a portion of the overall programmatic function (referred to as a “programmatic layer function”) on the incoming representation to produce a resultant representation (e.g., an output tensor). These operations may be performed using one or more convolution filters, which are pre-trained using a training set including patterns associated with known benign executable files and/or malicious executable files. The size of the resultant representation may be based on a number of hyper-parameters that influence operations of the convolution layer, including the number, length and width of the convolution filter(s), stride length, and an amount of zero padding.

After each convolution layer, various operations may be performed on the resultant representation. As one example, after performing a convolution operation on the first representation by a first convolution layer (of the one or more convolution layers), element-wise nonlinear operations (e.g., rectified linear unit or “ReLU”) may be performed on the resultant representation to map all negative values to “0” in order to introduce nonlinearities to the resultant representation.

As another example, after performing a convolution operation in a convolution layer to produce a resultant representation, a pooling layer (of the one or more pooling layers) may transform the resultant representation by reducing the spatial dimensions of the resultant representation provided to the next convolutional layer or the FCN layer. This may be viewed as compressing the data of that resultant representation from the convolution layer. The pooling operation does not affect the depth dimension of the resultant representation, where the depth dimension equates to the number of convolution filters. Sometimes, the pooling operation is referred to as “down-sampling,” given that the reduction in size of the resultant representation leads to loss of location information. However, such information loss may be beneficial for overall CNN performance as the decreased size leads to lesser computational overhead for a next convolutional layer (since embodiments of the CNN likely support multiple convolution layers) or a next FCN layer, e.g., where the pooling is being conducted after the final convolution operation has been performed by the CNN. Different types of pooling may include “max pooling,” “average pooling,” “dynamic pooling,” as known in the art.

As yet another example of a weighting and/or biasing operation, the FCN layer receives the resultant representation after convolution and/or pooling operations. The FCN layer applies weights and biases as trained by the training set described above to produce a vector, which may operate as the “output” for the CNN. The FCN layer applies the learned weights and biases to account for different nonlinear combinations and ordering of the features detected during preceding convolution/pooling operations.

Communicatively coupled to the CNN, a classifier is configured to receive the output from the CNN and determine a classification assigned to (and stored in memory in association with) the executable file, based, at least in part, on a threat score generated based on the received output from the CNN. The threat score is generated by threat assessment logic, which may perform a sigmoid function or other function to normalize a scalar value. The normalized scalar value represents the threat score within a prescribed range, and the executable file is considered to be malicious when the scalar value exceeds a threshold value within the prescribed range.

Additionally, a message generator may be deployed to generate an alert or other message. The alert may be transmitted to a system administrator or cyber-security administrator to report on results of the analysis, that is, a classification of an executable file as malicious and thus associated with a cyber-attack. Where the computational analysis subsystem is incorporated into or is in communication with a network device (such as a laptop, tablet or other endpoint) under user control, the message may be provided (e.g., on screen) to the user of the device. Moreover, the message may be provided to one or more other components (e.g., operating system or agent) running within the network device for example, to influence its operation such as signaling to cause the network device to block processing (e.g., download, loading or execution) of the executable file on the network device.

According to another embodiment of the disclosure, operating concurrently with the computational analysis subsystem described above, the cyber-security system may include an intelligence-driven analysis subsystem, which is configured to (i) receive the executable file, (ii) inspect the executable file (and, in some embodiments and/or deployments, of communication packets carrying the executable file) for indicators associated with a cyber-attack based on intelligence generated by a cyber-security analyst, as described above, (iii) compute features of the executable file for indicators, and (iv) produce an output representing the features.

The static analysis conducted by the intelligence-driven analysis subsystem may involve an inspection of the binary packets based on known (previously detected) malicious executable files and/or benign executable files. The inspection of the binary packets may involve identifying any communication protocol anomalies and suspicious content in the header, payload, etc. The inspection of the payload may include extraction and re-assembly of the executable file, followed by an inspection of the header and other portions of that executable file. Of course, where the executable file is received directly without being carried in communication packets, then the packet inspection is of no avail. Thereafter, the inspection can be conducted in a variety of ways, using signature hashes of known malicious executable files, heuristics and pattern matching based on known executable files, or the like. In some embodiments, the results of the intelligence-driven analysis, including the features associated with the detected indicators, may be provided to the post-analysis subsystem.

The concurrent operations of the computational analysis subsystem and the intelligence-driven analysis subsystem complement each other. The intelligence-driven analysis subsystem targets an analysis of context, e.g., anomalous data placement in binary packets, communication protocol anomalies, and/or known malicious and benign patterns associated with malicious and benign executable files. The computational analysis subsystem targets digital bit patterns, independent of context and based on training by a training set including hundreds of thousands or millions of benign and/or malicious executable files. Hence, the computational analysis subsystem is more content centric, and thus, coding tendencies or styles by malware authors that may be missed in the intelligence-driven analysis (absent profound observational skills and too often luck on the part of the analyst) may assist in the detection of zero-day (first time) cyber-attacks.

Herein, the post-analysis subsystem is communicatively coupled to both the computational analysis subsystem and the intelligence-driven analysis subsystem, described above. The post-analysis subsystem may include (i) grouping logic and (ii) a classifier. According to one embodiment of the disclosure, the grouping logic may be configured to perform one or more mathematical or logical operations (e.g., concatenation) on content from the output from the computational analysis subsystem and content from the output from the intelligence-driven analysis subsystem to generate a collective output. The classifier, as described above, is configured to receive the collective output from the grouping logic (subject to further fully connected/nonlinearity operations) and determine a classification assigned to the executable file based, at least in part, on a threat score generated based on the collective output. As also described above, the message generator may be deployed to generate a message to one or more components operating within the network device when the classification of the executable file is determined to be malicious.

In summary, by operating on the binary code and avoiding disassembly operations and attention mechanisms, the computational analysis subsystem (as well as the cyber-security system) may be performed with greater operational efficiency during runtime of the network device than previously available. Additionally, where deployed within a network device such as an endpoint device, the computational analysis subsystem (and the cyber-security system containing the computational analysis subsystem) can determine whether a cyber-attack is occurring without significant degradation of the network device's (e.g., endpoint device's) performance, and as a result, may issue alerts in time for action to be taken to contain, mitigate or even block the effects of the cyber-attack. Lastly, by avoiding complete reliance on a preconceived notion as to what features should be sought (a tendency in many conventional approaches), a cyber-security system including the computational analysis subsystem and, in some embodiments, a combination of the computational analysis subsystem and the intelligence-driven analysis subsystem, provides a more holistic analysis of the executable file in detecting an attempted cyber-attack.

II. Terminology

In the following description, certain terminology is used to describe aspects of the invention. For example, in certain situations, the term “logic” is representative of hardware, firmware and/or software that is configured to perform one or more functions. As hardware, logic may include circuitry having data processing or storage functionality. Examples of such processing or storage circuitry may include, but is not limited or restricted to the following: a processor; one or more graphics processing units (GPUs); one or more processor cores; a programmable gate array; an application specific integrated circuit (ASIC); semiconductor memory; combinatorial logic, or any combination of the above components.

Logic or a logic module may be in the form of one or more software modules, such as a program, a script, a software component within an operating system, an application programming interface (API), a subroutine, a function, a procedure, an applet, a servlet, a routine, source code, object code, a shared library/dynamic load library, or even one or more instructions. These software modules may be stored in any type of a suitable non-transitory storage medium, or transitory storage medium (e.g., electrical, optical, acoustical or other form of propagated signals such as carrier waves, infrared signals, or digital signals). Examples of a “non-transitory storage medium” may include, but are not limited or restricted to a programmable circuit; non-persistent storage such as volatile memory (e.g., any type of random access memory “RAM”); or persistent storage such as non-volatile memory (e.g., read-only memory “ROM”, power-backed RAM, flash memory, etc.), a solid-state drive, hard disk drive, an optical disc drive, or portable memory device. As firmware, the executable code is stored in persistent storage.

The term “object” generally refers to a collection of data, whether in transit (e.g., over a network) or at rest (e.g., stored), often having a logical structure or organization that enables it to be classified for purposes of analysis. According to one embodiment, the object may be an executable file as previously defined, which can be executed by a processor within a network device. The binary code includes one or more instructions, represented by a series of digital values (e.g., logic “1s” and/or “0s”). Herein, the executable file may be extracted from one or more communication packets (e.g., packet payloads) propagating over a network.

A “section” may be generally construed as a portion of content extracted from a particular file. In one embodiment, the “section” may be a collection of binary code of a particular size extracted from an executable file. The section may be comprised of contiguous binary code from the executable file or non-contiguous binary code subsections that may be aggregated to form a single binary code section.

A “network device” generally refers to an electronic device with network connectivity. Examples of an electronic device may include, but are not limited or restricted to the following: a server; a router or other signal propagation networking equipment (e.g., a wireless or wired access point); or an endpoint device (e.g., a stationary or portable computer including a desktop computer, laptop, electronic reader, netbook or tablet; a smart phone; a video-game console; wearable technology such as a smart watch, etc.).

The term “transmission medium” is a physical or logical communication path to or within a network device. For instance, the communication path may include wired and/or wireless segments. Examples of wired and/or wireless segments include electrical wiring, optical fiber, cable, bus trace, or a wireless channel using infrared, radio frequency (RF), or any other wired/wireless signaling mechanism.

The term “computerized” generally represents that any corresponding operations are conducted by hardware in combination with software and/or firmware. Also, the terms “compare” or “comparison” generally mean determining if a match (e.g., a certain level of correlation) is achieved between two items where, in certain instances, one of the items may include a particular signature pattern.

Lastly, the terms “or” and “and/or” as used herein are to be interpreted as inclusive or meaning any one or any combination. Therefore, “A, B or C” or “A, B and/or C” mean “any of the following: A; B; C; A and B; A and C; B and C; A, B and C.” An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.

As this invention is susceptible to embodiments of many different forms, it is intended that the present disclosure is to be considered as an example of the principles of the invention and is not intended to limit the invention to the specific embodiments shown and described.

II. General Architecture

Referring to FIG. 1, an exemplary block diagram illustrating logic deployed within a cyber-security system 100 using a feed-forward, artificial neural network 110 trained using supervised learning is shown. The artificial neural network 110 is a convolutional neural network (CNN) comprised of multiple layers, where each layer performs a specific function that assists in identifying features, namely a collection of values corresponding to patterns of binary code under analysis that may be probative in determining whether an executable file is associated with a cyber-attack. These CNN layers include one or more convolution layers performed by convolution logic 112, one or more pooling layers performed by pooling logic 114, and one or more fully-connected, nonlinearity (FCN) layers performed by FCN logic 116.

More specifically, the CNN 110 produces an output 130 based on a received input 120. The received input 120 includes encoded values each uniquely representing, for example, a corresponding measured unit of the binary code (e.g., nibble, byte, word, etc.). The convolution logic 112 includes a hierarchy of one or more convolution filters operating at each convolution layer to apply a programmatic layer function (e.g., weighting and/or biasing) on an incoming representation to produce a transformed, resultant representation. For a first convolution layer, the convolution logic 112 receives the incoming representation (i.e., the received input 120) and produces a resultant representation (e.g., one or more features extracted from the received input 120). For subsequent convolution layers, the convolution logic 112 receives the incoming representation, which may include the feature(s) produced by the last convolution layer (or feature(s) modified by an intermediary pooling layer as described below) instead of the received input 120. Hence, for each convolution layer, higher-level features may be extracted, where the number of convolution layers may be selected based, at least in part, on (i) accuracy improvements provided by each convolution layer and (ii) time constraints needed to analyze an executable file and conduct potential remediation actions (e.g., blocking, removal, quarantining, etc.) on a malicious executable file.

The pooling logic 114 operates in conjunction with the convolution logic 112. Herein, at a first pooling layer, the pooling logic 114 reduces the spatial dimension (size) of a feature produced by a preceding convolution layer. This “down-sampling” reduces the amount of additional computations needed by the CNN 110 in completing its analysis without significant adverse effects on accuracy of the analysis. Typically, for each pooling layer, a maximum (max) or average pooling technique is used, resulting in a fixed-length tensor that is smaller than the previous convolutional layer. For these techniques, the input, such as the features (represented by feature maps) may be divided into non-overlapping two-dimensional spaces. For average pooling, the averages of the pooling regions are calculated while, for max pooling, the maximum value of each pooling region is selected.

Another pooling technique utilized by one or more pooling layers may include dynamic pooling. For dynamic pooling, “k” best features (k≥2) are extracted during pooling, where the “k” value is dynamically calculated based on the length of the input (e.g., in bytes, bits, etc.) and/or the depth of the current pooling layer within the CNN hierarchy. The input (e.g., section of content under analysis) may vary in size during the analysis, and subsequent convolutional and pooling tensors will likewise vary in size in relation to the current depth in the hierarchy and length of the input. The variable-length tensors must be reduced to a common fixed size before interaction with the full-connected/non-linearity logic 116. This dynamic pooling technique allows the classifier to learn and extract a number of features that is proportional to the length of the input, rather than limiting it to a fixed number of features. Furthermore, this approach enables feature extraction to be concentrated in a non-uniform manner across the input, essentially allowing for features to be more densely distributed than in the fixed-length case. The combination of these benefits results in an ability to extract and retain more long-term relationships among the features than would otherwise be possible for arbitrary input lengths.

For instance, as an illustrative example, for a first convolution layer of analysis, the convolution logic 112 controls the convolution filter(s) to convolve the incoming representation of a section the binary code to extract features, which are associated with patterns that may aid in analysis of the executable code for malware. Collectively, the number of features produced by each convolution layer is based on the input and the number of convolution filters selected. Thereafter, for a second (subsequent) convolution layer, the convolution logic 112 applies the convolution filters to the features (or spatially reduced features produced by an interposed pooling layer described above) to extract higher level features. According to this example, the higher level features may include instructions formed by nonlinear combinations of at least some of the features extracted at the first convolution layer. Similarly, for a third (subsequent) convolution layer, the convolution logic 112 applies the convolution filters to the features generated during the second convolution layer (or spatially reduced features produced by an interposed pooling layer described above) to identify even higher levels features that are associated with nonlinear combinations of the higher level features extracted by the second convolution layer.

It is contemplated that, after each convolution layer, various operations may be performed on the resultant representation (features) to lessen processing load for the CNN 110. For example, during a first convolution layer, after performing a convolution operation on the incoming representation (i.e., received input 120) by the convolution logic 112, element-wise nonlinear operations may be performed (e.g., by a rectified linear unit or “ReLU”) on the resultant representation. The nonlinear operations map all negative values within the resultant representation to “0” in order to introduce nonlinearities to the resultant representation.

Referring still to FIG. 1, the FCN logic 116 is adapted to perform further nonlinearity operations on the resulting features to further uncover feature combinations that may aid in identifying the executable file is or is not associated with a cyber-attack. The resulting features undergo weighting and biasing operations to produce an output that takes into account nonlinear combinations from the entire input volume (e.g., all of the high-level features).

Communicatively coupled to the CNN 110, the classifier 140 is configured to receive the output 130 from the CNN 110 and determine a classification assigned to the executable file. This classification may be based, at least in part, on a threat score 145 generated by threat assessment logic 142, which conducts trained weighting and biasing operations on the received output 130. Such operations translate the received output 130 from a vector (e.g., an ordered sequence of two or more values) into a scalar value, where the scalar value is normalized as the threat score 145 bounded by a prescribed value range (e.g., 0-1; 0-10; 10-100, etc.).

Responsive to detecting the threat score 145 exceeding a prescribed threshold, the message generation logic 150 may generate an “alert” 155 (e.g., a message transmitted to report results of the analysis, especially upon detection of an executable file that may be associated with a cyber-attack). The alert 155 may include metadata associated with the analysis, such as executable file name, time of detection, network address(es) for the executable file (e.g., source IP, destination IP address, etc.), and/or severity of the threat (e.g., based on threat score, targeted network device, frequency of detection of similar executable files, etc.).

Referring now to FIG. 2, an exemplary block diagram of training logic 200 for training certain logic of the cyber-security system 100, including the CNN 110, encoding logic 250 and the classifier 140 of FIG. 1 is shown. Herein, the training logic 200 includes error evaluation logic 210 and weighting adjustment logic 220. The error evaluation logic 210, upon completion of processing of a labeled binary code section 240, compares the value of a threat score generated by the classifier 140 to a known score assigned to the labeled binary code file (e.g., “1” for known malicious binary code file and “0” for known benign binary code file).

According to one embodiment of the disclosure, the error evaluation logic 210 computes a difference (i.e., the error) 230 between the threat score 145 and the known score and provides the error 230 to the weighting adjustment logic 220. Based on the determined error 230, the weighting adjustment logic 220 may alter encoded values set forth in the embedding lookup table 260 stored in memory. Additionally, or in the alternative, the weighting adjustment logic 220 may alter the weighting and/or biasing as applied by (i) the convolution filters 270₁-270_N(N≥1) within the convolution logic 112, (ii) nonlinear analysis logic 280 within the FCN logic 116, and/or (iii) logistic logic 290 within the threat assessment logic 142.

More specifically, during a training process that may occur on a periodic or aperiodic basis, a labeled training set of malicious and/or benign binary code files 225 is provided to the cyber-security system 100. The binary code files 225 are executable files used for training purposes. More specifically, the labeled training set includes a plurality of labeled, binary code sections associated with known malicious and/or benign executable files. Each labeled binary code section 240 is provided (as input) to the encoding logic 250. The encoding logic 250 is configured to encode portions of the labeled binary code section 240 (e.g., byte sequences), and thereafter, a representation of these encoded values (input representation) is processed by the convolution logic 112 within the CNN 110. For clarity, the encoding operations are described for byte sequences, although the encoding may be conducted for other measured units of data (e.g., bit nibble, word, dword, etc.).

The encoding logic 250 may rely on various types of encoding schemes, including “one hot encoding” and “embedding.” For one-hot encoding, the encoding logic 250 substitutes a value of each byte sequence from the labeled binary code section 240 with a corresponding bitwise value from a unity matrix. As an illustrative example, for each byte sequence, the encoding logic 250 stores encoded values that are organized as a 257×257 unity matrix, where the last row/column is an encoded value assigned for padding. Hence, when analyzing the labeled binary code section 240 having a length “L”, a byte value “2” from the labeled binary code section 240 would be encoded with the 257-bit value [0,0,1,0 . . . 0] and the incoming representation 265, provided to the CNN 110 during the training process, would be a 257×L×1 tensor. A “tensor” is a multi-dimensional vector.

Another type of encoding scheme is “embedding,” where the embedding lookup table 260 includes encode values for each byte sequence. During a training session, the embedding lookup table 260 is initialized to random values and, based on the machine learning function followed by the weighting adjustment logic 220, these encode values are adjusted. For this embodiment of the disclosure, the embedding lookup table 260 would be sized as a K×257 matrix, where “K” corresponds to the number of dimensions (rows/entries) for the embedding lookup table 260. Hence, each byte sequence of the binary code section 240 would be encoded to produce the input (incoming representation) 265 provided to the CNN 110.

The CNN 110 receives the input 265 and performs operations on the encoded values of the input 265 to produce an output (outgoing representation) 285 that concentrates its analysis on features that may be probative in determining whether the labeled binary code section 240 includes malware, as described above for FIG. 1. Based on the errors 230 determined for each successive, labeled binary code section 240, the weighting adjustment logic 220 may alter one or more weighting parameters of the convolution filters 270₁-270_Nutilized by the convolution logic 112 in performing a specific programmatic layer function in efforts to reduce error for successively analyzed binary code sections. The weighting parameter(s) are stored in data store 275 along with biasing parameters, each associated with the application of a particular convolution filter 270₁, . . . or 270_Nand applied by the correlation logic 112 as the particular convolution filter 270₁, . . . or 270_Nconvolves the incoming representation 265, as described above.

Additionally, based on the error 230 determined, the weighting adjustment logic 220 may alter one or more weighting parameters and/or a biasing parameter utilized by the nonlinear analysis logic 280 within the FCN logic 116, which is used in producing the output 285 from the CNN 110. An illustrated example as to how modification of the weighting parameter(s) and/or biasing parameter of the nonlinear analysis logic 280 may influence the output 130 is shown in FIG. 6E.

Lastly, based on the error 230 determined, the weighting adjustment logic 220 may alter one or more weighting parameters and/or a biasing parameter utilized by the logistic logic 290. The logistic logic 290 of the threat assessment logic 142 applies weighting to each parameter of the input 285 along with biasing to produce a scalar value. The scalar value is used by the threat assessment logic 142 to produce the threat score 145 for the labeled binary code section 240, which is used by the error evaluation logic 210 to determine the error 230 for potential “tuning” of (i) the weighting and biasing for the convolution filters 270₁-270_N, (ii) the weighting and biasing for the nonlinear analysis logic 280, and/or (iii) the weighting and biasing for the logistic logic 290 as well as encode values within the embedding lookup table 260 (when embedding encoding is used).

Referring to FIGS. 3A-3B, exemplary flowcharts illustrating operations performed by one embodiment of the cyber-security system 100 is shown. Herein, an executable file is received by a network device deploying the cyber-security system via an interface (block 300). For example, as an illustrative embodiment, the network device may obtain the executable file as a series of packets extracted during transmission over a network. Hence, the interface may be a communication port (and circuitry associated therewith) and/or a network interface (tap) that intercepts the binary packets forming the executable file and either (i) re-routes these binary packets to the network device, (ii) generates a copy of the binary packets associated with the executable file and provides these copied binary packets to the network device, or (iii) reassembles the binary code associated with the executable file prior to providing the binary code to the network device.

Herein, as shown in FIG. 3A, operating as an input layer to the CNN, a pre-processor (shown at 420 in FIG. 4A) of the cyber-security system extracts a section of binary code from the received executable file (operation 305). Additionally, the pre-processor generates an input, namely a representation of the binary code (operation 310). The input is provided to CNN-based logic (shown at 430 of FIG. 4A), which includes the convolution logic, the pooling logic and the FCN logic as described herein. The convolution logic, pooling logic and FCN logic are trained using supervised learning and generates an output in response to that input (operation 315).

The output from the CNN is provided to the classifier (shown at 140 of FIG. 4A) (e.g., to its threat assessment logic), which produces a threat score that indicates a likelihood of the executable file being associated with a cyber-attack (operation 320). Thereafter, the classifier may compare the threat score to different thresholds to determine what actions, if any, are to be taken (operations 325 and 330). As shown, where the threat score falls below a first threshold, the executable file is determined to be benign and no further actions are needed (operation path A). Where the threat score is equal to or exceeds the first threshold, another determination is made as to whether the threat score exceeds a second threshold (operation 335).

As shown in FIGS. 3A-3B, responsive to the threat score exceeding the second threshold, the classifier determines that the executable file is malicious and a message generation logic within the post-analysis subsystem (shown at 475 of FIG. 4B) generates an alert message to warn of a potential cyber-security attack based on detection of the malicious executable file (operations 340 and 360; operation path B). Additionally, or in the alternative, the cyber-security system may perform remediation operations on the malicious executable file to prevent execution of the executable file, such as quarantining the executable file, deleting the executable file, storing the executable file as part of a log and setting permissions to read only (for subsequent analysis), or the like (operation 365).

Referring still to FIGS. 3A-3B, in one embodiment, responsive to the threat score falling within a range between the first and second thresholds, a behavioral analysis is performed on the executable file in order to determine whether to classify the executable file as benign or malicious (operations 335 and 350; operation path C). Also, in one embodiment, responsive to the executable file being classified as malicious, (i.e., the threat score exceeding the second threshold), a behavioral analysis is performed on the executable file in order to verify a prior classification. It should be emphasized that embodiments of the invention can accurately detect whether an executable file should be classified as malware without employing behavioral analysis.

In particular, for behavioral analysis, the executable file is executed within a virtual machine instantiated by the network device (or, in other embodiments, in another network device) that is configured with a selected software profile (e.g., certain application(s), operating system, etc.). The selected software profile may be a software profile appropriate to execute the executable, for example, the software profile including operating system and one or more applications matching those used by the network device or a different software profile that is used by computers within an enterprise to which the network device is connected. The behaviors, namely the activity performed by the virtual machine and/or the executable file during execution, are monitored and subsequently analyzed to determine whether the executable file is considered to be malicious based on performance (or omission) of one or more behaviors corresponding to those of known malware (operation 355).

Where the executable file is benign, no further analysis of the executable file may be required (operation path A). However, if the executable file is determined to be malicious, logic within the cyber-security system may prompt the message generation logic to generate an alert message and/or perform remediation operations as described above (operations 360 and 365).

Referring now to FIG. 4A, a first illustrative embodiment of the logical operations and architecture of the cyber-security system 100 deploying a computational analysis subsystem 400 trained by training sets to analyze binary code from a file 410 is shown. Herein, according to one embodiment of the disclosure, the file 410 may be an executable. In other embodiments, for analysis by the cyber-security system 100, the file may be another file type, such as, for example, a text file, Portable Document Format (PDF) file, Presentation File Format (PPT) file, scripts; however, for convenience, the description will continue with respect to executable files as an illustrative example. Herein, the cyber-security system 100 is configured to analyze the executable file 410 (of any size) through the use of a CNN-based analysis. The computational analysis subsystem 400 includes a pre-processor 420 (pre-processing logic 422 and encoding logic 424), CNN-based logic 430 (convolution logic 112, pooling logic 114, and FCN logic 116), and the classifier 140 includes at least the threat assessment logic 142. Each of these logic components may be software running on a network device or on multiple network devices and collectively operate to determine whether the executable file 410 is malicious (i.e., associated with a cyber-attack) or is benign. The executable file 410 may be provided in a Portable Executable (PE) format including a Common Object File Format (COFF) header and a Section Table, although the executable file 410 may be provided in other formats.

The pre-processing logic 422 operates as part of an input layer of the computational analysis subsystem 400 to receive the executable file 410 as a file or as a collection of binary packets from which content may be extracted and aggregated (reassembled) to produce the executable file 410. It is contemplated that Transmission Control Protocol (TCP) sequence numbers within the binary packets may be relied upon to position the content of these binary packets in a correct order when forming the executable file. It is contemplated that the pre-processing logic 422 may further include decode logic and/or decompression logic to recover the binary code where some of the binary packets are encoded and/or compressed.

Upon receiving the executable file 410, the pre-processing logic 422 is responsible for selecting a section of binary code 415 from the executable file 410 for analysis. In some embodiments, the pre-processing logic 422 may select a plurality of sections of the binary code for analysis by the CNN-based logic 430, each being analyzed, separately or in combination, and conditioned for analysis. The size of the binary code section 415 may be a static (constant) value or dynamic based on rules established during the training and dependent on attributes of the binary code such as length or format.

As shown in FIG. 5, an illustrative embodiment of the operations performed by the pre-processing logic 422 of FIGS. 4A-4B extracting the binary code section 415 from the executable file 410 is shown. The pre-processing logic 422 initially determines a size of the file and determines whether the size of the executable file exceeds a prescribed length “M” (operations 500 and 505). The metric used in the size calculations may be in the same measured units as the encoding, such as bytes for this illustrative example.

Where the size of executable file is less than a prescribed size (M), such as 100K bytes for example, which may be a static value set for the computational analysis or a value set by an administrator, the entire binary code of the executable file is extracted from the executable file (operation 510). This binary code, with additional padding as desired, is provided to the encoding logic to generate an input (incoming representation) for processing by the CNN (operation 515).

However, if the size of the executable file exceeds the prescribed value (M), the pre-processing logic 422 determines whether the selection of the binary code section is to be directed to contiguous binary code or multiple non-contiguous binary code subsections that can be aggregated as the binary code section (operation 520). By extracting subsections of binary code in lieu of extracting a single contiguous binary code section of size M, the computational analysis subsystem has an ability to analyze a broader range of the executable file. This broader analysis of the executable file, in some cases, may provide increased accuracy in classifying the executable file.

Where the binary code section is to be a single contiguous section of binary code, according to one embodiment of the disclosure, the pre-processing logic extracts “M” bytes at the start of the executable code (which may (or may not) be the entire binary code) (operation 525). These M bytes are provided to the encoding logic for use in generating the input to the CNN-based logic 430 through “one hot encoding,” “embedding” or another encoding technique (operation 535). It is contemplated that, for certain embodiments, the “M” extracted bytes may be a dynamic value that can be altered, at least in part, based on the size of the input (e.g., file size).

Alternatively, where the binary code section is to be produced from binary code at a location different than the starting location or from an aggregate of subsections of binary code, the pre-processing logic 420 receives one or more offsets that denote starting memory address locations from which the pre-processing logic 420 extracts binary code from the executable file. The offsets may be preset based on file format (at positions where known malware tends to reside) or may be set by an administrator having knowledge of current malware insertion trends (operation 530). These subsections of binary code may be aggregated to produce the binary code section that is provided to the encoding logic (operation 535).

Referring back to FIG. 4A, upon receipt of the binary code section 415, the encoding logic 424 of the pre-processor 420 encodes binary code section 415 to generate a representation 425 of the binary code section 415 (referred to as the “input” 120 in FIG. 1) in a form and format suitable for processing by the CNN-based logic 430. The (byte) encoding may be conducted by a variety of techniques, including “one hot encoding” or “embedding,” as described above.

Communicatively coupled to the encoding logic 424, the CNN-based logic 430 conducts operations represented by a plurality of executable logic layers (“layers”) including one or more convolution layers by the convolution logic 112, one or more pooling layers by the pooling logic 114, and one or more fully connected/nonlinearity (FCN) layer by the FCN logic 116. Each of these convolution layers is configured to (i) process an incoming representation, such as input 425 for the first convolution layer, and (ii) apply operations in accordance with a programmatic layer function to produce a resultant representation 432. These operations may be performed using one or more convolution filters, which are pre-trained using a training set including patterns associated with benign executable files and/or malicious executable files as described above. Each resultant representation 432 is produced by convolving, based on a selected stride for the convolution, each filter over each incoming representation (e.g., input 425 or resultant representations 432 for subsequent convolution layers). The convolution layers are independent and may be performed successively as shown by feed-forward arrow 450 and/or after a pooling layer as referenced by feed-forward arrow 455. The depiction of feed-forward arrows 450 and 455 is merely for convenience to represent that multiple, independent convolution layers and one or more independent pooling layers may be performed by the computational analysis subsystem 400.

More specifically, after each convolution layer, certain operations may be performed on the resultant representation 432 until a final output 436 is produced by the CNN-based logic 430. For example, after the convolution logic 112 performs a convolution operation on the incoming representation (e.g., input 425), element-wise nonlinear operations (e.g., rectified linear unit or “ReLU”) may be performed on the resultant representation 432 to provide nonlinearities to the resultant representation 432 for that convolution layer.

Additionally, after a convolution layer produces a resultant representation 432, a pooling layer may perform operations on the resultant representation 432. The pooling operation is conducted to reduce the spatial dimensions of the resultant representation 432 prior to providing to this transformed resultant representation 434 to a next convolutional layer (as identified by feed-forward 455) or to the FCN layer 116. Hence, as shown, the resultant representation 432 via feed-forward arrow 450 or transformed resultant representation 434 via feed-forward arrow 455 may operate as the incoming representation for a next convolution layer.

As yet another example of a weighting and/or biasing operation, the FCN logic 116 receives a resultant output representation from a convolution layer 432 or pooling layer 434, and thereafter, applies weights and biases once or in an iterative manner 460 to produce an output (vector) 436 from the CNN-based logic 430. Although not shown, it is contemplated that the FCN logic 116 may operate as an intermediary operation between convolution layers.

Communicatively coupled to the CNN-based logic 430, the classifier 140 is configured to receive the output 436 from the CNN-based logic 430 and classify the executable file, based, at least in part, on a threat score. The threat score is generated by threat assessment logic 142, which may perform a sigmoid function or other function to produce a scalar value, which is used to generate the normalized threat score representing a level of maliciousness as analyzed by the computational analysis subsystem 400.

Referring now to FIG. 4B, a second illustrative embodiment of the cyber-security system 100 deploying the pre-processor 420 and CNN-based logic 430 of the computational analysis subsystem 400 of FIG. 4A, an intelligence-driven subsystem 450 and a post-analysis subsystem 475 to analyze the executable file 410 is shown. Herein, operating concurrently (overlapping at least partially in time) or sequentially with the computational analysis subsystem 400, the intelligence-driven analysis subsystem 450 is configured to receive the executable file 410 and inspect the executable file for indicators associated with a cyber-attack. This inspection is conducted by static analysis logic 460 residing in the network device that is also performing the computational analysis described (see FIG. 7B) or residing in a different network device.

Where the executable file 410 is received in its entirety, the static analysis logic 460 is configured to conduct an analysis of the contents of the executable file 410 without any re-assembly. However, where the executable file 410 is received as a plurality of binary packets, the static analysis logic 465 is further configured to analyze the content of the binary packets forming the executable file 410 to identify any communication protocol anomalies and/or suspicious content in these packets. For example, with respect to payload inspection of the binary packets, the contents of the payloads may be extracted and reassembled to form the executable file for inspection. The header and other portions of the binary packets may be inspected separately.

According to one embodiment of the disclosure, the indicators may be based on intelligence generated by cyber-security analysts and captured in digital signatures (hashes) of known malicious executable files, heuristics and pattern matching based on known executable files, or the like. The comparison of the known indicators associated with malicious and/or benign executable files with the contents of the executable file 410 enables a determination as to whether the executable file 410 is malicious, such as including malware. Thereafter, the static analysis logic 460 produce an output 462 representing features computed from the detected indicators.

In some embodiments, the output 462 from the static analysis logic 460 may be provided to a static encoding logic 465. As a result, the static encoding logic 465 encodes the representative features into a format compatible with the format utilized by the computational analysis subsystem 400. In particular, the encoding may be based, at least in part, on the category of the feature.

More specifically, the static encoding logic 465 translates a Boolean, numeric and categorical features detected by the static analysis logic 460 and creates a vector of real values. For instance, where the feature is a Boolean value (true or false), the static encoding logic 465 translates or encodes the Boolean value as a digital “1” or “0”. For numeric values, the static analysis logic 460 may convert a numeric value into a different type of numeric value, while categorical features may be encoded in accordance with the “one-hot encoding” technique (each categorical feature would be represented by a unique, encoded value). Hence, the static encoding logic 465 produces an output that, after undergoing nonlinear operations by FCN logic 470 and some pre-processing (e.g., normalization, scaling, whitening), is provided to the post-analysis subsystem 475 in a format similar to and compatible with output 436 from the FCN logic 116.

Herein, according to one embodiment of the disclosure, the post-analysis subsystem 475 includes grouping logic 480, FCN logic 485 to provide nonlinearity to the collective output 482 produced by the grouping logic 480, and the threat assessment logic 142. The grouping logic 480 combines the outputs 436 and 472 of these two subsystems into a result (e.g., concatenated result) to which nonlinear combinations of the outputs 436 and 472 from each subsystem are analyzed in determining a result provided to the threat assessment logic 142 to determine the threat score.

As mentioned above, the concurrent operations of the computational analysis subsystem 400 and the intelligence-driven analysis subsystem 450 complement each other. The intelligence-driven analysis subsystem 450 targets an analysis of the context of the executable file, e.g., anomalous data placement in binary packets, communication protocol anomalies, and known malicious and benign patterns associated with malicious and benign executable files. The computational analysis subsystem 400 targets digital bit patterns, independent of the context being analyzed by the intelligence-driven analysis subsystem 450. Hence, the computational analysis subsystem 400 is more content centric, which may better detect coding tendencies or styles by malware authors. The computational analysis subsystem 400 provides further assistance in the detection of zero-day (first time) cyber-attacks, where the malware is unknown and has not been previously detected, and in some cases, never analyzed previously.

Herein, the post-analysis subsystem 475 is communicatively coupled to both the computational analysis subsystem 400 and the intelligence-driven analysis subsystem 450, described above. The post-analysis subsystem 475 may include (i) grouping logic 480 and (ii) the classifier 140. According to one embodiment of the disclosure, the grouping logic 480 may be configured to perform mathematical or logical operations (e.g., concatenation) on content from the received outputs 436 and 472 to generate the collective output 482. The classifier 140, as described above, is configured to receive the collective output 482 from the grouping logic 480 and determine a classification assigned to the executable file 410 based, at least in part, on a threat score for collective output 482 in a manner as described above. The message generator (at 150 of FIG. 1) and/or remediation logic (at 760 of FIG. 7A) may be deployed to generate alerts and remediate malicious executable files as also described above.

Hence, by deploying the general operability of the computational analysis subsystem together with the intelligence-driven analysis subsystem 450, a more robust technique for classifying executable files is provided.

Referring to FIGS. 6A-6H, an illustrative example of operations conducted on an extracted binary code section of an executable file and subsequent representations of the binary code section substantially performed by logic of the computational analysis subsystem 400 of FIGS. 4A-4B is shown. For simplicity, as shown in FIG. 6A, a 6-byte executable file 600 is to be analyzed, although it is contemplated that the executable file 600 may be of any size (e.g., tens or hundreds of thousands of bytes), as the operations would be performed in a similar manner as described below.

Upon receiving the executable file 600, the pre-processing logic is responsible for extracting a section of binary code 610 from the executable file 600 for analysis. Herein, the section of binary code 610 is set to ten (10) bytes, which is larger in size than the executable file 600. As a result, padding 612 is added to actual extracted binary code 614 to produce the binary code section 610 as shown in FIG. 6B.

Referring to FIG. 6C, the binary code section 610 (with padding) is received by the encoding logic, which generates an input 620 in a form and format suitable for processing by the CNN. Herein, for this illustrative embodiment, the byte encoding is conducted in accordance with an embedding encode scheme, where the encoding logic substitutes each byte sequence with “K” element pairs 622 maintained in a K-dimensional embedding lookup table 260. The element pairs 622 are set and adjusted by the training logic during a training session.

Communicatively coupled to the encoding logic, the CNN may be logically represented by a plurality of layer, including one or more convolution layers (see FIGS. 6D-6E), one or more pooling layers (see FIG. 6F), and one or more fully connected/nonlinearity (FCN) layers (see FIG. 6G). For this example, for simplicity, a sequential operation flow for each layer is discussed without any iterations. For instance, operations of a convolution layer and pooling layer are discussed without the output of the pooling layer being provided as an input (incoming representation) to another convolution layer featuring hyper-parameters and convolution filters that may differ from a prior convolution layer. Of course, it is contemplated that the operations may be performed in other embodiments in a similar manner, but with different inputs.

Herein, as an illustrative embodiment, the parameters for the convolution layer are set as follows: (i) the number of convolution filters (M₀) is set to “2” (M₀=2); (ii) amount of lengthwise, zero padding (P) permitted is equal to “1” (P=1); (iii) stride (S) is set to “1” (S=1); (iv) the length (F) of each convolution filter is set to “3” (F=3); (v) the height (K) of the convolution filter is set to the dimension (K) of the embedded matrix, which is “2” (K=2). Based on these settings, the input 620 provided to the convolution logic 112 of the convolution layer (see FIGS. 4A-4B) would be a 2×10×1 tensor while an output 630 (1×10×2 tensor) from the convolution layer is comprised of a first feature map 632 and a second feature map 634.

As shown in FIG. 6D, with a stride of “1”, a first convolution filter 636 (with a bias of “1”) operates on a first grouping of three parameter pairs of the input 620 [0,0; −0.2,−0.1; 0.2,0.5] to produce a convolution value “0.6”. Next, the first convolution filter 636 shifts by a single element pair and operates on the next grouping of parameter pairs [−0.2,−0.1; 0.2,0.5; −0.2,−0.1]. Such operations continue for the first convolution filter 636 to produce the first feature map 632 of the output 630. These operations are similarly performed using the second convolution filter 638 (with a bias of “0”) to produce the second feature map 634 of the output 630. As illustrated, convolution operations on the grouping [−0.2,−0.1;0.3,0.5;1.4,−0.1] are shown to produce convolution values of “4.92” and “0.35,” respectively. As yet, the relevance of these convolution values is unclear as pooling and/or other weighting/biasing operations are subsequently performed.

After the above-described convolution operation is performed by convolution logic as shown in FIG. 6D, an element-wise nonlinear operation, such as a rectified linear unit (ReLU) operation as known in the art, may be performed on the output 630 from the convolution logic (i.e. each element for each feature map 632 and 634). The element-wise nonlinear operation maps all negative values within the output 630 to a zero (“0”) element value in order to provide nonlinearities to the output 630 and produce a rectified, nonlinear output 640, including rectified features 642 and 644.

Thereafter, the pooling logic associated with the pooling layer may perform operations on the nonlinear output 640 in order to reduce its spatial dimensions. As shown in FIG. 6F, the pooling operation does not affect the height (K=2) dimension of the nonlinear output 640, where the height dimension equates to the number of convolution filters. Herein, for this illustrative example, max pooling operations are conducted where the pooling parameters include (i) the pooling length (PL) being set to “2” (PL=2) and (ii) the pooling stride (PS) being set to “2” (PS=2), which reduces the processing elements of the features 650 and 652 by fifty percent (50%). These reduced features 650 and 652 are “flattened” by producing a single pooling vector 654, which is provided to the fully connected/nonlinearlity logic 116 of FIGS. 4A-4B.

Referring to FIG. 6G, the input into the fully connected/nonlinearlity logic is the pooling vector 654 having a length (H) of “10” elements. One of the hyper-parameters set for the fully connected/nonlinearity logic is the number of hidden units (H₀), which sets the dimension of a weighting matrix (H×H₀) 660. As shown, the weighting matrix 660 is a 10×2 matrix. The bias parameter 662 is set during the training session, which is a vector of length “2” (H₀). Based on the input pooling vector 654, the output 665 from the CNN-based logic 430 is equal to [0.102, −0.032], and after conducting an element-wise nonlinear (e.g., ReLU) operation, the output 665 ([0.102, 0]) is determined. The output 665 from the CNN-based logic 430 (convolution and pooling) is a set of high level features that are useful for distinguishing benign binary code (goodware) from malware. An element in this tensor captures information contained in a contiguous byte sequence of the input 415, where the spatial extent of the byte sequence is dependent on convolution and pooling parameters. This positional information is lost when the output 655 is then run through a FCN layer.

Communicatively coupled to the CNN-based logic 430, a classifier is configured to receive the output 665 from the CNN-based logic 430 and determine a classification assigned to the executable file. As shown in FIG. 6H, this classification may be accomplished by generating a threat score 678 based on the received output 665. As shown, a weighting vector 670 and a bias 672 is “tuned” during the training session, where the output 665 processed with the weighting vector 670 and the bias 672 provides a scalar value 674. Based on the scalar value 674, a threat score 678 is generating using a scoring function 676 (e.g., sigmoid function as shown), which identifies the likelihood of the executable file 600 being associated with a cyber-attack.

Referring now to FIG. 7A, an illustrative embodiment of a network device 700 including software modules that support operability of the first embodiment of the cyber-security system of FIG. 4A is shown. According to this embodiment of the disclosure, the network device 700 comprises one or more hardware processors 710 (generally referred to as “processor”), a non-transitory storage medium 720, and one or more communication interfaces 730 (generally referred to as “interface”). These components may be at least partially encased in a housing 740, which may be made entirely or partially of a rigid material (e.g., hard plastic, metal, glass, composites, or any combination thereof) that protects these components from environmental conditions.

The processor 710 is a multi-purpose, processing component that is configured to execute logic 750 maintained within the non-transitory storage medium 720 operating as a data store. As described below, the logic 750 may include logic 752 controlling operability of the computational analysis subsystem and logic 754 to control operability of the classifier. As shown, the computational analysis subsystem logic 752 includes, but is not limited or restricted to, (i) pre-processing logic 422, (ii) encoding logic 424, (iii) convolution logic 112, (iv) pooling logic 114, and/or (v) FCN logic 116. The classifier logic 754 includes threat assessment logic 142, message generation logic 150, and optionally remediation logic 760 and/or behavioral analysis logic 770.

One example of processor 710 includes one or more graphic processing units (GPUs). Alternatively, processor 710 may include another type of processors such as one or more central processing units (CPUs), an Application Specific Integrated Circuit (ASIC), a field-programmable gate array, or any other hardware component with data processing capability.

According to one embodiment of the disclosure, as shown, the interface 730 is configured to receive incoming data propagating over a network, including the executable file 410 and at least temporarily store the executable file 410 in a data store 755. The executable file 410 may be received, as data packets, directly from the network or via a network tap or Switch Port Analyzer (SPAN) port, also known as a mirror port. Processed by the processor 710, the pre-processing logic 422 may extract and aggregate (reassemble) data from the packets to produce the executable file 410, and thereafter, select a section of the binary code for analysis.

Referring still to FIG. 7A, the encoding logic 424 is responsible for encoding the binary code section 415 to generate a representation of the binary code section 415 in a form and format that is suitable for processing by the convolution logic 112, pooling logic 114 and FCN logic 116, as illustrated in FIG. 4A and described above.

The classifier logic 754 includes the threat assessment logic 142 that is configured to receive an output from the computational analysis system logic 752. From the output, the threat assessment logic 142 determines a classification assigned to the executable file 410, as described above. The message generation logic 150 is configured to produce alert messages to warn of potential cyber-attacks while the remediation logic 760 is configured to mitigate the effects of the cyber-attack or halt the cyber-attack by preventing further operations by the network device caused by the executable file 410.

The behavior analysis logic 770 may be stored in the memory 720 and may be executed in response to the computational analysis being unable to determine whether the executable file 410 is malicious or benign or to verify any such determinations. As a result, the behavior analysis logic 770 creates a virtual machine (VM) environment and the executable file 410 is processed within the VM environment. The behaviors of the VM and the executable file 410 are monitored to assess whether the executable file 410 is malicious or benign based on the monitored behaviors.

Referring now to FIG. 7B, an illustrative embodiment of the network device 700 including computational analysis subsystem logic 752 of FIG. 7A, along with logic 780 associated with the intelligence-driven analysis subsystem and logic 790 associated with the post-analysis subsystem logic is shown. Although not shown in detail, the executable file 410 and the binary code section 415 may be temporarily stored in a data store.

The network device 700 performs concurrent analysis of the executable file 410 using both the computational analysis subsystem logic 752 and the intelligence-driven analysis subsystem logic 780. The operations of the computational analysis subsystem logic 752 are described above. Concurrently operating with the computational analysis subsystem, logic 752, the intelligence-driven analysis subsystem logic 780 is configured to receive the executable file 410 and inspect the executable file 410 for indicators associated with a cyber-attack. This inspection is conducted by static analysis logic 460, which is configured to conduct an analysis of the contents of the executable file 410 without any re-assembly. However, where the executable file 410 is received as a plurality of binary packets, the static analysis logic 460 is further configured to analyze the content of the binary packets forming the executable file 410 to identify any communication protocol anomalies and/or indicators (suspicious content) in these packets. The header and other portions of the binary packets may be inspected separately from the payload including the executable file 410.

According to one embodiment of the disclosure, the indicators may be based on intelligence generated by cyber-security analysts and captured in digital signatures (hashes) of known malicious executable files, heuristics and pattern matching based on known executable files, or the like. The comparison of the known indicators associated with malicious and/or benign executable files with the contents of the executable file 410 enables a determination as to whether the executable file includes malware. Thereafter, the static analysis logic 460 produces an output including features representing the detected indicators, which is provided to the static encoding logic 465.

As shown in FIG. 7B, the static encoding logic 465 encodes the features into a format compatible with the format utilized by the computational analysis system logic 752. In particular, the encoding may be based, at least in part, on the category of the feature, as described above. The encoded features, which may be represented as a vector of numbers, undergo pre-processing (e.g., normalization, scaling, whitening) along with consideration of nonlinear combinations of the features using the FCN logic 470 before being provided the post-analysis subsystem logic 790.

Herein, according to this embodiment of the disclosure, the post-analysis subsystem logic 790 includes grouping logic 480, the FCN logic 485 to provide nonlinearity to the output of the grouping logic 480 and the threat assessment logic 142. The grouping logic 480 combines the results of these two subsystems, such as through concatenation, and the combined result is analyzed by the threat assessment logic 142 to determine the threat score used in determining whether the executable file 410 is malicious or benign.

As mentioned above, the concurrent operations of the computational analysis subsystem logic 752 and the intelligence-driven analysis subsystem 780 complement each other. The intelligence-driven analysis subsystem 780 targets an analysis of the context of the executable file 410, e.g., anomalous data placement in binary packets, communication protocol anomalies, and known malicious and benign patterns associated with malicious and benign executable files. The computational analysis subsystem logic 752, however, targets the digital bit patterns, independent of the context being analyzed by the intelligence-driven analysis subsystem logic 780, as described above.

Referring now to FIG. 8, a third illustrative embodiment of the cyber-security system 100 deploying the computational analysis subsystem 400, the intelligence-driven analysis subsystem 450 and another type of post-analysis subsystem 800 is shown. Herein, the post-analysis subsystem 800 includes a first classifier 810 communicatively coupled to the FCN logic 116 to produce a first threat score 815 based on the features extracted by the computational analysis system 400. Similarly, the post-analysis subsystem 800 includes a second classifier 820 to produce a second threat score 825 based on indicators detected by the static analysis logic 460. Hence, the computational analysis subsystem 400 and the intelligence-driven analysis subsystem 450 may operate concurrently.

A threat determination logic 830 is configured to receive the first score 815 from the computational analysis system 400 and the second score 825 from the intelligence-driven analysis subsystem 450. Based on these scores, the threat determination logic 830 computes a resultant threat score that represents a threat level based on the collective analyses of the computational analysis system 400 and the intelligence-driven analysis subsystem 450.

In the foregoing description, the invention is described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. For instance, the selective system call monitoring may be conducted on system calls generated by logic outside the guest image.

Claims

1. A system, implemented with at least one processor and at least one memory including software that, when executed by the at least one processor, detects whether an executable file is associated with a cyber-attack, the system comprising: a pre-processor configured to (i) select a section of binary code, included as part of the executable file and corresponding to executable machine code, in lieu of a dissembled version of the binary code and (ii) generate a first representation of the section of the binary code;a deep neural network including a convolutional neural network communicatively coupled to the pre-processor, the convolutional neural network (CNN) is configured to process a CNN input being the first representation of the section of the binary code by at least applying a plurality of weighting operations executing a programmatic function on the first representation to produce a CNN output, the convolutional neural network is further configured to identify patterns in the first representation operating as the CNN input and produce the CNN output;a classifier communicatively coupled to the convolutional neural network, the classifier being configured to (i) receive the CNN output including one or more patterns for use in determining whether the first representation is associated with a cyber-attack, (ii) receive an output from an intelligence-driven analysis subsystem operating concurrently with the deep neural network, wherein the output is based on static analysis of the executable file, and (iii) determine a classification assigned to the file based, at least in part, on a threat score generated based on the received CNN output from the convolutional neural network; anda message generator configured to generate a message in response to determining the classification of the executable file as being associated with a cyber-attack.
2. The system of claim 1, wherein the pre-processor separates the section of the binary code into a first subsection and a second subsection, and each of the first and second subsections having corresponding first representations processed separately by the convolutional neural network to generate the CNN output provided to the classifier, and the classifier determines a classification of the executable file based on both of the corresponding CNN outputs.
3. The system of claim 1, wherein the executable file comprises a Portable Executable (PE) file.
4. The system of claim 3, wherein the section of the binary code comprises a predetermined number of bytes from one of a starting location of the PE file or an offset from the starting location of the PE file.
5. The system of claim 1 being communicatively coupled to the intelligence-driven analysis subsystem and a post-analysis subsystem, wherein: the intelligence-driven analysis subsystem to (i) receive the executable file, (ii) inspect the executable file for indicators associated with a cyber-attack, and (iii) produce a second output representing features associated with the detected indicators, andthe post-analysis subsystem to receive the second output from the intelligence-driven analysis subsystem and the received CNN output from the convolution neural network, the post analysis subsystem including the classifier.
6. The system of claim 5, wherein the post-analysis subsystem includes grouping logic to concatenate information associated with the received second output and the received CNN output and the classifier communicatively coupled to the grouping logic to receive a representation of the concatenated information and determine the classification assigned to the executable file based, at least in part, on the threat score generated from on the representation of the concatenated information.
7. The system of claim 1, wherein the pre-processor selects the section of the binary code as comprising either (i) the binary code in its entirety when the binary code has a length less than a first number of bytes or (ii) a portion of the binary code less than the binary code in its entirety when the binary code has a length greater than the first number of bytes, wherein the portion of the binary code being a fixed number of bytes.
8. The system of claim 7, wherein responsive to the binary code having a length greater than the first number of bytes, the pre-processor selecting the fixed number of bytes as the code section, the fixed number of bytes comprises either (i) the fixed number of contiguous bytes within the binary code, or (ii) a first number of bytes and a second number of bytes non-contiguous from the first number of bytes collectively forming the fixed number of bytes.
9. The system of claim 1, wherein the pre-processor separates the binary code into one or more code sections of the binary code along a predefined format for the executable file in accordance with an applicable specification, the one or more code sections include the section of binary code.
10. The system of claim 9, wherein the predefined format set for the executable file (PE).
11. The system of claim 1 being implemented as an endpoint device including the at least one processor and the at least one memory including remediation logic, wherein the remediation logic preventing execution of the executable file by the at least one processor upon receiving the message from the message generator.
12. The system of claim 1, wherein the convolutional neural network operates directly on the section of the binary code without disassembly of the binary code.
13. The system of claim 1, wherein the convolutional neural network comprises a plurality of convolutional layers including a first convolution layer that receives the first representation, one or more intermediary layers, and an output layer that generates the CNN output.
14. The system of claim 13, wherein the one or more intermediary layers comprises one or more pooling layers including a first pooling layer configured to perform a nonlinear down-sampling on a layer output from a preceding one of the plurality of convolutional layers to produce a representation compressed relative to the layer output.
15. The system of claim 14, wherein the pre-processor further comprises pre-processing logic and encoding logic communicatively coupled to the first convolution layer of the convolutional neural network, the pre-processing logic and the encoding logic receive the section of the binary code and transforms the section of the binary code into a matrix-based format for processing by the first convolution layer.
16. The system of claim 15, wherein the classifier comprises a logistic function that generates the threat score.
17. The system of claim 1, wherein the classifier further includes concatenation logic and a score generator, and whereinthe concatenation logic of the classifier being communicatively coupled to the intelligence-driven analysis subsystem being configured to (i) receive the executable file, and (ii) operate concurrently with the deep neural network to detect static features associated with a cyber-attack in the executable file; andthe score generator of the classifier to assign the threat score based on a concatenation produced by the concatenation logic of the detected static features provided by the intelligence-driven analysis subsystem and the CNN output provided by the convolutional neural network.
18. The system of claim 17, wherein the concatenation logic of the classifier to provide the detected features in a representation having a format as used by the CNN output provided by the convolutional neural network to the score generator.
19. The system of claim 1, wherein the message generated by the message generator comprises information identifying the executable file to cause the at least one processor to preclude execution of the executable file.
20. The system of claim 1, wherein a size of the section may differ based on a size of the CNN input.
21. The system of claim 1, wherein the CNN output comprises a vector reflecting patterns indicative of the first representation.
22. The system of claim 21, wherein the classifier converts the feature vector into a scalar that includes that threat score.
23. The system of claim 21, wherein the classifier determines the file is malicious when the threat score exceeds a threshold.
24. A system for detecting whether an executable file including binary code is associated with a cyber-attack, the system comprising: an intelligence-driven analysis subsystem to (i) receive the executable file, (ii) inspect and compute features of the executable file for indicators associated with a cyber-attack where the features are associated with one or more data patterns, and (iii) produce a first output representing the detected features;a computational analysis subsystem including a convolutional neural network (CNN) to (i) receive a CNN input being a first representation of at least one section of binary code from the executable file as input, and (ii) process the first representation of the section to produce a second output, the convolutional neural network is configured to identify patterns in the first representation operating as the CNN input and producing the second output representative of the patterns; anda post-analysis subsystem communicatively coupled to both the intelligence-driven analysis subsystem and the computational analysis subsystem, the post-analysis subsystem comprises a classifier being configured to (i) receive a first output from the intelligence-driven analysis subsystem and a second output from the convolutional neural network and (ii) determine a classification assigned to the executable file,wherein the classifier is communicatively coupled to the computational analysis subsystem, the classifier being configured to (i) receive the second output including one or more patterns for use in determining whether the first representation is associated with a cyber-attack, (ii) receive the first output from the intelligence-driven analysis subsystem operating concurrently with the convolutional neural network, wherein the first output is based on static analysis of the executable file, and (iii) determine the classification assigned to the executable file based, at least in part, on a threat score generated based on at least the received first output and the received second output.
25. The system of claim 24, wherein the post-analysis subsystem comprises (i) a concatenation logic to concatenate content from the first output and content from the second output to generate a collective output, and (ii) the classifier configured to (a) receive the collective output from the concatenation logic and (b) determine the classification assigned to the executable file based, at least in part, on the threat score generated based on the collective output being formed by the content of the first output and the content of the second output.
26. The system of claim 24, wherein the computational analysis subsystem comprises the convolutional neural network being configured to (i) receive the CNN input being the first representation of the at least one section of binary code from the executable file as input, and (ii) process the first representation of the at least one section of the binary code by at least applying a plurality of weighting operations executing a programmatic function on the first representation to produce the second output.
27. The system of claim 26, wherein the programmatic function being trained using a training set including labeled patterns associated with either benign executable files or executable files associated with a cyber-attack.
28. The system of claim 26, wherein the computational analysis subsystem further comprises a pre-processor communicatively coupled to the interface, the pre-processor being configured to (i) select the at least one section of binary code from the executable file and (ii) generate the first representation of the binary code within the at least one section.
29. The system of claim 28, wherein the computational analysis subsystem being implemented as an endpoint device including a processor and a memory that comprises the pre-processor, the convolutional neural network, the classifier and a message generator, the message generator including remediation logic, wherein the remediation logic, upon receiving the message from the message generator, preventing execution of the executable file by the processor.
30. The system of claim 24, wherein the computational analysis subsystem is deployed as part of cloud services.
31. The system of claim 30, wherein the intelligence-driven analysis subsystem is deployed within a private network communicatively coupled to the computational analysis subsystem over a public network.
32. The system of claim 24, wherein the computational analysis subsystem is deployed within an endpoint device and the intelligence-driven analysis subsystem is deployed as part of cloud services.
33. The system of claim 24, wherein the intelligence-driven analysis subsystem identifying the features in the executable file that identify whether the executable file is suspicious or malicious using at least any one or more of (i) signature comparison based on signature hashes of known malicious or benign executable files or (ii) heuristics and pattern matching based on known malicious or benign executable files.
34. A method for classifying a file as a benign file or a malicious file associated with a cyber-attack, comprising: receiving the file, the file comprises a binary code, the binary code corresponding to executable machine code;selecting a section of the binary code from the file;encoding the binary code section to produce a first representation of the binary code section;processing the first representation of the binary code section by a convolutional neural network, the processing performed by the convolutional neural network includes applying a plurality of weighting operations executing a programmatic function on the first representation to produce an output including a threat score for use in determining whether the file is associated with a cyber-attack, the programmatic function being trained using a training set including labeled patterns associated with at least one of (i) benign files and (ii) malicious files; anddetermining a classification assigned to the file at least based, at least in part, on a threat score generated based on the received output from the convolutional neural network and an output from an intelligence-driven analysis subsystem operating concurrently with the deep neural network, wherein the output is based on static analysis of the executable file.
35. The method of claim 34, wherein the selecting of the section of the binary code comprises separating the binary code into a first subsection and a second subsection, wherein both the first subsection and the second subsection correspond to the first representation processed by the convolutional neural network to generate the CNN output provided to the classifier.
36. The method of claim 34, wherein the file comprises a Portable Executable (PE) file.
37. The method of claim 36, wherein the section of the binary code comprises a predetermined number of bytes from one of a starting location of the PE file or an offset from the starting location of the PE file.
38. The method of claim 37, wherein the section of the binary code as comprises either (i) the binary code in its entirety when the binary code has a length less than a first number of bytes or (ii) a portion of the binary code less than the binary code in its entirety when the binary code has a length greater than the first number of bytes, wherein the portion of the binary code being a fixed number of bytes.
39. The method of claim 38, wherein responsive to the binary code having a length greater than the first number of bytes, selecting the fixed number of bytes as the code section, the fixed number of bytes comprises either (i) the fixed number of contiguous bytes within the binary code or (ii) a first number of bytes and a second number of bytes non-contiguous from the first number of bytes collectively forming the fixed number of bytes.
40. The method of claim 34 being conducted by an endpoint device including a processor and a memory including remediation logic, wherein the remediation logic preventing execution of the file by the processor.
41. The method of claim 34, wherein the convolutional neural network operates directly on the section of the binary code without disassembly of the binary code.
42. The method of claim 34, wherein the convolutional neural network being deployed as part of cloud services.

US Referenced Citations (714)

Number	Name	Date	Kind
4292580	Ott et al.	Sep 1981	A
5175732	Hendel et al.	Dec 1992	A
5319776	Hile et al.	Jun 1994	A
5440723	Arnold et al.	Aug 1995	A
5490249	Miller	Feb 1996	A
5657473	Killean et al.	Aug 1997	A
5802277	Cowlard	Sep 1998	A
5842002	Schnurer et al.	Nov 1998	A
5960170	Chen et al.	Sep 1999	A
5978917	Chi	Nov 1999	A
5983348	Ji	Nov 1999	A
6088803	Tso et al.	Jul 2000	A
6092194	Touboul	Jul 2000	A
6094677	Capek et al.	Jul 2000	A
6108799	Boulay et al.	Aug 2000	A
6154844	Touboul et al.	Nov 2000	A
6269330	Cidon et al.	Jul 2001	B1
6272641	Ji	Aug 2001	B1
6279113	Vaidya	Aug 2001	B1
6298445	Shostack et al.	Oct 2001	B1
6357008	Nachenberg	Mar 2002	B1
6424627	Sorhaug et al.	Jul 2002	B1
6442696	Wray et al.	Aug 2002	B1
6484315	Ziese	Nov 2002	B1
6487666	Shanklin et al.	Nov 2002	B1
6493756	O'Brien et al.	Dec 2002	B1
6550012	Villa et al.	Apr 2003	B1
6775657	Baker	Aug 2004	B1
6831893	Ben Nun et al.	Dec 2004	B1
6832367	Choi et al.	Dec 2004	B1
6895550	Kanchirayappa et al.	May 2005	B2
6898632	Gordy et al.	May 2005	B2
6907396	Muttik et al.	Jun 2005	B1
6941348	Petry et al.	Sep 2005	B2
6971097	Wallman	Nov 2005	B1
6981279	Arnold et al.	Dec 2005	B1
7007107	Ivchenko et al.	Feb 2006	B1
7028179	Anderson et al.	Apr 2006	B2
7043757	Hoefelmeyer et al.	May 2006	B2
7058822	Edery et al.	Jun 2006	B2
7069316	Gryaznov	Jun 2006	B1
7080407	Zhao et al.	Jul 2006	B1
7080408	Pak et al.	Jul 2006	B1
7093002	Wolff et al.	Aug 2006	B2
7093239	van der Made	Aug 2006	B1
7096498	Judge	Aug 2006	B2
7100201	Izatt	Aug 2006	B2
7107617	Hursey et al.	Sep 2006	B2
7159149	Spiegel et al.	Jan 2007	B2
7213260	Judge	May 2007	B2
7231667	Jordan	Jun 2007	B2
7240364	Branscomb et al.	Jul 2007	B1
7240368	Roesch et al.	Jul 2007	B1
7243371	Kasper et al.	Jul 2007	B1
7249175	Donaldson	Jul 2007	B1
7287278	Liang	Oct 2007	B2
7308716	Danford et al.	Dec 2007	B2
7328453	Merkle, Jr. et al.	Feb 2008	B2
7346486	Ivancic et al.	Mar 2008	B2
7356736	Natvig	Apr 2008	B2
7386888	Liang et al.	Jun 2008	B2
7392542	Bucher	Jun 2008	B2
7418729	Szor	Aug 2008	B2
7428300	Drew et al.	Sep 2008	B1
7441272	Durham et al.	Oct 2008	B2
7448084	Apap et al.	Nov 2008	B1
7458098	Judge et al.	Nov 2008	B2
7464404	Carpenter et al.	Dec 2008	B2
7464407	Nakae et al.	Dec 2008	B2
7467408	O'Toole, Jr.	Dec 2008	B1
7478428	Thomlinson	Jan 2009	B1
7480773	Reed	Jan 2009	B1
7487543	Arnold et al.	Feb 2009	B2
7496960	Chen et al.	Feb 2009	B1
7496961	Zimmer et al.	Feb 2009	B2
7519990	Xie	Apr 2009	B1
7523493	Liang et al.	Apr 2009	B2
7530104	Thrower et al.	May 2009	B1
7540025	Tzadikario	May 2009	B2
7546638	Anderson et al.	Jun 2009	B2
7565550	Liang et al.	Jul 2009	B2
7568233	Szor et al.	Jul 2009	B1
7584455	Ball	Sep 2009	B2
7603715	Costa et al.	Oct 2009	B2
7607171	Marsden et al.	Oct 2009	B1
7639714	Stolfo et al.	Dec 2009	B2
7644441	Schmid et al.	Jan 2010	B2
7657419	van der Made	Feb 2010	B2
7676841	Sobchuk et al.	Mar 2010	B2
7698548	Shelest et al.	Apr 2010	B2
7707633	Danford et al.	Apr 2010	B2
7712136	Sprosts et al.	May 2010	B2
7730011	Deninger et al.	Jun 2010	B1
7739740	Nachenberg et al.	Jun 2010	B1
7779463	Stolfo et al.	Aug 2010	B2
7784097	Stolfo et al.	Aug 2010	B1
7832008	Kraemer	Nov 2010	B1
7836502	Zhao et al.	Nov 2010	B1
7849506	Dansey et al.	Dec 2010	B1
7854007	Sprosts et al.	Dec 2010	B2
7869073	Oshima	Jan 2011	B2
7877803	Enstone et al.	Jan 2011	B2
7904959	Sidiroglou et al.	Mar 2011	B2
7908660	Bahl	Mar 2011	B2
7930738	Petersen	Apr 2011	B1
7937387	Frazier et al.	May 2011	B2
7937761	Bennett	May 2011	B1
7949849	Lowe et al.	May 2011	B2
7996556	Raghavan et al.	Aug 2011	B2
7996836	McCorkendale et al.	Aug 2011	B1
7996904	Chiueh et al.	Aug 2011	B1
7996905	Arnold et al.	Aug 2011	B2
8006305	Aziz	Aug 2011	B2
8010667	Zhang et al.	Aug 2011	B2
8020206	Hubbard et al.	Sep 2011	B2
8028338	Schneider et al.	Sep 2011	B1
8042184	Batenin	Oct 2011	B1
8045094	Teragawa	Oct 2011	B2
8045458	Alperovitch et al.	Oct 2011	B2
8069484	McMillan et al.	Nov 2011	B2
8087086	Lai et al.	Dec 2011	B1
8171553	Aziz et al.	May 2012	B2
8176049	Deninger et al.	May 2012	B2
8176480	Spertus	May 2012	B1
8201246	Wu et al.	Jun 2012	B1
8204984	Aziz et al.	Jun 2012	B1
8214905	Doukhvalov et al.	Jul 2012	B1
8220055	Kennedy	Jul 2012	B1
8225288	Miller et al.	Jul 2012	B2
8225373	Kraemer	Jul 2012	B2
8233882	Rogel	Jul 2012	B2
8234640	Fitzgerald et al.	Jul 2012	B1
8234709	Viljoen et al.	Jul 2012	B2
8239944	Nachenberg et al.	Aug 2012	B1
8260914	Ranjan	Sep 2012	B1
8266091	Gubin et al.	Sep 2012	B1
8286251	Eker et al.	Oct 2012	B2
8291499	Aziz et al.	Oct 2012	B2
8307435	Mann et al.	Nov 2012	B1
8307443	Wang et al.	Nov 2012	B2
8312545	Tuvell et al.	Nov 2012	B2
8321936	Green et al.	Nov 2012	B1
8321941	Tuvell et al.	Nov 2012	B2
8332571	Edwards, Sr.	Dec 2012	B1
8365286	Poston	Jan 2013	B2
8365297	Parshin et al.	Jan 2013	B1
8370938	Daswani et al.	Feb 2013	B1
8370939	Zaitsev et al.	Feb 2013	B2
8375444	Aziz et al.	Feb 2013	B2
8381299	Stolfo et al.	Feb 2013	B2
8402529	Green et al.	Mar 2013	B1
8464340	Ahn et al.	Jun 2013	B2
8479174	Chiriac	Jul 2013	B2
8479276	Vaystikh et al.	Jul 2013	B1
8479291	Bodke	Jul 2013	B1
8510827	Leake et al.	Aug 2013	B1
8510828	Guo et al.	Aug 2013	B1
8510842	Amit et al.	Aug 2013	B2
8516478	Edwards et al.	Aug 2013	B1
8516590	Ranadive et al.	Aug 2013	B1
8516593	Aziz	Aug 2013	B2
8522348	Chen et al.	Aug 2013	B2
8528086	Aziz	Sep 2013	B1
8533824	Hutton et al.	Sep 2013	B2
8539582	Aziz et al.	Sep 2013	B1
8549638	Aziz	Oct 2013	B2
8555391	Demir et al.	Oct 2013	B1
8561177	Aziz et al.	Oct 2013	B1
8566476	Shiffer et al.	Oct 2013	B2
8566946	Aziz et al.	Oct 2013	B1
8584094	Dadhia et al.	Nov 2013	B2
8584234	Sobel et al.	Nov 2013	B1
8584239	Aziz et al.	Nov 2013	B2
8595834	Xie et al.	Nov 2013	B2
8627476	Satish et al.	Jan 2014	B1
8635696	Aziz	Jan 2014	B1
8682054	Xue et al.	Mar 2014	B2
8682812	Ranjan	Mar 2014	B1
8689333	Aziz	Apr 2014	B2
8695096	Zhang	Apr 2014	B1
8713631	Pavlyushchik	Apr 2014	B1
8713681	Silberman et al.	Apr 2014	B2
8726392	McCorkendale et al.	May 2014	B1
8739280	Chess et al.	May 2014	B2
8776229	Aziz	Jul 2014	B1
8782792	Bodke	Jul 2014	B1
8789172	Stolfo et al.	Jul 2014	B2
8789178	Kejriwal et al.	Jul 2014	B2
8793278	Frazier et al.	Jul 2014	B2
8793787	Ismael et al.	Jul 2014	B2
8805947	Kuzkin et al.	Aug 2014	B1
8806647	Daswani et al.	Aug 2014	B1
8832829	Manni et al.	Sep 2014	B2
8850570	Ramzan	Sep 2014	B1
8850571	Staniford et al.	Sep 2014	B2
8881234	Narasimhan et al.	Nov 2014	B2
8881271	Butler, II	Nov 2014	B2
8881282	Aziz et al.	Nov 2014	B1
8898788	Aziz et al.	Nov 2014	B1
8935779	Manni et al.	Jan 2015	B2
8949257	Shiffer et al.	Feb 2015	B2
8984638	Aziz et al.	Mar 2015	B1
8990939	Staniford et al.	Mar 2015	B2
8990944	Singh et al.	Mar 2015	B1
8997219	Staniford et al.	Mar 2015	B2
9009822	Ismael et al.	Apr 2015	B1
9009823	Ismael et al.	Apr 2015	B1
9027135	Aziz	May 2015	B1
9071638	Aziz et al.	Jun 2015	B1
9104867	Thioux et al.	Aug 2015	B1
9106630	Frazier et al.	Aug 2015	B2
9106694	Aziz et al.	Aug 2015	B2
9118715	Staniford et al.	Aug 2015	B2
9159035	Ismael et al.	Oct 2015	B1
9171160	Vincent et al.	Oct 2015	B2
9176843	Ismael et al.	Nov 2015	B1
9189627	Islam	Nov 2015	B1
9195829	Goradia et al.	Nov 2015	B1
9197664	Aziz et al.	Nov 2015	B1
9223972	Vincent et al.	Dec 2015	B1
9225740	Ismael et al.	Dec 2015	B1
9241010	Bennett et al.	Jan 2016	B1
9251343	Vincent et al.	Feb 2016	B1
9262635	Paithane et al.	Feb 2016	B2
9268936	Butler	Feb 2016	B2
9275229	LeMasters	Mar 2016	B2
9282109	Aziz et al.	Mar 2016	B1
9292686	Ismael et al.	Mar 2016	B2
9294501	Mesdaq et al.	Mar 2016	B2
9300686	Pidathala et al.	Mar 2016	B2
9306960	Aziz	Apr 2016	B1
9306974	Aziz et al.	Apr 2016	B1
9311479	Manni et al.	Apr 2016	B1
9355247	Thioux et al.	May 2016	B1
9356944	Aziz	May 2016	B1
9363280	Rivlin et al.	Jun 2016	B1
9367681	Ismael et al.	Jun 2016	B1
9398028	Karandikar et al.	Jul 2016	B1
9413781	Cunningham et al.	Aug 2016	B2
9426071	Caldejon et al.	Aug 2016	B1
9430646	Mushtaq et al.	Aug 2016	B1
9432389	Khalid et al.	Aug 2016	B1
9438613	Paithane et al.	Sep 2016	B1
9438622	Staniford et al.	Sep 2016	B1
9438623	Thioux et al.	Sep 2016	B1
9459901	Jung et al.	Oct 2016	B2
9467460	Otvagin et al.	Oct 2016	B1
9483644	Paithane et al.	Nov 2016	B1
9495180	Ismael	Nov 2016	B2
9497213	Thompson et al.	Nov 2016	B2
9507935	Ismael et al.	Nov 2016	B2
9516057	Aziz	Dec 2016	B2
9519782	Aziz et al.	Dec 2016	B2
9536091	Paithane et al.	Jan 2017	B2
9537972	Edwards et al.	Jan 2017	B1
9560059	Islam	Jan 2017	B1
9565202	Kindlund et al.	Feb 2017	B1
9591015	Amin et al.	Mar 2017	B1
9591020	Aziz	Mar 2017	B1
9594904	Jain et al.	Mar 2017	B1
9594905	Ismael et al.	Mar 2017	B1
9594912	Thioux et al.	Mar 2017	B1
9609007	Rivlin et al.	Mar 2017	B1
9626509	Khalid et al.	Apr 2017	B1
9628498	Aziz et al.	Apr 2017	B1
9628507	Haq et al.	Apr 2017	B2
9633134	Ross	Apr 2017	B2
9635039	Islam et al.	Apr 2017	B1
9641546	Manni et al.	May 2017	B1
9654485	Neumann	May 2017	B1
9661009	Karandikar et al.	May 2017	B1
9661018	Aziz	May 2017	B1
9674298	Edwards et al.	Jun 2017	B1
9680862	Ismael et al.	Jun 2017	B2
9690606	Ha et al.	Jun 2017	B1
9690933	Singh et al.	Jun 2017	B1
9690935	Shiffer et al.	Jun 2017	B2
9690936	Malik et al.	Jun 2017	B1
9690938	Saxe et al.	Jun 2017	B1
9705904	Davis	Jul 2017	B1
9736179	Ismael	Aug 2017	B2
9740857	Ismael et al.	Aug 2017	B2
9747446	Pidathala et al.	Aug 2017	B1
9756074	Aziz et al.	Sep 2017	B2
9773112	Rathor et al.	Sep 2017	B1
9781144	Otvagin et al.	Oct 2017	B1
9787700	Amin et al.	Oct 2017	B1
9787706	Otvagin et al.	Oct 2017	B1
9792196	Ismael et al.	Oct 2017	B1
9824209	Ismael et al.	Nov 2017	B1
9824211	Wilson	Nov 2017	B2
9824216	Khalid et al.	Nov 2017	B1
9825976	Gomez et al.	Nov 2017	B1
9825989	Mehra et al.	Nov 2017	B1
9838408	Karandikar et al.	Dec 2017	B1
9838411	Aziz	Dec 2017	B1
9838416	Aziz	Dec 2017	B1
9838417	Khalid et al.	Dec 2017	B1
9846776	Paithane et al.	Dec 2017	B1
9876701	Caldejon et al.	Jan 2018	B1
9888016	Amin et al.	Feb 2018	B1
9888019	Pidathala et al.	Feb 2018	B1
9910988	Vincent et al.	Mar 2018	B1
9912644	Cunningham	Mar 2018	B2
9912681	Ismael et al.	Mar 2018	B1
9912684	Aziz et al.	Mar 2018	B1
9912691	Mesdaq et al.	Mar 2018	B2
9912698	Thioux et al.	Mar 2018	B1
9916440	Paithane et al.	Mar 2018	B1
9921978	Chan et al.	Mar 2018	B1
9934376	Ismael	Apr 2018	B1
9934381	Kindlund et al.	Apr 2018	B1
9946568	Ismael et al.	Apr 2018	B1
9954890	Staniford et al.	Apr 2018	B1
9973531	Thioux	May 2018	B1
10002252	Ismael et al.	Jun 2018	B2
10019338	Goradia et al.	Jul 2018	B1
10019573	Silberman et al.	Jul 2018	B2
10025691	Ismael et al.	Jul 2018	B1
10025927	Khalid et al.	Jul 2018	B1
10027689	Rathor et al.	Jul 2018	B1
10027690	Aziz et al.	Jul 2018	B2
10027696	Rivlin et al.	Jul 2018	B1
10033747	Paithane et al.	Jul 2018	B1
10033748	Cunningham et al.	Jul 2018	B1
10033753	Islam et al.	Jul 2018	B1
10033759	Kabra et al.	Jul 2018	B1
10050998	Singh	Aug 2018	B1
10068091	Aziz et al.	Sep 2018	B1
10075455	Zafar et al.	Sep 2018	B2
10083302	Paithane et al.	Sep 2018	B1
10084813	Eyada	Sep 2018	B2
10089461	Ha et al.	Oct 2018	B1
10097573	Aziz	Oct 2018	B1
10104102	Neumann	Oct 2018	B1
10108446	Steinberg et al.	Oct 2018	B1
10121000	Rivlin et al.	Nov 2018	B1
10122746	Manni et al.	Nov 2018	B1
10133863	Bu et al.	Nov 2018	B2
10133866	Kumar et al.	Nov 2018	B1
10146810	Shiffer et al.	Dec 2018	B2
10148693	Singh et al.	Dec 2018	B2
10165000	Aziz et al.	Dec 2018	B1
10169585	Pilipenko et al.	Jan 2019	B1
10176321	Abbasi et al.	Jan 2019	B2
10181029	Ismael et al.	Jan 2019	B1
10191861	Steinberg et al.	Jan 2019	B1
10192052	Singh et al.	Jan 2019	B1
10198574	Thioux et al.	Feb 2019	B1
10200384	Mushtaq et al.	Feb 2019	B1
10210329	Malik et al.	Feb 2019	B1
10216927	Steinberg	Feb 2019	B1
10218740	Mesdaq et al.	Feb 2019	B1
10242185	Goradia	Mar 2019	B1
20010005889	Albrecht	Jun 2001	A1
20010047326	Broadbent et al.	Nov 2001	A1
20020018903	Kokubo et al.	Feb 2002	A1
20020038430	Edwards et al.	Mar 2002	A1
20020091819	Melchione et al.	Jul 2002	A1
20020095607	Lin-Hendel	Jul 2002	A1
20020116627	Tarbotton et al.	Aug 2002	A1
20020144156	Copeland	Oct 2002	A1
20020162015	Tang	Oct 2002	A1
20020166063	Lachman et al.	Nov 2002	A1
20020169952	DiSanto et al.	Nov 2002	A1
20020184528	Shevenell et al.	Dec 2002	A1
20020188887	Largman et al.	Dec 2002	A1
20020194490	Halperin et al.	Dec 2002	A1
20030021728	Sharpe et al.	Jan 2003	A1
20030074578	Ford et al.	Apr 2003	A1
20030084318	Schertz	May 2003	A1
20030101381	Mateev et al.	May 2003	A1
20030115483	Liang	Jun 2003	A1
20030188190	Aaron et al.	Oct 2003	A1
20030191957	Hypponen et al.	Oct 2003	A1
20030200460	Morota et al.	Oct 2003	A1
20030212902	van der Made	Nov 2003	A1
20030229801	Kouznetsov et al.	Dec 2003	A1
20030237000	Denton et al.	Dec 2003	A1
20040003323	Bennett et al.	Jan 2004	A1
20040006473	Mills et al.	Jan 2004	A1
20040015712	Szor	Jan 2004	A1
20040019832	Arnold et al.	Jan 2004	A1
20040047356	Bauer	Mar 2004	A1
20040083408	Spiegel et al.	Apr 2004	A1
20040088581	Brawn et al.	May 2004	A1
20040093513	Cantrell et al.	May 2004	A1
20040111531	Staniford et al.	Jun 2004	A1
20040117478	Triulzi et al.	Jun 2004	A1
20040117624	Brandt et al.	Jun 2004	A1
20040128355	Chao et al.	Jul 2004	A1
20040165588	Pandya	Aug 2004	A1
20040236963	Danford et al.	Nov 2004	A1
20040243349	Greifeneder et al.	Dec 2004	A1
20040249911	Alkhatib et al.	Dec 2004	A1
20040255161	Cavanaugh	Dec 2004	A1
20040268147	Wiederin et al.	Dec 2004	A1
20050005159	Oliphant	Jan 2005	A1
20050021740	Bar et al.	Jan 2005	A1
20050033960	Vialen et al.	Feb 2005	A1
20050033989	Poletto et al.	Feb 2005	A1
20050050148	Mohammadioun et al.	Mar 2005	A1
20050086523	Zimmer et al.	Apr 2005	A1
20050091513	Mitomo et al.	Apr 2005	A1
20050091533	Omote et al.	Apr 2005	A1
20050091652	Ross et al.	Apr 2005	A1
20050108562	Khazan et al.	May 2005	A1
20050114663	Cornell et al.	May 2005	A1
20050125195	Brendel	Jun 2005	A1
20050149726	Joshi et al.	Jul 2005	A1
20050157662	Bingham et al.	Jul 2005	A1
20050183143	Anderholm et al.	Aug 2005	A1
20050201297	Peikari	Sep 2005	A1
20050210533	Copeland et al.	Sep 2005	A1
20050238005	Chen et al.	Oct 2005	A1
20050240781	Gassoway	Oct 2005	A1
20050262562	Gassoway	Nov 2005	A1
20050265331	Stolfo	Dec 2005	A1
20050283839	Cowburn	Dec 2005	A1
20060010495	Cohen et al.	Jan 2006	A1
20060015416	Hoffman et al.	Jan 2006	A1
20060015715	Anderson	Jan 2006	A1
20060015747	Van de Ven	Jan 2006	A1
20060021029	Brickell et al.	Jan 2006	A1
20060021054	Costa et al.	Jan 2006	A1
20060031476	Mathes et al.	Feb 2006	A1
20060047665	Neil	Mar 2006	A1
20060070130	Costea et al.	Mar 2006	A1
20060075496	Carpenter et al.	Apr 2006	A1
20060095968	Portolani et al.	May 2006	A1
20060101516	Sudaharan et al.	May 2006	A1
20060101517	Banzhof et al.	May 2006	A1
20060117385	Mester et al.	Jun 2006	A1
20060123477	Raghavan et al.	Jun 2006	A1
20060143709	Brooks et al.	Jun 2006	A1
20060150249	Gassen et al.	Jul 2006	A1
20060161983	Cothrell et al.	Jul 2006	A1
20060161987	Levy-Yurista	Jul 2006	A1
20060161989	Reshef et al.	Jul 2006	A1
20060164199	Glide et al.	Jul 2006	A1
20060173992	Weber et al.	Aug 2006	A1
20060179147	Tran et al.	Aug 2006	A1
20060184632	Marino et al.	Aug 2006	A1
20060191010	Benjamin	Aug 2006	A1
20060221956	Narayan et al.	Oct 2006	A1
20060236393	Kramer et al.	Oct 2006	A1
20060242709	Seinfeld et al.	Oct 2006	A1
20060248519	Jaeger et al.	Nov 2006	A1
20060248582	Panjwani et al.	Nov 2006	A1
20060251104	Koga	Nov 2006	A1
20060288417	Bookbinder et al.	Dec 2006	A1
20070006288	Mayfield et al.	Jan 2007	A1
20070006313	Porras et al.	Jan 2007	A1
20070011174	Takaragi et al.	Jan 2007	A1
20070016951	Piccard et al.	Jan 2007	A1
20070019286	Kikuchi	Jan 2007	A1
20070033645	Jones	Feb 2007	A1
20070038943	FitzGerald et al.	Feb 2007	A1
20070064689	Shin et al.	Mar 2007	A1
20070074169	Chess et al.	Mar 2007	A1
20070094730	Bhikkaji et al.	Apr 2007	A1
20070101435	Konanka et al.	May 2007	A1
20070128855	Cho et al.	Jun 2007	A1
20070142030	Sinha et al.	Jun 2007	A1
20070143827	Nicodemus et al.	Jun 2007	A1
20070156895	Vuong	Jul 2007	A1
20070157180	Tillmann et al.	Jul 2007	A1
20070157306	Elrod et al.	Jul 2007	A1
20070168988	Eisner et al.	Jul 2007	A1
20070171824	Ruello et al.	Jul 2007	A1
20070174915	Gribble et al.	Jul 2007	A1
20070192500	Lum	Aug 2007	A1
20070192858	Lum	Aug 2007	A1
20070198275	Malden et al.	Aug 2007	A1
20070208822	Wang et al.	Sep 2007	A1
20070220607	Sprosts et al.	Sep 2007	A1
20070240218	Tuvell et al.	Oct 2007	A1
20070240219	Tuvell et al.	Oct 2007	A1
20070240220	Tuvell et al.	Oct 2007	A1
20070240222	Tuvell et al.	Oct 2007	A1
20070250930	Aziz et al.	Oct 2007	A1
20070256132	Oliphant	Nov 2007	A2
20070271446	Nakamura	Nov 2007	A1
20080005782	Aziz	Jan 2008	A1
20080018122	Zierler et al.	Jan 2008	A1
20080028463	Dagon et al.	Jan 2008	A1
20080040710	Chiriac	Feb 2008	A1
20080046781	Childs et al.	Feb 2008	A1
20080066179	Liu	Mar 2008	A1
20080072326	Danford et al.	Mar 2008	A1
20080077793	Tan et al.	Mar 2008	A1
20080080518	Hoeflin et al.	Apr 2008	A1
20080086720	Lekel	Apr 2008	A1
20080098476	Syversen	Apr 2008	A1
20080120722	Sima et al.	May 2008	A1
20080134178	Fitzgerald et al.	Jun 2008	A1
20080134334	Kim et al.	Jun 2008	A1
20080141376	Clausen et al.	Jun 2008	A1
20080184367	McMillan et al.	Jul 2008	A1
20080184373	Traut et al.	Jul 2008	A1
20080189787	Arnold et al.	Aug 2008	A1
20080201778	Guo et al.	Aug 2008	A1
20080209557	Herley et al.	Aug 2008	A1
20080215742	Goldszmidt et al.	Sep 2008	A1
20080222729	Chen et al.	Sep 2008	A1
20080263665	Ma et al.	Oct 2008	A1
20080295172	Bohacek	Nov 2008	A1
20080301810	Lehane et al.	Dec 2008	A1
20080307524	Singh et al.	Dec 2008	A1
20080313738	Enderby	Dec 2008	A1
20080320594	Jiang	Dec 2008	A1
20090003317	Kasralikar et al.	Jan 2009	A1
20090007100	Field et al.	Jan 2009	A1
20090013408	Schipka	Jan 2009	A1
20090031423	Liu et al.	Jan 2009	A1
20090036111	Danford et al.	Feb 2009	A1
20090037835	Goldman	Feb 2009	A1
20090044024	Oberheide et al.	Feb 2009	A1
20090044274	Budko et al.	Feb 2009	A1
20090064332	Porras et al.	Mar 2009	A1
20090077666	Chen et al.	Mar 2009	A1
20090083369	Marmor	Mar 2009	A1
20090083855	Apap et al.	Mar 2009	A1
20090089879	Wang et al.	Apr 2009	A1
20090094697	Provos et al.	Apr 2009	A1
20090113425	Ports et al.	Apr 2009	A1
20090125976	Wassermann et al.	May 2009	A1
20090126015	Monastyrsky et al.	May 2009	A1
20090126016	Sobko et al.	May 2009	A1
20090133125	Choi et al.	May 2009	A1
20090144823	Lamastra et al.	Jun 2009	A1
20090158430	Borders	Jun 2009	A1
20090172815	Gu et al.	Jul 2009	A1
20090187992	Poston	Jul 2009	A1
20090193293	Stolfo et al.	Jul 2009	A1
20090198651	Shiffer et al.	Aug 2009	A1
20090198670	Shiffer et al.	Aug 2009	A1
20090198689	Frazier et al.	Aug 2009	A1
20090199274	Frazier et al.	Aug 2009	A1
20090199296	Xie et al.	Aug 2009	A1
20090228233	Anderson et al.	Sep 2009	A1
20090241187	Troyansky	Sep 2009	A1
20090241190	Todd et al.	Sep 2009	A1
20090265692	Godefroid et al.	Oct 2009	A1
20090271867	Zhang	Oct 2009	A1
20090300415	Zhang et al.	Dec 2009	A1
20090300761	Park et al.	Dec 2009	A1
20090328185	Berg et al.	Dec 2009	A1
20090328221	Blumfield et al.	Dec 2009	A1
20100005146	Drako et al.	Jan 2010	A1
20100011205	McKenna	Jan 2010	A1
20100017546	Poo et al.	Jan 2010	A1
20100030996	Butler, II	Feb 2010	A1
20100031353	Thomas et al.	Feb 2010	A1
20100037314	Perdisci et al.	Feb 2010	A1
20100043073	Kuwamura	Feb 2010	A1
20100054278	Stolfo et al.	Mar 2010	A1
20100058474	Hicks	Mar 2010	A1
20100064044	Nonoyama	Mar 2010	A1
20100077481	Polyakov et al.	Mar 2010	A1
20100083376	Pereira et al.	Apr 2010	A1
20100115621	Staniford et al.	May 2010	A1
20100132038	Zaitsev	May 2010	A1
20100154056	Smith et al.	Jun 2010	A1
20100180344	Malyshev et al.	Jul 2010	A1
20100192223	Ismael et al.	Jul 2010	A1
20100220863	Dupaquis et al.	Sep 2010	A1
20100235831	Dittmer	Sep 2010	A1
20100251104	Massand	Sep 2010	A1
20100281102	Chinta et al.	Nov 2010	A1
20100281541	Stolfo et al.	Nov 2010	A1
20100281542	Stolfo et al.	Nov 2010	A1
20100287260	Peterson et al.	Nov 2010	A1
20100299754	Amit et al.	Nov 2010	A1
20100306173	Frank	Dec 2010	A1
20110004737	Greenebaum	Jan 2011	A1
20110025504	Lyon et al.	Feb 2011	A1
20110041179	St Hlberg	Feb 2011	A1
20110047594	Mahaffey et al.	Feb 2011	A1
20110047620	Mahaffey et al.	Feb 2011	A1
20110055907	Narasimhan et al.	Mar 2011	A1
20110078794	Manni et al.	Mar 2011	A1
20110093951	Aziz	Apr 2011	A1
20110099620	Stavrou et al.	Apr 2011	A1
20110099633	Aziz	Apr 2011	A1
20110099635	Silberman et al.	Apr 2011	A1
20110113231	Kaminsky	May 2011	A1
20110145918	Jung et al.	Jun 2011	A1
20110145920	Mahaffey et al.	Jun 2011	A1
20110145934	Abramovici et al.	Jun 2011	A1
20110167493	Song et al.	Jul 2011	A1
20110167494	Bowen et al.	Jul 2011	A1
20110173213	Frazier et al.	Jul 2011	A1
20110173460	Ito et al.	Jul 2011	A1
20110219449	St. Neitzel et al.	Sep 2011	A1
20110219450	McDougal et al.	Sep 2011	A1
20110225624	Sawhney et al.	Sep 2011	A1
20110225655	Niemela et al.	Sep 2011	A1
20110247072	Staniford et al.	Oct 2011	A1
20110265182	Peinado et al.	Oct 2011	A1
20110289582	Kejriwal et al.	Nov 2011	A1
20110302587	Nishikawa et al.	Dec 2011	A1
20110307954	Melnik et al.	Dec 2011	A1
20110307955	Kaplan et al.	Dec 2011	A1
20110307956	Yermakov et al.	Dec 2011	A1
20110314546	Aziz et al.	Dec 2011	A1
20120023593	Puder et al.	Jan 2012	A1
20120054869	Yen et al.	Mar 2012	A1
20120066698	Yanoo	Mar 2012	A1
20120079596	Thomas et al.	Mar 2012	A1
20120084859	Radinsky et al.	Apr 2012	A1
20120096553	Srivastava et al.	Apr 2012	A1
20120110667	Zubrilin et al.	May 2012	A1
20120117652	Manni et al.	May 2012	A1
20120121154	Xue et al.	May 2012	A1
20120124426	Maybee et al.	May 2012	A1
20120174186	Aziz et al.	Jul 2012	A1
20120174196	Bhogavilli et al.	Jul 2012	A1
20120174218	McCoy et al.	Jul 2012	A1
20120198279	Schroeder	Aug 2012	A1
20120210423	Friedrichs et al.	Aug 2012	A1
20120222121	Staniford et al.	Aug 2012	A1
20120255015	Sahita et al.	Oct 2012	A1
20120255017	Sallam	Oct 2012	A1
20120260342	Dube et al.	Oct 2012	A1
20120266244	Green et al.	Oct 2012	A1
20120278886	Luna	Nov 2012	A1
20120297489	Dequevy	Nov 2012	A1
20120330801	McDougal et al.	Dec 2012	A1
20120331553	Aziz et al.	Dec 2012	A1
20130014259	Gribble et al.	Jan 2013	A1
20130036472	Aziz	Feb 2013	A1
20130047257	Aziz	Feb 2013	A1
20130074185	McDougal et al.	Mar 2013	A1
20130086684	Mohler	Apr 2013	A1
20130097699	Balupari et al.	Apr 2013	A1
20130097706	Titonis et al.	Apr 2013	A1
20130111587	Goel et al.	May 2013	A1
20130117852	Stute	May 2013	A1
20130117855	Kim et al.	May 2013	A1
20130139264	Brinkley et al.	May 2013	A1
20130160125	Likhachev et al.	Jun 2013	A1
20130160127	Jeong et al.	Jun 2013	A1
20130160130	Mendelev et al.	Jun 2013	A1
20130160131	Madou et al.	Jun 2013	A1
20130167236	Sick	Jun 2013	A1
20130174214	Duncan	Jul 2013	A1
20130185789	Hagiwara et al.	Jul 2013	A1
20130185795	Winn et al.	Jul 2013	A1
20130185798	Saunders et al.	Jul 2013	A1
20130191915	Antonakakis et al.	Jul 2013	A1
20130196649	Paddon et al.	Aug 2013	A1
20130205014	Muro	Aug 2013	A1
20130227691	Aziz et al.	Aug 2013	A1
20130246370	Bartram et al.	Sep 2013	A1
20130247186	LeMasters	Sep 2013	A1
20130263260	Mahaffey et al.	Oct 2013	A1
20130291109	Staniford et al.	Oct 2013	A1
20130298243	Kumar et al.	Nov 2013	A1
20130318038	Shiffer et al.	Nov 2013	A1
20130318073	Shiffer et al.	Nov 2013	A1
20130325791	Shiffer et al.	Dec 2013	A1
20130325792	Shiffer et al.	Dec 2013	A1
20130325871	Shiffer et al.	Dec 2013	A1
20130325872	Shiffer et al.	Dec 2013	A1
20140032875	Butler	Jan 2014	A1
20140053260	Gupta et al.	Feb 2014	A1
20140053261	Gupta et al.	Feb 2014	A1
20140130158	Wang et al.	May 2014	A1
20140137180	Lukacs et al.	May 2014	A1
20140169762	Ryu	Jun 2014	A1
20140179360	Jackson et al.	Jun 2014	A1
20140181131	Ross	Jun 2014	A1
20140189687	Jung et al.	Jul 2014	A1
20140189866	Shiffer et al.	Jul 2014	A1
20140189882	Jung et al.	Jul 2014	A1
20140237600	Silberman et al.	Aug 2014	A1
20140280245	Wilson	Sep 2014	A1
20140283037	Sikorski et al.	Sep 2014	A1
20140283063	Thompson et al.	Sep 2014	A1
20140328204	Klotsche et al.	Nov 2014	A1
20140337836	Ismael	Nov 2014	A1
20140344926	Cunningham et al.	Nov 2014	A1
20140351935	Shao et al.	Nov 2014	A1
20140380473	Bu et al.	Dec 2014	A1
20140380474	Paithane et al.	Dec 2014	A1
20150007312	Pidathala et al.	Jan 2015	A1
20150096022	Vincent et al.	Apr 2015	A1
20150096023	Mesdaq et al.	Apr 2015	A1
20150096024	Haq et al.	Apr 2015	A1
20150096025	Ismael	Apr 2015	A1
20150180886	Staniford et al.	Jun 2015	A1
20150186645	Aziz et al.	Jul 2015	A1
20150199513	Ismael et al.	Jul 2015	A1
20150199531	Ismael et al.	Jul 2015	A1
20150199532	Ismael et al.	Jul 2015	A1
20150220735	Paithane et al.	Aug 2015	A1
20150372980	Eyada	Dec 2015	A1
20160004869	Ismael et al.	Jan 2016	A1
20160006756	Ismael et al.	Jan 2016	A1
20160044000	Cunningham	Feb 2016	A1
20160127393	Aziz et al.	May 2016	A1
20160191547	Zafar et al.	Jun 2016	A1
20160191550	Ismael et al.	Jun 2016	A1
20160261612	Mesdaq et al.	Sep 2016	A1
20160285914	Singh et al.	Sep 2016	A1
20160301703	Aziz	Oct 2016	A1
20160335110	Paithane et al.	Nov 2016	A1
20170083703	Abbasi et al.	Mar 2017	A1
20180013770	Ismael	Jan 2018	A1
20180048660	Paithane et al.	Feb 2018	A1
20180063169	Zhao	Mar 2018	A1
20180121316	Ismael et al.	May 2018	A1
20180288077	Siddiqui et al.	Oct 2018	A1

Foreign Referenced Citations (13)

Number	Date	Country
2439806	Jan 2008	GB
2490431	Oct 2012	GB
0206928	Jan 2002	WO
0223805	Mar 2002	WO
2007117636	Oct 2007	WO
2008041950	Apr 2008	WO
2011084431	Jul 2011	WO
2011112348	Sep 2011	WO
2012075336	Jun 2012	WO
2012145066	Oct 2012	WO
2013067505	May 2013	WO
2017011702	Jan 2017	WO
WO-2017011702	Jan 2017	WO

Non-Patent Literature Citations (61)

Entry
PCT/US2018/055508 filed Oct. 11, 2018 International Search Report and Written Opinion dated Dec. 12, 2018.
“Mining Specification of Malicious Behavior”—Jha et al, UCSB, Sep. 2007 https://www.cs.ucsb.edu/.about.chris/research/doc/esec07.sub.--mining.pdf-.
“Network Security: NetDetector—Network Intrusion Forensic System (NIFS) Whitepaper”, (“NetDetector Whitepaper”), (2003).
“When Virtual is Better Than Real”, IEEEXplore Digital Library, available at, http://ieeexplore.ieee.org/xpl/articleDetails.isp?reload=true&arnumbe- r=990073, (Dec. 7, 2013).
Abdullah, et al., Visualizing Network Data for Intrusion Detection, 2005 IEEE Workshop on Information Assurance and Security, pp. 100-108.
Adetoye, Adedayo , et al., “Network Intrusion Detection & Response System”, (“Adetoye”), (Sep. 2003).
Apostolopoulos, George; hassapis, Constantinos; “V-eM: A cluster of Virtual Machines for Robust, Detailed, and High-Performance Network Emulation”, 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, Sep. 11-14, 2006, pp. 117-126.
Aura, Tuomas, “Scanning electronic documents for personally identifiable information”, Proceedings of the 5th ACM workshop on Privacy in electronic society. ACM, 2006.
Baecher, “The Nepenthes Platform: An Efficient Approach to collect Malware”, Springer-verlag Berlin Heidelberg, (2006), pp. 165-184.
Bayer, et al., “Dynamic Analysis of Malicious Code”, J Comput Virol, Springer-Verlag, France., (2006), pp. 67-77.
Boubalos, Chris , “extracting syslog data out of raw pcap dumps, seclists.org, Honeypots mailing list archives”, available at http://seclists.org/honeypots/2003/q2/319 (“Boubalos”), (Jun. 5, 2003).
Chaudet, C. , et al., “Optimal Positioning of Active and Passive Monitoring Devices”, International Conference on Emerging Networking Experiments and Technologies, Proceedings of the 2005 ACM Conference on Emerging Network Experiment and Technology, CoNEXT '05, Toulousse, France, (Oct. 2005), pp. 71-82.
Chen, P. M. and Noble, B. D., “When Virtual is Better Than Real, Department of Electrical Engineering and Computer Science”, University of Michigan (“Chen”) (2001).
Cisco “Intrusion Prevention for the Cisco ASA 5500-x Series” Data Sheet (2012).
Cohen, M.I. , “PyFlag—An advanced network forensic framework”, Digital investigation 5, Elsevier, (2008), pp. S112-S120.
Costa, M. , et al., “Vigilante: End-to-End Containment of Internet Worms”, SOSP '05, Association for Computing Machinery, Inc., Brighton U.K., (Oct. 23-26, 2005).
Didier Stevens, “Malicious PDF Documents Explained”, Security & Privacy, IEEE, IEEE Service Center, Los Alamitos, CA, US, vol. 9, No. 1, Jan. 1, 2011, pp. 80-82, XP011329453, ISSN: 1540-7993, DOI: 10.1109/MSP.2011.14.
Distler, “Malware Analysis: An Introduction”, SANS Institute InfoSec Reading Room, SANS Institute, (2007).
Dunlap, George W. , et al., “ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay”, Proceeding of the 5th Symposium on Operating Systems Design and Implementation, USENIX Association, (“Dunlap”), (Dec. 9, 2002).
FireEye Malware Analysis & Exchange Network, Malware Protection System, FireEye Inc., 2010.
FireEye Malware Analysis, Modern Malware Forensics, FireEye Inc., 2010.
FireEye v.6.0 Security Target, pp. 1-35, Version 1.1, FireEye Inc., May 2011.
Goel, et al., Reconstructing System State for Intrusion Analysis, Apr. 2008 SIGOPS Operating Systems Review, vol. 42 Issue 3, pp. 21-28.
Gregg Keizer: “Microsoft's HoneyMonkeys Show Patching Windows Works”, Aug. 8, 2005, XP055143386, Retrieved fom the Internet: URL:http://www.informationweek.com/microsofts-honeymonkeys-show-patching-windows-works/d/d-id/1035069? [retrieved on Jun. 1, 2016].
Heng Yin et al, Panorama: Capturing System-Wide Information Flow for Malware Detection and Analysis, Research Showcase @ CMU, Carnegie Mellon University, 2007.
Hiroshi Shinotsuka, Malware Authors Using New Techniques to Evade Automated Threat Analysis Systems, Oct. 26, 2012, http://www.symantec.com/connect/blogs/, pp. 1-4.
Idika et al., A-Survey-of-Malware-Detection-Techniques, Feb. 2, 2007, Department of Computer Science, Purdue University.
Isohara, Takamasa, Keisuke Takemori, and Ayumu Kubota. “Kernel-based behavior analysis for android malware detection.” Computational intelligence and Security (CIS), 2011 Seventh International Conference on. IEEE, 2011.
Kaeo, Merike , “Designing Network Security”, (“Kaeo”), (Nov. 2003).
Kevin A Roundy et al: “Hybrid Analysis and Control of Malware”, Sep. 15, 2010, Recent Advances in Intrusion Detection, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 317-338, XP019150454 ISBM:978-3-642-15511-6.
Khaled Salah et al: “Using Cloud Computing to Implement a Security Overlay Network”, Security & Privacy, IEEE, IEEE Service Center, Los Alamitos, CA, US, vol. 11, No. 1, Jan. 1, 2013 (Jan. 1, 2013).
Kim, H. , et al., “Autograph: Toward Automated, Distributed Worm Signature Detection”, Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, (Aug. 2004), pp. 271-286.
King, Samuel T., et al., “Operating System Support for Virtual Machines”, (“King”), (2003).
Kreibich, C. , et al., “Honeycomb-Creating Intrusion Detection Signatures Using Honeypots”, 2nd Workshop on Hot Topics in Networks (HotNets-11), Boston, USA, (2003).
Kristoff, J. , “Botnets, Detection and Mitigation: DNS-Based Techniques”, NU Security Day, (2005), 23 pages.
Lastline Labs, The Threat of Evasive Malware, Feb. 25, 2013, Lastline Labs, pp. 1-8.
Li et al., A VMM-Based System Call Interposition Framework for Program Monitoring, Dec. 2010, IEEE 16th International Conference on Parallel and Distributed Systems, pp. 706-711.
Lindorfer, Martina, Clemens Kolbitsch, and Paolo Milani Comparetti. “Detecting environment-sensitive malware.” Recent Advances in Intrusion Detection. Springer Berlin Heidelberg, 2011.
Marchette, David J., “Computer Intrusion Detection and Network Monitoring: A Statistical Viewpoint”, (“Marchette”), (2001).
Moore, D. , et al., “Internet Quarantine: Requirements for Containing Self-Propagating Code”, INFOCOM, vol. 3, (Mar. 30-Apr. 3, 2003), pp. 1901-1910.
Morales, Jose A., et al., ““Analyzing and exploiting network behaviors of malware.””, Security and Privacy in communication Networks. Springer Berlin Heidelberg, 2010. 20-34.
Mori, Detecting Unknown Computer Viruses, 2004, Springer-Verlag Berlin Heidelberg.
Natvig, Kurt , “SANDBOXII: Internet”, Virus Bulletin Conference, (“Natvig”), (Sep. 2002).
NetBIOS Working Group. Protocol Standard for a NetBIOS Service on a TCP/UDP transport: Concepts and Methods. Std 19, RFC 1001, Mar. 1987.
Newsome, J. , et al., “Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software”, In Proceedings of the 12th Annual Network and Distributed System Security, Symposium (NDSS '05), (Feb. 2005).
Nojiri, D. , et al., “Cooperation Response Strategies for Large Scale Attack Mitigation”, DARPA Information Survivability Conference and Exposition, vol. 1, (Apr. 22-24, 2003), pp. 293-302.
Oberheide et al., CloudAV.sub.—N-Version Antivirus in the Network Cloud, 17th USENIX Security Symposium USENIX Security '08 Jul. 28-Aug. 1, 2008 San Jose, CA.
Reiner Sailer, Enriquillo Valdez, Trent Jaeger, Roonald Perez, Leendert van Doorn John Linwood Griffin, Stefan Berger., sHype: Secure Hypervisor Appraoch to Trusted Virtualized Systems (Feb. 2, 2005) (“Sailer”).
Silicon Defense, “Worm Containment in the Internal Network”, (Mar. 2003), pp. 1-25.
Singh, S. , et al., “Automated Worm Fingerprinting”, Proceedings of the ACM/USENIX Symposium on Operating System Design and Implementation, San Francisco, California, (Dec. 2004).
Thomas H. Ptacek, and Timothy N. Newsham , “Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection”, Secure Networks, (“Ptacek”), (Jan. 1998).
Venezia, Paul , “NetDetector Captures Intrusions”, InfoWorld Issue 27, (“Venezia”), (Jul. 14, 2003).
Vladimir Getov: “Security as a Service in Smart Clouds—Opportunities and Concerns”, Computer Software and Applications Conference (COMPSAC), 2012 IEEE 36th Annual, IEEE, Jul. 16, 2012 (Jul. 16, 2012).
Wahid et al., Characterising the Evolution in Scanning Activity of Suspicious Hosts, Oct. 2009, Third International Conference on Network and System Security, pp. 344-350.
Whyte, et al., “DNS-Based Detection of Scanning Works in an Enterprise Network”, Proceedings of the 12th Annual Network and Distributed System Security Symposium, (Feb. 2005), 15 pages.
Williamson, Matthew M., “Throttling Viruses: Restricting Propagation to Defeat Malicious Mobile Code”, ACSAC Conference, Las Vegas, NV, USA, (Dec. 2002), pp. 1-9.
Yuhei Kawakoya et al: “Memory behavior-based automatic malware unpacking in stealth debugging environment”, Malicious and Unwanted Software (Malware), 2010 5th International Conference on, IEEE, Piscataway, NJ, USA, Oct. 19, 2010, pp. 39-46, XP031833827, ISBN:978-1-4244-8-9353-1.
Zhang et al., The Effects of Threading, Infection Time, and Multiple-Attacker Collaboration on Malware Propagation, Sep. 2009, IEEE 28th International Symposium on Reliable Distributed Systems, pp. 73-82.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., . . . & Bengio, Y. (Jun. 2015). Show, attend and tell: Neural image caption generation with visual attention. In International Conference on Machine Learning (pp. 2048-2057).
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Advances in neural information processing systems (pp. 649-657).

Related Publications (1)

	Number	Date	Country
	20190132334 A1	May 2019	US

System and method for analyzing binary code for malware classification using artificial neural network techniques

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

International Classifications