Systems and methods for automated cybersecurity analysis of extracted binary string sets

Description

FIELD

Embodiments of the disclosure relate to the field of cybersecurity. More specifically, certain embodiments of the disclosure relate to a system, apparatus and method for an automated analysis of an extracted set of strings.

BACKGROUND

Over the last decade, malicious software (malware) has become a pervasive problem for Internet users and system administrators of networks devices. To counter this increasing problem, computer files are often inspected to verify that they do not contain any malware. Malware analysts, reverse engineers, forensic investigators, and incident responders have developed an arsenal of tools at their disposal to dissect malware and examine it for potential threats or other indications of source.

A “string” is a data type that comprises any finite sequence of characters (i.e., letters, numerals, symbols and punctuation marks). Data types are frequently used in programming languages as a way of categorizing data. Data types can differ according to the programming language used, however strings are implemented as a data type in virtually every programming language. The characters within strings are typically encoded in accordance with the American Standard Code for Information Interchange (ASCII) standard which establishes a relationship between the binary values stored within data and a pre-established set of characters. Other encodings and standards can be used to format strings including the Extended Binary Coded Decimal Interchange Code (EBCDIC) and UNICODE. Strings can be used for many purposes within computer files, including, for example encoding text relating to an error message that is displayed to the user upon triggering, a registry key, a uniform resource locator (URL) link, or a directory location for where to copy or store data within a computer system.

Malware analysis tools can examine strings contained within software binaries, namely any type of executable code including an application, script or any set of instructions. This examination may aide in gathering clues about the binary's function, threat level, design detection methods, and how containment of any potential damage may be achieved. For example, strings that contain filenames, internet protocol (IP) addresses, Uniform Resource Locators (URLs), domain names or the like may constitute indicators of compromise, and thus, are associated with a higher relevance to cybersecurity than strings that contain, for example, random sequences of characters. By analyzing suspicious binaries with a string extractor, a listing of the strings found within that binary can be generated. However, as the complexity of software and other binaries increases, the amount of strings to be reviewed as well as the effort required to determine relevance also increases. Hence, there is a need for a system to automatically locate and analyze sets of strings contained with various suspicious binaries under review.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:

FIG. 1A depicts an exemplary system diagram of a cloud-based automated string analysis system in accordance with various embodiments of the invention.

FIG. 1B depicts an exemplary system diagram of a device-based automated string analysis system in accordance with an embodiment of the invention.

FIG. 1C depicts an exemplary hardware block diagram of an automated string analysis device in accordance with an embodiment of the invention.

FIG. 2 depicts an exemplary block diagram of an automated string analysis process in accordance with an embodiment of the invention.

FIG. 3 depicts an exemplary block diagram of automated prediction model generation utilizing string feature extraction in accordance with an embodiment of the invention.

FIG. 4A depicts an exemplary simplified list of extracted strings prior to automated analysis in accordance with an embodiment of the invention.

FIG. 4B depicts an exemplary simplified ranked list of extracted strings after automated analysis in accordance with an embodiment of the invention.

FIG. 5 depicts an exemplary flowchart of an automated process of extracted string set analysis in accordance with an embodiment of the invention.

DETAILED DESCRIPTION

Various embodiments of the disclosure relate to an automated system and/or process configured to analyze extracted string sets. This can be accomplished by generating a ranked list of extracted strings for use by various security systems and users within the cybersecurity field. According to one embodiment of the disclosure, the ranked list of extracted strings can be generated by ordering the extracted strings contained within a binary based on a generated threat detection score that corresponds to the likelihood of the string being associated with a risky or otherwise malicious action within the binary.

One of the tools malware analysts typically used when attempting to examine strings located within suspicious binaries is STRINGS.EXE from Sysinternals (a business unit of the Microsoft Corporation of Redmond, Wash.). STRINGS.EXE is an analytic software tool that is configured to receive a passed-in binary and scan it for embedded ASCII or UNICODE strings located within. However, certain analytic tools, such as STRINGS.EXE for example, may simply scan, extract, and generate an unordered list of strings that were located within the passed-in binary. No further analysis is done. By default, certain analytic tools, such as STRINGS.EXE identify strings as any sequence of characters comprising three or more consecutive characters followed by a null terminator. This type of indiscriminate string identification typically leads to the generation of noisy data sets since many of the extracted strings can be irrelevant, and thus obscure the highly relevant strings within the extracted string set.

For example, a set of consecutive bytes within a binary may be interpreted as a set of ASCII characters by STRINGS.EXE and thus be added to the list of extracted strings. However, the consecutive bytes may not actually represent a string of ASCII characters relevant for malware analysis, but instead represent irrelevant data such as a memory address, central processing unit (CPU) instruction, or other non-string data utilized within a program. As a result, string sets generated from analytic tools, such as STRINGS.EXE for example, often require human analysts to manually examine the extracted string sets in order to determine if relevant strings for malware analysis are present. The process of extracted string set analysis, which includes understanding and scoring the relevance of various extracted strings, along with manually generating a threat score often requires highly experienced human analysts. As a result, obtaining quality security-relevant scored data can be time consuming and expensive to obtain. Often, within an extracted string set, the frequency of relevant strings occurring within a set are disproportionately less than irrelevant strings. Additionally, during the manual analysis, variations may exist between the subjective opinions of various human analysts as to what strings constitute potential threats compared to other strings, based on differing past experiences or biases.

As the complexity of software and other binaries increases, the amount of time needed to manually analyze the extracted string sets also grows. Furthermore, as these extracted string sets grow in size, human error during the manual analysis process can also increase. For example, a human analyst may inadvertently skip over relevant strings during the manual review of the string set due to fatigue. Having an automated process to analyze extracted string sets may aide malware analysis by freeing up human analysts to examine other threat indicators of a suspicious binary under analysis.

Generating heuristic rules to robustly account for all possible variations of string combinations that may be extracted from suspicious binaries would be a monumental task. Thus, embodiments disclosed herein utilize automated machine learning frameworks to analyze extracted string sets and generate a ranked list output based on generated threat prediction scores. Many of the embodiments utilize an automated learning to rank (LTR) method to generate a potential threat score and utilize this score to create a ranking for each string extracted from a suspicious binary. LTR methods incorporate supervised machine learning procedures that utilize previously scored data to generate a prediction model that can then be used to predict a score for a new, previously unanalyzed data set (i.e. query) which can then be ranked based on the predicted score. Since the ranking (in many embodiments described herein) is related to the predicted threat scores of the extracted strings, the LTR ranking can be utilized to generate a ranked list of extracted strings. The rank list is an arrangement (i.e., ordering) of the strings according to rank. That is, a sequential arrangement based on the predicted scores of the extracted strings. For example, strings at the beginning of the ranked list can correspond to strings with a higher threat prediction score than subsequent strings in the list. As a result, rankings generated from the disclosed automated methods can be used to generate a ranked list of extracted strings which can subsequently be incorporated into a threat warning for further analysis or presentation to a threat detection system or human analyst.

LTR methods typically generate a prediction model function from known training data sets. The generated prediction model function can then receive new input data and output a score associated with the input data. The generation of a prediction model function is typically done by utilizing a large set of training data that have previously been analyzed and scored, often by a human analyst. Training data can be obtained from historical data generated from prior analyses. Once the prediction model has been generated, new data sets may be processed with the prediction model in order generate predictive scores for these new data sets without the need for human intervention. Embodiments herein utilize prediction models generated via automated machine learning methods to generate prediction scores associated with potential threat levels. By ranking strings extracted from a suspicious binary based on the predicted potential threat levels, the ranked strings at the beginning portion of the ranked string list are more likely to be relevant to further malware analysis compared to the strings in the later portion of the list.

Extracted strings can be expressed as feature data relating to features of the extracted strings. Feature data is typically represented as a number or other designation that correlates with a particular characteristic of the string. For example, a feature could be associated with the string that denotes the number of characters in the string, how many characters of a certain type are present, or if the string reads as natural language (denoting higher relevance) or as gibberish (denoting lower relevance). This type of feature extraction can be accomplished by utilizing natural language processing tools. In this way, some embodiments may generate a machine learning prediction model that utilizes feature data to further minimize the influence of irrelevant strings or random sequences of characters not probative of a cyberattack or otherwise meaningful to cybersecurity. Additionally, in certain embodiments, the automated LTR prediction model may utilize similar string feature data comprised within training data to create a prediction model that can analyze an extracted string against historical threat prediction scores, along with string feature data to generate threat prediction scores with increased accuracy.

In certain embodiments, the generated prediction model may utilize a gradient boosted decision tree (GBDT) method for the machine learning prediction model. LTR systems can be understood as a pairwise classification system, meaning that the system evaluates pairs of items (e.g., extracted strings) from a set at a time and iteratively computes the optimal ranking for all pairs of items (e.g., extracted strings) within a set to come up with a final ranking for the entire set. GBDT methods generally incorporate individual decision trees to facilitate prediction score generation by using a weighted sum of the leaves of each decision tree. GBDT methods can classify each pair of extracted strings as correctly or incorrectly ranked, and use the optimal ordering of each pair of extracted strings to come up with the final ranking for all of the extracted strings within the extracted string set.

Once a suspicious binary has been fully processed and the associated extracted strings ranked with corresponding predicted threat values, a threat warning can be generated. Threat warnings may utilize predetermined rule sets or thresholds to process the ranked extracted string set. In some embodiments, the suspicious binary is initially analyzed in response to a user request (where the “user” may be, e.g., a computer user, security analyst or system admin) and the threat warning is then utilized for the generation of a threat report that is presented to the user. In other embodiments, in response to a predicted threat level that is beyond a predetermined threshold, the threat warning may be utilized to create a remedial action. In certain cases, a score for a single string (e.g., a reference to a particular, known sensitive memory address, etc.) may be enough to generate a remedial action on the entire binary.

Threat warnings can be utilized to generate threat reports, emails, or other communications presented (e.g., displayed or sent) to a user informing them of the results of the analyzed extracted strings within the suspicious binary. These threat warnings can be utilized, in various embodiments, to auto-generate the threat report, email, or other communication informing the user of the results of various analyses. In certain embodiments, the threat warnings can also be reported to outside third parties.

Threat warnings may also be utilized to update remedial action behaviors as responses to newly determined threats as they are identified, such as, but not limited to, a new malware attack having a particularly new threat pattern. In some embodiments, remedial actions may be taken without human intervention. In further embodiments, the string analysis logic may be given a set of pre-defined thresholds and/or rules that may empower the generated threat warnings to initiate remedial actions immediately based on the predicted threat data derived from the suspicious binary. Remedial actions may include, but are not limited to, quarantining the suspicious binary within a system, or halting any processing of the binary.

It should be understood that threat warnings are utilized by varying types of users with unique needs. For example, malware analysts are typically skilled at reading strings. Such malware analysts typically examine string threats as leads for deeper analysis of the suspicious binary which can lead to a variety of outcomes including classification of the malware, verification that the suspicious binary is in fact malware, or mapping of the malware to a certain family. Security operations center (SOC) analysts generally respond to alerts coming into the system. These alerts are typically examined to determine if a suspicious binary needs to be escalated for further review. SOC analysts can benefit from concise, pre-generated threat reports incorporating ranked string lists that can focus their attention on relevant strings. Specifically, utilizing the methods and systems described herein can reduce the time needed for intervention by a SOC analyst. Incident response (IR) consultants conduct investigations of specific incidents of intrusions and other cyberattacks in progress or having occurred, and how to remediate them. For these users, having a focused ranked set of strings can help determine where the malware may be on other areas or systems within the network. Finally, threat intelligence analysts typically want to know if a piece of discovered malware is related to other known pieces of malware. By looking at a ranked string list, the threat intelligence analyst may be able to see similar strings as those found in previous malware, which can help indicate similar origins.

Another aspect of the invention is that the resulting prediction models that are generated for predicting threats on new suspicious binaries can be analyzed, verified, and compared quantitatively. By using such quantitative methods, the prediction model's performance can be assessed and given a value to compare to other models. In some embodiments, the quantitative method utilized is a mean normalized discounted cumulative gain (MNDCG) method that generates a score of each item within a generated prediction model. Broadly, this method examines the magnitude of each string's relevance summed over the entire string set, which can be represented as a non-negative number called the “cumulative gain”. The MNDCG method can then discount these results within the prediction model in a typically logarithmic fashion so as to reflect the goal of having the most relevant strings appear, for example, towards the top of the predicted ranking. That output is normalized so the results of the MNDCG method can be compared to other generated prediction models of varying size. Finally, the quantitative evaluation can limit a certain number of strings of a binary from appearing at (or near, e.g., within a predetermined distance (i.e., number of strings) from) the top of the ranked string list to help limit the computational requirements needed (as some suspicious binaries may have thousands or tens of thousands of binaries). For example, the quantitative analysis results may limit the output to the first 100 strings of a binary within a ranked string list, but could be adjusted via a user interface by an analyst examining the suspicious binary.

It is understood that the process described herein provide for a more efficient and robust method of providing ranked strings sets for malware analysis in an automated fashion. The automated generation of threat predictions scores on new, previously unranked string sets which can be utilized to rank and generate threat warnings can provide a more accurate threat assessment of suspicious binaries as well as increasing efficiency through reducing the time needed for a human analyst to review the set. This facilitates the practical application of providing more efficient malware detection.

I. Terminology

In the following description, certain terminology is used to describe features of the invention. For example, in certain situations, the term “logic” is representative of hardware, firmware or software that is configured to perform one or more functions. As hardware, logic may include circuitry such as one or more processors (e.g., a microprocessor, one or more processor cores, a virtual central processing unit, a programmable gate array, a microcontroller, an application specific integrated circuit, etc.), wireless receiver, transmitter and/or transceiver circuitry, semiconductor memory, combinatorial logic, or other types of electronic components.

As software, logic (or “engines” in certain descriptions) may be in the form of one or more software modules, such as executable code in the form of an executable application, an application programming interface (API), a subroutine, a function, a procedure, an applet, a servlet, a routine, source code, object code, a shared library/dynamic load library, or one or more instructions. These software modules may be stored in any type of a suitable non-transitory storage medium, cloud-based storage medium, or transitory storage medium (e.g., electrical, optical, acoustical or other form of propagated signals such as carrier waves, infrared signals, or digital signals). Examples of non-transitory storage mediums may include, but are not limited or restricted to a programmable circuit; a semiconductor memory; non-persistent storage such as volatile memory (e.g., any type of random access memory “RAM”); persistent storage such as non-volatile memory (e.g., read-only memory “ROM”, power-backed RAM, flash memory, phase-change memory, etc.), a solid-state drive, hard disk drive, an optical disc drive, or a portable memory device. As firmware, the executable code is stored in persistent storage.

The term “malware” is directed to software that produces an undesirable behavior upon execution, where the behavior is deemed to be “undesirable” based on customer-specific rules, manufacturer-based rules, or any other type of rules formulated by public opinion or a particular governmental or commercial entity. This undesired behavior may include a communication-based anomaly or an execution-based anomaly that (1) alters the functionality of a network device executing application software in a malicious manner; (2) alters the functionality of a network device executing application software without any malicious intent; and/or (3) provides an unwanted functionality which is generally acceptable in other contexts.

The term “object” generally refers to content in the form of an item of information having a logical structure or organization that enables it to be classified for purposes of analysis for malware. One example of the object may include an email message or a portion of the email message. Another example of the object may include a storage file or a document such as a PHP or other dynamic file, a word processing document such as Word® document, or other information that may be subjected to cybersecurity analysis. The object may also include an executable such as an application, program, code segment, a script, dynamic link library “dll,” URL link, or any other element having a format that can be directly executed or interpreted by logic within the network device. Network content such as webpages and other downloaded content may be further examples of objects analyzed for malware.

The term “binary” embraces a computer program code that represents text, computer processor instructions, or any other data using a two-symbol system, such as, for example, “0” and “1” from the binary number system. A binary code assigns a pattern of binary digits, also known as bits, to each instruction. Binary codes are used to encode data, such as each digit or character, into bit strings of fixed-width or variable width, depending on the implementation. The term “binary”, as used herein, may also designate an executable or interpretable computer processor instruction, regardless of whether in a two-symbol system, depending on context of its use in this description. For example, a “binary” may refer to any non-text file, but which may nevertheless comprise embedded text as strings. One example of a non-text file that may contain embedded strings is a text-editor file that comprises not only the text within the document, but also includes data related to formatting the text within the program. Binaries may include a variety of types of objects, such as executables, applications, programs, scripts, etc. It is understood that the term binary may include partial, corrupt, or otherwise incomplete files.

The term “cloud-based” generally refers to a hosted service that is remotely located from a data source and configured to receive, store and process data delivered by the data source over a network, including a self-hosted and third-party hosted service. Cloud-based systems may be configured to operate as a public cloud-based service, a private cloud-based service or a hybrid cloud-based service. A “public cloud-based service” may include a third-party provider that supplies one or more servers to host multi-tenant services. Examples of a public cloud-based service include Amazon Web Services® (AWS®), Microsoft® Azure™, and Google® Compute Engine™ as examples. In contrast, a “private” cloud-based service may include one or more servers that host services provided to a single subscriber (enterprise) and a hybrid cloud-based service may be a combination of both a public cloud-based service and a private cloud-based service.

The term “network device” should be generally construed as electronics with data processing capability and/or a capability of connecting to any type of network, such as a public network (e.g., Internet), a private network (e.g., a wireless data telecommunication network, a local area network “LAN”, etc.), or a combination of networks. Examples of a network device may include, but are not limited or restricted to, the following: a server or other stand-alone electronic device, a mainframe, a firewall, a router; an info-entertainment device, industrial controllers, vehicles, or a client device (e.g., a laptop, a smartphone, a tablet, a desktop computer, a netbook, gaming console, a medical device, or any general-purpose or special-purpose, user-controlled electronic device).

Lastly, the terms “or” and “and/or” as used herein are to be interpreted as inclusive or meaning any one or any combination. Therefore, “A, B or C” or “A, B and/or C” mean “any of the following: A; B; C; A and B; A and C; B and C; A, B and C.” An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.

As this invention is susceptible to embodiments of many different forms, it is intended that the present disclosure is to be considered as an example of the principles of the invention and not intended to limit the invention to the specific embodiments shown and described.

II. System Architecture

Referring to FIG. 1A, an exemplary system diagram of a cloud-based automated string analysis system 100A is shown. The string analysis can be accomplished within a private virtual cloud system 135, which, in an embodiment, may be provided within a larger public cloud service 130A. In many embodiments, a client device 140A communicatively coupled to the public cloud service 130A via a network 120 can receive a suspicious binary from a source device 110 which is also communicatively coupled to the network 120. In response to receiving a suspicious binary, the client device 140A can send the binary to the public cloud service 130A which forwards it to the private virtual cloud 135.

The private virtual cloud 135 and associated resources can be generated as part of an Infrastructure-as-a-Service (IaaS) model and comprise at least one instance of a vCPU 136 and a memory 137 communicatively coupled to the vCPU 136. It would be understood by those skilled in the art that the private virtual cloud 135 may comprise a variable number of vCPUs and memory stores as needed based on various factors including, but not limited to, the current available computing resources available on the system, or the current computational demands placed upon the private virtual cloud 135.

The memory 137 can have string analysis logic to direct the vCPU 136 to process the received suspicious binary. In response, the vCPU 136 extracts the strings from the suspicious binary, generates a prediction model for evaluating the extracted strings, and processes the suspicious binary through the prediction model function to generate a list of prediction scores that can be utilized to generate a ranked list of strings taken from the suspicious binary that correlate to the perceived threat of each string. The ranked list of extracted strings can then be utilized to generate an overall threat warning for the suspicious binary. In certain embodiments, the ranked list of extracted strings may be sent back to the client device 140A for further processing. In various embodiments, the string analysis of the suspicious binary may result in a determination that the binary poses an immediate threat such that remedial action should be taken, which may then occur or be communicated to the client device 140A for further action. In further embodiments, the resulting ranked string list may be sent to a third party, such as cybersecurity vendors, or other external threat analysts for evaluation in a threat report. In certain embodiments, the memory 137 comprises prediction model generation logic that may utilize public data, non-public data, or a mixture of both to generate a prediction model function that can be accessed or otherwise provided to a client device 140A for supplementing string analysis logic within the client device 140A. In certain embodiments, the prediction model may be sent to an analyst station or admin for continued action such as further threat analysis and/or remediation.

In some embodiments, the private virtual cloud 135 may be accessed by a network security appliance 150 which may require assistance in evaluating the threat of a suspicious binary received from a source device 110. Embodiments relating to the network security appliance 150 can behave similarly to the embodiments of the client device 140A such that communication between the network security appliance 150 and private virtual cloud 135 is analogous to the communication between the client device 140A and private cloud server 135.

Referring to FIG. 1B, a system diagram of a device-based automated string analysis system 100B is shown. The string analysis device 130B can be located within a client network 160 and be in communication with at least one client device 140B. As the client network 160 receives a suspicious binary from the source device 110 over a network 120, the client device 140B can pass the binary to the string analysis device 130B for processing.

Similar to the embodiments discussed above with respect to FIG. 1A, communication between the client device 140B and the string analysis device 130B may be analogous to the above description of the communication between the private virtual cloud 135 and the client device 140A of FIG. 1A. As would be understood by those skilled in the art, the client network 160 may comprise any number of devices, including multiple client or network devices. Additionally, the string analysis device 130B may also operate as a subsystem of a larger network security appliance within the client network 160. In further embodiments, the string analysis device 130B may be implemented as a virtual instance (i.e. software) that runs on the client device 140B.

Referring to FIG. 1C, a hardware block diagram of an automated string analysis device 130B is shown. The string analysis device 130B comprises a network interface 131 which can be utilized to connect to the network 120 for communication similar to the discussion of FIG. 1A. The string analysis device 130B further comprises a processor 132, memory 170, and training data store 133 which are all in communication with each other. The memory further comprises a plurality of logics including, but not limited to, string extraction logic 171, prediction model generation logic 172, prediction model verification logic 173, ranking logic 174, and reporting logic 175.

The string analysis device 130B can receive suspicious binaries from the network 120 via the network interface 131. In response, the processor 132 can be instructed by the string extraction logic 171 to extract the strings found within the suspicious binary. Once extracted, the prediction model generation logic 172 can utilize training data within the training data store 133 to generate a prediction model. It is contemplated that various embodiments of the string analysis machine 130B may utilize the prediction model generation logic 172 to retrieve a pre-generated prediction model from an external source over the network 120 instead of generating a new model internally. In fact, certain embodiments of the string analysis device 130B may not comprise a training data store 133 and instead can retrieve data (if needed) via a remote connection over the network 120.

Once generated or retrieved, the prediction model can be verified via the prediction model verification logic 173. The verification may be accomplished using verification data either extracted or derived from the training data, or via a specialized set of verification data that may be stored on the training data store 133 or on a remote device. The ranking logic 174 can be utilized to rank a set of strings based on the prediction model either during the verification process or during general analysis of strings extracted from suspicious binaries. The output of the ranking logic 174 can be analyzed by the reporting logic 175 to determine if a report should be generated, and if so, what actions to take. For example, the ranking of a set of strings may require that a threat report should be sent to an analyst for further evaluation. Ranked string sets may also be determined to contain strings that trigger a pre-determined rule (such as strings that are directed to specific, crucial locations within memory) that require the reporting logic 175 to trigger at least one remedial action, sometimes independently without human intervention. As those in the art will understand, the reporting logic 175 can be configured to generate and respond to a number of various threats in a variety of ways that can minimize the potential threat of malware determined to be contained within the suspicious binary.

It should be understood that although certain embodiments are highlighted in discussion of FIGS. 1A-1C, a wider variety of embodiments are possible and contemplated by this application. In fact, based on the desired application and layout, a mixture of client devices, source devices, and other components can be utilized in order to provide an automated system to analyze extracted strings.

III. Training and Prediction

Referring to FIG. 2, an exemplary block diagram of an automated string analysis process 200 is shown. Broadly, the process of string analysis can be understood to comprise two phases: training and predicting. Before predictions can be made, a prediction model 250 must be generated in the training phase 210. In order to better visualize the string analysis process 200, the elements associated with the training phase 210 are bounded by a dashed line. Training data binaries 220 are gathered and processed through a string extractor 230. The output of the string extractor 230 comprises a set of strings which are then, in many embodiments, evaluated by human analysts for relevance related to potential malware threats.

A plurality of these analyzed string sets 241-243 are conceptually shown in FIG. 2 as lists comprising rows of two associated elements. The left “X” element represents a string extracted from the training data binaries 220, while the right “Y” element corresponds to a score assigned to the string by a human analyst. Scores are often assigned as a non-negative integer number that corresponds to the string's relevance for malware analysis. In some embodiments, additional elements may be present in the analyzed string sets 241-243 that represent values associated with features of the string.

Each element in the analyzed string sets 241-243 comprises both a superscript and a subscript. The superscript denotes the rank of the element in the binary so the first row has a superscript of 1, the second row has a superscript of 2, and so on until the last row u is reached. Each analyzed string set 241-243 may have variable lengths which are denoted by the variables u, v, and w. The subscript denotes the number of the training files within the training data 240. Therefore, the first analyzed string set 241 has a subscript of 1 on every element, the second analyzed string set 242 has a subscript of 2 on every element, up to the last analyzed string set 243 which has a subscript of m denoting that the number of analyzed string sets within the training data 240 can be variable and may include large numbers of sets. In fact, in order to increase the robustness of the prediction model 250 generated from the training data 240, the training data 240 may include a large number of previously analyzed and ranked string sets (“labelled” string sets).

With the training data 240 comprising a plurality of analyzed string sets 241-243 generated from the training data binaries 220, the string analysis process 200 can generate a prediction model 250 that can be utilized to create prediction scores on subsequent binary strings. In many embodiments, prior to predicting subsequent strings, the prediction model 250 can undergo a verification process. Verification can occur in many ways, but may be accomplished by directing the prediction model to process at least one (but likely many) verification data sets. A verification data set can be a set of unranked strings which have previously been ranked and analyzed manually. Upon processing of the verification data set by the prediction model 250, the system 200 or analyst may compare the sorted data set 270 generated from the verification data set to the known ranking of the verification data set. In this way, the prediction model 250 can be verified for accuracy and, in certain embodiments, may be adjusted based on the processing of the verification data set. In further embodiments, the verification process may occur automatically based on a set of pre-determined heuristics or thresholds.

The subsequent binary strings for processing by the prediction model 250 may be pre-ranked string sets used to verify the prediction model 250 for accuracy in ranking (as described above), or unranked string sets encountered for example in a private network and potentially representing cybersecurity threats. The embodiment shown in FIG. 2 illustrates decision trees similar to GBDT methods that create an ordered ranking of the analyzed string sets 241-243 within the training data 240. Once the ranking is compete and the prediction model has been generated, the training phase is complete and moves into the prediction phase.

A suspicious binary 221 can be processed through the string extractor 230 to generate a new list of strings, commonly called a query 260. As denoted in FIG. 2, the query 260 has a superscript to denote the initial ordering of the strings (typically the order in which they were discovered during the string extraction). The number of elements in the query 260 can vary based on the size of the suspicious binary 221 and does not have to be similar in size to the analyzed string sets 241-243 in the training data 240. When the query 260 has been fully generated, it can be passed to the prediction model 250 for processing and prediction. The prediction model 250 typically takes each element of the query 260 and assigns a score (denoted as Ŷ) that is predicted for the string.

Once each element of the query 260 has been processed and received a score from the prediction model 250, the string analysis process 200 can perform a ranking of the elements based on the predicted scores for each element. The final ranked string list 270 comprises rows of a string element “X” and a predicted score element Ŷ. The superscript on each element of the row denotes its ranking within the ranked string list 270. Since the score elements correspond to the string's relevance for malware analysis, it can be understood that strings with a higher ranking will have more relevance for further analysis than strings with a lower ranking. The ranked string list 270 can then be utilized to create a threat warning which may be utilized to generate a threat report or other remedial action in response to the presence of a predicted score higher than a pre-determined threshold. The generation of prediction models utilizing string features is discussed in more detail below.

Referring to FIG. 3, an exemplary block diagram of automated prediction model generation utilizing string feature extraction is shown. Typically, a machine learning system 350 will access a set of training data 240 to begin the process of generating a prediction model 250. As discussed above, the training data 240 typically consists of a series of extracted string sets that have been previously analyzed and scored by human analysts or automated heuristics. The training data 240 is typically comprised of pairs of extracted strings and associated threat scores. The number of extracted strings within the training data 240 can be of any size, including, but not limited to, tens of millions of strings. Large amounts of training data can often lead to more accurate predictions. For example, as a training data set increases in size, biases within the human analysts who scored the training data are often diminished. As more data points associated with the training data are obtained or utilized within the prediction model 250, the overall error in the predicted score of a new query will generally be reduced. Data points may be expressed through the use of string features.

In many embodiments, the machine learning system 350 can process the strings associated with the training data 240, and extract features from the strings via a feature extractor 340. In other embodiments, the training data 240 already comprises feature data associated with each string. The machine learning system 350 can then generate a prediction model 250 which can accept a new query with extracted strings and associated string features and generate a predicted threat score and ranking to associate with that string.

Once received, a suspicious binary 221 can be analyzed by first extracting the strings within the suspicious binary 221 via the string extractor 230. Each located string within the suspicious binary 221 can then utilize a strings feature extractor 340 to determine various features 341-343. It should be understood that the number of features extracted can vary depending on the needs of the application. For example, fewer features can be extracted and utilized when computational resources or time is limited. However, more features may be utilized if the extracted string set is very large and increased differentiation between the string threat levels is needed.

By way of example and not limitation, a first feature 341 may be the length of the string. Strings of increased length may correspond to varying levels of threats. A second feature 342 can relate to the type of string. Strings that have been determined to contain natural language elements may be given a certain value compared to strings comprising random characters. String features can be derived from any meaningful distinction that varies between strings and that can be assigned a numerical value for comparison within the prediction model 250. Once all features 341-343 have been extracted from the set of strings located within the suspicious binary 220, the resulting query can be passed to the machine learning system 350 for processing within the prediction model 250 which generates a prediction score for each string. These generated prediction scores can be utilized to create a ranked string list for use in generating threat warnings. A simplified example of a ranked string list that can be generated with this method is discussed below.

Referring to FIG. 4A, an exemplary simplified list of extracted strings 400A prior to automated analysis is shown. The list of extracted strings 400A omits showing all rankings in order to reduce complexity, but it is understood that every slot position between the first slot and the last slot corresponds to an extracted string. In the list of extracted strings 400A, there are various strings that can be classified as having high, mid, or low relevance while some strings are classified as being irrelevant in regards to malware analysis. It is understood that the placement of each string in an unranked set of extracted strings generally correspond to the order in which the strings were located and processed by the string extractor. It should also be understood that the determination of relevance is typically not known at this point prior to analysis and ranking, but is present in FIG. 4A to highlight the processing of the list from an unranked to the ranked state shown in FIG. 4B.

In this example, the first slot in the list of extracted strings 400A corresponds to a first string that has a mid-relevance 410, the second slot corresponds to an irrelevant string 420, and the third slot corresponds to a high relevance string 430. Lower in the list, the fifteenth slot of the list of extracted strings 400A corresponds to a first low relevance string 440. Further down, the twenty third slot corresponds to another high relevance string 450, while the last slot corresponds to a third high relevance string 460. It should be understood that the length of the list of extracted strings 400A can be of any length depending on the number of strings located within a given suspicious binary. The list of extracted strings 400A can be processed by the string analysis logic to create a ranked list of extracted strings such as the one in FIG. 4B.

Referring to FIG. 4B, an exemplary simplified ranked list of extracted strings after automated analysis is shown. Similar to the list of extracted strings 400A of FIG. 4A, the ranked list of extracted strings 400B omits showing all rankings in order to reduce complexity, but it is understood that every ranked slot between the first slot and the last slot corresponds to a string. The ranked list of extracted strings 400B is further understood to be comprised of the same strings as the list of extracted strings 400A prior to automated analysis, and that these strings have been sorted to different slots based upon their relevance as determined by their associated threat prediction scores.

As can be seen in the ranked list of extracted strings after automated analysis 400B, the ranking of the given strings corresponds to a descending order from high relevant strings, down to mid relevant strings, then to low relevant strings, and then concluding with irrelevant strings. The first three ranked slots correspond to the first, second and third high relevance strings 430, 450, 480. The fifteenth slot corresponds to the mid relevance string 410 which was originally located first in the unranked list of extracted strings 400A. The twenty-third slot corresponds to the low relevance string 440, while the last ranked slot in the ranked list of extracted strings 400B corresponds to the irrelevant string 420 that was previously located in the second slot of the list of extracted strings 400A.

As can be understood, the length of the ranked list of extracted strings 400B can be any length and typically corresponds both to the number of strings extracted from the suspicious binary, but also to the number of ranked slots in the list of extracted strings 400A. However, it is contemplated that certain embodiments may generate a ranked list of extracted strings 400B that is smaller than the unranked list of extracted strings 400A due to the process of eliminating strings that are below a certain relevance threshold or ones that are duplicative. Once generated, the ranked list of extracted strings 400B can then be utilized to generate a threat warning suitable for further malware analysis.

IV. String Analysis Process

Referring now to FIG. 5, a flowchart of the automated extracted string analysis process 500 is shown. The process 500 typically begins with the receiving of a suspicious binary (block 510). In many embodiments, this binary can be received from a source device, which could be a hardware-based source, a virtualized source, or any source that is communicatively coupled with the string analysis logic. However, as described above, the source of the received binary can often be from the client device or network security device that is evaluating the binary for malware.

Upon reception of a suspicious binary, the process 500 can utilize a tool to extract the strings from the suspicious binary (block 520). The tool utilized may be, for example, STRINGS.EXE, but may instead be any suitable analytic tool that can generate a list of strings located within a given binary. In a number of embodiments, the generated list of strings is further processed to extract features from the strings (block 530). Features are often processed as a numerical value corresponding to a certain property of the string. As discussed above, features may be related to various aspects of the string including, but not limited to, length, type, frequency, or similarity to previous features. Upon extracting the features from the generated list of strings, the data is ready for processing within a prediction model.

The process 500 often requires that a set of training data be retrieved from a training data source (block 540). In many embodiments, the training data is stored in a memory communicatively coupled to the string analysis logic and can be retrieved directly. In other embodiments, the training data may be available through a remote service and must be requested to be retrieved. Once the training data has been retrieved, a prediction model may be generated based on the training data (block 550). The prediction model can typically be logic that receives new extracted strings and generates a prediction score based on the training data used to generate the prediction model. Often, this can be achieved through the use of a GBDT method. Because the training data comprises scores relating to threat levels associated with previously analyzed extracted strings, the prediction score generated by the prediction model corresponds to a prediction for the perceived level of threat of the new extracted strings.

Once the prediction model is generated, it can then be verified through the use of verification data (block 555). Typically, the verification can be accomplished by running a set of previously verified string data through the prediction model and comparing the results of the prediction model to the results previously determined on the same string data. As a result of the verification process, the prediction model may be adjusted which can be accomplished either manually or via an automated adjustment process based on a set of pre-established thresholds or heuristics. Once verified, the string analysis ranking logic can process each string from within the extracted string set to generate a prediction score that is associated with each analyzed string (block 560). In certain embodiments, the prediction score is a non-negative value that can range from within a lower and upper bound. For example, the threat levels could be assigned a number between zero and seven, with seven being a higher predicted threat than a string associated with a zero prediction score.

Upon completion of the prediction score generation, the process 500 can generate a new string list that ranks the order of the analyzed extracted strings based upon the threat prediction score (block 570). In certain embodiments, the process of generating the ranked list of strings may delete duplicate strings, or add an indicator within the list to each string to denote the number of occurrences of the extracted string within the ranked set. In this way, the ranked list of strings can be displayed in a more efficient manner. In various embodiments, the process of generating the ranked list of extracted strings may limit the number of strings within the ranked set based on factors including, but not limited to, the total number of extracted strings within the original set, or strings that only exceed a certain threat threshold value level. For example, a ranked list of extracted strings may only comprise the first one-hundred entries even if the binary under analysis comprises thousands of strings. The generated ranked list of extracted strings may then be utilized in various ways.

The process 500 can utilize the ranked list of extracted strings to generate an overall threat warning (block 580). As discussed above, threat warnings can have a variety of uses such as generating remedial actions, or providing data for the generation of an overall threat report. Based upon the requirements of the user, the presence of strings that exceed a pre-determined value can create a trigger for a network security system to take immediate remedial action such as, but not limited to, quarantining the suspicious binary within a system, or to halt processing of the binary.

Threat reports may incorporate the threat warning data into the threat report in order to aid a human analyst's further threat analysis of the suspicious binary (block 590). As discussed above, the type of data and report required by the analyst can vary depending on the nature of the analysis sought. For example, the string rankings can be used in an analyst's further investigation and analysis of the cybersecurity threat represented by the binary through analysis of the order of the strings within the ranked strings. The binary analysis results may further be combined with other cybersecurity information and indicators of compromise to aid in determining whether a cyberattack is occurring and/or the remediation appropriate to mitigate the attack and its damage.

The binary utilized in the string ranking analysis may also be subjected to further static and/or dynamic analysis. These types of analysis employ a two-phase malware detection approach to detect malware contained in suspicious binaries or other network traffic monitored in real-time. In a first or “static” phase, a heuristic is applied to a suspicious binary or other object that exhibits characteristics associated with malware. In a second or “dynamic” phase, the suspicious objects are processed within one or more virtual machines and in accordance with a specific version of an application or multiple versions of that application associated with the binary. These methods offer a two-phase, malware detection solution with options for concurrent processing of two or more versions of an application in order to achieve significant reduction of false positives while limiting time for analysis. Static and dynamic analysis techniques that can be utilized in accordance with embodiments of the invention are described in U.S. Pat. No. 9,241,010 issued Jan. 19, 2016 and U.S. Pat. No. 10,284,575, issued May 7, 2019, the disclosures of which are hereby incorporated by reference in their entirety.

In the foregoing description, the invention is described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims.

Claims

1. An automated computerized method for analyzing a set of extracted strings relevant for cybersecurity threat detection comprising: processing a binary with a string-extraction logic, wherein the string extraction logic is configured to locate strings within a received binary and output an extracted string set of the located strings;processing the extracted string set with a prediction model generated from a set of training data to determine a threat prediction score for each located string within the extracted string set;ranking the located strings within the extracted string set based upon the determined threat prediction score; andoutputting a ranked string list based upon the located strings' ranking,wherein prior to the processing of the extracted string set, the prediction model is generated based on at least the set of training data including a plurality of previously analyzed extracted string sets, each element of the previously analyzed extracted string sets comprises at least one extracted string and a corresponding previously determined threat prediction score.
2. The method of claim 1, wherein the located strings associated with a higher threat prediction score appear in the ranked string list before strings associated with a lower threat prediction score.
3. The method of claim 1, wherein in response to the outputting of the ranked string list, generating a threat warning comprising additional cybersecurity threat data associated with the ranked string list.
4. The method of claim 3, wherein, in response to the threat warning exceeding a first pre-determined threshold, generating a threat report incorporating the ranked string list.
5. The method of claim 4, wherein the threat report only incorporates strings from the ranked string list that exceed a second pre-determined threshold.
6. The method of claim 4, wherein the set of strings incorporated within the threat report does not comprise duplicate strings.
7. The method of claim 3, wherein in response to the threat warning exceeding a pre-determined threshold, a remedial action is conducted.
8. The method of claim 1, wherein the method is practiced at least partially within a cloud-based computing environment.
9. An automated computerized method for analyzing a set of extracted strings relevant for cybersecurity threat detection comprising: processing a binary with a string extraction logic, wherein the string extraction logic is configured to locate strings within a received binary and output an extracted string set of the located strings;processing the extracted string set with a prediction model generated from a set of training data to determine a threat prediction score for each located string within the extracted string set;ranking the located strings within the extracted string set based upon the determined threat prediction score; andoutputting a ranked string list based upon the rankings of the located strings,wherein the prediction model utilized to generate the ranked string list is further processed within a quantitative analysis system to generate a first comparative score suitable for comparison with a second comparative score associated with a second prediction model utilized to generate a second ranked string list in order to assess the validity of the prediction model utilized to generate the ranked string list.
10. The method of claim 9, wherein the quantitative analysis system utilizes normalized discounted cumulative gain methods.
11. An automated system for analyzing a set of extracted strings relevant for cybersecurity threat detection comprising: a processor; anda transitory storage medium communicatively coupled to the processor, the transitory storage medium includes string analysis logic configured to: process a binary with a string extraction logic, wherein the string extraction logic is configured to locate strings within a received binary and output an extracted string set of the located strings;process the extracted string set with a prediction model to determine a threat prediction score for each located string within the extracted string set;rank the located strings within the extracted string set based upon the determined threat prediction score; andoutput a ranked string list based upon the ranking of the located strings,wherein the prediction model is generated, prior to the processing of the extracted string set based on at least a set of training data comprising a plurality of previously analyzed extracted string sets andwherein each element of the previously analyzed extracted string sets comprises at least one extracted string and a corresponding previously determined threat prediction score.
12. The system of claim 11, wherein the located strings associated with a higher threat prediction score appear in the ranked string list before strings associated with a lower threat prediction score.
13. The system of claim 11, wherein in response to the outputting of the ranked string list, a threat warning is generated comprising additional cybersecurity threat data associated with the ranked string list.
14. The system of claim 13, wherein a threat report incorporates the ranked string list and is generated in response to the threat warning exceeding a first pre-determined threshold.
15. The system of claim 14, wherein the threat report only incorporates strings from the ranked string list that exceed a second pre-determined threshold.
16. The system of claim 14, wherein the ranked string list incorporated within the threat report does not comprise duplicate strings.
17. The system of claim 15, wherein remedial action is taken in response to the threat warning exceeding a third pre-determined threshold.
18. The system of claim 11, wherein the prediction model utilized to generate the ranked string list is further processed within a quantitative analysis system to generate a first comparative score suitable for comparison with a second comparative score associated with a second prediction model utilized to generate a second ranked string list in order to assess the validity of the prediction model utilized to generate the ranked string list.
19. The system of claim 18, wherein the quantitative analysis system utilizes normalized discounted cumulative gain.
20. The system of claim 11, wherein the system is at least partially operated within a cloud-based computing environment.
21. The system of claim 20, wherein the processor is operated within a virtual computing environment.
22. An automated system for analyzing a set of extracted strings relevant for cybersecurity threat detection comprising: a processor; anda transitory storage medium communicatively coupled to the processor, the transitory storage medium comprises: a string extraction logic to process a binary to locate strings within the binary and output an extracted string set of the located strings;a prediction model logic configured to retrieve a prediction model generated with a set of training data and verified with a set of verification data;a ranking logic configured to rank the located strings within the extracted string set based on a prediction score generated by the prediction model for each located string; anda reporting logic configured to generate a threat warning comprising data generated from the ranked string list wherein the threat warning is formatted for a human analyst to perform further analysis.
23. The automated system of claim 22, wherein the located strings associated with a higher threat prediction score appear in the ranked string list before strings associated with a lower threat prediction score.
24. The automated system of claim 22, wherein the reporting logic to generate the threat warning comprising additional cybersecurity threat data associated with the ranked string list in response to an outputting of the ranked string list.
25. The automated system of claim 22, wherein responsive to the threat warning exceeding a pre-determined threshold, generating the threat report incorporating the ranked string list.

US Referenced Citations (712)

Number	Name	Date	Kind
4292580	Ott et al.	Sep 1981	A
5175732	Hendel et al.	Dec 1992	A
5319776	Hile et al.	Jun 1994	A
5440723	Arnold et al.	Aug 1995	A
5490249	Miller	Feb 1996	A
5657473	Killean et al.	Aug 1997	A
5802277	Cowlard	Sep 1998	A
5842002	Schnurer et al.	Nov 1998	A
5960170	Chen et al.	Sep 1999	A
5978917	Chi	Nov 1999	A
5983348	Ji	Nov 1999	A
6088803	Tso et al.	Jul 2000	A
6092194	Touboul	Jul 2000	A
6094677	Capek et al.	Jul 2000	A
6108799	Boulay et al.	Aug 2000	A
6154844	Touboul et al.	Nov 2000	A
6269330	Cidon et al.	Jul 2001	B1
6272641	Ji	Aug 2001	B1
6279113	Vaidya	Aug 2001	B1
6298445	Shostack et al.	Oct 2001	B1
6357008	Nachenberg	Mar 2002	B1
6424627	Sorhaug et al.	Jul 2002	B1
6442696	Wray et al.	Aug 2002	B1
6484315	Ziese	Nov 2002	B1
6487666	Shanklin et al.	Nov 2002	B1
6493756	O'Brien et al.	Dec 2002	B1
6550012	Villa et al.	Apr 2003	B1
6775657	Baker	Aug 2004	B1
6831893	Ben Nun et al.	Dec 2004	B1
6832367	Choi et al.	Dec 2004	B1
6895550	Kanchirayappa et al.	May 2005	B2
6898632	Gordy et al.	May 2005	B2
6907396	Muttik et al.	Jun 2005	B1
6941348	Petry et al.	Sep 2005	B2
6971097	Wallman	Nov 2005	B1
6981279	Arnold et al.	Dec 2005	B1
7007107	Ivchenko et al.	Feb 2006	B1
7028179	Anderson et al.	Apr 2006	B2
7043757	Hoefelmeyer et al.	May 2006	B2
7058822	Edery et al.	Jun 2006	B2
7069316	Gryaznov	Jun 2006	B1
7080407	Zhao et al.	Jul 2006	B1
7080408	Pak et al.	Jul 2006	B1
7093002	Wolff et al.	Aug 2006	B2
7093239	van der Made	Aug 2006	B1
7096498	Judge	Aug 2006	B2
7100201	Izatt	Aug 2006	B2
7107617	Hursey et al.	Sep 2006	B2
7159149	Spiegel et al.	Jan 2007	B2
7213260	Judge	May 2007	B2
7231667	Jordan	Jun 2007	B2
7240364	Branscomb et al.	Jul 2007	B1
7240368	Roesch et al.	Jul 2007	B1
7243371	Kasper et al.	Jul 2007	B1
7249175	Donaldson	Jul 2007	B1
7287278	Liang	Oct 2007	B2
7308716	Danford et al.	Dec 2007	B2
7328453	Merkle, Jr. et al.	Feb 2008	B2
7346486	Ivancic et al.	Mar 2008	B2
7356736	Natvig	Apr 2008	B2
7386888	Liang et al.	Jun 2008	B2
7392542	Bucher	Jun 2008	B2
7418729	Szor	Aug 2008	B2
7428300	Drew et al.	Sep 2008	B1
7441272	Durham et al.	Oct 2008	B2
7448084	Apap et al.	Nov 2008	B1
7458098	Judge et al.	Nov 2008	B2
7464404	Carpenter et al.	Dec 2008	B2
7464407	Nakae et al.	Dec 2008	B2
7467408	O'Toole, Jr.	Dec 2008	B1
7478428	Thomlinson	Jan 2009	B1
7480773	Reed	Jan 2009	B1
7487543	Arnold et al.	Feb 2009	B2
7496960	Chen et al.	Feb 2009	B1
7496961	Zimmer et al.	Feb 2009	B2
7519990	Xie	Apr 2009	B1
7523493	Liang et al.	Apr 2009	B2
7530104	Thrower et al.	May 2009	B1
7540025	Tzadikario	May 2009	B2
7546638	Anderson et al.	Jun 2009	B2
7565550	Liang et al.	Jul 2009	B2
7568233	Szor et al.	Jul 2009	B1
7584455	Ball	Sep 2009	B2
7603715	Costa et al.	Oct 2009	B2
7607171	Marsden et al.	Oct 2009	B1
7639714	Stolfo et al.	Dec 2009	B2
7644441	Schmid et al.	Jan 2010	B2
7657419	van der Made	Feb 2010	B2
7676841	Sobchuk et al.	Mar 2010	B2
7698548	Shelest et al.	Apr 2010	B2
7707633	Danford et al.	Apr 2010	B2
7712136	Sprosts et al.	May 2010	B2
7730011	Deninger et al.	Jun 2010	B1
7739740	Nachenberg et al.	Jun 2010	B1
7779463	Stolfo et al.	Aug 2010	B2
7784097	Stolfo et al.	Aug 2010	B1
7832008	Kraemer	Nov 2010	B1
7836502	Zhao et al.	Nov 2010	B1
7849506	Dansey et al.	Dec 2010	B1
7854007	Sprosts et al.	Dec 2010	B2
7869073	Oshima	Jan 2011	B2
7877803	Enstone et al.	Jan 2011	B2
7904959	Sidiroglou et al.	Mar 2011	B2
7908660	Bahl	Mar 2011	B2
7930738	Petersen	Apr 2011	B1
7937387	Frazier et al.	May 2011	B2
7937761	Bennett	May 2011	B1
7949849	Lowe et al.	May 2011	B2
3006305	Aziz	Aug 2011	A1
7996556	Raghavan et al.	Aug 2011	B2
7996836	McCorkendale et al.	Aug 2011	B1
7996904	Chiueh et al.	Aug 2011	B1
7996905	Arnold et al.	Aug 2011	B2
8010667	Zhang et al.	Aug 2011	B2
8020206	Hubbard et al.	Sep 2011	B2
8028338	Schneider et al.	Sep 2011	B1
8042184	Batenin	Oct 2011	B1
8045094	Teragawa	Oct 2011	B2
8045458	Alperovitch et al.	Oct 2011	B2
8069484	McMillan et al.	Nov 2011	B2
8087086	Lai et al.	Dec 2011	B1
8171553	Aziz et al.	May 2012	B2
8176049	Deninger et al.	May 2012	B2
8176480	Spertus	May 2012	B1
8201246	Wu et al.	Jun 2012	B1
8204984	Aziz et al.	Jun 2012	B1
8214905	Doukhvalov et al.	Jul 2012	B1
8220055	Kennedy	Jul 2012	B1
8225288	Miller et al.	Jul 2012	B2
8225373	Kraemer	Jul 2012	B2
8233882	Rogel	Jul 2012	B2
8234640	Fitzgerald et al.	Jul 2012	B1
8234709	Viljoen et al.	Jul 2012	B2
8239944	Nachenberg et al.	Aug 2012	B1
8260914	Ranjan	Sep 2012	B1
8266091	Gubin et al.	Sep 2012	B1
8286251	Eker et al.	Oct 2012	B2
8291499	Aziz et al.	Oct 2012	B2
8307435	Mann et al.	Nov 2012	B1
8307443	Wang et al.	Nov 2012	B2
8312545	Tuvell et al.	Nov 2012	B2
8321936	Green et al.	Nov 2012	B1
8321941	Tuvell et al.	Nov 2012	B2
8332571	Edwards, Sr.	Dec 2012	B1
8365286	Poston	Jan 2013	B2
8365297	Parshin et al.	Jan 2013	B1
8370938	Daswani et al.	Feb 2013	B1
8370939	Zaitsev et al.	Feb 2013	B2
8375444	Aziz et al.	Feb 2013	B2
8381299	Stolfo et al.	Feb 2013	B2
8402529	Green et al.	Mar 2013	B1
8464340	Ahn et al.	Jun 2013	B2
8479174	Chiriac	Jul 2013	B2
8479276	Vaystikh et al.	Jul 2013	B1
8479291	Bodke	Jul 2013	B1
8510827	Leake et al.	Aug 2013	B1
8510828	Guo et al.	Aug 2013	B1
8510842	Amit et al.	Aug 2013	B2
8516478	Edwards et al.	Aug 2013	B1
8516590	Ranadive et al.	Aug 2013	B1
8516593	Aziz	Aug 2013	B2
8522348	Chen et al.	Aug 2013	B2
8528086	Aziz	Sep 2013	B1
8533824	Hutton et al.	Sep 2013	B2
8539582	Aziz et al.	Sep 2013	B1
8549638	Aziz	Oct 2013	B2
8555391	Demir et al.	Oct 2013	B1
8561177	Aziz et al.	Oct 2013	B1
8566476	Shiffer et al.	Oct 2013	B2
8566946	Aziz et al.	Oct 2013	B1
8584094	Dadhia et al.	Nov 2013	B2
8584234	Sobel et al.	Nov 2013	B1
8584239	Aziz et al.	Nov 2013	B2
8595834	Xie et al.	Nov 2013	B2
8627476	Satish et al.	Jan 2014	B1
8635696	Aziz	Jan 2014	B1
8682054	Xue et al.	Mar 2014	B2
8682812	Ranjan	Mar 2014	B1
8689333	Aziz	Apr 2014	B2
8695096	Zhang	Apr 2014	B1
8713631	Pavlyushchik	Apr 2014	B1
8713681	Silberman et al.	Apr 2014	B2
8726392	McCorkendale et al.	May 2014	B1
8739280	Chess et al.	May 2014	B2
8776229	Aziz	Jul 2014	B1
8782792	Bodke	Jul 2014	B1
8789172	Stolfo et al.	Jul 2014	B2
8789178	Kejriwal et al.	Jul 2014	B2
8793278	Frazier et al.	Jul 2014	B2
8793787	Ismael et al.	Jul 2014	B2
8805947	Kuzkin et al.	Aug 2014	B1
8806647	Daswani et al.	Aug 2014	B1
8832829	Manni et al.	Sep 2014	B2
8850570	Ramzan	Sep 2014	B1
8850571	Staniford et al.	Sep 2014	B2
8881234	Narasimhan et al.	Nov 2014	B2
8881271	Butler, II	Nov 2014	B2
8881282	Aziz et al.	Nov 2014	B1
8898788	Aziz et al.	Nov 2014	B1
8935779	Manni et al.	Jan 2015	B2
8949257	Shiffer et al.	Feb 2015	B2
8984638	Aziz et al.	Mar 2015	B1
8990939	Staniford et al.	Mar 2015	B2
8990944	Singh et al.	Mar 2015	B1
8997219	Staniford et al.	Mar 2015	B2
9009822	Ismael et al.	Apr 2015	B1
9009823	Ismael et al.	Apr 2015	B1
9027135	Aziz	May 2015	B1
9071638	Aziz et al.	Jun 2015	B1
9104867	Thioux et al.	Aug 2015	B1
9106630	Frazier et al.	Aug 2015	B2
9106694	Aziz et al.	Aug 2015	B2
9118715	Staniford et al.	Aug 2015	B2
9159035	Ismael et al.	Oct 2015	B1
9171160	Vincent et al.	Oct 2015	B2
9176843	Ismael et al.	Nov 2015	B1
9189627	Islam	Nov 2015	B1
9195829	Goradia et al.	Nov 2015	B1
9197664	Aziz et al.	Nov 2015	B1
9223972	Vincent et al.	Dec 2015	B1
9225740	Ismael et al.	Dec 2015	B1
9241010	Bennett et al.	Jan 2016	B1
9251343	Vincent et al.	Feb 2016	B1
9262635	Paithane et al.	Feb 2016	B2
9268936	Butler	Feb 2016	B2
9275229	LeMasters	Mar 2016	B2
9282109	Aziz et al.	Mar 2016	B1
9292686	Ismael et al.	Mar 2016	B2
9294501	Mesdaq et al.	Mar 2016	B2
9300686	Pidathala et al.	Mar 2016	B2
9306960	Aziz	Apr 2016	B1
9306974	Aziz et al.	Apr 2016	B1
9311479	Manni et al.	Apr 2016	B1
9355247	Thioux et al.	May 2016	B1
9356944	Aziz	May 2016	B1
9363280	Rivlin et al.	Jun 2016	B1
9367681	Ismael et al.	Jun 2016	B1
9398028	Karandikar et al.	Jul 2016	B1
9413781	Cunningham et al.	Aug 2016	B2
9426071	Caldejon et al.	Aug 2016	B1
9430646	Mushtaq et al.	Aug 2016	B1
9432389	Khalid et al.	Aug 2016	B1
9438613	Paithane et al.	Sep 2016	B1
9438622	Staniford et al.	Sep 2016	B1
9438623	Thioux et al.	Sep 2016	B1
9459901	Jung et al.	Oct 2016	B2
9467460	Otvagin et al.	Oct 2016	B1
9483644	Paithane et al.	Nov 2016	B1
9495180	Ismael	Nov 2016	B2
9497213	Thompson et al.	Nov 2016	B2
9507935	Ismael et al.	Nov 2016	B2
9516057	Aziz	Dec 2016	B2
9519782	Aziz et al.	Dec 2016	B2
9536091	Paithane et al.	Jan 2017	B2
9537972	Edwards et al.	Jan 2017	B1
9560059	Islam	Jan 2017	B1
9565202	Kindlund et al.	Feb 2017	B1
9591015	Amin et al.	Mar 2017	B1
9591020	Aziz	Mar 2017	B1
9594904	Jain et al.	Mar 2017	B1
9594905	Ismael et al.	Mar 2017	B1
9594912	Thioux et al.	Mar 2017	B1
9609007	Rivlin et al.	Mar 2017	B1
9626509	Khalid et al.	Apr 2017	B1
9628498	Aziz et al.	Apr 2017	B1
9628507	Haq et al.	Apr 2017	B2
9633134	Ross	Apr 2017	B2
9635039	Islam et al.	Apr 2017	B1
9641546	Manni et al.	May 2017	B1
9654485	Neumann	May 2017	B1
9661009	Karandikar et al.	May 2017	B1
9661018	Aziz	May 2017	B1
9674298	Edwards et al.	Jun 2017	B1
9680862	Ismael et al.	Jun 2017	B2
9690606	Ha et al.	Jun 2017	B1
9690933	Singh et al.	Jun 2017	B1
9690935	Shiffer et al.	Jun 2017	B2
9690936	Malik et al.	Jun 2017	B1
9736179	Ismael	Aug 2017	B2
9740857	Ismael et al.	Aug 2017	B2
9747446	Pidathala et al.	Aug 2017	B1
9756074	Aziz et al.	Sep 2017	B2
9773112	Rathor et al.	Sep 2017	B1
9781144	Otvagin et al.	Oct 2017	B1
9787700	Amin et al.	Oct 2017	B1
9787706	Otvagin et al.	Oct 2017	B1
9792196	Ismael et al.	Oct 2017	B1
9824209	Ismael et al.	Nov 2017	B1
9824211	Wilson	Nov 2017	B2
9824216	Khalid et al.	Nov 2017	B1
9825976	Gomez et al.	Nov 2017	B1
9825989	Mehra et al.	Nov 2017	B1
9838408	Karandikar et al.	Dec 2017	B1
9838411	Aziz	Dec 2017	B1
9838416	Aziz	Dec 2017	B1
9838417	Khalid et al.	Dec 2017	B1
9846776	Paithane et al.	Dec 2017	B1
9876701	Caldejon et al.	Jan 2018	B1
9888016	Amin et al.	Feb 2018	B1
9888019	Pidathala et al.	Feb 2018	B1
9910988	Vincent et al.	Mar 2018	B1
9912644	Cunningham	Mar 2018	B2
9912681	Ismael et al.	Mar 2018	B1
9912684	Aziz et al.	Mar 2018	B1
9912691	Mesdaq et al.	Mar 2018	B2
9912698	Thioux et al.	Mar 2018	B1
9916440	Paithane et al.	Mar 2018	B1
9921978	Chan et al.	Mar 2018	B1
9934376	Ismael	Apr 2018	B1
9934381	Kindlund et al.	Apr 2018	B1
9946568	Ismael et al.	Apr 2018	B1
9954890	Staniford et al.	Apr 2018	B1
9973531	Thioux	May 2018	B1
10002252	Ismael et al.	Jun 2018	B2
10019338	Goradia et al.	Jul 2018	B1
10019573	Silberman et al.	Jul 2018	B2
10025691	Ismael et al.	Jul 2018	B1
10025927	Khalid et al.	Jul 2018	B1
10027689	Rathor et al.	Jul 2018	B1
10027690	Aziz et al.	Jul 2018	B2
10027696	Rivlin et al.	Jul 2018	B1
10033747	Paithane et al.	Jul 2018	B1
10033748	Cunningham et al.	Jul 2018	B1
10033753	Islam et al.	Jul 2018	B1
10033759	Kabra et al.	Jul 2018	B1
10050998	Singh	Aug 2018	B1
10068091	Aziz et al.	Sep 2018	B1
10075455	Zafar et al.	Sep 2018	B2
10083302	Paithane et al.	Sep 2018	B1
10084813	Eyada	Sep 2018	B2
10089461	Ha et al.	Oct 2018	B1
10097573	Aziz	Oct 2018	B1
10104102	Neumann	Oct 2018	B1
10108446	Steinberg et al.	Oct 2018	B1
10121000	Rivlin et al.	Nov 2018	B1
10122746	Manni et al.	Nov 2018	B1
10133863	Bu et al.	Nov 2018	B2
10133866	Kumar et al.	Nov 2018	B1
10146810	Shiffer et al.	Dec 2018	B2
10148693	Singh et al.	Dec 2018	B2
10165000	Aziz et al.	Dec 2018	B1
10169585	Pilipenko et al.	Jan 2019	B1
10176321	Abbasi et al.	Jan 2019	B2
10181029	Ismael et al.	Jan 2019	B1
10191861	Steinberg et al.	Jan 2019	B1
10192052	Singh et al.	Jan 2019	B1
10198574	Thioux et al.	Feb 2019	B1
10200384	Mushtaq et al.	Feb 2019	B1
10210329	Malik et al.	Feb 2019	B1
10216927	Steinberg	Feb 2019	B1
10218740	Mesdaq et al.	Feb 2019	B1
10242185	Goradia	Mar 2019	B1
10586046	Herman-Saffar	Mar 2020	B1
20010005889	Albrecht	Jun 2001	A1
20010047326	Broadbent et al.	Nov 2001	A1
20020018903	Kokubo et al.	Feb 2002	A1
20020038430	Edwards et al.	Mar 2002	A1
20020091819	Melchione et al.	Jul 2002	A1
20020095607	Lin-Hendel	Jul 2002	A1
20020116627	Tarbotton et al.	Aug 2002	A1
20020144156	Copeland	Oct 2002	A1
20020162015	Tang	Oct 2002	A1
20020166063	Lachman et al.	Nov 2002	A1
20020169952	DiSanto et al.	Nov 2002	A1
20020184528	Shevenell et al.	Dec 2002	A1
20020188887	Largman et al.	Dec 2002	A1
20020194490	Halperin et al.	Dec 2002	A1
20030021728	Sharpe et al.	Jan 2003	A1
20030074578	Ford et al.	Apr 2003	A1
20030084318	Schertz	May 2003	A1
20030101381	Mateev et al.	May 2003	A1
20030115483	Liang	Jun 2003	A1
20030188190	Aaron et al.	Oct 2003	A1
20030191957	Hypponen et al.	Oct 2003	A1
20030200460	Morota et al.	Oct 2003	A1
20030212902	van der Made	Nov 2003	A1
20030229801	Kouznetsov et al.	Dec 2003	A1
20030237000	Denton et al.	Dec 2003	A1
20040003323	Bennett et al.	Jan 2004	A1
20040006473	Mills et al.	Jan 2004	A1
20040015712	Szor	Jan 2004	A1
20040019832	Arnold et al.	Jan 2004	A1
20040047356	Bauer	Mar 2004	A1
20040083408	Spiegel et al.	Apr 2004	A1
20040088581	Brawn et al.	May 2004	A1
20040093513	Cantrell et al.	May 2004	A1
20040111531	Staniford et al.	Jun 2004	A1
20040117478	Triulzi et al.	Jun 2004	A1
20040117624	Brandt et al.	Jun 2004	A1
20040128355	Chao et al.	Jul 2004	A1
20040165588	Pandya	Aug 2004	A1
20040236963	Danford et al.	Nov 2004	A1
20040243349	Greifeneder et al.	Dec 2004	A1
20040249911	Alkhatib et al.	Dec 2004	A1
20040255161	Cavanaugh	Dec 2004	A1
20040268147	Wiederin et al.	Dec 2004	A1
20050005159	Oliphant	Jan 2005	A1
20050021740	Bar et al.	Jan 2005	A1
20050033960	Vialen et al.	Feb 2005	A1
20050033989	Poletto et al.	Feb 2005	A1
20050050148	Mohammadioun et al.	Mar 2005	A1
20050086523	Zimmer et al.	Apr 2005	A1
20050091513	Mitomo et al.	Apr 2005	A1
20050091533	Omote et al.	Apr 2005	A1
20050091652	Ross et al.	Apr 2005	A1
20050108562	Khazan et al.	May 2005	A1
20050114663	Cornell et al.	May 2005	A1
20050125195	Brendel	Jun 2005	A1
20050149726	Joshi et al.	Jul 2005	A1
20050157662	Bingham et al.	Jul 2005	A1
20050183143	Anderholm et al.	Aug 2005	A1
20050201297	Peikari	Sep 2005	A1
20050210533	Copeland et al.	Sep 2005	A1
20050238005	Chen et al.	Oct 2005	A1
20050240781	Gassoway	Oct 2005	A1
20050262562	Gassoway	Nov 2005	A1
20050265331	Stolfo	Dec 2005	A1
20050283839	Cowburn	Dec 2005	A1
20060010495	Cohen et al.	Jan 2006	A1
20060015416	Hoffman et al.	Jan 2006	A1
20060015715	Anderson	Jan 2006	A1
20060015747	Van de Ven	Jan 2006	A1
20060021029	Brickell et al.	Jan 2006	A1
20060021054	Costa et al.	Jan 2006	A1
20060031476	Mathes et al.	Feb 2006	A1
20060047665	Neil	Mar 2006	A1
20060070130	Costea et al.	Mar 2006	A1
20060075496	Carpenter et al.	Apr 2006	A1
20060095968	Portolani et al.	May 2006	A1
20060101516	Sudaharan et al.	May 2006	A1
20060101517	Banzhof et al.	May 2006	A1
20060117385	Mester et al.	Jun 2006	A1
20060123477	Raghavan et al.	Jun 2006	A1
20060143709	Brooks et al.	Jun 2006	A1
20060150249	Gassen et al.	Jul 2006	A1
20060161983	Cothrell et al.	Jul 2006	A1
20060161987	Levy-Yurista	Jul 2006	A1
20060161989	Reshef et al.	Jul 2006	A1
20060164199	Gilde et al.	Jul 2006	A1
20060173992	Weber et al.	Aug 2006	A1
20060179147	Tran et al.	Aug 2006	A1
20060184632	Marino et al.	Aug 2006	A1
20060191010	Benjamin	Aug 2006	A1
20060221956	Narayan et al.	Oct 2006	A1
20060236393	Kramer et al.	Oct 2006	A1
20060242709	Seinfeld et al.	Oct 2006	A1
20060248519	Jaeger et al.	Nov 2006	A1
20060248582	Panjwani et al.	Nov 2006	A1
20060251104	Koga	Nov 2006	A1
20060288417	Bookbinder et al.	Dec 2006	A1
20070006288	Mayfield et al.	Jan 2007	A1
20070006313	Porras et al.	Jan 2007	A1
20070011174	Takaragi et al.	Jan 2007	A1
20070016951	Piccard et al.	Jan 2007	A1
20070019286	Kikuchi	Jan 2007	A1
20070033645	Jones	Feb 2007	A1
20070038943	FitzGerald et al.	Feb 2007	A1
20070064689	Shin et al.	Mar 2007	A1
20070074169	Chess et al.	Mar 2007	A1
20070094730	Bhikkaji et al.	Apr 2007	A1
20070101435	Konanka et al.	May 2007	A1
20070128855	Cho et al.	Jun 2007	A1
20070142030	Sinha et al.	Jun 2007	A1
20070143827	Nicodemus et al.	Jun 2007	A1
20070156895	Vuong	Jul 2007	A1
20070157180	Tillmann et al.	Jul 2007	A1
20070157306	Elrod et al.	Jul 2007	A1
20070168988	Eisner et al.	Jul 2007	A1
20070171824	Ruello et al.	Jul 2007	A1
20070174915	Gribble et al.	Jul 2007	A1
20070192500	Lum	Aug 2007	A1
20070192858	Lum	Aug 2007	A1
20070198275	Malden et al.	Aug 2007	A1
20070208822	Wang et al.	Sep 2007	A1
20070220607	Sprosts et al.	Sep 2007	A1
20070240218	Tuvell et al.	Oct 2007	A1
20070240219	Tuvell et al.	Oct 2007	A1
20070240220	Tuvell et al.	Oct 2007	A1
20070240222	Tuvell et al.	Oct 2007	A1
20070250930	Aziz et al.	Oct 2007	A1
20070256132	Oliphant	Nov 2007	A2
20070271446	Nakamura	Nov 2007	A1
20080005782	Aziz	Jan 2008	A1
20080018122	Zierler et al.	Jan 2008	A1
20080028463	Dagon et al.	Jan 2008	A1
20080040710	Chiriac	Feb 2008	A1
20080046781	Childs et al.	Feb 2008	A1
20080066179	Liu	Mar 2008	A1
20080072326	Danford et al.	Mar 2008	A1
20080077793	Tan et al.	Mar 2008	A1
20080080518	Hoeflin et al.	Apr 2008	A1
20080086720	Lekel	Apr 2008	A1
20080098476	Syversen	Apr 2008	A1
20080120722	Sima et al.	May 2008	A1
20080134178	Fitzgerald et al.	Jun 2008	A1
20080134334	Kim et al.	Jun 2008	A1
20080141376	Clausen et al.	Jun 2008	A1
20080184367	McMillan et al.	Jul 2008	A1
20080184373	Traut et al.	Jul 2008	A1
20080189787	Arnold et al.	Aug 2008	A1
20080201778	Guo et al.	Aug 2008	A1
20080209557	Herley et al.	Aug 2008	A1
20080215742	Goldszmidt et al.	Sep 2008	A1
20080222729	Chen et al.	Sep 2008	A1
20080263665	Ma et al.	Oct 2008	A1
20080295172	Bohacek	Nov 2008	A1
20080301810	Lehane et al.	Dec 2008	A1
20080307524	Singh et al.	Dec 2008	A1
20080313738	Enderby	Dec 2008	A1
20080320594	Jiang	Dec 2008	A1
20090003317	Kasralikar et al.	Jan 2009	A1
20090007100	Field et al.	Jan 2009	A1
20090013408	Schipka	Jan 2009	A1
20090031423	Liu et al.	Jan 2009	A1
20090036111	Danford et al.	Feb 2009	A1
20090037835	Goldman	Feb 2009	A1
20090044024	Oberheide et al.	Feb 2009	A1
20090044274	Budko et al.	Feb 2009	A1
20090064332	Porras et al.	Mar 2009	A1
20090077666	Chen et al.	Mar 2009	A1
20090083369	Marmor	Mar 2009	A1
20090083855	Apap et al.	Mar 2009	A1
20090089879	Wang et al.	Apr 2009	A1
20090094697	Proves et al.	Apr 2009	A1
20090113425	Ports et al.	Apr 2009	A1
20090125976	Wassermann et al.	May 2009	A1
20090126015	Monastyrsky et al.	May 2009	A1
20090126016	Sobko et al.	May 2009	A1
20090133125	Choi et al.	May 2009	A1
20090144823	Lamastra et al.	Jun 2009	A1
20090158430	Borders	Jun 2009	A1
20090172815	Gu et al.	Jul 2009	A1
20090187992	Poston	Jul 2009	A1
20090193293	Stolfo et al.	Jul 2009	A1
20090198651	Shiffer et al.	Aug 2009	A1
20090198670	Shiffer et al.	Aug 2009	A1
20090198689	Frazier et al.	Aug 2009	A1
20090199274	Frazier et al.	Aug 2009	A1
20090199296	Xie et al.	Aug 2009	A1
20090228233	Anderson et al.	Sep 2009	A1
20090241187	Froyansky	Sep 2009	A1
20090241190	Fodd et al.	Sep 2009	A1
20090265692	Godefroid et al.	Oct 2009	A1
20090271867	Zhang	Oct 2009	A1
20090300415	Zhang et al.	Dec 2009	A1
20090300761	Park et al.	Dec 2009	A1
20090328185	Berg et al.	Dec 2009	A1
20090328221	Blumfield et al.	Dec 2009	A1
20100005146	Drako et al.	Jan 2010	A1
20100011205	McKenna	Jan 2010	A1
20100017546	Poo et al.	Jan 2010	A1
20100030996	Butler, II	Feb 2010	A1
20100031353	Thomas et al.	Feb 2010	A1
20100037314	Perdisci et al.	Feb 2010	A1
20100043073	Kuwamura	Feb 2010	A1
20100054278	Stolfo et al.	Mar 2010	A1
20100058474	Hicks	Mar 2010	A1
20100064044	Nonoyama	Mar 2010	A1
20100077481	Polyakov et al.	Mar 2010	A1
20100083376	Pereira et al.	Apr 2010	A1
20100115621	Staniford et al.	May 2010	A1
20100132038	Zaitsev	May 2010	A1
20100154056	Smith et al.	Jun 2010	A1
20100180344	Malyshev et al.	Jul 2010	A1
20100192223	Ismael et al.	Jul 2010	A1
20100220863	Dupaquis et al.	Sep 2010	A1
20100235831	Dittmer	Sep 2010	A1
20100251104	Massand	Sep 2010	A1
20100281102	Chinta et al.	Nov 2010	A1
20100281541	Stolfo et al.	Nov 2010	A1
20100281542	Stolfo et al.	Nov 2010	A1
20100287260	Peterson et al.	Nov 2010	A1
20100299754	Amit et al.	Nov 2010	A1
20100306173	Frank	Dec 2010	A1
20110004737	Greenebaum	Jan 2011	A1
20110025504	Lyon et al.	Feb 2011	A1
20110041179	St Hlberg	Feb 2011	A1
20110047594	Mahaffey et al.	Feb 2011	A1
20110047620	Mahaffey et al.	Feb 2011	A1
20110055907	Narasimhan et al.	Mar 2011	A1
20110078794	Manni et al.	Mar 2011	A1
20110093951	Aziz	Apr 2011	A1
20110099620	Stavrou et al.	Apr 2011	A1
20110099633	Aziz	Apr 2011	A1
20110099635	Silberman et al.	Apr 2011	A1
20110113231	Kaminsky	May 2011	A1
20110145918	Jung et al.	Jun 2011	A1
20110145920	Mahaffey et al.	Jun 2011	A1
20110145934	Abramovici et al.	Jun 2011	A1
20110167493	Song et al.	Jul 2011	A1
20110167494	Bowen et al.	Jul 2011	A1
20110173213	Frazier et al.	Jul 2011	A1
20110173460	Ito et al.	Jul 2011	A1
20110219449	St. Neitzel et al.	Sep 2011	A1
20110219450	McDougal et al.	Sep 2011	A1
20110225624	Sawhney et al.	Sep 2011	A1
20110225655	Niemela et al.	Sep 2011	A1
20110247072	Staniford et al.	Oct 2011	A1
20110265182	Peinado et al.	Oct 2011	A1
20110289582	Kejriwal et al.	Nov 2011	A1
20110302587	Nishikawa et al.	Dec 2011	A1
20110307954	Melnik et al.	Dec 2011	A1
20110307955	Kaplan et al.	Dec 2011	A1
20110307956	Yermakov et al.	Dec 2011	A1
20110314546	Aziz et al.	Dec 2011	A1
20120023593	Puder et al.	Jan 2012	A1
20120054869	Yen et al.	Mar 2012	A1
20120066698	Yanoo	Mar 2012	A1
20120079596	Thomas et al.	Mar 2012	A1
20120084859	Radinsky et al.	Apr 2012	A1
20120096553	Srivastava et al.	Apr 2012	A1
20120110667	Zubrilin et al.	May 2012	A1
20120117652	Manni et al.	May 2012	A1
20120121154	Kue et al.	May 2012	A1
20120124426	Maybee et al.	May 2012	A1
20120174186	Aziz et al.	Jul 2012	A1
20120174196	Bhogavilli et al.	Jul 2012	A1
20120174218	McCoy et al.	Jul 2012	A1
20120198279	Schroeder	Aug 2012	A1
20120210423	Friedrichs et al.	Aug 2012	A1
20120222121	Staniford et al.	Aug 2012	A1
20120255015	Sahita et al.	Oct 2012	A1
20120255017	Sallam	Oct 2012	A1
20120260342	Dube et al.	Oct 2012	A1
20120266244	Green et al.	Oct 2012	A1
20120278886	Luna	Nov 2012	A1
20120297489	Dequevy	Nov 2012	A1
20120330801	McDougal et al.	Dec 2012	A1
20120331553	Viz et al.	Dec 2012	A1
20130014259	Gribble et al.	Jan 2013	A1
20130036472	Aziz	Feb 2013	A1
20130047257	Aziz	Feb 2013	A1
20130074185	McDougal et al.	Mar 2013	A1
20130086684	Mohler	Apr 2013	A1
20130097699	Balupari et al.	Apr 2013	A1
20130097706	Titonis et al.	Apr 2013	A1
20130111587	Goel et al.	May 2013	A1
20130117852	Stute	May 2013	A1
20130117855	Kim et al.	May 2013	A1
20130139264	Brinkley et al.	May 2013	A1
20130160125	Likhachev et al.	Jun 2013	A1
20130160127	Jeong et al.	Jun 2013	A1
20130160130	Mendelev et al.	Jun 2013	A1
20130160131	Madou et al.	Jun 2013	A1
20130167236	Sick	Jun 2013	A1
20130174214	Duncan	Jul 2013	A1
20130185789	Hagiwara et al.	Jul 2013	A1
20130185795	Winn et al.	Jul 2013	A1
20130185798	Saunders et al.	Jul 2013	A1
20130191915	Antonakakis et al.	Jul 2013	A1
20130196649	Paddon et al.	Aug 2013	A1
20130227691	Aziz et al.	Aug 2013	A1
20130246370	Bartram et al.	Sep 2013	A1
20130247186	LeMasters	Sep 2013	A1
20130263260	Mahaffey et al.	Oct 2013	A1
20130291109	Staniford et al.	Oct 2013	A1
20130298243	Kumar et al.	Nov 2013	A1
20130318038	Shiffer et al.	Nov 2013	A1
20130318073	Shiffer et al.	Nov 2013	A1
20130325791	Shiffer et al.	Dec 2013	A1
20130325792	Shiffer et al.	Dec 2013	A1
20130325871	Shiffer et al.	Dec 2013	A1
20130325872	Shiffer et al.	Dec 2013	A1
20140032875	Butler	Jan 2014	A1
20140053260	Gupta et al.	Feb 2014	A1
20140053261	Gupta et al.	Feb 2014	A1
20140130158	Wang et al.	May 2014	A1
20140137180	Lukacs et al.	May 2014	A1
20140169762	Ryu	Jun 2014	A1
20140179360	Jackson et al.	Jun 2014	A1
20140181131	Ross	Jun 2014	A1
20140189687	Jung et al.	Jul 2014	A1
20140189866	Shiffer et al.	Jul 2014	A1
20140189882	Jung et al.	Jul 2014	A1
20140237600	Silberman et al.	Aug 2014	A1
20140280245	Wilson	Sep 2014	A1
20140283037	Sikorski et al.	Sep 2014	A1
20140283063	Thompson et al.	Sep 2014	A1
20140328204	Klotsche et al.	Nov 2014	A1
20140337836	Ismael	Nov 2014	A1
20140344926	Cunningham et al.	Nov 2014	A1
20140351935	Shao et al.	Nov 2014	A1
20140380473	Bu et al.	Dec 2014	A1
20140380474	Paithane et al.	Dec 2014	A1
20150007312	Pidathala et al.	Jan 2015	A1
20150096022	Vincent et al.	Apr 2015	A1
20150096023	Mesdaq et al.	Apr 2015	A1
20150096024	Haq et al.	Apr 2015	A1
20150096025	Ismael	Apr 2015	A1
20150180886	Staniford et al.	Jun 2015	A1
20150186645	Aziz et al.	Jul 2015	A1
20150199513	Ismael et al.	Jul 2015	A1
20150199531	Ismael et al.	Jul 2015	A1
20150199532	Ismael et al.	Jul 2015	A1
20150220735	Paithane et al.	Aug 2015	A1
20150372980	Eyada	Dec 2015	A1
20160004869	Ismael et al.	Jan 2016	A1
20160006756	Ismael et al.	Jan 2016	A1
20160044000	Cunningham	Feb 2016	A1
20160127393	Aziz et al.	May 2016	A1
20160191547	Zafar et al.	Jun 2016	A1
20160191550	Ismael et al.	Jun 2016	A1
20160261612	Mesdaq et al.	Sep 2016	A1
20160285914	Singh et al.	Sep 2016	A1
20160301703	Aziz	Oct 2016	A1
20160335110	Paithane et al.	Nov 2016	A1
20170083703	Abbasi et al.	Mar 2017	A1
20180013770	Ismael	Jan 2018	A1
20180048660	Paithane et al.	Feb 2018	A1
20180121316	Ismael et al.	May 2018	A1
20180288077	Siddiqui et al.	Oct 2018	A1
20200364238	Cruanes	Nov 2020	A1

Foreign Referenced Citations (12)

Number	Date	Country
105760762	Jul 2016	CN
2439806	Jan 2008	GB
2490431	Oct 2012	GB
0206928	Jan 2002	WO
0223805	Mar 2002	WO
2007117636	Oct 2007	WO
2008041950	Apr 2008	WO
2011084431	Jul 2011	WO
2011112348	Sep 2011	WO
2012075336	Jun 2012	WO
2012145066	Oct 2012	WO
2013067505	May 2013	WO

Non-Patent Literature Citations (57)

Entry
Venezia, Paul, “NetDetector Captures Intrusions”, InfoWorld Issue 27, (“Venezia”), (Jul. 14, 2003).
Vladimir Getov: “Security as a Service in Smart Clouds—Opportunities and Concerns”, Computer Software and Applications Conference (COMPSAC), 2012 IEEE 36th Annual, IEEE, Jul. 16, 2012 (Jul. 16, 2012).
Wahid et al., Characterising the Evolution in Scanning Activity of Suspicious Hosts, Oct. 2009, Third International Conference on Network and System Security, pp. 344-350.
Whyte, et al., “DNS-Based Detection of Scanning Works in an Enterprise Network”, Proceedings of the 12th Annual Network and Distributed System Security Symposium, (Feb. 2005), 15 pages.
Williamson, Matthew M., “Throttling Viruses: Restricting Propagation to Defeat Malicious Mobile Code”, ACSAC Conference, Las Vegas, NV, USA, (Dec. 2002), pp. 1-9.
Yuhei Kawakoya et al: “Memory behavior-based automatic malware unpacking in stealth debugging environment”, Malicious and Unwanted Software (Malware), 2010 5th International Conference on, IEEE, Piscataway, NJ, USA, Oct. 19, 2010, pp. 39-46, XP031833827, ISBN:978-1-4244-8-9353-1.
Zhang et al., The Effects of Threading, Infection Time, and Multiple-Attacker Collaboration on Malware Propagation, Sep. 2009, IEEE 28th International Symposium on Reliable Distributed Systems, pp. 73-82.
“Mining Specification of Malicious Behavior”—Jha et al, UCSB, Sep. 2007 https://www.cs.ucsb.edU/.about.chris/research/doc/esec07.sub.--mining.pdf-.
“Network Security: NetDetector—Network Intrusion Forensic System (NIFS) Whitepaper”, (“NetDetector Whitepaper”), (2003).
“When Virtual is Better Than Real”, IEEEXplore Digital Library, available at, http://ieeexplore.ieee.org/xpl/articleDetails.isp?reload=true&arnumbe- r=990073, (Dec. 7, 2013).
Abdullah, et al., Visualizing Network Data for Intrusion Detection, 2005 IEEE Workshop on Information Assurance and Security, pp. 100-108.
Adetoye, Adedayo , et al., “Network Intrusion Detection & Response System”, (“Adetoye”), (Sep. 2003).
Apostolopoulos, George; hassapis, Constantinos; “V-eM: A cluster of Virtual Machines for Robust, Detailed, and High-Performance Network Emulation”, 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, Sep. 11-14, 2006, pp. 117-126.
Aura, Tuomas, “Scanning electronic documents for personally identifiable information”. Proceedings of the 5th ACM workshop on Privacy in electronic society. ACM, 2006.
Baecher, “The Nepenthes Platform: An Efficient Approach to collect Malware”, Springer-verlag Berlin Heidelberg, (2006), pp. 165-184.
Bayer, et al., “Dynamic Analysis of Malicious Code”, J Comput Virol, Springer-Verlag, France., (2006), pp. 67-77.
Boubalos, Chris , “extracting syslog data out of raw pcap dumps, seclists.org, Honeypots mailing list archives”, available at http://seclists.org/honeypots/2003/q2/319 (“Boubalos”), (Jun. 5, 2003).
Chaudet, C., et al., “Optimal Positioning of Active and Passive Monitoring Devices”, International Conference on Emerging Networking Experiments and Technologies, Proceedings of the 2005 ACM Conference on Emerging Network Experiment and Technology, CoNEXT '05, Toulousse, France, (Oct. 2005), pp. 71-82.
Chen, P. M. and Noble, B. D., “When Virtual is Better Than Real, Department of Electrical Engineering and Computer Science”, University of Michigan (“Chen”) (2001).
Cisco “Intrusion Prevention for the Cisco ASA 5500-x Series” Data Sheet (2012).
Cohen, M.I., “PyFlag—An advanced network forensic framework”, Digital investigation 5, Elsevier, (2008), pp. S112-S120.
Costa, M., et al., “Vigilante: End-to-End Containment of Internet Worms”, SOSP '05, Association for Computing Machinery, Inc., Brighton U.K., (Oct. 23-26, 2005).
Didier Stevens, “Malicious PDF Documents Explained”, Security & Privacy, IEEE, IEEE Service Center, Los Alamitos, CA, US, vol. 9, No. 1, Jan. 1, 2011, pp. 80-82, XP011329453, ISSN: 1540-7993, DOI: 10.1109/MSP.2011.14.
Distler, “Malware Analysis: An Introduction”, SANS Institute InfoSec Reading Room, SANS Institute, (2007).
Dunlap, George W. , et al., “ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay”, Proceeding of the 5th Symposium on Operating Systems Design and Implementation, USENIX Association, (“Dunlap”), (Dec. 9, 2002).
FireEye Malware Analysis & Exchange Network, Malware Protection System, FireEye Inc., 2010.
FireEye Malware Analysis, Modern Malware Forensics, FireEye Inc., 2010.
FireEye v.6.0 Security Target, pp. 1-35, Version 1.1, FireEye Inc., May 2011.
Goel, et al., Reconstructing System State for Intrusion Analysis, Apr. 2008 SIGOPS Operating Systems Review, vol. 42 Issue 3, pp. 21-28.
Gregg Keizer: “Microsoft's HoneyMonkeys Show Patching Windows Works”, Aug. 8, 2005, XP055143386, Retrieved from the Internet: URL:http://www.informationweek.com/microsofts-honeymonkeys-show-patching-windows-works/d/d-id/1035069? [retrieved on Jun. 1, 2016].
Heng Yin et al, Panorama: Capturing System-Wide Information Flow for Malware Detection and Analysis, Research Showcase @ CMU, Carnegie Mellon University, 2007.
Hiroshi Shinotsuka, Malware Authors Using New Techniques to Evade Automated Threat Analysis Systems, Oct. 26, 2012, http://www.symantec.com/connect/blogs/, pp. 1-4.
Idika et al., A-Survey-of-Malware-Detection-Techniques, Feb. 2, 2007, Department of Computer Science, Purdue University.
Isohara, Takamasa, Keisuke Takemori, and Ayumu Kubota. “Kernel-based behavior analysis for android malware detection.” Computational intelligence and Security (CIS), 2011 Seventh International Conference on. IEEE, 2011.
Kaeo, Merike , “Designing Network Security”, (“Kaeo”), (Nov. 2003).
Kevin A Roundy et al: “Hybrid Analysis and Control of Malware”, Sep. 15, 2010, Recent Advances in Intrusion Detection, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 317-338, XP019150454 ISBN:978-3-642-15511-6.
Khaled Salah et al: “Using Cloud Computing to Implement a Security Overlay Network”, Security & Privacy, IEEE, IEEE Service Center, Los Alamitos, CA, US, vol. 11, No. 1, Jan. 1, 2013 (Jan. 1, 2013).
Kim, H., et al., “Autograph: Toward Automated, Distributed Worm Signature Detection”, Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, (Aug. 2004), pp. 271-286.
King, Samuel T., et al., “Operating System Support for Virtual Machines”, (“King”), (2003).
Kreibich, C., et al., “Honeycomb-Creating Intrusion Detection Signatures Using Honeypots”, 2nd Workshop on Hot Topics in Networks (HotNets-11), Boston, USA, (2003).
Kristoff, J., “Botnets, Detection and Mitigation: DNS-Based Techniques”, NU Security Day, (2005), 23 pages.
Lastline Labs, The Threat of Evasive Malware, Feb. 25, 2013, Lastline Labs, pp. 1-8.
Li et al., A VMM-Based System Call Interposition Framework for Program Monitoring, Dec. 2010, IEEE 16th International Conference on Parallel and Distributed Systems, pp. 706-711.
Lindorfer, Martina, Clemens Kolbitsch, and Paolo Milani Comparetti. “Detecting environment-sensitive malware.” Recent Advances in Intrusion Detection. Springer Berlin Heidelberg, 2011.
Marchette, David J., “Computer Intrusion Detection and Network Monitoring: A Statistical Viewpoint”, (“Marchette”), (2001).
Moore, D., et al., “Internet Quarantine: Requirements for Containing Self-Propagating Code”, INFOCOM, vol. 3, (Mar. 30-Apr. 3, 2003), pp. 1901-1910.
Morales, Jose A., et al., ““Analyzing and exploiting network behaviors of malware.””, Security and Privacy in Communication Networks. Springer Berlin Heidelberg, 2010. 20-34.
Mori, Detecting Unknown Computer Viruses, 2004, Springer-Verlag Berlin Heidelberg.
Natvig, Kurt, “SANDBOXII: Internet”, Virus Bulletin Conference, (“Natvig”), (Sep. 2002).
NetBIOS Working Group. Protocol Standard fora NetBIOS Service on a TCP/UDP transport: Concepts and Methods. STD 19, RFC 1001, Mar. 1987.
Newsome, J. , et al., “Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software”, In Proceedings of the 12th Annual Network and Distributed System Security, Symposium (NDSS '05), (Feb. 2005).
Nojiri, D. , et al., “Cooperation Response Strategies for Large Scale Attack Mitigation”, DARPA Information Survivability Conference and Exposition, vol. 1, (Apr. 22-24, 2003), pp. 293-302.
Oberheide et al., CloudAV.sub.-N-Version Antivirus in the Network Cloud, 17th USENIX Security Symposium USENIX Security '08 Jul. 28-Aug. 1, 2008 San Jose, CA.
Reiner Sailer, Enriquillo Valdez, Trent Jaeger, Roonald Perez, Leendert van Doorn, John Linwood Griffin, Stefan Berger., sHype: Secure Hypervisor Appraoch to Trusted Virtualized Systems (Feb. 2, 2005) (“Sailer”).
Silicon Defense, “Worm Containment in the Internal Network”, (Mar. 2003), pp. 1-25.
Singh, S. , et al., “Automated Worm Fingerprinting”, Proceedings of the ACM/USENIX Symposium on Operating System Design and Implementation, San Francisco, California, (Dec. 2004).
Thomas H. Ptacek, and Timothy N. Newsham , “Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection”, Secure Networks, (“Ptacek”), (Jan. 1998).

Systems and methods for automated cybersecurity analysis of extracted binary string sets

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Term Extension

Abstract

Description

Claims

US Referenced Citations (712)

Foreign Referenced Citations (12)

Non-Patent Literature Citations (57)