Malicious software (i.e., malware) is used by cybercriminals to harm legitimate people and businesses in many ways including interrupting public services, stealing data (e.g., confidential and secure data such as personally identifying information), and stealing financial resources. Cybercriminals and malware are an ever-present issue for any entity utilizing computing technology. Cybercriminals exploit many technologies including everyday types of office documents (e.g., word processing documents, spreadsheet documents, presentation documents, and the like) to deliver malware. These everyday documents represent a large threat to entities, and a favored choice by cybercriminals, because of their widespread usage. Zero-day malware attacks exploit unknown security flaws and vulnerabilities, so cybercriminals often use these everyday documents to deliver zero-day malware. These malicious files present a substantial risk to organizations because they often initiate the first stage of an attack, triggering execution of the malware.
Once a user opens or gains access to an infected document, any malware included in the document is executed. The malware in such a document may initiate the attack by installing unwanted malicious software on the user's device, opening access to otherwise secure data locations, and the like. Existing technologies use strategies such as static or signature-based detections, but these strategies often do not detect stealthy malware hidden in this type of everyday office document. Particularly, zero-day malware is difficult to identify and is not detected using only static or signature-based detections because static and signature-based detections use previously known information about malware to detect the malware. By definition, zero-day malware is previously unknown. Additionally, other novel malware, older malware strains that have been modified, or polymorphic malware (i.e., malware that continually changes to evade detection) are not typically detectable using only static or signature-based detections. Accordingly, improvements are needed to ensure that malware hidden in everyday office documents is detected and contained prior to inadvertent execution by the user.
To address the limitations described above, a network security system that detonates documents securely in a sandbox is used to analyze the documents and make determinations as to whether the documents are clean or malicious. The system analyzes and extracts static information related to the document (e.g., information about the document itself), dynamic information about the document (e.g., behaviors observed by opening the document) in the sandbox, and character strings from images in the document while accessing the document in the sandbox. The system uses artificial intelligence (AI) models to analyze the dynamic information. The AI models are trained to predict whether the document includes malware. Further, analysis of the character strings provides a heuristic evaluation of whether the document contains malware. A validation engine of the system combines the heuristic evaluation and the output of the AI model to classify the document as malicious or clean. Security policies can then be applied based at least in part on the classification.
In particular, a system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. One general aspect includes a computer-implemented method that can be performed by a network security system. The network security system intercepts a request to access a document and, in response, obtains the document. The network security system detonates the document in a sandbox. In response to the detonating, the network security system extracts dynamic information about the document and character strings from images in the document during the detonating in the sandbox. The network security system provides the dynamic information about the document as input to an artificial intelligence model trained to provide an output indicating a prediction of whether the document contains malware based on the input (e.g., the static and dynamic information). The network security system also generates a heuristic score based on comparing the character strings extracted from the document to a batch of phishing keywords. The network security system provides the output of the artificial intelligence model and the heuristic score as input to a verdict engine, where the verdict engine combines the output of the artificial intelligence model and the heuristic score to classify the document as either malicious or clean. Based on the classification, the network security system implements a security policy. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
Implementations may include one or more of the following features. Optionally, extracting the dynamic information about the document may include analyzing behavior of the document during detonation in the sandbox and extracting data from the behavior. The dynamic information may include a set of behavior features of the document exhibited during the detonating, a size of a process tree spawned by the detonating, a signature vector having a dimension for each of a number of known software signatures where the value of each dimension indicates whether the document invoked the software signature in the sandbox, and severity scores for each of the software signatures the document invoked. Optionally, the set of behavior features may include frequently visited files, frequently visited paths, pathways explored by processes in the sandbox, or a combination thereof.
In some embodiments, providing the dynamic information about the document as input may include generating a feature vector representing the dynamic information and providing the feature vector as the input to the artificial intelligence model. In some embodiments, the artificial intelligence model may include a gradient boosting tree algorithm.
In some embodiments, extracting the character strings from the images in the document may include analyzing the document during detonation in the sandbox with optical character recognition. Optionally, the heuristic score may include a count of the matches identified during the comparing of the character strings to the batch of phishing keywords.
Optionally, a heuristic rule is triggered based on the heuristic score exceeding a threshold heuristic value and the output of the artificial intelligence model may include a prediction score indicating the prediction. In such embodiments, the verdict engine classifies the document as malicious based on determining if the heuristic rule is triggered and the prediction score exceeds a pre-defined prediction threshold value. Further, the verdict engine in such embodiments classifies the document as clean based on determining if the heuristic rule is not triggered or the prediction score does not exceed the pre-defined prediction threshold value.
Optionally, the method may further include extracting static information about the document. The static information may be input to the artificial intelligence model with the dynamic information as input, in some embodiments. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.
In the drawings, like reference characters generally refer to like parts throughout the different views. Also, the drawings are not necessarily to scale, with an emphasis instead generally being placed upon illustrating the principles of the technology disclosed. In the following description, various implementations of the technology disclosed are described with reference to the following drawings.
To more accurately detect malware in documents, more than simply static or signature-based analysis is used. As discussed above, everyday documents including word processing documents (e.g., MICROSOFT WORD®), presentation documents (e.g., MICROSOFT POWERPOINT®), and spreadsheet documents (e.g., MICROSOFT EXCEL®) are often exploited by cybercriminals to attack individuals and enterprises. These cybercriminals embed malware in the documents or otherwise configure the documents to access and initiate execution of malware on the target computers. Identifying the infected documents prior to execution (e.g., opening the document) on the target computers is ideal but difficult. Existing detection often relies on static or signature-based detection, but cybercriminals are constantly evolving their techniques to evade detection. Further, novel malware including zero-day malware, older malware strains that have been modified, or polymorphic malware (i.e., malware that continually changes to evade detection) are not detectable using standard static or signature-based detection methods.
To increase detection of these types of novel malware and avoid infection to unsuspecting computing devices, the present disclosure includes a cloud-based network security system (NSS) with a document malware detection engine. The document malware detection engine uses a sandbox in which the documents in question are detonated (i.e., opened) and analyzed. Because sandboxes are isolated and secure, detonating the document in the sandbox avoids infection. Nonetheless, the documents can be analyzed to detect many static and dynamic parameters. Additionally, while open in the sandbox, text from images in the document can be analyzed, for example using optical character recognition (OCR) technology. Character strings can be captured based on analysis of the images.
The document malware detection engine includes an artificial intelligence (AI) model trained to analyze the dynamic characteristics captured about the document while it was open in the sandbox. In some embodiments, the static information/characteristics are also included in the AI model analysis. The AI model is trained to analyze the characteristics and make a prediction as to whether the document includes malware. For example, the AI model may provide a score indicating the probability that the document includes malware.
The document malware detection engine includes a heuristic analyzer trained to compare the character strings captured from the images in the document against a list of phishing keywords and phrases. The heuristic analyzer can generate a heuristic score, such that a score exceeding a threshold value may be a heuristic trigger, indicating that the document likely contains malware.
The document malware detection engine includes a validation engine that ingests the heuristic score and the prediction from the AI model to classify the document as either clean or malicious. Once classified by the validation engine, the NSS may apply security policies to the document based on the classification.
Advantageously, the disclosed document malware detection engine uses a sandbox to isolate the document during access to avoid infecting any unsuspecting computing systems while still maintaining the ability to analyze the file. While in the sandbox, the document is analyzed to detect static information (e.g., information not requiring the document to be opened), dynamic information (e.g., information generated and captured by opening the document), and other data embedded in images is analyzed. The AI model trained to analyze the dynamic information (and static information in some embodiments) provides a prediction of whether the document includes malware. Alone, this prediction provides satisfactory results. However, in combination with the heuristic trigger based on the data embedded in images, the detection increases by two-fold in testing. The increased rate of detection reduces infected computing systems and saves computing resources as well as human resources in mitigation of infected computing systems.
Endpoints 105 comprise user devices including desktops, laptops, mobile devices, and the like. The mobile devices include smartphones, smart watches, and the like. Endpoints 105 may also include internet of things (IOT) devices. Endpoints 105 may include any number of components including those described with respect to computing device 700 of
Endpoint routing client 110 routes network traffic transmitted from its respective endpoint 105 to the network security system 125. Depending on the type of device for which endpoint routing client 110 is routing traffic, endpoint routing client 110 may use or be a virtual private network (VPN) such as VPN on demand or per-app-VPN that use certificate-based authentication. For example, for some devices having a first operating system, endpoint routing client 110 me be a per-app-VPN may be used or a set of domain-based VPN profiles may be used. For other devices having a second operating system, endpoint routing client 110 me be a cloud director mobile app. Endpoint routing client 110 can also be an agent that is downloaded using e-mail or silently installed using mass deployment tools. As mentioned above, endpoint routing client 110 may be implemented in a gateway through which all traffic from endpoints 105 travels to leave an enterprise network, for example. In any implementation, endpoint routing client 110 routes traffic generated by endpoints 105 to network security system 125.
Public network 115 may be any public network including, for example, the Internet. Public network 115 couples endpoints 105, destination domain servers 120, and network security system 125 such that any may communicate with any other via public network 115. While not depicted for simplicity, public network 115 may also couple many other devices for communication including, for example, other servers, other private networks, other user devices, and the like (e.g., any other connected devices). The communication path can be point-to-point over public network 115 and may include communication over private networks (not shown). In some embodiments, endpoint routing client 110, might be delivered indirectly, for example, via an application store (not shown). Communications can occur using a variety of network technologies, for example, private networks, Virtual Private Network (VPN), multiprotocol label switching (MPLS), local area network (LAN), wide area network (WAN), Public Switched Telephone Network (PSTN), Session Initiation Protocol (SIP), wireless networks, point-to-point networks, star network, token ring network, hub network, Internet, or the like. Communications may use a variety of protocols. Communications can use appropriate application programming interfaces (APIs) and data interchange formats, for example, Representational State Transfer (REST), JavaScript Object Notation (JSON), Extensible Markup Language (XML), Simple Object Access Protocol (SOAP), Java Message Service (JMS), Java Platform Module System, and the like. Additionally, a variety of authorization and authentication techniques, such as username/password, Open Authorization (OAuth), Kerberos, SecureID, digital certificates and more, can be used to secure communications.
Destination domain servers 120 include any domain servers available on public network 115. Destination domain servers 120 may include, for example, hosted services such as cloud computing and storage services, financial services, e-commerce services, or any type of applications, websites, or platforms that provide cloud-based storage or web services. At least some destination domain servers 120 may provide or store documents that endpoints 105 access (e.g., store, manipulate, download, upload, open, or the like).
Network security system 125 may provide network security services to endpoints 105. Endpoint routing client 110 may route traffic addressed to destination domain servers 120 from the endpoints 105 to network security system 125 to enforce security policies. Based on the security policy enforcement, the traffic may then be routed to the addressed destination domain server 120, blocked, modified, or the like. While network security system 125 is shown as connected to endpoints 105 via public network 115, in some embodiments, network security system 125 may be on a private network with endpoints 105 to manage network security on premises. Network security system 125 may implement security management for endpoints 105. The security management may include protecting endpoints 105 from various security threats including data loss prevention (DLP) and other security vulnerabilities including document malware. For simplicity, the features of network security system 125 related to detecting document malware are shown while other security features are not described in detail. Network security system 125 may be implemented as a cloud-based service and accordingly may be served by one or more server computing systems that provide the cloud-based services that are distributed geographically across data centers, in some embodiments. Network security system 125 may be implemented in any computing system or architecture that can provide the described capabilities without departing from the scope of the present disclosure. Network security system 125 may include, among other security features, sandbox 130, AI malware detection engine 135, and security policy enforcer 140. While a single network security system 125 is depicted for simplicity, any number of network security systems 125 may be implemented in security environment 100 and may include multiple instances of sandbox 130, AI malware detection engine 135, and security policy enforcer 140 for handling multiple clients or enterprises on a per/client basis, for example.
Sandbox 130 is a secure, isolated environment in which a document may be detonated (i.e., opened or launched). Sandbox 130 allows the AI malware detection engine 135 to detonate the document securely such that if it contains malware, the malware is contained and does not harm or infect any client computing systems, including endpoints 105. The documents may be any office documents including, for example, word processing documents (e.g., MICROSOFT WORD®), spreadsheet documents (e.g., MICROSOFT EXCEL®), presentation documents (e.g., MICROSOFT POWERPOINT®), or the like. Sandbox 130 isolates all running programs and is configured to have tightly controlled resources so that any malware is contained and does not infect the hosting server. While in the sandbox, static information about the document may be extracted. Further, during detonation, dynamic behaviors of the document can be observed and analyzed. For example, data about files and paths visited, static data about the document, processes spawned by detonation of the document, and the like can be safely obtained. Additionally, character strings can be extracted from images in the document safely. For example, optical character recognition (OCR) can be used to extract the character strings. The extracted and obtained data, including the static information and dynamic behavior data as well as the character strings, can be used by AI malware detection engine 135 to classify the document as clean or malicious as described in further detail throughout.
AI malware detection engine 135 analyzes documents requested by endpoints 105 to determine or predict whether the documents contain malware (i.e., are malicious). AI malware detection engine 135 obtains the requested document from the destination domain server 120 indicated in the access request. Upon obtaining the document, it is detonated in sandbox 130 and its behavior is observed. Static and dynamic data is captured while the document is in and opened in sandbox 130. In some embodiments, a process in sandbox 130 extracts the desired data and provides it to AI malware detection engine 135. For example, the process may generate a report containing the relevant static and dynamic data and provide the report to AI malware detection engine 135. In some embodiments, a data analyzer of AI malware detection engine 135 analyzes the document while it is open in sandbox 130 and extracts the static and dynamic data. In some embodiments, the static data about the document may be obtained outside of sandbox 130 since some static data may be extracted without opening the document. Details of the static and dynamic data that is extracted or obtained is discussed in more detail with respect to
Security policy enforcer 140 enforces security policies on all outgoing transactions intercepted by network security system 125 from endpoints 105. Security policy enforcer 140 may identify security policies to apply to outgoing transactions based on, for example, the user account that the outgoing transaction originates from, the endpoint 105 (i.e., user device) that the outgoing transaction originates from, the destination server addressed, the type of communication protocol used, the type of transaction (e.g., document download, document upload, login transaction, or the like), data included in the traffic (e.g., data in the packet), or any combination. Further, security policies may be applied based on classification of a document access request by AI malware detection engine 135. For example, if AI malware detection engine 135 classifies a requested document as malicious, security policy enforcer 140 may block the access request. In some embodiments, other security actions may be performed, other security policies may be applied based on the classification, or the like. For example, a notification of the malicious classification may be presented to the user. As another example, if the document is classified as clean, other security policies may be applied. In all cases, security policy enforcer 140 may identify relevant security policies for the outgoing transaction and apply the security policies. The security policies may include document malware specific policies as well as any other security policies implemented by the organization or entity. Accordingly, security policy enforcer 140 may identify and enforce any other security policies (e.g., security policies other than those related to document malware classification). After applying the security policies, the outgoing transaction may be blocked, modified, or transmitted to the destination domain server 120 specified in the outgoing transaction.
In use, endpoint 105 generates an outgoing transaction to a destination domain server 120. Endpoint routing client 110 routes the outgoing transaction to network security system 125. Network security system 125 intercepts the outgoing transaction and determines whether the transaction includes a document access request. If not, the outgoing transaction is routed to security policy enforcer 140. If so, the outgoing transaction is routed to AI malware detection engine 135. AI malware detection engine 135 analyzes the requested document by detonating it in sandbox 130 and extracting relevant information about the behavior of the document. Based on the analysis, AI malware detection engine 135 classifies the document as clean or malicious and provides the classification with the outgoing transaction to security policy enforcer 140. Security policy enforcer 140 enforces relevant security policies, some of which may be related to the document classification. Based on enforcement of the relevant security policies, network security system 125 may block the outgoing transaction, modify the outgoing transaction, or transmit the outgoing transaction to the addressed destination domain server 120.
Destination domain servers 120, network security system 125, sandbox 130, and security policy enforcer 140 remain as described with respect to
Ingestion engine 210 receives outgoing transaction 205 as it arrives based on being routed from endpoint routing client 110. As outgoing transactions are routed to network security system 125, ingestion engine 210 receives each outgoing transaction 205. Ingestion engine 210 may perform various filtering processes depending on the outgoing transaction 205. For the purposes of detecting document malware, ingestion engine 210 may determine whether outgoing transaction 205 includes a document access request. For example, ingestion engine 210 may review packet header information to determine the destination domain server 120 to which outgoing transaction 205 is directed. Based, for example, on the destination domain server 120 being a document storage service, ingestion engine 210 may determine outgoing transaction 205 includes a document access request. As another example, ingestion engine 210 may analyze the payload of outgoing transaction 205 to determine outgoing transaction 205 includes a document access request. In any case, upon determining outgoing transaction 205 includes a document access request, ingestion engine 210 sends outgoing transaction 205 to AI malware detection engine 135. If, however, ingestion engine 210 determines outgoing transaction 205 does not include a document access request, ingestion engine 210 routes outgoing transaction 205 directly to security policy enforcer 140.
File retriever 215 is responsible for obtaining a copy of the target document to which the user requested access. File retriever 215 receives outgoing transaction 205 from ingestion engine 210 when ingestion engine 210 routes outgoing transaction 205 to AI malware detection engine 135. In some embodiments, if ingestion engine 210 determined the file location of the document requested, the file location may be provided separately with outgoing transaction 205 from ingestion engine 210 so that file retriever 215 need not repeat the analysis of outgoing transaction 205. Otherwise, file retriever 215 analyzes outgoing transaction 205 to determine where the requested document is located. Upon determining the file location, file retriever 215 requests the document from destination domain server 120. In some embodiments, file retriever 215 generates a request to download the document using user login credentials from the user associated with outgoing transaction 205. File retriever 215 obtains the document from destination domain server 120 and provides the document to file detonator 220.
File detonator 220 is responsible for detonating (i.e., opening) the document in sandbox 130. File detonator 220 receives the document from file retriever 215. Upon receipt, file detonator 220 may configure sandbox 130 as needed for accessing the document. Note that many outgoing transactions 205 may be analyzed simultaneously. Accordingly, a specific sandbox 130 is configured to open each document in its own, isolated sandbox 130. File detonator 220 may, for example, configure settings, parameters, or the types of data to be captured, and in some embodiments the settings, parameters and types of data are configured based on the type of file. Once sandbox 130 is configured, file detonator 220 opens the document in sandbox 130.
Data analyzer 225 is responsible for distributing the retrieved data from the document to the relevant components so that AI malware detection engine 135 generates a classification for the document. In some embodiments, data analyzer 225 analyzes the document while it is open in sandbox 130. Data analyzer 225 may use, for example, optical character recognition (OCR) to analyze the images in the document to extract any character strings or text in the images. Further, data analyzer 225 may obtain other static and dynamic data from the document while it is opened in sandbox 130. In some embodiments, data analyzer 225 may extract the static data from the document outside of sandbox 130. In some embodiments, processes within sandbox 130 may obtain the static and dynamic data from the document in sandbox 130 as well as perform the OCR to analyze the images in the document. In such embodiments, sandbox 130 may include a process that generates a report that includes all the extracted, observed, identified, and captured data about the document while in sandbox 130. In any case, data analyzer 225 obtains the dynamic data from opening (i.e., executing or detonating) the document in sandbox 130. If processes within sandbox 130 capture the static and dynamic data as well as the character strings from the OCR analysis, the processes may generate a report and send the report or store the report as well as the character strings to a specific destination folder on network security system 125. Data analyzer 225 can retrieve the report and character strings from the destination folder or otherwise obtain the report and character strings from sandbox 130.
Data analyzer 225 may generate a feature vector using the relevant static and dynamic data captured in sandbox 130 to provide to AI model 230. Static data may include the number of pages in the document, the last saved by name, the author, the title, the creation time, keywords, template information, and the like. Some static data may be identified without opening the document, and static data generally represents information (including current information) about the document. Dynamic information is described further below and represents data that may represent behaviors of the document that are identified upon opening the document. The feature vector generated by data analyzer 225 may include, for example, a dimension including a count of behavior features that were exhibited by the document when opened in sandbox 130. A selection of features that may be included in the count may include, for example: apistats, dll_loaded, regkey_opened, regkey_read, regkey_written, regkey_deleted, file_loaded, directory_enumerated, file_exists, file_opened, file_deleted, file_moved, file_created, file_failed, file_written, file_copied, file_recreated, file_read, mutex, command_line, guid, wmi_query, directory_created, directory_removed, resolves_host, connects_ip, connects_host, downloads_file, fetches_url. In some embodiments, some features may be determined to have a higher relevance, so those may be weighted by counting for more value in the count, for example. The following table includes further explanation of each of the behavior features listed above.
The feature vector may include a count of the behavior features exhibited by the document, a dimension for each behavior feature indicating whether it was exhibited by the document, or otherwise represent the behavior features exhibited by the document in the feature vector.
Additionally, data analyzer 225 may include a dimension in the feature vector representing the process tree. For example, a value indicating the size of the process tree may be used as a dimension. The size of the process tree may be determined based on the number of processes and subprocesses that are spawned in response to opening the document. For example, process A may create children processes B and C, and child process C may create a child process D. In this case, the size of the process tree is four (4).
Data analyzer 225 may include one or more dimensions representing frequently visited files and paths of the document. For example, malicious documents may access (e.g., load, read, open delete, or the like) some directories or files which are less frequently accessed by benign documents. To fully analyze this behavior, the dynamic data may include particular information about frequently visited files and paths. The training associated with this dimension of the feature vector is described in more detail with respect to
Data analyzer 225 further includes signature related features associated with the document that are captured during detonation of the document in sandbox 130. For example, a one-hot vector having a dimension for each of a number of known signatures may be included in the feature vector. Each dimension may have a value of zero (0) or one (1) indicating whether or not the specific signature was invoked by the document detonation in sandbox 130. Each known signature may have a known severity score. In some embodiments, the severity scores for each invoked signature can be summed to provide a total severity score as a dimension in the feature vector.
After analyzing the data from sandbox 130, data analyzer 225 generates the feature vector as discussed above and submits it to AI model 230. AI model 230 is trained (as discussed in more detail with respect to
As previously discussed, data analyzer 225 may also obtain character strings, for example using OCR, from images in the document while it was open in sandbox 130. Data analyzer 225 provides the character strings to heuristic analyzer 235. Heuristic analyzer 235 may analyze the character strings by, for example, comparing the character strings to a batch of known phishing keywords and/or phrases to identify matches. In some embodiments, partial matches may be included. In some embodiments, a count of the matches may be used to generate a heuristic score. In some embodiments, partial matches are used and may be weighted to account for a smaller portion of the score than a complete or exact match. Heuristic analyzer 235 may return a heuristic score in some embodiments. In other embodiments, heuristic analyzer 235 may determine whether to indicate a heuristic trigger based on a heuristic rule, for example, the heuristic score exceeding a threshold value. In other words, the heuristic rule may be triggered if sufficient matches to known phishing keywords and phrases are identified in images within the document. The result (e.g., a heuristic trigger, a heuristic score, a true or false indicator, or the like) from heuristic analyzer 235 is returned to data analyzer 225. Data analyzer 225 provides the heuristic analyzer 235 result and the output from AI model 230 to verdict engine 240.
Verdict engine 240 is responsible for classifying the document as clean or malicious. For example, upon receipt of the heuristic score and the threat score from AI model 230, verdict engine 240 may combine the two values to classify the document. In some embodiments, based on the threat score exceeding a threshold threat value and the heuristic score exceeding a threshold heuristic value, verdict engine 240 classifies the document as malicious. In some embodiments, upon receiving a prediction from AI model 230 that the document includes malware and the heuristic trigger having been triggered, verdict engine 240 classifies the document as malicious. In other words, when both AI model 230 indicates malware and heuristic analyzer 235 indicates malware, verdict engine 240 may classify the document as malicious. Similarly, if verdict engine 240 receives a threat value from AI model 230 that is below a threshold threat value and a heuristic score that is below a threshold heuristic value, verdict engine 240 may classify the document as clean. Further, if either AI model 230 or heuristic analyzer 235 indicates no malware (e.g., one or the other returns a score falling below the respective threshold value), verdict engine 240 may classify the document as clean. In other words, unless both AI model 230 and heuristic analyzer 235 indicate malware in the document, the document is classified by verdict engine 240 as clean. After classification, verdict engine 240 provides outgoing transaction 205 and the classification to security policy enforcer 140.
Security policy enforcer 140 enforces security policies on outgoing transaction 205 based at least in part on the classification from verdict engine 240. For example, if the document is classified as malicious, outgoing transaction 205 may be blocked, a notification may be sent to the user, the document may be quarantined, a notification may be sent to administrators, or the like. Further, any combination of security policies may be applied. If the document is classified as clean, security policies may be applied including forwarding outgoing transaction 205 to the destination domain server 120, limiting the user's ability to share, modify, or delete the document based on user privileges, or the like.
AI model 230 may be or use a gradient boosting tree algorithm (e.g., extreme Gradient Boosting (XGBoost)). AI model 230 may include trees 330a, 330b, and 330n (collectively referred to as trees 330) indicating that there may be any number of trees 330. In some embodiments, AI model 230 includes one hundred forty (140) decision trees having a maximum depth of sixteen (16). AI model 230 uses trees 330 to generate threat score 315. The details of how gradient boosting tree algorithms work are not described in detail here as they are known in the art. However, further details of how AI model 230 is trained specifically to generate threat score 315 are described with respect to
Data analyzer 225 further obtains character strings 310. Character strings 310 may represent data extracted from the document while it was open in sandbox 130. For example, optical character recognition may be used to analyze images in the document to extract text from the images. Other character strings may be extracted from the document as well including from metadata or plain text in the document. The selection of character strings 310 are sent by data analyzer 225 to heuristic analyzer 235.
Heuristic analyzer 235 may include a batch of known phishing keywords, phrases, or a combination thereof. Heuristic analyzer 235 may compare each of the character strings 310 to the batch of known phishing keywords and phrases. Each match may be counted to generate a score. For example, each match may increase the score by one (1). In some embodiments, partial matches may be included. In some embodiments, partial matches may count for less in the score than exact matches. In some embodiments, the score may be used as heuristic score 320. In other embodiments, the score may be used to determine whether it exceeds a threshold heuristic value. If the score exceeds the threshold heuristic value, it may be considered a heuristic trigger, and heuristic score 320 may indicate a binary value (e.g., zero (0) for no heuristic trigger and one (1) for the heuristic trigger). In either case, heuristic score 320 is generated by heuristic analyzer 235.
Threat score 315 generated by AI model 230 and heuristic score 320 generated by heuristic analyzer 235 are provided to verdict engine 240. In some embodiments, AI model 230 sends threat score 315 and heuristic analyzer 235 sends heuristic score 320 directly to verdict engine 240. In some embodiments, AI model 230 sends threat score 315 and heuristic analyzer 235 sends heuristic score 320 back to data analyzer 225, and data analyzer 225 sends both to verdict engine 240. In either case, verdict engine 240 receives threat score 315 and heuristic score 320. Verdict engine 240 combines threat score 315 and heuristic score 320 to generate classification 325, classifying the document as either clean or malicious. Verdict engine 240 may be an AI classifier that takes the threat score 315 and the heuristic score 320 as input and outputs a classification. Verdict engine 240 may, in some embodiments, be a simpler algorithm that determines whether there is a heuristic trigger. The heuristic trigger may either be determined by heuristic analyzer 235 and indicated by heuristic score 320 being a positive value (e.g., one (1)), indicating the heuristic trigger, or verdict engine 240 may determine whether the heuristic trigger is positive by comparing the heuristic score 320 to a threshold heuristic value. In either case, a rule may be used to determine whether the heuristic trigger happened with respect to the document. Further, verdict engine 240 may compare threat score 315 against a threshold threat value such that when the threat score 315 exceeds the threshold threat value it indicates that the document includes malware. Verdict engine 240 may classify the document as malicious when the heuristic trigger happens and the threat score 315 indicates the document includes malware (e.g., threat score 315 exceeds the threshold threat value). Otherwise, verdict engine 240 may classify the document as clean. In other words, both heuristic analyzer 235 and AI model 230 may need to indicate the document includes malware for verdict engine 240 to classify the document as malicious. In other embodiments, verdict engine may classify the document as clean when both AI model 230 and heuristic analyzer 235 indicate the document does not include malware and otherwise classify the document as malicious. In yet other embodiments, when threat score 315 falls within an intermediate range, heuristic score 320 may determine the classification (e.g., when the heuristic trigger does not happen, the document is classified as clean and when the heuristic trigger does happen, the document is classified as malicious). In such embodiments when threat score 315 falls below the intermediate range the document is classified as clean and when threat score 315 falls above the intermediate range the document is classified as malicious. As can be seen, verdict engine 240 may combine threat score 315 and heuristic score 320 in any way to classify the document as clean or malicious.
Upon completing analysis, verdict engine 240 generates classification 325 and sends it to security policy enforcer 140 for security policy enforcement of the outgoing transaction 205.
At step 415, the document is obtained. For example, file retriever 215 may retrieve the file from the addressed destination domain server 120. In some embodiments, the user credentials used for outgoing transaction 205 may be used to obtain the file.
At step 420, the document is detonated (i.e., opened) in a sandbox of the network security system. For example, file detonator 220 may open the file in sandbox 130.
At step 425, in response to detonating the document, dynamic information about the document is extracted. Static information may also be extracted. For example, processes running in sandbox 130 may extract behavior feature data, signature data, process data, and the like. The processes may generate a report providing the relevant information. In some embodiments, data analyzer 225 may obtain the report such that data analyzer 225 retrieves all the data for analyzing whether the document includes malware. Data analyzer 225 can then extract the relevant data from the report. In some embodiments, data analyzer 225 probes sandbox 130 while the document is opened and extracts and obtains the information directly including the static and dynamic data.
At step 430, character strings from images in the document are extracted while the document is open in the sandbox. For example, OCR technology may be used to extract character strings from images in the document. In some embodiments, a process within sandbox 130 is configured to launch the OCR technology and provide the information in a report (e.g., the same report providing the static and dynamic data or a different report).
At step 435, the dynamic information about the document is provided as input to an artificial intelligence model trained to provide an output indicating a prediction of whether the document contains malware based on the input. For example, data analyzer 225 generates feature vector 305 and provides it as input to AI model 230. AI model 230 is trained to generate threat score 315 as output. In some embodiments, the static information is also provided as a portion of the input to the artificial intelligence model.
At step 440, a heuristic score is generated based on comparing the character strings extracted from images in the document to a batch of phishing keywords. For example, data analyzer 225 sends character strings 310 to heuristic analyzer 235 to compare character strings 310 to a batch of phishing keywords and phrases (collectively phishing keywords). Heuristic analyzer 235 generates heuristic score 320. Heuristic score may be a count of the matches, a weighted count of the matches and partial matches, a number indicating a heuristic trigger is triggered or not (e.g., the count of matches exceeds a threshold heuristic value), or any value that indicates a likelihood that the document includes malware based on the comparison of character strings 310 to the batch of phishing keywords.
At step 445, the output of the artificial intelligence model and the heuristic score are input to a verdict engine that combines the output and the heuristic score to classify the document as either clean or malicious. For example, threat score 315 and heuristic score 320 are input to verdict engine 240, which uses them to generate classification 325.
At step 450, a security policy is implemented based on the classification of the document. For example, security policy enforcer 140 may implement one or more security policies on outgoing transaction 205 based on whether verdict engine 240 provided classification 325 indicating the document was clean or malicious. If, for example, the document is classified as clean, security policy enforcer 140 may implement different security policies than if the document is classified as malicious. As one example, if classified as malicious, security policy enforcer 140 may block transmission of outgoing transaction 205. However, if classified as clean, security policy enforcer 140 may apply further security policies to outgoing transaction.
After initial setup, including identifying frequently visited files and paths that are discussed in more detail with respect to
Behavior features 520 may include any or all of the behavior features discussed in the table above with respect to
Process tree size 525 may be a dimension of feature vector 305 that indicates an integer value of the size of the process tree spawned by detonating the document. The process tree size may indicate malware if, for example, a large process tree is spawned.
Frequently visited files and paths 530 may be identified and included as one or more dimensions of feature vector 305. Specifically, known frequently visited files and paths of malicious files that may exhibit particular behavior features with respect to those known paths and files are used to generate a count for a one dimension of feature vector 305. In other words, when a document exhibits a behavior feature known to be often exhibited in a known path (e.g., file_created in path: directory1/subdirectory2/subdirectory3), the count increases by one. Similarly, known frequently visited files and paths of benign files that exhibit particular behavior features with respect to those known paths and files are used to generate a count for another dimension of feature vector 305. Those dimensions make up the frequently visited files and paths 530 data included in feature vector 305. Additional details about how the frequently visited files and paths are identified for inclusion is described with respect to
Signature features 535 includes data regarding signatures that are invoked by detonation of the document. Each signature may be associated with a corresponding dimension of feature vector 305. When the signature is invoked, the dimension indicates the invocation (e.g., has a value of one (1)), and when the signature is not invoked, the dimension indicates no invocation (e.g., has a value of zero (0)). The list of signatures included may be based on known signatures that are published and known for network security.
Signature severity 540 may include the severity information for the signatures invoked by the document. For example, each signature that is tracked in signature features 535 includes a severity score. A dimension of feature vector 305 may include a sum of the signature severity scores for the signatures invoked by the document.
Once feature vector 305 is generated for a given training data sample from training sample data 515, it is input to AI model 230. AI model 230 generates output 545 (e.g., threat score 315). The training system compares output 545 against the ground truth 550 for the given training sample. For example, cach training sample may be labeled as malicious or clean. When output 545 indicates that the training sample is malicious, but the ground truth 550 indicates the document is clean, the AI model 230 is wrong, and that information is provided as feedback 555 to AI model 230. Similarly, when AI model 230 is correct, feedback 555 indicates the correct classification. AI model 230 is trained until it reaches an acceptable level of accuracy. Once trained, it is deployed to use in a production environment.
Once a set of paths are identified as visited by benign and malicious samples, the paths are analyzed. The sample 600 shown in
Each trimmed path is scored with an odds ratio as shown in
The odds ratio is calculated by dividing odds1 by odds2 (i.e., odds ratio=odds1/odds2). Further, a value M0 is calculated indicating the total number of malicious samples that visited a specific trimmed path (e.g., the first trimmed path 615). A second value M1 is calculated indicating the number of malicious samples that did not visit the specific trimmed path. Another value B0 is calculated indicating the total number of benign samples that visited the trimmed path, and a value B1 is calculated indicating the total number of benign samples that did not visit the trimmed path. Odds1 is calculated as M0 divided by M1 (i.e., odds1=M0/M1), and Odds2 is calculated as B0 divided by B1 (i.e., odds2=B0/B1).
In addition to generating the malicious list 640 and the benign list 635 indicating trimmed paths and files frequently visited by the malicious and benign documents, a list of behavior features for each frequently visited path in the malicious list 640 can be identified, and a list of behavior features for each frequently visited path in the benign list 635 can be identified. For example, behavior features including file_exists, file_created, file_failed, file_written, file_recreated, file_read, and file_opened may be included in the behavior features relevant to the frequently visited paths and files in benign list 635. Behavior features including directory_enumerated, file_exists, file_opened file_created, file_failed, file_read, directory_created, file_written, file_deleted, directory_removed, and file_recreated may be included in the behavior features relevant to the frequently visited paths and files in malicious list 640.
Accordingly, to generate frequently visited files and paths 530 in feature vector 305 for each training and production sample, the dynamic information is collected by detonating the document in sandbox 130. A count of how many times the document performs one of the relevant behavior features in a path of the malicious list 640 may be one dimension of feature vector 305. A count of how many times the document performs one of the relevant behavior features in a path of the benign list 635 may be another dimension of feature vector 305.
Computing device 700 is suitable for implementing processing operations described herein related to security enforcement and document malware detection, with which aspects of the present disclosure may be practiced. Computing device 700 may be configured to implement processing operations of any component described herein including the user system components (e.g., endpoints 105 of
Non-limiting examples of computing device 700 include smart phones, laptops, tablets, PDAs, desktop computers, servers, blade servers, cloud servers, smart computing devices including television devices and wearable computing devices including VR devices and AR devices, e-reader devices, gaming consoles and conferencing systems, among other non-limiting examples.
Processors 710 may include general processors, specialized processors such as graphical processing units (GPUs) and digital signal processors (DSPs), or a combination. Processors 710 may load and execute software 740 from memory 735. Software 740 may include one or more software components such as sandbox 130, AI malware detection engine 135, security policy enforcer 140, endpoint routing client 110, or any combination including other software components. In some examples, computing device 700 may be connected to other computing devices (e.g., display device, audio devices, servers, mobile devices, remote devices, VR devices, AR devices, or the like) to further enable processing operations to be executed. When executed by processors 710, software 740 directs processors 710 to operate as described herein for at least the various processes, operational scenarios, and sequences discussed in the foregoing implementations. Computing device 700 may optionally include additional devices, features, or functionality not discussed for purposes of brevity. For example, software 740 may include an operating system that is executed on computing device 700. Computing device 700 may further be utilized as endpoints 105 or any of the cloud computing systems in security environment 100 (
Referring still to
Memory 735 may include any computer-readable storage device readable by processors 710 and capable of storing software 740 and data stores 745. Data stores 745 may include data stores that maintain security policies used by security policy enforcer 140, for example. Memory 735 may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, cache memory, or other data. Examples of storage media include random access memory, read only memory, magnetic disks, optical disks, flash memory, virtual memory and non-virtual memory, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or other suitable storage media, except for propagated signals. In no case is the computer-readable storage device a propagated signal.
In addition to computer-readable storage devices, in some implementations, memory 735 may also include computer-readable communication media over which at least some of software 740 may be communicated internally or externally. Memory 735 may be implemented as a single storage device but may also be implemented across multiple storage devices or sub-systems co-located or distributed relative to each other. Memory 735 may include additional elements, such as a controller, capable of communicating with processors 710 or possibly other systems.
Software 740 may be implemented in program instructions and among other functions may, when executed by processors 710, direct processors 710 to operate as described with respect to the various operational scenarios, sequences, and processes illustrated herein. For example, software 740 may include program instructions for executing document malware detection (e.g., AI malware detection engine 135, AI model 230, heuristic analyzer 235, verdict engine 240, data analyzer 225, file retriever 215, file detonator 220, sandbox 130) or security policy enforcement (e.g., security policy enforcer 140) as described herein.
In particular, the program instructions may include various components or modules that cooperate or otherwise interact to conduct the various processes and operational scenarios described herein. The various components or modules may be embodied in compiled or interpreted instructions, or in some other variation or combination of instructions. The various components or modules may be executed in a synchronous or asynchronous manner, serially or in parallel, in a single threaded environment or multi-threaded, or in accordance with any other suitable execution paradigm, variation, or combination thereof. Software 740 may include additional processes, programs, or components, such as operating system software, virtual machine software, or other application software. Software 740 may also include firmware or some other form of machine-readable processing instructions executable by processors 710.
In general, software 740 may, when loaded into processors 710 and executed, transform a suitable apparatus, system, or device (of which computing device 700 is representative) overall from a general-purpose computing system into a special-purpose computing system customized to execute specific processing components described herein as well as process data and respond to queries. Indeed, encoding software 740 on memory 735 may transform the physical structure of memory 735. The specific transformation of the physical structure may depend on various factors in different implementations of this description. Examples of such factors may include, but are not limited to, the technology used to implement the storage media of memory 735 and whether the computer-storage media are characterized as primary or secondary storage, as well as other factors.
For example, if the computer readable storage device are implemented as semiconductor-based memory, software 740 may transform the physical state of the semiconductor memory when the program instructions are encoded therein, such as by transforming the state of transistors, capacitors, or other discrete circuit elements constituting the semiconductor memory. A similar transformation may occur with respect to magnetic or optical media. Other transformations of physical media are possible without departing from the scope of the present description, with the foregoing examples provided only to facilitate the present discussion.
Communication interfaces 720 may include communication connections and devices that allow for communication with other computing systems (not shown) over communication networks (not shown). Communication interfaces 720 may also be utilized to cover interfacing between processing components described herein. Examples of connections and devices that together allow for inter-system communication may include network interface cards or devices, antennas, satellites, power amplifiers, RF circuitry, transceivers, and other communication circuitry. The connections and devices may communicate over communication media to exchange communications with other computing systems or networks of systems, such as metal, glass, air, or any other suitable communication media. The aforementioned media, connections, and devices are well known and need not be discussed at length here.
Communication interfaces 720 may also include associated user interface software executable by processors 710 in support of the various user input and output devices discussed below. Separately or in conjunction with each other and other hardware and software elements, the user interface software and user interface devices may support a graphical user interface, a natural user interface, or any other type of user interface, for example, which enables front-end processing and including rendering of user interfaces, such as a user interface that is used by a user on endpoint 105. Exemplary applications and services may further be configured to interface with processing components of computing device 700 that enable output of other types of signals (e.g., audio output, handwritten input) in conjunction with operation of exemplary applications or services (e.g., a collaborative communication application or service, electronic meeting application or service, or the like) described herein.
Input devices 725 may include a keyboard, a mouse, a voice input device, a touch input device for receiving a touch gesture from a user, a motion input device for detecting non-touch gestures and other motions by a user, gaming accessories (e.g., controllers and/or headsets) and other comparable input devices and associated processing elements capable of receiving user input from a user. Output devices 715 may include a display, speakers, haptic devices, and the like. In some cases, the input and output devices may be combined in a single device, such as a display capable of displaying images and receiving touch gestures. The aforementioned user input and output devices are well known in the art and need not be discussed at length here.
Communication between computing device 700 and other computing systems (not shown), may occur over a communication network or networks and in accordance with various communication protocols, combinations of protocols, or variations thereof. Examples include intranets, internets, the Internet, local area networks, wide area networks, wireless networks, wired networks, virtual networks, software defined networks, data center buses, computing backplanes, or any other type of network, combination of network, or variation thereof. The aforementioned communication networks and protocols are well known and need not be discussed at length here. However, some communication protocols that may be used include, but are not limited to, the Internet protocol (IP, IPv4, IPv6, etc.), the transfer control protocol (TCP), and the user datagram protocol (UDP), as well as any other suitable communication protocol, variation, or combination thereof.
The computing device 700 has a power supply 730, which may be implemented as one or more batteries. The power supply 730 may further include an external power source, such as an AC adapter or a powered docking cradle that supplements or recharges the batteries. In some embodiments, the power supply 730 may not include batteries and the power source may be an external power source such as an AC adapter.
The aforementioned discussion is presented to enable any person skilled in the art to make and use the technology disclosed and is provided in the context of a particular application and its requirements. Various modifications to the disclosed implementations will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other implementations and applications without departing from the spirit and scope of the technology disclosed. Thus, the technology disclosed is not intended to be limited to the implementations shown but is to be accorded the widest scope consistent with the principles and features disclosed herein.
Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.” As used herein, the terms “connected,” “coupled,” or any variant thereof means any connection or coupling, either direct or indirect, between two or more elements; the coupling or connection between the elements can be physical, logical, or a combination thereof. Additionally, the words “herein,” “above,” “below,” and words of similar import, when used in this application, refer to this application as a whole and not to any particular portions of this application. Where the context permits, words in the above Detailed Description using the singular or plural number may also include the plural or singular number, respectively. The word “or” in reference to a list of two or more items covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.
The phrases “in some embodiments,” “according to some embodiments,” “in the embodiments shown,” “in other embodiments,” and the like generally mean the particular feature, structure, or characteristic following the phrase is included in at least one implementation of the present technology and may be included in more than one implementation. In addition, such phrases do not necessarily refer to the same embodiments or different embodiments.
The above Detailed Description of examples of the technology is not intended to be exhaustive or to limit the technology to the precise form disclosed above. While specific examples for the technology are described above for illustrative purposes, various equivalent modifications are possible within the scope of the technology, as those skilled in the relevant art will recognize. For example, while processes or blocks are presented in a given order, alternative implementations may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified to provide alternative or subcombinations. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed or implemented in parallel or may be performed at different times. Further any specific numbers noted herein are only examples: alternative implementations may employ differing values or ranges.
Number | Name | Date | Kind |
---|---|---|---|
5440723 | Arnold et al. | Aug 1995 | A |
6513122 | Magdych et al. | Jan 2003 | B1 |
6622248 | Hirai | Sep 2003 | B1 |
7080408 | Pak et al. | Jul 2006 | B1 |
7298864 | Jones | Nov 2007 | B2 |
7376719 | Shafer et al. | May 2008 | B1 |
7735116 | Gauvin | Jun 2010 | B1 |
7966654 | Crawford | Jun 2011 | B2 |
8000329 | Fendick et al. | Aug 2011 | B2 |
8296178 | Hudis et al. | Oct 2012 | B2 |
8793151 | DelZoppo et al. | Jul 2014 | B2 |
8839417 | Jordan | Sep 2014 | B1 |
9177142 | Montoro | Nov 2015 | B2 |
9197601 | Pasdar | Nov 2015 | B2 |
9225734 | Hastings | Dec 2015 | B1 |
9231968 | Fang et al. | Jan 2016 | B2 |
9280678 | Redberg | Mar 2016 | B2 |
9811662 | Sharpe et al. | Nov 2017 | B2 |
10084825 | Xu | Sep 2018 | B1 |
10169579 | Xu et al. | Jan 2019 | B1 |
10237282 | Nelson et al. | Mar 2019 | B2 |
10334442 | Vaughn et al. | Jun 2019 | B2 |
10382468 | Dods | Aug 2019 | B2 |
10462173 | Aziz et al. | Oct 2019 | B1 |
10484334 | Lee et al. | Nov 2019 | B1 |
10762206 | Titonis | Sep 2020 | B2 |
10826941 | Jain et al. | Nov 2020 | B2 |
11025666 | Han | Jun 2021 | B1 |
11032301 | Mandrychenko et al. | Jun 2021 | B2 |
11036856 | Graun et al. | Jun 2021 | B2 |
11281775 | Burdett et al. | Mar 2022 | B2 |
11310282 | Zhang | Apr 2022 | B1 |
11444951 | Patil | Sep 2022 | B1 |
11481709 | Liao | Oct 2022 | B1 |
20020099666 | Dryer et al. | Jul 2002 | A1 |
20030055994 | Herrmann et al. | Mar 2003 | A1 |
20030063321 | Inoue et al. | Apr 2003 | A1 |
20030172292 | Judge | Sep 2003 | A1 |
20030204632 | Willebeek-Lemair et al. | Oct 2003 | A1 |
20040015719 | Lee et al. | Jan 2004 | A1 |
20050010593 | Fellenstein et al. | Jan 2005 | A1 |
20050271246 | Sharma et al. | Dec 2005 | A1 |
20060156401 | Newstadt et al. | Jul 2006 | A1 |
20070204018 | Chandra et al. | Aug 2007 | A1 |
20070237147 | Quinn et al. | Oct 2007 | A1 |
20080069480 | Aarabi et al. | Mar 2008 | A1 |
20080134332 | Keohane et al. | Jun 2008 | A1 |
20090144818 | Kumar et al. | Jun 2009 | A1 |
20090249470 | Litvin et al. | Oct 2009 | A1 |
20090300351 | Lei et al. | Dec 2009 | A1 |
20100017436 | Wolge | Jan 2010 | A1 |
20110119481 | Auradkar et al. | May 2011 | A1 |
20110145594 | Jho et al. | Jun 2011 | A1 |
20120278896 | Fang et al. | Nov 2012 | A1 |
20130159694 | Chiueh et al. | Jun 2013 | A1 |
20130298190 | Sikka et al. | Nov 2013 | A1 |
20130347085 | Hawthorn et al. | Dec 2013 | A1 |
20140013112 | Cidon et al. | Jan 2014 | A1 |
20140068030 | Chambers et al. | Mar 2014 | A1 |
20140068705 | Chambers et al. | Mar 2014 | A1 |
20140259093 | Narayanaswamy et al. | Sep 2014 | A1 |
20140282843 | Buruganahalli et al. | Sep 2014 | A1 |
20140289852 | Evans | Sep 2014 | A1 |
20140359282 | Shikfa et al. | Dec 2014 | A1 |
20140366079 | Pasdar | Dec 2014 | A1 |
20150100357 | Seese et al. | Apr 2015 | A1 |
20160323318 | Terrill et al. | Nov 2016 | A1 |
20160350145 | Botzer et al. | Dec 2016 | A1 |
20170064005 | Lee | Mar 2017 | A1 |
20170093917 | Chandra et al. | Mar 2017 | A1 |
20170250951 | Wang et al. | Aug 2017 | A1 |
20190222591 | Kislitsin | Jul 2019 | A1 |
20200050686 | Kamalapuram et al. | Feb 2020 | A1 |
20200322361 | Ravindra | Oct 2020 | A1 |
20230297685 | Patil et al. | Sep 2023 | A1 |
20230342461 | Du et al. | Oct 2023 | A1 |
Number | Date | Country |
---|---|---|
111274583 | Jun 2020 | CN |
1063833 | Dec 2000 | EP |
WO-2013184653 | Dec 2013 | WO |
WO-2014012106 | Jan 2014 | WO |
WO-2022246131 | Nov 2022 | WO |
Entry |
---|
Djenna, Amir, et al. “Artificial intelligence-based malware detection, analysis, and mitigation.” Symmetry 15.3 (2023): 677. (Year: 2023). |
Singh, Jagsir, and Jaswinder Singh. “Detection of malicious software by analyzing the behavioral artifacts using machine learning algorithms.” Information and Software Technology 121 (2020): 106273. (Year: 2020). |
Kishore, Pushkar, Swadhin Kumar Barisal, and Durga Prasad Mohapatra. “JavaScript malware behaviour analysis and detection using sandbox assisted ensemble model.” 2020 IEEE Region 10 Conference (Tencon). IEEE, 2020. (Year: 2020). |
U.S. Appl. No. 18/656,895 Non-Final Office Action dated Jul. 5, 2024, USPTO Examiner Fatoumata Traore, 18 pages. |
Martin, Victoria “Cooperative Security Fabric,” The Fortinet Cookbook, Jun. 8, 2016, 6 pgs., archived Jul. 28, 2016 at https://web.archive.org/web/20160728170025/http://cookbook.fortinet.com/cooperative-security-fabric-54. |
Huckaby, Jeff “Ending Clear Text Protocols,” Rackaid.com, Dec. 9, 2008, 3 pgs. |
Newton, Harry “fabric,” Newton's Telecom Dictionary, 30th Updated, Expanded, Anniversary Edition, 2016, 3 pgs. |
Fortinet, “Fortinet Security Fabric Earns 100% Detection Scores Across Several Attack Vectors in NSS Labs' Latest Breach Detection Group Test [press release]”, Aug. 2, 2016, 4 pgs, available at https://www.fortinet.com/de/corporate/about-us/newsroom/press-releases/2016/security-fabric-earns-100-percent-breach-detection-scores-nss-labs. |
Fortinet, “Fortinet Security Fabric Named 2016 CRN Network Security Product of the Year [press release]”, Dec. 5, 2016, 4 pgs, available at https://www.fortinet.com/corporate/about-US/newsroom/press-releases/2016/fortinet-security-fabric-named-2016-crn-network-security-product. |
McCullagh, Declan, “How safe is instant messaging? A security and privacy survey,” CNET, Jun. 9, 2008, 14 pgs. |
Beck et al., “IBM and Cisco: Together for a World Class Data Center,” IBM Redbooks, Jul. 2013, 654 pgs. |
Martin, Victoria “Installing internal FortiGates and enabling a security fabric,” The Fortinet Cookbook, Jun. 8, 2016, 11 pgs, archived Aug. 28, 2016 at https://web.archive.org/web/20160828235831/http://cookbook.fortinet.com/installing-isfw-fortigate-enabling-csf-54/ . |
Zetter, Kim, “Revealed: The Internet's Biggest Security Hole,” Wired, Aug. 26, 2008, 13 pgs. |
Adya et al., “Farsite: Federated, available, and reliable storage for an incompletely trusted environment,” SIGOPS Oper. Syst. Rev. 36, SI, Dec. 2002, pp. 1-14. |
Agrawal et al., “Order preserving encryption for numeric data,” In Proceedings of the 2004 Acm Sigmod international conference on Management of data, Jun. 2004, pp. 563-574. |
Balakrishnan et al., “A layered naming architecture for the Internet,” Acm Sigcomm Computer Communication Review, 34(4), 2004, pp. 343-352. |
Downing et al., Naming Dictionary of Computer and Internet Terms, (11th Ed.) Barron's, 2013, 6 pgs. |
Downing et al., Dictionary of Computer and Internet Terms, (10th Ed.) Barron's, 2009, 4 pgs. |
Zoho Mail, “Email Protocols: What they are & their different types,” 2006, 7 pgs. available at https://www.zoho.com/mail/glossary/email-protocols.html# :˜: text=mode of communication.-, What are the different email protocols%3F, and also has defined functions. |
NIIT, Special Edition Using Storage Area Networks, Que, 2002, 6 pgs. |
Chapple, Mike, “Firewall redundancy: Deployment scenarios and benefits,” TechTarget, 2005, 5 pgs. available at https://www.techtarget.com/searchsecurity/tip/Firewall-redundancy-Deployment-scenarios-and-benefits? Offer=abt_pubpro_AI-Insider. |
Fortinet, FortiGate—3600 User Manual (vol. 1, Version 2.50 MR2) Sep. 5, 2003, 329 pgs. |
Fortinet, FortiGate SOHO and SMB Configuration Example, (Version 3.0 MR5), Aug. 24, 2007, 54 pgs. |
Fortinet, FortiSandbox—Administration Guide, (Version 2.3.2), Nov. 9, 2016, 191 pgs. |
Fortinet, FortiSandbox Administration Guide, (Version 4.2.4) Jun. 12, 2023, 245 pgs. available at https://fortinetweb.s3.amazonaws.com/docs.fortinet.com/v2/attachments/fba32b46-b7c0-11ed-8e6d-fa163e15d75b/FortiSandbox-4.2.4-Administration_Guide.pdf. |
Fortinet, FortiOS- Administration Guide, (Versions 6.4.0), Jun. 3, 2021, 1638 pgs. |
Heady et al., “The Architecture of a Network Level Intrusion Detection System,” University of New Mexico, Aug. 15, 1990, 21 pgs. |
Kephart et al., “Fighting Computer Viruses,” Scientific American (vol. 277, No. 5) Nov. 1997, pp. 88-93. |
Wang, L., Chapter 5: Cooperative Security in D2D Communications, “Physical Layer Security in Wireless Cooperative Networks,” 41 pgs. first online on Sep. 1, 2017 at https://link.springer.com/chapter/10.1007/978-3-319-61863-0_5. |
Lee et al., “A Data Mining Framework for Building Intrusion Detection Models,” Columbia University, n.d. 13 pgs. |
Merriam-Webster Dictionary, 2004, 5 pgs. |
Microsoft Computer Dictionary, (5th Ed.), Microsoft Press, 2002, 8 pgs. |
Microsoft Computer Dictionary, (4th Ed.), Microsoft Press, 1999, 5 pgs. |
Mika et al., “Metadata Statistics for a Large Web Corpus,” LDOW2012, Apr. 16, 2012, 6 pgs. |
Oxford Dictionary of Computing (6th Ed.), 2008, 5 pgs. |
Paxson, Vern, “Bro: a System for Detecting Network Intruders in Real-Time,” Proceedings of the 7th USENIX Security Symposium, Jan. 1998, 22 pgs. |
Fortinet Inc., U.S. Appl. No. 62/503,252, “Building a Cooperative Security Fabric of Hierarchically Interconnected Network Security Devices.” n.d., 87 pgs. |
Song et al., “Practical techniques for searches on encrypted data,” In Proceeding 2000 IEEE symposium on security and privacy. S&P. 2000, May 2000, pp. 44-55. |
Dean, Tamara, Guide to Telecommunications Technology, Course Technology, 2003, 5 pgs. |
U.S. Appl. No. 60/520,577, “Device, System, and Method for Defending a Computer Network,” Nov. 17, 2003, 21 pgs. |
U.S. Appl. No. 60/552,457, “Fortinet Security Update Technology,” Mar. 2004, 6 pgs. |
Tittel, Ed, Unified Threat Management For Dummies, John Wiley & Sons, Inc., 2012, 76 pgs. |
Fortinet, FortiOS Handbook: UTM Guide (Version 2), Oct. 15, 2010, 188 pgs. |
Full Definition of Security, Wayback Machine Archive of Merriam-Webster on Nov. 17, 2016, 1 pg. |
Definition of Cooperative, Wayback Machine Archive of Merriam-Webster on Nov. 26, 2016, 1 pg. |
Pfaffenberger, Bryan, Webster's New World Computer Dictionary, (10th Ed.), 2003, 5 pgs. |
U.S. Appl. No. 18/656,895 Non-Final Office Action dated Jul. 5, 2024, USPTO Examiner Fatoumata Traore, 18 pages (745.0082). |