ANALYSES AND AGGREGATION OF DOMAIN BEHAVIOR FOR EMAIL THREAT DETECTION BY A CYBER SECURITY SYSTEM

NOTICE OF COPYRIGHT

A portion of this disclosure contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the material subject to copyright protection as it appears in the United States Patent & Trademark Office's patent file or records, but otherwise reserves all copyright rights whatsoever.

FIELD

Embodiments of the design provided herein generally relate to electronic mail (email) security, and in particular, a cyber security appliance configured to analyze and detect email threats based, at least in part, on Domain-Based Message Authentication, Reporting, and Conformance (DMARC) reports and/or domain data shared across a fleet of cyber security appliances.

BACKGROUND

In the cyber security environment, firewalls, endpoint security methods and other tools such as SIEMs and sandboxes are deployed to enforce specific policies and provide protection against certain threats. These tools currently form an important part of an organization's cyber defense strategy, but they are insufficient in the new age of cyber threats where intelligent threats modify their behavior and actively seek to avoid detection. Cyber threats, including email borne cyber threats, can be subtle and rapidly cause harm to a network as well as the branding of the organization, especially where third-party malicious actors represent that it is acting on behalf of an organization when, in fact, they are using the goodwill of the organization as a “hook” for a cyberattack.

Currently, cyber security protections utilize Sender Policy Framework (SPF). SPF is an authentication process that identifies authorized mail servers permitted to send an email message (generally referred to as an “email”) on behalf of a given domain of interest (generally referred to as the “monitored domain”). The SPF process helps to solve the problem of how to identify authorized and unauthorized email sources for an organization or user. When an organization sets up SPF, it helps Internet Service Providers (ISPs), email security vendors, and other email providers to validate an organization's or user's email communication and distinguish authorized communications from spoofed emails or phishing attacks attempting to impersonate association with the monitored domain. The authorized email servers for the monitored domain are identified in the SPF record, which is a type of Domain Name System (DNS) text record.

One shortcoming with the SPF process is that it validates the authenticity of an originating server by analyzing a Return-Path value for the email rather than its From header. The Return-Path is the address used by recipient email servers to communicate with the sender if a transmission error occurs (e.g., bounce back notification). Hence, an email recipient may be subject to spoofing and/or phishing as he or she may not notice the false email address (and false domain) in the From header if the malicious email is delivered.

Besides SPF, DomainKeys Identified Mail (DKIM) is another authentication process used to prevent email spoofing, where DKIM adds cryptographic signatures to emails to prevent modification of their content during transit. The DKIM authentication process may be used to avoid certain types of cyber threats that may effective if the content of the message is changed; however, DKIM is not directed to authenticating the original source (content in the From header) of the email.

A cyber security appliance deploying logic configured to analyze email authentication reports, which may include authentication status for emails propagating to/from the monitored domain (and its subdomains), is needed to enhance security against third-party email threats to the organization or user associated with the monitored domain and to counter these third-party email threats more rapidly. In efforts to further reduce the likelihood of erred determinations (e.g., false-positive or false-negative), a global intelligence data store is needed that aggregates domain data across a fleet of cyber security appliances.

SUMMARY

A cyber security appliance and its models and modules have been developed to enhance protection of a domain and all of its subdomains (generally referred to as “monitored domain”) associated with an organization or a user (generally referred to as “domain registrant”) against cyber threats including cyberattacks. In protecting a network system accessible via the monitored domain, the cyber security appliance may be a physical device or a software instance configured with software components including, but not limited or restricted to an email module, a network module, one or more machine learning (ML) models, a cyber threat analyst module, an autonomous response module, and a communication module with input/output (I/O) ports.

According to one embodiment of the disclosure, the email module includes email report analytic logic, which is configured to (i) receive one or more email authentication reports, such as a DMARC aggregate report and/or a DMARC forensic report (collectively referred to as “DMARC reports”) from an Internet Service Provider (ISP) and (ii) conduct analytics on the content of the DMARC reports to determine whether certain emails pose a threat to the network system. Herein, the email module can use the content of the DMARC reports to improve security posture and visibility by detecting, tracking, and profiling third-party services across email, Software-as-a-Service (SaaS), and/or various attack surfaces.

According to one embodiment of the disclosure, the cyber security appliance is configured to query and obtain the DMARC reports as well as obtain access to SPF records, DKIM records and/or DMARC records. Herein, the SPF record is a published record that identifies the servers authorized to send emails on behalf of the monitored domain. If an email is sourced by a server that is not identified in the SPF record associated with the monitored domain, then the emails are determined to be sent from an unauthorized server. Published on the DNS server (and publicly available), a DMARC record features instructions that guide mail receiving servers as to what action or actions are to be undertaken if an incoming email associated with the monitored domain fails authentication. The instructions correspond to a policy set as a parameter of the DMARC record for the monitored domain. Some or all of the gathered information may be used to track and profile third-party services.

More specifically, when queried, an ISP will send the DMARC reports associated with the monitored domain to an email address identified in the DMARC record pertaining to the monitored domain and configured by the domain registrant (owner). In some cases, this email address is referred to as the domain registrant's “No-Reply DMARC address.” The DMARC reports contains information to discern the authenticity of emails observed by the ISP as being sent on behalf of the monitored domain.

More specifically, the DMARC reports includes content that identifies what emails have been sent on the domain registrant's behalf and information as to whether or not these emails have passed email authentication checks, which may include DMARC, SPF and/or DKIM. The DMARC reports may be used by the cyber security appliance to improve network system health (e.g., real-time identification of cyber threats directed to the monitored domain), as well as determining whether any email authentication processes deployed by the network system, including DMARC, has been misconfigured.

The contents of the DMARC reports from the ISP may be routed to email reporting analysis logic deployed within the cyber security appliance, which is configured to determine malicious activity by third-party services (e.g., email servers controlled by a user or organization nominated to send on behalf of the domain registrant). For example, Company A resources (e.g., Salesforce® resources or LinkedIn® resources) may be permitted to send emails on behalf of the domain registrant. As a result, if any emails came through Company A's email servers and pass the DMARC check, then Company A is allowed to send emails on behalf of the user or organization. Emails from legitimate third-party services are normally allowed to continue transmission; however, given the presence of third-party services increases the known attack surface for the monitored domain, processes are needed to detect if somebody might be pretending to be the third-party vendor. The detection and alerting of new third parties through DMARC reports and adjustment of DMARC records can be used to enhance system security as well as supply chain security through identified risks and specific configuration of security tools and identifying supply chain attacks.

Additionally, the cyber security appliance is also configured to use scores and metric outputs for domains that can be aggregated across all cyber security email deployments. Unlike simple domain-wide metrics, which may include whether a domain is considered rare or popular along with the age of the domain, fleet-wide (behavioral) metrics can include more in-depth metrics such as (i) the global popularity of the particular domain, (ii) whether the domain regularly fails SPF, (iii) the kind of emails the domain sends, (iv) the number of people sending emails from the domain, (v) the style, tone, and formatting of the emails, (vi) how frequently the domain appears across all cyber security appliance deployments, and/or (vii) how frequently the domain is seen in anomalous connectivity across all of the cyber security appliance deployments.

Herein, the fleet-wide metrics can be collated and may be provided to systems to perform further analytics on the fleet-wide metrics, from all products, for an individual domain. The fleet-wide domain analysis results can be (1) accessible as a database for analysts, (2) in the form of threat feeds for clients/alerts for clients, and (3) utilized by all products for rarity scoring.

A global baseline can be seeded into new deployments of any kind by incorporating a dynamic list of domains that are considered popular and how strangely they may behave, as opposed to having a hard-coded list of a number of common domains. The fleet-wide domain analysis results can be used to detect emerging threats if a domain is seen as behaving unusually in the email domain. A domain may be seen as behaving unusually based on when it begins breaching more Artificial Intelligence (AI) models or the Internet Protocol (IP) addresses associated with the domain shows up in other streams being analyzed by the cyber security appliance.

The detection of unusual behavior can be fed into the inoculation module implemented within the cyber security appliance, which may be configured to block malicious activity on the monitored domain referenced by cyber security appliance and/or multiple domains referenced by the fleet of cyber security appliances. This can be used as a preemptive threat detection system that can allow a cyber security appliance across the fleet to respond before a corresponding user even knows they have been compromised.

The fleet-wide domain analysis can provide a query-able location from all email streams being analyzed by the cyber security appliance, everything that is understood about a domain and the domain's behavior, such as the users behind that domain in terms of emails.

These and other features of the design provided herein can be better understood with reference to the drawings, description, and claims, all of which form the disclosure of this patent application.

DRAWINGS

The drawings refer to some embodiments of the design provided herein in which:

FIG. 1A illustrates a block diagram of an embodiment of a cyber security appliance that comprises an email module evaluate the characteristics of the emails and email authentication reports to uncover malicious emails.

FIG. 1B illustrates a block diagram of the email module deployed within the cyber security appliance of FIG. 1A.

FIG. 2A is a flow diagram of illustrative operations of an embodiment of the email report analysis logic shown in FIG. 1B.

FIG. 2B is a flow diagram of illustrative operations of an embodiment of the email report analysis logic shown in FIG. 1B in determining malicious emails and policy extraction for the handling of email senders of the malicious emails.

FIG. 3 illustrates a block diagram of an embodiment of the cyber security appliance with probes and detectors monitoring email activity and network activity with a global domain intelligence data store to provide increased context for domain administrators in handling email-based cyberattacks.

FIG. 4 illustrates a flow diagram of an embodiment of the email report analysis logic utilizing DMARC reports to assist in deployment, globally track, identify and combat potentially malicious third-party behavior, and improve attack surface management.

FIG. 5 illustrates a flow diagram of the fleet-wide domain analysis for detecting cyber threats occurring across all cyber security email deployments.

FIG. 6 illustrates an example of the contents of the global domain intelligence data store.

FIG. 7 illustrates a block diagram of an embodiment of example autonomous actions that the autonomous response module can be configured to take without a human initiating that action.

FIG. 8 illustrates an example cyber security appliance protecting an example network.

FIG. 9 illustrates a block diagram of an embodiment of one or more computing devices that can be a part of an embodiment of the Artificial Intelligence based cyber security appliance discussed herein.

While the design is subject to various modifications, equivalents, and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and will now be described in detail. It should be understood that the design is not limited to the particular embodiments disclosed, but—on the contrary—the intention is to cover all modifications, equivalents, and alternative forms using the specific embodiments.

DESCRIPTION

In the following description, numerous specific details associated with aspects of embodiments of the disclosure are set forth in order to provide a thorough understanding of the present design. It will be apparent, however, to one of ordinary skill in the art that the present design can be practiced without these specific details. In other instances, well known components or methods have not been described in detail, but may be represented as part of a block diagram in order to avoid unnecessarily obscuring the present design. Further, specific numeric references, such as use of the terms “first” and “second” for example, should not be interpreted as a literal sequential order. Rather, these numeric references may be used to denote different features (e.g., components, operations, functionality, etc.) that are merely exemplary. Also, the features implemented in one embodiment are not restricted to that embodiment, but rather, may be implemented in another embodiment where logically possible. The specific details can be varied from and still be contemplated to be within the spirit and scope of the present design.

In general, a cyber security appliance is configured to analyze email information to detect a cyber threat. More specifically, the cyber security appliance is configured to use Domain-Based Message Authentication, Reporting, and Conformance (DMARC) reports and DMARC records to improve security posture and visibility by detecting, tracking, and profiling third-party services across email, SaaS, and other attack surfaces.

The cyber security appliance can pull and/or query resources for information generated in accordance with a security protocol to increase email security and assist in set-up and management of these authentication processes. More specifically, the cyber security appliance is configured to utilize DMARC authentication, namely a security protocol that enhances email security by providing additional features of detecting, tracking, and profiling a third-party service that is falsely representing that it is part of an organization or authorized to operate on behalf of the organization for communications with other third-party destinations. Stated differently, DMARC authentication monitors SPF and/or DKIM authentication process results, provides a supplemental authentication process for detecting a misalignment between an email sender and the email address (From header content) as it appears to the recipient of the email, and enables domain registrants (owners) to establish policies as to how to manage emails that fail any or all of the email authentication processes (SPF/DKIM/DMARC).

Herein, the SPF authentication process, relying on the content of the SPF records, can be used to confirm the identity of an authorized server. The DKIM authentication process can be used to confirm content within the emails remains unchanged during transit, where the DKIM record (stored in a DNS record of the DNS server) includes a public key of the public-private key pair for use in digital signature validation for the email. To provide further email security, the cyber security appliance may be adapted to obtain DMARC reports generated in accordance with the DMARC authentication. The DMARC reports includes content identifying who is attempting to convey association with the monitored domain in emails. In particular, the DMARC reports includes SPF and/or DKIM authentication details (e.g., source network identifier, etc.) along with pass/fail status of such authentication. Through the collective operation of DMARC authentication along with SPF authentication and/or the DKIM authentication, with automated analytics of the DMARC reports, the cyber security appliance provides greater protection against many types of cyberattacks, including spoofing and phishing emails.

Upon detecting an email communication that has failed SPF, DKIM and/or DMARC authentication, the cyber security appliance may determine that the email is associated with a potential cyber threat. In response, the cyber security appliance may conduct one or more remedial actions (remediations) in order to protect the user or organization (e.g., enterprise network, computing device, etc.) referenced by the monitored domain against the cyber threat.

This collective security process, performed in accordance with the DMARC authentication in combination with SPF authentication and/or DKIM authentication, assists domain administrators (e.g., cyber security professionals, network administrators, etc.) in confirming misconfiguration of an email authentication process relied upon by the domain registrant and Security Operations Centers (SOC) in deciding whether certain remedial action(s) are warranted upon confirmation of attempted conveyance of a malicious email. Also, this collective security process provides a security tool for enhanced supply chain security that may be presented to potential customers. Furthermore, this collective security process is better adapted to protect the “brand” of the domain registrant, which is subjected to a greater threat due to an increased attack surface when one or more third-party vendors are authorized to send emails on behalf of an organization or user.

Additionally, the cyber security appliance is configured to use scores and metric outputs for domains, where aggregated context for email deployments within cyber security appliances across a fleet of cyber security appliances (e.g., two or more cyber security appliances) deployed to protect email systems and to detect trends that may be associated with one or more targeted campaigns of malicious emails occurring in a fleet of cyber security appliances. The aggregation of the context from the fleet of cyber security appliances increases confidence in a successful determination of a cyber threat.

A cyber security appliance deploying an email module, inclusive of email report analysis logic, can apply the techniques and mechanisms herein to provide better detection of email campaigns, better detection of spoofing or phishing cyberattacks, more generally higher fidelity, and minimizing alert fatigue.

Referring to FIG. 1A, an illustrative block diagram of an embodiment of a cyber security appliance 100 with an email module 120 and a cyber threat analyst module 125 is shown. For this embodiment, the email module 120 may be configured to conduct analytics on one or more email authentication reports 122 received from an Internet Services Provider (ISP) via a communication module 165 featuring input/output (I/O) ports. The email authentication reports 122 are used to identify certain cyber threats, including spoofing and phishing.

According to one embodiment of the disclosure, the email authentication reports may correspond to DMARC reports, namely a DMARC aggregate report 123 and/or a DMARC forensic report 124. The DMARC aggregate report 123 does not include details regarding each email's content, but rather, it includes, for each observed email, (i) a source network identifier (e.g., source IP address) of the email, (ii) SPF authentication status (pass/fail), (iii) DKIM authentication status (pass/fail), (iv) DMARC authentication status (pass/fail), and/or (v) the domain policy for emails that fail the email authentication. The DMARC forensic report 124 contains information about individual emails that fail authentication. This information may include (i) From email address, (ii) recipient email address, (iii) email source IP address, (iv) SPF and DKIM authentication results, (v) email subject line, and/or (vi) time stamp (e.g., when email received).

As stated above, the email module 120 conducts analytics on the content of the DMARC reports. The analytics may include, but are not limited or restricted to (i) determining characteristics of the emails, (ii) identifying whether any of the email authentications (DMARC/SPF/DKIM) have failed, and (iii) automatically generating results that may signal components, such as the autonomous response module 140 and/or an inoculation module 170 for example, to issue an alert or conduct a remedial action to increase security of the network system to which the domain pertains. The type of remedial action may be selected based on a threat risk determined from the DMARC reports, where the threat risk may be measured as a parameter based the frequency of emails from third-party services into or from the monitored domain that fail DMARC authentication, SPF authentication, and/or DKIM authentication.

For instance, as an illustrative example, the email module 120 is configured to conduct analytics on content associated with DMARC reports pertaining to the monitored domain. The DMARC reports 122 are received from an ISP over input/output (I/O) ports 165, where the delivery time/frequency may be set in a DMARC record set by the domain administrator. From the DMARC aggregate report 123, the email module 120 is configured to determine what emails identified as being sent from the monitored domain, if any, failed an email authentication process. This increased visibility as to which third-party services are falsely representing an affiliation associated with the monitored domain enables the domain registrant to conduct actions to stop or mitigate unauthorized use of the monitored domain.

Additionally, the email module 120 may be configured to reference some or all of the machine learning model (generally referred to as “AI model(s) 160). According to one embodiment of the disclosure, the AI model(s) 160 may be trained on email characteristics and can evaluate the characteristics of the emails, identify malicious emails, and automatically generate results that may initiate alerts or conduct remediation operations on components associated with the malicious activity.

According to another embodiment of the disclosure, some, or all of the AI model(s) 160 may be trained on a normal pattern of life of email activity and user activity associated with an email system. A determination is made of a threat risk parameter that factors in the likelihood that a chain of one or more unusual behaviors of the email activity and user activity under analysis fall outside of derived normal benign behavior. If so, the autonomous response module 140 can be used, rather than a human taking an action, to cause one or more autonomous actions to be undertaken to manage the cyber threat.

As an optional feature, in lieu of or in addition to operations by the email module 120, the cyber threat analyst module 125 may be configured to conduct analytics on the DMARC reports 122. Furthermore, for this embodiment, the cyber threat analyst module 125 is configured to conduct analytics on behaviors (user and email activities) to determine if such activities denote a cyber threat as set forth in U.S. patent application Ser. No. 18/117,348, filed Mar. 3, 2023, the entire content of which is incorporated by reference herein.

The cyber security appliance 100 can protect an email system with components including the email module 120. As shown in FIG. 1B, the email module 120 may include email threat detector logic 180, email similarity classifier logic 182, targeted campaign classifier logic 184, self-correcting similarity classifier logic 190, impersonation detector logic 192, email report analysis logic 195, and/or email authentication set-up logic 197. The operations of these logic units are described below.

More specifically, as shown in FIG. 1A, the cyber security appliance 100 is configured to protect against cyber threats from an e-mail system as well as its network. The cyber security appliance 100 features (i) a trigger module 105, (ii) a gather module 110, (iii) the email module 120, (iv) the cyber threat analyst module 125, (v) an assessment module 130, (vi) a user interface and (display) formatting module 135, (vii) the autonomous response module 140, (viii) a (local) data store 145, (ix) a network module 155, (x) a network & email coordinator module 157, and (xi) the AI models 160A-160D. The AI models 160A-160D include a first AI model 160B trained on characteristics of an email itself and its related data, a second AI model 160C trained on potential cyber threats, a third AI model 160D trained on how to conduct cyber threat investigations including parsing and analysis of email authentication reports, and one or more AI models 160A each trained on the pattern of life of different users, devices, network and email activities, and interactions between entities in the system (which includes AI models that are trained on the normal pattern of life of email activity and user activity associated with at least the email system).

The trigger module 105, operating in cooperation with the email module 120, the network module 155 and the AI model(s) 160, may detect an event (e.g., time-based signal to initiate a query for email authentication report(s), a detected options that causes initiation of the query, etc.) and/or an alert based on unusual or suspicious behavior/activity is occurring. The email module 120, further cooperating with AI model(s) 160, may be configured to conduct analytics on the content of the DMARC reports 122 as well as evaluate characteristics of the monitored emails to locate potential malicious emails, and thereafter, trigger an alert or initiate a remedial action.

Accordingly, the gather module 110 operates in response to the trigger module 105 detecting a specific events and/or alert. The content associated with the DMARC reports 122 may be gathered from the ISPs and maintained within the local data store 145. Other data that may be useful in the email security analysis (e.g., domain-based metrics, fleet-wide behavioral metrics from a global domain intelligence data store 150, etc.) as well as historic data from the local data store 145 may be collected and passed to the email module 120 and/or the cyber threat analyst module 125.

According to one embodiment of the disclosure, the email module 120, the network module 155, and the network & email coordinator module 157 may be portions of the cyber threat analyst module 125 or separate modules by themselves. In another embodiment of the disclosure, each of the email threat detector logic 180, the email similarity classifier logic 182, the targeted campaign classifier logic 184, the self-correcting similarity classifier logic 190, the impersonation detector logic 192, the email report analysis logic 195 and the email authentication set-up logic 197 forming the email module 120 may be separate logic units with the email module 120 or may be logic sub-units being part of a logic unit (e.g., email threat detector logic 180).

The cyber security appliance 100 uses various probes to collect domain data, which is provided to the local data store 145 and, in response to a triggering event (e.g., time-based prompt, change in the local data store 145, external query, etc.), at least a portion of domain data may be further provided to a global domain intelligence data store 150. According to one embodiment, the domain data may also include content available from the DMARC reports, which may identify the monitored domain along with characteristics of the emails observed to be associated with the domain as well as domain characteristics.

Additionally, or in the alternative, the email module 120, the cyber threat analyst module 125 and/or one or more of the AI models 160A-160D may be configured to receive the domain data. For this deployment, the cyber threat analyst module 125 may use the collected domain data to draw an understanding of email activity in the email system as well as updates a training for the one or more AI models 160A trained on this email system and its users. For example, email traffic can be collected by putting probe hooks into the e-mail application, the email server, such as Outlook® or Gmail® email servers, and/or monitoring the internet gateway from which the e-mails are routed through. Additionally, probes may collect network data and metrics via one of the following methods: port spanning the organization's existing network equipment; inserting or re-using an in-line network tap, and/or accessing any existing repositories of network data (see FIG. 3).

The email module 120 and the network module 155 may be configured to communicate and exchange information with the AI models 160A-160D. Additionally, the cyber threat analyst module 125 cooperates with the two or more modules to analyze the wide range of metadata from the observed email communications.

For example, the cyber threat analyst module 125 can receive an input from two or more AI modules in the cyber security appliance 100. The cyber threat analyst module 125 factors in the input from at least each of these analyses above in a wide range of metadata from observed email communications to detect and determine when a deviation from the normal pattern of life of email activity and user activity associated with the network and its email domain is occurring. In response to a deviation, the cyber threat analyst module 125 cooperates with the autonomous response module 140 to determine what autonomous action to take to remedy against a potentially malicious email. The cyber threat analyst module 125 may also reference and communicate with one or more AI models 160C trained on cyber threats in the email system. The cyber threat analyst module 125 may also reference the one or more AI models 160A that are trained on the normal pattern of life of email activity and user activity associated with the email system.

The cyber threat analyst module 125 can reference these various trained AI models 160A-160D and data from the network module 155, the email module 120, and the trigger module 105. The cyber threat analyst module 125, cooperating with the assessment module 130, may be configured to determine a threat risk parameter that factors in how the chain of unusual behaviors correlate to potential cyber threats and ‘what is a likelihood of this chain of one or more unusual behaviors of the email activity and user activity under analysis that fall outside of derived normal benign behavior;’ and thus, is malicious behavior.

Any or all of the AI models 160A-160D can be self-learning models using unsupervised learning and trained on a normal behavior of different aspects of the system, for example, email activity and user activity associated with an email system. The self-learning models of normal behavior (e.g., AI model(s) 160A) are regularly updated. The self-learning model of normal behavior is updated when new input data is received that is deemed within the limits of normal behavior. A normal behavior threshold is used by the model as a moving benchmark of parameters that correspond to a normal pattern of life for the network system. The normal behavior threshold is varied according to the updated changes in the network system allowing the model to spot behavior on the network system that falls outside the parameters set by the moving benchmark.

Referring still to FIG. 1A, the cyber security appliance 100 may also include one or more AI models 160B trained on gaining an understanding of a plurality of characteristics on an email itself and its related data including classifying the properties of the email and its metadata. The cyber threat analyst module 125 can also reference the AI model(s) 160B trained on an email itself and its related data to determine if an email or a set of emails under analysis have potentially malicious characteristics. The cyber threat analyst module 125 can also reference the AI model(s) 160C trained on cyber threats and their characteristics and symptoms to determine if an email or a set of emails under analysis are likely malicious. The cyber threat analyst module 125 can also factor this email characteristics analysis into its determination of the threat risk parameter.

The network module 155 cooperates with the AI model(s) 160A trained on a normal behavior of users, devices, and interactions between them, on a network, which is tied to the email system. The cyber threat analyst module 125 can also factor this network analysis into its determination of the threat risk parameter.

A user interface has one or more windows to display network data and one or more windows to display emails and cyber security details about those emails through the same user interface on a display screen, which allows a domain administrator to pivot between network data and email cyber security details within one platform, and consider them as an interconnected whole rather than separate realms on the same display screen.

According to the embodiment illustrate in FIG. 1A, the cyber security appliance 100 may use at least four separate AI models 160A-160D. The AI model(s) 160A may be trained on specific aspects of the normal pattern of life for the system such as devices, users, network traffic flow, outputs from one or more cyber security analysis tools analyzing the system, etc. The AI model(s) 160C may also be trained on characteristics and aspects of all manner of types of cyber threats and/or characteristics of emails themselves. The AI model(s) 160D may be trained on how and operations to perform when conducting cyber threat investigations, cyber threat analytics, and/or tuning/setting DMARC, SPF or DKIM parameters.

The email module 120 is configured to monitor email activity and the network module 155 is configured to monitor network activity, where both of these modules may be fed their data to a network & email coordinator module 157 to correlate causal links between these activities to supply this input into the cyber threat analyst module 125.

Again, the cyber threat analyst module 125 is configured to receive an input from at least each of the two or more modules above. The cyber threat analyst module 125 factors in the input from each of these analyses above to use a wide range of metadata from observed email communications to detect and determine when the deviation from the normal pattern of life of email activity and user activity associated with the network and its email domain is occurring, and then determine what autonomous action to take to remedy against a potentially malicious email. Again, the cyber threat analyst module 125 may factor in the input from each of these analyses above including comparing emails to the AI model trained on characteristics of an email itself and its related data to detect and determine when the deviation indicates a potentially malicious email.

The cyber threat analyst module 125 detects deviations from a normal pattern of life of email activity and user activity associated with the network and its email domain based on at least one or more AI models determining the normal pattern of life of email activity and user activity associated with the network and its email domain; rather than, ahead of time finding out what a ‘bad’ email signature looks like and then preventing that known bad′ email signature.

Based on analytic results from the email module 120 and/or the cyber threat analyst module 125, the cyber security appliance 100 takes actions to counter detected potential cyber threats. The autonomous response module 140, rather than a human taking an action, can be configured to cause one or more autonomous actions to be taken to contain the cyber-threat when the threat risk parameter from the cyber threat analyst module 125 is equal to or above an actionable threshold. The email module 120 and/or the cyber threat analyst module 125 cooperates with the autonomous response module 140 to cause one or more autonomous actions to be taken to contain the cyber threat, in order to improve computing devices in the email system by limiting an impact of the cyber-threat from consuming unauthorized CPU cycles, memory space, and power consumption in the computing devices via responding to the cyber-threat without waiting for some human intervention.

As further shown in FIG. 1A, the autonomous response module 140 may be configured to operate with the inoculation module 170. The inoculation module 170 can deploy defensive actions against a potentially malicious source (e.g., malicious server, etc.). The inoculation module 170 may be configured to conduct a defensive or offensive action in response to the email module 120 identifying third-party services transmitting malicious emails representing the monitored domain. The defensive actions may include alerts to cyber security appliance or a fleet-wide alert to multiple cyber security appliances, blacklisting the IP address of the malicious source to block further communications therefrom, or the like. An offensive action may include a Denial of Service (DoS) attack on the malicious server in which the cyber security appliance associated with the monitored domain exhausts the available sockets of the malicious server or a coordinated DoS attack by cyber security appliances of different domains.

Referring to FIG. 1B, the cyber security appliance 100 can have an email module 120 that features the email threat detector logic 180, the email similarity classifier logic 182, the targeted campaign classifier logic 184, the self-correcting similarity classifier logic 190, and the impersonation detector logic 192, the email report analysis logic 195, and/or the email authentication set-up logic 197 operating with the email report analysis logic 195.

In particular, the email authentication set-up logic 197 may be configured to notify and assist domain administrators in response to the email report analysis logic 195 identifying potential DNS configuration problems from the DMARC or SPF authentication results. For example, in response to determining that the number or frequency of errors exceeds a prescribed threshold (e.g., the number or frequency of false-negative or false-positive findings based on a review of results from the DMARC, DKIM or SPF authentications is greater than the prescribed threshold), the email authentication set-up logic 197 may be configured to alert a domain administrator to re-evaluate the parameters associated with these authentication processes. The email authentication set-up logic 197 may be further configured to provide guidance as to potential configuration issues based on prior modifications made in similarly situated networks and/or AI model(s) 160D, where the prior modification data is maintained (e.g., maintained in the global domain intelligence data store 150).

The email report analysis logic 195 may be configured to extract content from one or more of the DMARC reports 122 for example. Based on the extracted content, the email report analysis logic 195 provides visibility of the current threat landscape, which allows the domain administrator to alter the DNS-published DMARC record to tune its policy in handling emails that failed email authentication processes such as SPK and/or DKIM.

As an illustrative example, based on the DMARC aggregate report 123 of FIG. 1A, the email report analysis logic 195 may identify emails that have failed DMARC, SPF and/or DKIM authentication and obtain a source network identifier associate with email sender of each potentially malicious email. The source network identifier may include the source Internet Protocol (IP) address for the email sender, an autonomous system number (ASN), or the like. Thereafter, the email report analysis logic 195 may confirm that actions were taken in compliance with a policy established in the DMARC record for failure of the DMARC authentication. The email report analysis logic 195 may further determine if the email authentication failures have increased to warrant a change in DMARC policy (e.g., change from ‘none’ to ‘reject’ or “quarantine’), a change in frequency of receiving of the DMARC policy to more closely monitor email traffic associated with the monitored domain, or the like. The current DMARC policy may influence what actions are performed by the autonomous response module 140 and/or the inoculation module 170.

Furthermore, the email threat detector logic 180 can use machine learning to cluster similar emails deemed malicious. The email threat detector logic 180 has (1) an email similarity classifier logic 182 configured to analyze a group of emails, under analysis, in order to cluster emails with similar characteristics in the group of emails. The email threat detector logic 180 has (2) a targeted campaign classifier logic 184 configured to (i) analyze the clustered emails with similar characteristics to check whether the clustered emails with similar characteristics are a) coming from a same threat actor b) going to a same intended recipient, and c) any combination of both. The targeted campaign classifier logic 184 will also verify whether the clustered emails with similar characteristics are deemed malicious. The email threat detector logic 180 is configured to analyze information from the email similarity classifier logic 182 and/or the targeted campaign classifier logic 184, which when combined can cluster the corresponding emails into a campaign of malicious emails. The email threat detector logic 180 is configured to analyze information from the email similarity classifier logic 182 and/or the targeted campaign classifier logic 184 in order to provide an early warning system of a targeted campaign of malicious emails. An email threat detector logic 180 working in a corporate environment is configured to cluster inbound emails that have similar indices/metrics as well as from a same source, e.g. (1) sent from the same person, entity, group, (2) sent from and/or sent to a common specific geographic area, (3) sent from and/or sent to a common entity (i.e., an email campaign).

The email threat detector logic 180 can check when there are changes over time changes in the AI modeling and its sophisticated anomaly scoring. The early warning system in the email threat detector logic 180 can start looking for trends and anomalies within the AI model breaches. Each client's cyber security appliance has a set of AI models and policies that can be breached.

The targeted campaign classifier logic 184 can determine a likelihood that two or more highly similar emails would be (i) sent from or (ii) received by a collection of users in the email domain under analysis in the same communication or in multiple communications within a substantially simultaneous time period. The targeted campaign classifier logic 184 module can determine a likelihood that two or more highly similar emails that are being (i) sent from or (ii) received by a collection of users in the email domain under analysis in the same communication or in a given time frame, based on at least (i) historical patterns of communication between those users, and (ii) how rare the collection of users under analysis all would send and/or receive this highly similar email in roughly the substantially simultaneous time frame. The normal pattern of life of email activity and user activity associated with the network and its email domain can be used by the targeted campaign classifier logic 184 on a mass email association to create a map of associations between users in the email domain to generate the probabilistic likelihood that the two or more users would be included in the highly similar emails.

The cloud platform can aggregate those targeted campaigns of malicious emails centrally with a centralized fleet aggregator 305 (see FIG. 3). The centralized fleet aggregator 305 looks for these trends, anomalies, and then from that the centralized mechanism can drive detected trends like autonomous action responses by transmitting that information back out to local cyber security appliances deployed throughout the fleet. The aggregation of that data which is fed to an AI classifier to ascertain whether this email campaign is occurring in certain region(s), across certain industries, sent from a same entity, sent from a same geographic location, etc. The centralized fleet aggregator 305 puts this information into a format which is usable for the fleet of deployed cyber security appliances in general, as well as notices sent to marketing and customer support.

The email similarity classifier logic 182 and the targeted campaign classifier logic 184 cooperate in the email threat detector logic 180 to provide an early warning system to predict a sustained and malicious email campaign by analyzing, for example, a type of action taken by the autonomous response on a set of emails with similar overlapping features. The early warning system in the email threat detector logic 180 is configured to predict a sustained, email campaign of actually malicious emails by analyzing the type of action taken by the autonomous response on a set of emails with many overlapping features, and factoring in a pattern of email analysis occurring across a fleet of two or more cyber security appliances deployed and protecting email systems to detect trends. The “early warning” system can be a fleetwide approach that tries to detect trends across all of our deployed cyber security appliances, with the individual email threat detector logic 180 in the local cyber security appliance trying to do so on a per cyber security appliance basis. The email threat detector logic 180 can detect campaigns early, before they are written about; and thus, generate reports to the end user about a new email campaign.

One or more AI models 160A-160C communicatively couple to the email threat detector logic 180 (e.g., See FIG. 1A). The one or more AI models 160A-160C are configured to analyze the emails under analysis and then output results to detect malicious emails. The email threat detector logic 180 is configured to cooperate with the one or more AI models 160A-160C to identify emails that are deemed malicious. The existing framework of cyber threat detection via the modules and models and autonomous response via the autonomous response module 140 to mitigate the detected threat in the email domain is capable of successfully identifying and reacting to malicious emails. Autonomous action responses are decided on an email-by-email basis and include, for example, holding a message for further investigation, sending it to the junk folder, disabling hyperlinks, etc. before delivery to the destination inbox.

The email threat detector logic 180 and autonomous response module 140 can cooperate to analyze what level of autonomous action is initiated by the autonomous response module 140 to mitigate emails in the cluster of emails with similar characteristics compared to a historical norm of autonomous actions to past groups of clusters of emails with similar characteristics. The comparison is made and when different and more severe than the historical norm, then the email threat detector logic 180 can consider the different and more severe autonomous action taken on the cluster of emails as a factor indicating that the targeted campaign of malicious emails is underway. The email threat detector logic 180 uses both statistical approaches and machine learning. The email threat detector logic 180 also tracks and compares the historical data. The email threat detector logic 180 has the machine learning aspect to fit and create sensible bounds of what we would expect to see within each of these periods. The email threat detector logic 180 can look at the mean and medium as well as machine learning modeled normal pattern of behavior as bound indicators of whether there is a campaign and how serious the campaign is.

The email threat detector logic 180 can look at time periods within a given time frame to detect pretty quickly whether this email network being protected is getting a campaign of emails building up and occurring, by looking at and comparing to the machine learning averages as well as the mathematical means and median values, etc. to the current numbers. The email threat detector logic 180 can detect, for example, an uptick in more severe autonomous responses and that is indicative of building up to an ongoing email attack campaign. The uptick in severity of the autonomous responses that the autonomous response module 140 takes is more severe than what the system normally and/or historically sees in this organization's email domain. The malicious actor conducting the ongoing email attack campaign generally sends test bad emails to figure out what defenses and vulnerabilities the organization's email domain has before the full en-masse sending of emails occurs. Also, the email report analysis logic 195 can look at elapsed time periods to quickly determine whether a query is needed to DNS and/or one or more ISPs to obtain email authentication reports to ensure that the domain interaction is policed over during a prescribed time frame.

The email threat detector logic 180 and/or the email report analysis logic 195 may be configured to be communicatively coupled to a user interface, which allows a user to select a time frame to use for analysis or one or more select default time frames preconfigured for the email threat detector logic 180 and/or the email report analysis logic 195.

Although not shown, a communication module deployed within the email module 120 may be configured to encrypt and securely communicate information from the email threat detector logic 180 and/or the email report analysis logic 195 over a network to a centralized fleet aggregator 305 of FIG. 3 (external to the cyber security appliance 100) that is configured to cooperate with a database to collate metrics. The centralized fleet aggregator 305 is further configured to analyze the metrics to detect trends, including targeted campaigns of malicious emails occurring in a fleet of instances of the cyber security appliances, and then send the trend data back to the fleet of instances of the cyber security appliance in an actionable way. The centralized fleet aggregator 305 can use a tracker mechanism to track each email campaign of malicious emails or spoofing/phishing attacks determined by the email report analysis logic 195. This information may be returned back to the cyber security appliance 100 for storage in local data store 145.

Referring FIG. 2A, a flow diagram of an embodiment of the email report analysis logic 195 leveraging email authentication report content to determine malicious emails for handling by logic within the cyber security appliance is shown. Herein, the email report analysis logic 195 is configured to initiate queries to resources 200 external to the cyber security appliance 100 for one or more email authentication reports. As shown, the email report analysis logic 195 may initiate a query message 220 to one or more ISP 210 requesting the DMARC reports 122 associated with a domain identified in the query message 220 (generally referred to as “monitored domain”). The content of the DMARC reports 122 may include source network identifiers of emails that were sent from or on behalf of the monitored domain, identification of pass/fail status of one or more authentication processes by certain emails (e.g., pass/fail DMARC authentication, SPF authentication, DKIM authentication, etc.). Optionally, besides the DMARC reports 122, the email report analysis logic 195 may initiate a query message 230 to the DNS (e.g., ISP hosted or separate) for the SPF record 235 (e.g., nslookup command for SPF text record) and/or a query message 240 for the DKIM record 245.

Referring now to FIG. 2B, a flow diagram of illustrative operations of an embodiment of the email report analysis logic 195 shown in FIG. 1B in determining malicious emails and policy extraction for the handling of email senders of the malicious emails is shown. Herein, after receiving the DMARC reports (block 250), the email report analysis logic 195 is configured to extract content from the DMARC reports, such as a status (pass/fail) for DMARC, SPF and DKIM authentications conducted on emails pertaining to the monitored domain (block 255). For each email entry within the DMARC aggregate report, the email report analysis logic 195 determines whether any emails, during DMARC analysis by external resources (e.g., remote email servers), failed any of the email authentications since transmission of the prior DMARC aggregate report (blocks 260, 265 and 270). If no new failures, the email report analysis logic 195 determines there are no cyber threats directed to the monitored domain can be confirmed. However, in response to detection of one or more email authentication failures, the email report analysis logic 195 obtains the source network identifier(s) associated with the email(s) that failed the email authentication (block 280).

From the DMARC aggregate report, the email report analysis logic 195 may be configured to extract and re-evaluate the policy associated with failed authentication of an email (block 285). For example, in accordance with one embodiment of the disclosure, the policy may be uniform among the ISPs, where all email authentication failures are managed in the same fashion. However, in accordance with another embodiment of the disclosure, different policies may be assigned to different DMARC records published by different DNS servers. The different policies may be based on the characteristics associated with typical email senders using the DNS for the ISP, determined geographic location of the email sender or IP address and its affiliation to a particular DNS server, or the like. The email report analysis logic 195 may provide data to the autonomous response module 140 and/or the inoculation module 170 of FIG. 1 to take appropriate action (block 290).

Referring back to FIG. 1B, the email threat detector logic 180 includes the impersonation detector logic 192, which is configured to analyze whether either a nexus exists or just a complete mismatch exists between a display-name and an addr-spec field of an email under analysis as a factor in detecting whether the email under analysis is malicious. Namely, the mismatch determination assists the impersonation detector logic 192 to determine whether the email is potentially part of the targeted campaign of similar malicious emails—or just spam email that is regularly received by an email mailbox and does not constitute a cyber threat that corrupts a computing device and attempts to obtain their credentials and/or personal information of a user of the computing device.

In accordance with RFC 5322 & 2822 section 3.4.1, both RFCs give an example meaning of an addr-spec field of an email. An “addr-spec” field may include a specific Internet identifier that contains a locally interpreted string followed by the at-sign character (“@”, ASCII value 64) followed by an Internet domain. A mismatch of a display-name to addr-spec does not surely indicate malicious intent; many addr-specs are innocently unrelated to their sender's name, e.g., “Tony Laws sunderlandfan73@hotmail.com.” However, if an email is suspicious in another respect (e.g., it has been assigned a moderately high phishing inducement score by another AI classifier and/or AI model breach) the context renders the potentially innocent mismatch between display-name to addr-spec as additional evidence of malicious intent.

This impersonation detector logic 192 is useful in catching when the display name appears to have no relation to the email address and/or the same email address has a history of multiple different display names across different emails with little to no relationship between the display name and the email address. The impersonation detector logic 192 also has exclusions for things that do not actually look like that part of the name—title e.g., CEO, Dr., Mr., Ms., etc.

The email similarity classifier logic 182 can be configured to analyze the group of emails based on a set of multiple similar indices for the clustered emails with similar characteristics. The email similarity classifier logic 182 can be configured to cooperate with a self-correcting similarity classifier logic 190. The self-correcting similarity classifier logic 190 is configured to apply at least one of (i) a mathematical function and (ii) a graphing operation to identify outlier emails from the cluster and then remove the outlier emails from the cluster; and thus, from the targeted campaign of malicious emails, based on a variation in the output results generated by the machine learning models 160A-160C for the outlier emails compared to a remainder of the emails in the cluster of emails with similar characteristics, even though the outlier emails shared similar characteristics with the remainder of the emails in the cluster of emails with similar characteristics.

The email threat detector logic 180 can use a real-time, self-correcting similarity classifier logic 190 for emails as a factor in detecting whether an email is malicious; and therefore, potentially part of a campaign of malicious email, or not malicious and probably a spam email.

The email module 120 may contain additional modules. For example, although not shown, a similarity scoring module may be deployed, where the similarity scoring module compares an incoming email, based on a semantic similarity of multiple aspects of the email to a cluster of different metrics derived from known bad emails to derive a similarity score between an email under analysis and the cluster of different metrics derived from known bad emails. An email layout change predictor module analyzes changes in an email layout of an email of a user in that email domain to assess whether malicious activity is occurring to an email account of that user, based on the changes in the email layout of the email deviating from a historical norm. The email layout change predictor module detects anomaly deviations by considering two or more parameters of an email selected from a group consisting of a layout of the email, a formatting of the email, a structure of an email body including any of the content, language-usage, subjects, and sentence construction within the email body in order to detect a change in behavior of the email sender under analysis that is indicative of their account being compromised. An image-tracking link module cooperates with an image-tracking link detector to analyze the link properties that describe the link's visual style and appearance accompanying the link to detect whether the tracking link is intentionally being hidden as well as a type of query requests made by the tracking link to determine if this tracking link is a suspicious covert tracking link.

The cyber security appliance 100 may be hosted on a computing device, on one or more servers, and/or in its own cyber-threat appliance platform (e.g., see FIG. 3).

Referring now FIG. 3, a block diagram of an embodiment of the cyber security appliance 100 monitoring email activity and network activity based on content gathered from the cyber security appliance 100 and other resources such as the global domain intelligence data store 150 is shown. Herein, the cyber security appliance 100 operates to monitor email and/or network activity over a network system 300 and to collect and process email authentication reports from remote resources such as an ISP 310 and/or DNS server 320. The network system 300 may include the cyber security appliance 100 that is communicatively coupled to various computing devices 330 (e.g., desktop computer(s), laptop computer(s), smart phone(s), smart watch(es), wearable(s), etc.), firewall(s) 340, network infrastructure 350 (e.g., switches, routers, bridges, etc.), server(s) 360, database(s) 370, Internet gateway(s) 380 for communicatively coupling to the ISP(s) 310, DNS server 320, and/or the global domain intelligence data store 150.

Herein, as shown in FIGS. 1A and 3, the cyber security appliance 100 includes the email module 120, which uses probes, including a set of detectors, to monitor email activity. Likewise, the network module 155 uses the probes, including a set of detectors, to monitor network activity and can reference the AI model(s) 160 to identify unusual network activity by users, devices, and interactions between them or the internet which is subsequently tied to the email system. Information associated with the monitored network activity provides an additional input to the cyber threat analyst module 125 in order to determine the threat risk parameter (e.g., a score or probability) indicative of the level of threat. A particular user's network activity may be highly correlated to email activity because the network module 155 observes network activity, and the network & email coordinator module 157 may assess particular user's email activity to make an appraisal of potential email threats with a resulting threat risk parameter tailored for different users in the e-mail system. The network module tracks each user's network activity and sends that to the network & email coordinator module 157 to interconnect the network activity and email activity to closely inform one-another's behavior and appraisal of potential email threats.

Referring now to FIG. 4, a flow chart diagram of an embodiment of the email report analysis logic 195 of FIG. 1B utilizing DMARC reports to assist in the deployment, global tracking, identifying, and combating potentially malicious third-party behavior and improve attack surface management is shown. Upon receiving the DMARC reports for the monitored domain from the ISP (block 400), the email report analysis report logic performs actions on each of the DMARC reports.

More specifically, upon receiving the DMARC forensic report (block 410), the email report analysis logic determines and identifies malicious senders trying to impersonate a member of the domain or falsely conveying that it is authorized to send emails on behalf of the domain (block 420). Upon identifying the malicious senders, the email report analysis logic prompts the generation of actions to be conducted (block 430).

According to one embodiment of the disclosure, one action may involve creation and transmission of a message (e.g., email, text message, etc.) to the domain administrator to perform actions to protect the brand” of the organization or user associated with the monitored domain. Another action may involve notifying the autonomous response module of the malicious emails handle actions such as blocking further emails from the malicious sender to email servers associated with the monitored domain and/or increasing cyber security protections for certain computing devices within the network system. Yet another action is for the domain administrator to conduct employee training exercises similar to the cyber threats detected to heighten awareness of on-going cyber threats (block 430).

The email report analysis logic is further configured to receive and parse the DMARC aggregate report to understand potential failures in email authentication for emails directed to or directed from the monitored domain and to identify potentially malicious third-party services (block 440). The parsing and extraction of content provided by the DMARC aggregate report (generally referred to as “DMARC data”) may be utilized in a number of operations.

For instance, the DMARC data can be displayed to assist the domain administrator in the adjustment (e.g., reconfiguration) of the email authentication processes such as SPF, DKIM, and DMARC (block 450). For example, the DMARC data may identify failed authentications that suggest an improper setting (e.g., SPF record does not list an authorized email server) or may suggest adjustment of the DMARC policy to a more aggressive response to failed email authentications.

Additionally, or in the alternative, the DMARC data may be aggregated and the source network identifiers may be extracted from the aggregated DMARC data. The source network identifiers (e.g., source IP address) may be used to determine third-party services associated with those emails that have failed one or more of the email authentications, namely the suspected malicious emails per the DMARC aggregate report (block 460).

As an illustrative example, the source network identifiers associated with suspected malicious emails may be shared with the domain administrator in order to assist in further deployments of cyber security protections to assist in the deployment of the same (block 470). In addition, this DMARC data may assist in attack service management as the information will identify vulnerabilities in the network system associated with the monitored domain. The DMARC data may further provide insight as to additional cyber security products that could provide further protection against potential spoofing or phishing attacks that may have been uncovered by the DMARC aggregate report (block 480).

Also, the DMARC data may be utilized to globally track third-party behavior pertaining to the actors associated with the source network identifiers pertaining to the suspected malicious emails uncovered through one or more failed email authentications (block 490). This may enable a domain administrator to tailor the cyber security protections to monitor emails more closely from a certain source and/or heighten authentication protections.

Referring now to FIG. 5, an illustrative flow diagram of the fleet-wide domain analysis for detecting cyber threats based on scores and metrics outputs from domains that are aggregated across all cyber security email deployments is shown. Herein, a fleet of cyber security appliances 500, namely a plurality of cyber security appliances associated with different domains, is configured to gather information associated with email activity. The gathered information may include content extracted from the DMARC reports and/or other content gathered by each of the cyber security appliances. For example, each cyber security appliance may be configured to gather domain-wide metrics 510 such as email activity metrics associated with a given domain (e.g., total number of emails sourced by the domain, etc.), the ASN assigned to the domain registrant, authentication results for emails sent by third-party services on behalf of the monitored domain, network identifiers (IP addresses) for email activity, and other metrics. The domain-wide metrics 510 from each of the cyber security appliances are provided to the global domain intelligence data store 150.

The global domain intelligence data store 150 operates as a depository for domain-wide metrics gathered by different cyber security appliances, where the aggregate of the domain-wide metrics may be used to create fleet-wide (behavioral) metrics. Herein, the global domain intelligence data store 150 includes metric generation logic 520, which is configured to parse, aggregate, and perform statistical computations on the aggregated domain-wide metrics in order to generate fleet-wide metric that are more in-depth than the domain-wide metrics.

For instance, as an illustrative embodiment, the fleet-wide metrics may include, but are not limited or restricted to (1) global popularity of a particular domain, (2) whether emails that are sourced by the domain or sent pretending to be authorized by the domain regularly fails SPF or another email authentication process, (3) the kind of emails the domain sends, (4) the number of people sending emails from the domain, (5) the style, tone, and formatting of the emails, (6) how frequently the domain appears across all cyber security appliance deployments, and/or (7) how frequently the domain is seen in anomalous connectivity across all the cyber security appliance deployments. These fleet-wide metrics can be collated, maintained in a fleet-wide metric data store 530, and provided to other systems for further fleet-wide analytics, especially those metrics that identify unusual behaviors being detected in the domains 540.

Additionally, the fleet-wide metrics (or a portion of these metrics) are accessible in the fleet-wide metric data store 530 for analysts and other cyber security products. For example, the fleet-wide metrics may be utilized to confirm proper operations of the cyber security appliances and to provide context as supplemental information for “threat hunt” in which cyber threats captured by the cyber security appliance are coincidentally detected by an analyst during a threat hunting operation. Also, certain fleet-wide metrics can also be utilized by cyber security products for rarity scoring, where rarity may suggest operations by malicious actor attempting to gain access in the network system utilizing the domain.

As shown in FIG. 6, an illustrated example of the categorization of data within the global domain intelligence data store 150 is shown. The global domain intelligence data store 150 includes email activity metrics 600, ASN statistical metrics 610, SPF statistical metrics 620, location statistical metrics 630, tag metrics 640, connection (source) IPs metrics 650, internal contact statistical metrics 660, and external contact statistical metrics 670. The global domain intelligence data store 150 preserves these metrics and allows the cyber security appliance and other cyber security appliances access to better understand the overall threat landscape.

Herein, the global domain intelligence data store 150 may be an aggregation of metrics associated with a plurality of domains. The plurality of domains may be selected based on region (e.g., county, state, district, city, country, etc.). Hence, domains associated with domain registrants in a particular region provide their metrics to a particular global domain intelligence data store 150. Alternatively, the plurality of domains may be selected based on field of industry. Alternatively, the plurality of domains may be selected based on a subscription for access to the contents of the global domain intelligence data store 150. This aggregation allows better understanding of the overall threat landscape and perhaps additional understanding of the threat landscape associated with a particular industry or particular region.

Referring back to FIG. 5, the global domain intelligence data store 150 may be configured to update by each of the cyber security appliances and disseminated throughout the fleet. The updated data 550 is distributed to each of the cyber security appliances for local storage. A portion of the updated data 550 may be used for training of AI models and/or may be accessed by domain administrators 560. The immediate and live access to the global domain intelligence data store enables the domain administrators to develop strategies to manage cyber security threats that are ongoing and present in the current threat landscape.

Referring to FIG. 7, an illustrative block diagram of an embodiment of example autonomous actions automatically conducted by the autonomous response module 140 and/or the inoculation module 170 of FIG. 1A without a human initiating that action is shown. The autonomous response module 140 and/or the inoculation module 170 is configurable, via a user interface, to know when it should take the autonomous actions to contain a cyber threat when malicious activity is determined by the email module 120 or the cyber threat analyst module 125. According to one embodiment, the autonomous response module 140 and the inoculation module 170 operate as an administrative tool, configurable through the user interface, to program/set what autonomous actions are to perform in response to signaling from the email module 120 and/or cyber threat analyst module 125.

The following selection of example actions to be performed in response to the email module 120 receiving the DMARC reports, which identifies a third-party service is falsely representing itself as a user or a member of an organization associated with the monitored domain or authorized to operate on behalf of the user or organization for email communications. These actions may be categorized into authentication adjustment 700, cyber security defensive actions 730, and/or cyber security countermeasures 740.

More specifically, the authentication adjustment 700 may include DMARC authentication adjustment 710, SPF authentication adjustment 715, and/or DKIM authentication adjustment 720. For DMARC authentication adjustment 710, the autonomous response module 140 or the inoculation module 170 may cause the cyber security appliance 100 to alter the published DMARC record pertaining to the monitored domain. The alteration may involve changing a tag value that selects the DMARC policy in email servers handling emails that fail the DMARC authentication. For example, the alteration may change from no action (“none” policy) to a more aggressive action (“reject” or “quarantine”). Also, the frequency of the conveyance of the DMARC aggregate report may be changed to increase the reporting time from daily to multiple times per day so that the domain administrator can monitor the current threat landscape more closely.

For SPF authentication adjustment 715, the autonomous response module 140 or the inoculation module 170 may cause the cyber security appliance 100 to alter the SPF record pertaining to the monitored domain. The alteration may involve adding or removing an authorized server from its listing. Such changes may be temporary to effectively disable usage of an authorized server that may be compromised by a malicious actor. Similarly, for DKIM authentication adjustment 720, the autonomous response module 140 or the inoculation module 170 may cause the cyber security appliance 100 to alter the DKIM record pertaining to the monitored domain.

The cyber security defensive actions 730 may include actions to fortify cyber security defenses of computing devices associated with the network system associated with the monitored domain. The cyber security countermeasures 740 may include actions that are directed to neutralize operability of identified malicious servers as described above. These actions may be similar to the actions by the autonomous response module 140 and/or the inoculation module 170 in response to signaling from the cyber threat analyst module 125.

Other exemplary actions based on signaling from the cyber threat analyst module 125 may be categorized into delivery actions, attachment actions, link actions, header, and body actions, etc., which appear on the dashboard and can be taken by or at least suggested to be taken by the autonomous response module 140 and/or the inoculation module 170 when the threat risk parameter is equal to or above a configurable set point set by a domain administrator. Examples of these other actions 745 are described below.

Hold Message 750: The autonomous response module 140 has held the message before delivery due to suspicious content or attachments. Held emails can be reprocessed and released by an operator after investigation. The email will be prevented from delivery, or if delivery has already been performed, removed from the recipient's inbox. The original mail will be maintained in a buffered cache by the data store and can be recovered, or sent to an alternative mailbox, using the ‘release’ button in the user interface.

Lock Links 755: The autonomous response module 140 replaces the URL of a link such that a click of that link will first divert the user via an alternative destination. The alternative destination may optionally request confirmation from the user before proceeding. The original link destination and original source will be subject to additional checks before the user is permitted to access the source.

Convert Attachment 760: The autonomous response module 140 converts one or more attachments of this email to a safe format, flattening the file typically by converting into a PDF through initial image conversion. This delivers the content of the attachment to the intended recipient, but with vastly reduced risk. For attachments which are visual in nature, such as images, PDFs and Microsoft Office formats, the attachments will be processed into an image format and subsequently rendered into a PDF (in the case of Microsoft Office formats and PDFs) or into an image of the original file format (if an image). In some email systems, the email attachment may be initially removed and replaced with a notification informing the user that the attachment is undergoing processing. When processing is complete the converted attachment will be inserted back into the email.

Double Lock Links 765: The autonomous response module 140 replaces the URL with a redirected Email link. If the link is clicked, the user will be presented with a notification to that user that they are not permitted to access the original destination of the link. The user will be unable to follow the link to the original source, but their intent to follow the link will be recorded by the data store via the autonomous response module 140.

Strip Attachments 770: The autonomous response module 140 strips one or more attachments of this email. Most file formats are delivered as converted attachments; file formats which do not convert to visible documents (e.g., executables, compressed types) are stripped to reduce risk. The ‘Strip attachment’ action will cause the system to remove the attachment from the email and replace it with a file informing the user that the original attachment was removed.

Junk action 775: The autonomous response module 140 will ensure the email classified as junk or other malicious email is diverted to the recipient's junk folder, or other nominated destination such as ‘quarantine’.

The types of actions and specific actions conducted by the autonomous response module 140 and/or the inoculation module 170 may be customizable for different users and parts of the system; and thus, configurable for the domain administrator to approve/set for the autonomous response module 140 and/or the inoculation module 170 to automatically take those actions and when to automatically take those actions.

For instance, the autonomous response module 140 may have access to a library of response action types of actions and specific actions the autonomous response module 140 is capable of, including focused response actions selectable through the user interface that are contextualized to autonomously act on specific email elements of a malicious email, rather than a blanket quarantine or block approach on that email, to avoid business disruption to a particular user of the email system. The autonomous response module 140 is able to take measured, varied actions towards those email communications to minimize business disruption in a reactive, contextualized manner.

The autonomous response module 140 may work hand-in-hand with the AI models to neutralize malicious emails, and deliver preemptive protection against targeted, email-borne attack campaigns in real time.

The cyber threat analyst module 125 cooperating with the autonomous response module 140 can detect and contain, for example, an infection in the network, recognize that the infection had an email as its source, and identify and neutralize that malicious email by either removing that from the corporate email account inboxes, or simply stripping the malicious portion of that before the email reaches its intended user. The autonomous actions range from flattening attachments or stripping suspect links, through to holding emails back entirely if they pose a sufficient risk.

The cyber threat analyst module 125 can identify the source of the compromise and then invoke an autonomous response action by sending a request to the autonomous response model. This autonomous response action will rapidly stop the spread of an emerging attack campaign and give human responders the crucial time needed to catch up.

In an embodiment, initially, the autonomous response module 140 can be run in human confirmation mode—all autonomous, intelligent interventions must be confirmed initially by a human operator. As the autonomous response module 140 refines and nuances its understanding of an organization's email behavior, the level of autonomous action can be increased until no human supervision is required for each autonomous response action. Most security teams will spend little time in the user interface once this level is reached. At this time, the autonomous response module 140 response action neutralizes malicious emails without the need for any active management. The autonomous response module 140 may take one or more proactive or reactive action against emails, which are observed as potentially malicious. Actions are triggered by threat alerts or by a level of anomalous behavior as defined and detected by the cyber-security system and offer highly customizable, targeted response actions to email threats that allows the end user to remain safe without interruption. Suspect email content can be held in full, autonomously with selected users exempted from this policy, for further inspection or authorization for release. User behavior and notable incidents can be mapped, and detailed, comprehensive email logs can be filtered by a vast range of metrics compared to the model of normal behavior to release or strip potentially malicious content from the email.

Referring now to FIG. 8, an example of the AI-based cyber security appliance 100 using a cyber threat analyst module 125 to protect an example network is illustrated. The example network systems 850 uses a cyber security appliance 100. The system depicted is a simplified illustration, which is provided for ease of explanation. The network system 850 comprises a first computer system 810 within a building, which uses the threat detection system to detect and thereby attempt to prevent threats to computing devices within its bounds.

The first computer system 810 comprises three computers 1, 2, 3, a local server 4, and a multifunctional device 5 that provides printing, scanning and facsimile functionalities to each of the computers 1, 2, 3. All of the devices within the first computer system 810 are communicatively coupled via a Local Area Network 6. Consequently, all of the computers 1, 2, 3 are able to access the local server 4 via the LAN 6 and use the functionalities of the MFD 5 via the LAN 6.

The LAN 6 of the first computer system 810 is connected to the Internet 820, which in turn provides computers 1, 2, 3 with access to a multitude of other computing devices 18 including server 830 and second computer system 840. The second computer system 840 also includes two computers 41, 42, connected by a second LAN 43.

In this exemplary embodiment of the cyber security appliance 100, computer 1 on the first computer system 810 has the electronic hardware, modules, models, and various software processes of the cyber security appliance 100; and therefore, runs threat detection for detecting threats to the first computer system. As such, the computer system includes one or more processors arranged to run the steps of the process described herein, memory storage components required to store information related to the running of the process, as well as a network interface for collecting the required information for the probes and other sensors collecting data from the network under analysis.

The cyber security appliance 100 in computer 1 builds and maintains a dynamic, ever-changing model of the ‘normal behavior’ of each user and machine within the computer system 810. The approach is based on Bayesian mathematics, and monitors all interactions, events, and communications within the computer system 810—which computer is talking to which, files that have been created, networks that are being accessed.

For example, computer 2 is based in a company's San Francisco office and operated by a marketing employee who regularly accesses the marketing network, usually communicates with machines in the company's U.K. office in second computer system 840 between 9.30 AM and midday, and is active from about 8:30 AM until 6 PM.

The same employee virtually never accesses the employee time sheets, very rarely connects to the company's Atlanta network, and has no dealings in South-East Asia. The security appliance takes all the information that is available relating to this employee and establishes a ‘pattern of life’ for that person and the devices used by that person in that system, which is dynamically updated as more information is gathered. The model of the normal pattern of life for an entity in the network under analysis is used as a moving benchmark, allowing the cyber security appliance 100 to spot behavior on a system that seems to fall outside of this normal pattern of life, and flags this behavior as anomalous, requiring further investigation and/or autonomous action.

The cyber security appliance 100 is built to deal with the fact that today's attackers are getting stealthier, and an attacker/malicious agent may be ‘hiding’ in a system to ensure that they avoid raising suspicion in an end user, such as by slowing their machine down.

The Artificial Intelligence model(s) in the cyber security appliance 100 builds a sophisticated ‘pattern of life’—that understands what represents normality for every person, device, and network activity in the system being protected by the cyber security appliance 100.

The self-learning algorithms in the AI can, for example, understand each node's (user account, device, etc.) in an organization's normal patterns of life in about a week, and grows more bespoke with every passing minute. Conventional AI typically relies solely on identifying threats based on historical attack data and reported techniques, requiring data to be cleansed, labelled, and moved to a centralized repository. The detection engine self-learning AI can learn “on the job” from real-world data occurring in the system and constantly evolves its understanding as the system's environment changes. The Artificial Intelligence can use machine learning algorithms to analyze patterns and ‘learn’ what is the ‘normal behavior’ of the network by analyzing data on the activity on the network at the device and employee level. The unsupervised machine learning does not need humans to supervise the learning in the model but rather discovers hidden patterns or data groupings without the need for human intervention. The unsupervised machine learning discovers the patterns and related information using the unlabeled data monitored in the system itself. Unsupervised learning algorithms can include clustering, anomaly detection, neural networks, etc. Unsupervised Learning can break down features of what it is analyzing (e.g., a network node of a device or user account), which can be useful for categorization, and then identify what else has similar or overlapping feature sets matching to what it is analyzing.

The cyber security appliance 100 can use unsupervised machine learning to works things out without pre-defined labels. In the case of sorting a series of different entities, such as animals, the system analyzes the information and works out the different classes of animals. This allows the system to handle the unexpected and embrace uncertainty when new entities and classes are examined. The modules and models of the cyber security appliance 100 do not always know what they are looking for, but can independently classify data and detect compelling patterns.

The cyber security appliance 100's unsupervised machine learning methods do not require training data with pre-defined labels. Instead, they are able to identify key patterns and trends in the data, without the need for human input. The advantage of unsupervised learning in this system is that it allows computers to go beyond what their programmers already know and discover previously unknown relationships. The unsupervised machine learning methods can use a probabilistic approach based on a Bayesian framework. The machine learning allows the cyber security appliance 100 to integrate a vast number of weak indicators/low threat values by themselves of potentially anomalous network behavior to produce a single clear overall measure of these correlated anomalies to determine how likely a network device is to be compromised. This probabilistic mathematical approach provides an ability to understand valuable information, amid the noise of the network—even when it does not know what it is looking for.

The cyber security appliance 100 can use a Recursive Bayesian Estimation to combine these multiple analyzes of different measures of network behavior to generate a single overall/comprehensive picture of the state of each device, the cyber security appliance 100 takes advantage of the power of Recursive Bayesian Estimation (RBE) via an implementation of the Bayes filter.

Using RBE, the cyber security appliance 100's AI models are able to constantly adapt themselves, in a computationally efficient manner, as new information becomes available to the system. The cyber security appliance 100's AI models continually recalculate threat levels in the light of new evidence, identifying changing attack behaviors where conventional signature-based methods fall down.

Training a model can be accomplished by having the model learn good values for all of the weights and the bias for labeled examples created by the system, and in this case; starting with no labels initially. A goal of the training of the model can be to find a set of weights and biases that have low loss, on average, across all examples.

The AI classifier can receive supervised machine learning with a labeled data set to learn to perform their task as discussed herein. An anomaly detection technique that can be used is supervised anomaly detection that requires a data set that has been labeled as “normal” and “abnormal” and involves training a classifier. Another anomaly detection technique that can be used is an unsupervised anomaly detection that detects anomalies in an unlabeled test data set under the assumption that the majority of the instances in the data set are normal, by looking for instances that seem to fit least to the remainder of the data set. The model representing normal behavior from a given normal training data set can detect anomalies by establishing the normal pattern and then evaluate the likelihood of a test instance under analysis to be generated by the model. Anomaly detection can identify rare items, events or observations which raise suspicions by differing significantly from the majority of the data, which includes rare objects as well as things like unexpected bursts in activity.

Referring to FIG. 9, an illustrative block diagram of an embodiment of one or more computing devices that can be a part of an embodiment of the AI-based cyber security appliance 100 discussed is shown. Herein, the computing device may include one or more processors or processing units 920 to execute instructions, one or more memories 930-932 to store information, one or more data input components 960-963 to receive data input from a user of the computing device 900, one or more modules that include the management module, a network interface 9 communication circuit) 970 to establish a communication link to communicate with other computing devices external to the computing device, one or more sensors where an output from the sensors is used for sensing a specific triggering condition and then correspondingly generating one or more preprogrammed actions, a display monitor 991 to display at least some of the information stored in the one or more memories 930-932 and other components. Note, portions of this design implemented in software 944, 945, 946 are stored in the one or more memories 930-932 and are executed by the one or more processing units 920. The processing unit(s) 920 may have one or more processing cores, which couples to a system bus 921 that couples various system components including the system memory 930. The system bus 921 may be any of several types of bus structures selected from a memory bus, an interconnect fabric, a peripheral bus, and a local bus using any of a variety of bus architectures.

Computing device 900 typically includes a variety of non-transitory storage medium. The non-transitory storage medium can be any available media that can be accessed by computing device 900 and includes both volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, the non-transitory storage medium may include, but is not limited to, a programmable circuit; a semiconductor memory; non-persistent storage such as volatile memory (e.g., any type of random access memory “RAM”); persistent storage such as non-volatile memory (e.g., read-only memory “ROM”, power-backed RAM, flash memory, phase-change memory, or other memory technologies), a solid-state storage, a hard disk storage, an optical disc storage, a portable memory device, or storage instances.

The methods and systems shown in the Figures and discussed in the text herein can be coded to be performed, at least in part, by one or more processing components with any portions of software stored in an executable format on a non-transitory storage medium. Thus, any portions of the method, apparatus and system implemented as software can be stored in one or more non-transitory storage mediums in an executable format to be executed by one or more processors.

A network system can be, wholly or partially, part of one or more of the server or client computing devices in accordance with some embodiments. Components of the network system can include, but are not limited to, a processing unit having one or more processing cores, a system memory, and a system bus that couples various system components including the system memory to the processing unit.

In an example, a volatile memory drive 941 is illustrated for storing portions of the software such as operating system 944, application programs 945, other executable software 946, and program data 947.

A user may enter commands and information into the computing device 900 through input devices such as a keyboard, touchscreen, or software or hardware input buttons 962, a microphone 963, a pointing device and/or scrolling input component, such as a mouse, trackball, or touch pad 961. The microphone 963 can cooperate with speech recognition software. These and other input devices are often connected to the processing unit 920 through a user input interface 960 that is coupled to the system bus 921, but can be connected by other interface and bus structures, such as a lighting port, game port, or a universal serial bus (USB). A display monitor 991 or other type of display screen device is also connected to the system bus 921 via an interface, such as a display interface 990. In addition to the display monitor 991, computing devices may also include other peripheral output devices such as speakers 997, a vibration device 999, and other output devices, which may be connected through an output peripheral interface 995.

The computing device 900 can operate in a networked environment using logical connections to one or more remote computers/client devices, such as a remote computing device 980. The remote computing device 980 can a personal computer, a mobile computing device, a server, a router, a network PC, a peer device, or other common network node, and typically includes many or all of the elements described above relative to the computing device 900. The logical connections can include a personal area network (PAN) 972 (e.g., Bluetooth®), a local area network (LAN) 971 (e.g., Wi-Fi), and a wide area network (WAN) 973 (e.g., cellular network). Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet. A browser application and/or one or more local apps may be resident on the computing device and stored in the memory.

When used in a LAN networking environment, the computing device 900 is connected to the LAN 971 through the network interface 970, which can be, for example, a Bluetooth® or Wi-Fi adapter. When used in a WAN networking environment (e.g., Internet), the computing device 900 typically includes some means for establishing communications over the WAN 973. With respect to mobile telecommunication technologies, for example, a radio interface, which can be internal or external, can be connected to the system bus 921 via the network interface 970, or other appropriate mechanism. In a networked environment, other software depicted relative to the computing device 900, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation, remote application programs 985 as reside on remote computing device 980. It will be appreciated that the network connections shown are examples and other means of establishing a communications link between the computing devices that may be used. It should be noted that the present design can be conducted on a single computing device or on a distributed system in which different portions of the present design are conducted on different parts of the distributed network system.

Note, an application described herein includes but is not limited to software applications, mobile applications, and programs routines, objects, widgets, plug-ins that are part of an operating system application. Some portions of this description are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to convey the substance of their work most effectively to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. These algorithms can be written in a number of different software programming languages such as Python, C, C++, Java, HTTP, or other similar languages. Also, an algorithm can be implemented with lines of code in software, configured logic gates in hardware, or a combination of both. In an embodiment, the logic consists of electronic circuits that follow the rules of Boolean Logic, software that contain patterns of instructions, or any combination of both. A module may be implemented in hardware electronic components, software components, and a combination of both. A module is a core component of a complex system consisting of hardware and/or software that is capable of performing its function discretely from other portions of the entire complex system but designed to interact with the other portions of the entire complex system. Note, many functions performed by electronic hardware components can be duplicated by software emulation. Thus, a software program written to accomplish those same functions can emulate the functionality of the hardware components in the electronic circuitry.

Unless specifically stated otherwise as apparent from the above discussions, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers, or other such information storage, transmission or display devices.

The term “message” generally refers to as information placed in a prescribed format that is transmitted in accordance with a suitable delivery protocol or accessible through a logical data structure such as an Application Programming Interface (API) or a web service or service such as a portal. Examples of a message may include one or more packets, frames, header/body data structure, or any other series of bits having the prescribed, structured format.

The term “coupled” is defined as meaning connected either directly to the component or indirectly to the component through another component.

While the foregoing design and embodiments thereof have been provided in considerable detail, it is not the intention of the applicant(s) for the design and embodiments provided herein to be limiting. Additional adaptations and/or modifications are possible, and, in broader aspects, these adaptations and/or modifications are also encompassed. Accordingly, departures may be made from the foregoing design and embodiments without departing from the scope afforded by the following claims, which scope is only limited by the claims when appropriately construed.

	Number	Date	Country
	63350781	Jun 2022	US
	63396105	Aug 2022	US

ANALYSES AND AGGREGATION OF DOMAIN BEHAVIOR FOR EMAIL THREAT DETECTION BY A CYBER SECURITY SYSTEM

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

RELATED APPLICATIONS

Provisional Applications (2)