The present invention relates generally to providing an indicator on a user agent (e.g., a web browser) indicating a level of security or trust for a particular website. More particularly, the present invention is directed to evaluating multiple aspects indicative of security and trustworthiness of a website, and combining these evaluations to classify an overall risk level in connection with the site.
Computer users typically use user agent applications to access documents or data resources that are available over a computer network to which their computer is connected. Such resources are identified by a Uniform Resource Identifier (URI), usually a Uniform Resource Locator (URL), which identifies the resource uniquely and provides the information necessary for locating and accessing the resource. A web browser is a type of user agent commonly used to navigate the World Wide Web (i.e., the system of interlinked hypertext documents accessible on the Internet), in order to access a particular information resource (or “webpage”) and present it to the user.
However, many information resources are made available on the World Wide Web with malicious intent. As one example, “phishing” refers to a technique whereby a webpage masquerades as a popular and trustworthy website such as a bank site, an auction site, or a retail shopping website. Some phishing sites intend on tricking a computer user to provide confidential information (e.g., password, account number, social security number, etc.). Other phishing sites try to cause the user to download “malware,” i.e., malicious software which is intended to disrupt operation of the user's computer or gather sensitive information therefrom. Examples of malware include computer viruses, worms, spyware, and Trojan horses.
Existing user agents such as web browsers are capable of notifying a computer user of certain attributes of a data resource or website which are indicative of a level of security and a possibility of risk. For instance, it is known for a browser to display a padlock icon in the address field for websites that utilize encryption, according to the Hypertext Transfer Protocol Secure (HTTPS) scheme, for site authentication and bidirectional encryption. Also, existing browsers are able to display an alert when the user attempts to navigate to a known phishing or malware site.
While existing browsers are capable of evaluating different characteristics of a website relating to security or risk, such browsers present these evaluations as separate pieces of information, without any attempt to combine them into a single assessment of the risk level associated with the website.
The present invention is directed toward a computer implemented method and a device for evaluating and quantifying several aspects of security and risk associated with accessing a network data resource (e.g., webpage), and combining these quantifications to classify an overall safety level of the data resource. Particularly, a user agent (e.g., web browser) may be programmed to display this safety level to the user when an attempt is made to visit or access the data resource. In further embodiments of the invention, the user agent may also implement a precautionary measure if deemed appropriate based on the level of risk. Such precautionary measure may be selected from displaying a warning to the user and blocking access to the data resource. In addition, the user agent may display the individual quantifications to the user in combination with the overall level of risk.
The present invention is directed toward a computer implemented method and a device for classifying the safety level, or level of risk, involved with accessing a particular data resource on a network. The method may typically be implemented as part of a user agent, e.g., a web browser, to analyze the safety (risk) level associated with a data resource (e.g., webpage or website).
Specific embodiments are described hereinafter in which the user agent is a web browser for accessing particular data resources, such as webpages or documents, over the Internet using the HTTP (Hypertext Transfer Protocol). However, the principles described hereinafter could more broadly be applied to other types of user agents, data resources, and networks. Therefore, when the terms “browser,” “web browser,” and the like is used hereinafter to describe various aspects or principles of the present invention, it should be recognized that such aspects or principles are also applicable to other types of user agents. Likewise, when terms such as “webpage,” “document,” “website,” and the like are used hereinafter to describe certain aspects or principles of the invention, the same aspects or principles are also applicable to other types of data resources.
According to an exemplary embodiment, the user agent or browser is programmed to provide each active browsing context (e.g., tab or window) with its own “security advisor,” i.e., the capability of classifying the safety level of the webpage to which the tab/window has been directed. This security advisor may further consist of numerous “observers,” each having the task of monitoring, discovering, and/or relaying a specific type of information of interest. This is used to update a transitory knowledge base of the active page. Further, the security advisor contains multiple quantifiers each of which is programmed to analyze and evaluate a particular aspect of security or trust in regard to the webpage or site, based on the current body of knowledge as maintained in the transitory knowledge base. Through the use of these quantifiers, multiple categories relating to security or trustworthiness may be evaluated. These categories may include, e.g., Encryption, Certificate, Familiarity, Reputation, and Inquisitiveness of the site. There may be a quantifier for each of these categories. Further, as the name implies, each of the quantifiers may quantify its evaluation, in order to produce a score for the corresponding category.
In addition to the observers, transitory knowledge base, and quantifiers, the security advisor may include a “risk assessor.” This risk assessor receives the scores from the quantifiers and combines them into a single classification of the safety level. According to a particular exemplary embodiment, this risk assessor may be programmed to apply a set of rules to the scores, in order to classify the safety level associated with the webpage or site. Also, in addition to classifying the safety level, the risk assessor may also determine whether a precautionary measure needs to be taken based on the application of the rules. For instance, if the safety level is classified as SUSPICIOUS or UNSAFE, the risk assessor may decide that it is necessary either to display a warning to the user or block access to the webpage or site.
According to a further exemplary embodiment, one or more of the observers in the security advisor may continuously react to events (e.g., the arrival of network data, changes in the webpage structure, or even a regular timer) by updating the transitory knowledge base, even after a webpage or site has been classified as safe. Further, whenever the body of knowledge of the active webpage (as stored in the transitory knowledge base) changes, this may trigger a new evaluation by one or more of the quantifiers, and also a new classification of the safety level by the risk assessor. This allows for a real-time analysis of the safety level of the webpage or site so that, whenever an observer makes a discovery that might affect the safety level, this information would trigger a new analysis of the risk involved with the webpage.
The aforementioned components including the security advisor, observers, and risk assessor may be coded as any combination of rules, functions, routines, and/or sub-routines programmed into a user agent, or alternatively may be coded separately from the user agent, e.g., as an “add-on.”
The memory 102, which may include ROM, RAM, flash memory, hard drives, or any other combination of fixed and removable memory, stores the various software components of the system. The software components in the memory 102 may include a basic input/output system (BIOS) 141, an operating system 142, various computer programs 143 including applications and device drivers, various types of data 144, and other executable files or instructions such as macros and scripts 145. For instance, the computer programs 143 stored within the memory 102 may include any number of applications, including the user agent and other program(s) used for implementing the principles of the present invention, as well as any other programs (e.g., widgets) designed to be executed in a user agent environment.
In
The video interface device 104 is connected to a display unit 120 which may be an external monitor or an integrated display such as an LCD display. The display unit 120 may have a touch sensitive screen and in that case the display unit 120 doubles as a user input device. The user input device aspects of the display unit 120 may be considered as one of the local devices 110 communicating over a communication port 103.
The network interface device 105 provides the device 100 with the ability to connect to a network in order to communicate with a remote device 130. The communication network, which in
It will be understood that the device 100 illustrated in
In an exemplary embodiment, various aspects of the present invention may be incorporated into, or used in connection with, the components and/or functionality making up a user agent or browser installed as an application on a device 100.
The user agent or browser 200 presents the user with a user interface 201 that may be displayed on the display unit 120 shown in
In any case, the URL may be received by a window and input manager 203 that represents the input part of the user interface 201 associated with, or part of, the user agent 200. The URL may then be forwarded to a document manager 204, which manages the data received as part of the document identified by the URL.
The document manager 204 forwards the URL to a URL manager 205, which instructs a communication module 206 to request access to the identified resource. The communication module 206 may be capable of accessing and retrieving data from a remote device 130 such as a server over a network using the hypertext transfer protocol (HTTP), or some other protocol such as HTTP Secure (HTTPS) or file transfer protocol (FTP). The communication module 206 may also be capable of accessing data that is stored in local memory 102.
If communication outside the device 100 is required to be encrypted, e.g. as specified by the protocol used to access the URL, encryption/decryption module 207 handles communication between the URL manager 205 and the communication module 206.
The data received by the communication unit 206 in response to a request is forwarded to the URL manager 205. The URL manager 205 may then store a copy of the received content in local memory 102 using a cache manager 208 which administers a document and image cache 209. If the same URL is requested at a later time, the URL manager 205 may request it from the cache manager 208, which will retrieve the cached copy from the cache 209 (unless the cached copy has been deleted) and forward the cached copy to the URL manager 205. Accordingly, it may not be necessary to retrieve the same data again from a remote device 130 when the same URL is requested a second time.
The URL manager 205 forwards the data received from the communication port 206 or cache 209 to a parser 210 capable of parsing content such as HTML, XML and CSS. The parsed content may then, depending on the type and nature of the content, be processed further by an ECMAScript engine 211, a module for handling a document object model (DOM) structure 212, and/or a layout engine 213.
This processing of the retrieved content is administered by the document manager 204, which may also forward additional URL requests to the URL manager 205 as a result of the processing of the received content. These additional URL's may, e.g., specify images or other additional files that should be embedded in the document specified by the original URL.
When the data representing the content of the specified document has been processed it is forwarded from the document manager 204 in order to be rendered by a rendering engine 214 and displayed on the user interface 201.
The various modules thus described are executed by the CPU 101 of device 100 as the CPU 101 receives instructions and data over the system bus(es) 106. The communications module 206 communicates with the remote device 130 using the network interface 105. The functionality of various modules in
It will further be understood that, while the user agent 200 described above may be implemented as an application program 143, some of the user agent's 200 functionality may also be implemented as part of the operating system 142 or even the BIOS 141 of the device 100. The content received in response to a URL request may be data 144, script 145, or a combination thereof as further described below.
Principles of the present invention will be described below in connection with the particular example of a web browser 200 as illustrated in
Reference is now made to
The term “security advisor” is used herein to describe each implementation of the process 30 illustrated in
Referring to
After the relevant URL is obtained, the corresponding webpage or website is evaluated according to operation S310. Particularly, according to S310, various pieces of data which might be relevant to the security or risk involved with the active website or page may be collected by functional units which are called “observers” in this specification, and then multiple categories relating to security or risk are evaluated according to the collected data. The term “quantifier” is used in this specification to describe the functionality whereby each one of these categories is evaluated. As such, multiple observers and quantifiers may be employed by the security advisor to perform the evaluations of S310.
The types of data which may be collected by the observers in S310 may include e.g., document content or patterns therein, details of security protocols (e.g., Secure Sockets Layer (SSL)) utilized by the site, details on security policy declared by the site (e.g., HSTS), results of site authentication, characteristics of the web server hosting the site, whether the site attempts to collect geolocation data from the browser 200, details of the site's certificate chain, the site's reputation among third parties, and/or whether any of the site's software matches a malware registry. Other types of information collected by observers may include user input actions while browsing the site, the user's browsing history information, and details regarding the browser's network or Wi-Fi connection.
The set of categories which are evaluated by quantifiers in S310 may include, e.g., encryption employed by the site, nosiness or inquisitiveness of the site, familiarity with the site, reputation, quality and validity of the site's certificate chain, content of the webpage, or any subset thereof. Other categories which are not listed herein may also be evaluated by a quantifier. Further, in performing its evaluation, each quantifier may be configured to perform one or more of the following analyses: look for a particular pattern in the content or structure of the document residing at the website, examine details of a certificate employed by the site, asking a third-party's opinion of the site, analyze a history of past evaluations and/or risk classifications of the site, compare a “fingerprint” of the site with a previous visit, etc.
The score produced by each quantifier may be a numeric quantification, but this is not strictly necessary. It is contemplated that a quantifier could produce another type of score which expresses a degree to which a certain characteristic or quality is exhibited by the webpage or site. This could be done by choosing from a finite set of choices, which is representative of a range or scale. An example of this is a quantifier which produces a score by choosing one of HIGH, MEDIUM, and LOW. Another example is a quantifier which chooses one of POOR, AVERAGE, and EXCELLENT. It is not strictly necessary for this set of choices to be representative of a range or scale, as long as the score is obtained by selecting from a predetermined set of choices. E.g., the score may be produced simply by classifying something according to a limited number of classifications (as long as such classifications are understood by the “risk assessor” described below).
For a more detailed discussion of the observers, the quantifiers, and the categories which are evaluated in operation S310, will be provided below in connection with
Referring again to
Particular examples of such rules (formatted as If-Then statements) are provided in Table 1 below. In these examples, it is assumed that observers are provided for the respective categories of “Certificate Quality,” “Reputation,” and “Degree of Inquisitiveness”; and that each of these categories are scored on a scale of 0 to 10.
However, as mentioned earlier, it is not strictly necessary for the scores to be numeric. Instead, they can be other types of scores indicative of a scale or a range. For example, the scores could be chosen from the following set: TERRIBLE, POOR, AVERAGE, GOOD, and EXCELLENT. If this the case, the same rules as above could be described differently, as indicated in Table 2 below:
According to an exemplary embodiment, the rules applied in operation S320 could classify the safety level according to a set of classifications, each indicative of a degree of risk involved in accessing the webpage or site. For instance, in the above tables, the rules are based on an embodiment in which the safety level is classified as UNSAFE, RISKY, or SAFE. Another example of a set of classifications indicative of risk is SUSPICIOUS, POOR, NORMAL, and EXCELLENT. Examples of this second set of classifications are illustrated in
As such, it should be noted that Tables 1 and 2 merely provide examples in regard to the types of scores and the classifications of safety level. In fact, it is contemplated that there could be many more (and more complex) rules. As such, Tables 1 and 2 above should not be construed as being limiting on the types of rules, scores, and safety level classifications. For instance, the scores may be generated from any range (e.g., a scale of 1 to 100), and there may be more (or possibly less) classifications of safety level. Also, while the rule examples above directly classify the safety level into classifications such as UNSAFE, RISKY, and SAFE, this may not be the case. For instance, it would be possible to configure the rules to quantify the safety level, e.g., as a number between 1 and 100. It would also be possible to provide an additional rule to define numeric ranges for classifications such as UNSAFE, RISKY, and SAFE, and then determine in which of these ranges the numeric value of the safety level fits.
Referring again to
Referring again to
Furthermore, it may be decided that, in addition to displaying the classified safety level, another precautionary measure may be needed. Other precautionary measures may include (but are not limited to) displaying a more detailed warning or explanation to a user, and blocking access to the relevant webpage or website. If either (or both) of these additional precautionary measures are deemed warranted, they could be activated according to operation S342. It is possible for multiple precautionary measures to be activated in S342. For instance, if access to the website is blocked, it might also be useful to display a warning explaining why the site has been blocked.
Furthermore, the user may be given an option of overriding any precautionary measures implemented, so that he/she can proceed to browse the site. This is illustrated in
For instance, consider the example where the safety level is classified as one of SUSPICIOUS, POOR, NORMAL, and EXCELLENT. The security advisor may determine that the precautionary measure of displaying a warning is necessary, if the safety level is classified as POOR. On the other hand, if the safety level is classified at SUSPICIOUS (which is indicative of the highest degree of risk), then the security advisor may determine that it is necessary to block the site. As such, the classified safety level, as determined by application of the rules in S320, may ultimately be determinative of whether, and what type of, precautionary measure is needed.
Furthermore, it is also possible for the rules in S320 to determine the content of the warning which might be displayed. For instance, at least some of these rules may be configured to determine a particular subcategory of risk which is relevant to the webpage or cite. Such subcategory may be determined only when the classified safety level is indicative of a threshold level of risk (e.g., POOR) or higher.
For instance, each of the aforementioned subcategories may be indicative of a specific type of risk that is associated with webpage. Examples of these specific types of risk may include:
Examples of rules for determining whether a subcategory of risk exists (e.g., potential man-in-the-middle attack (MITM), phishing, or fraud) are provided below in Table 3. In these examples, it is assumed that the safety level is classified as one of SUSPICIOUS, POOR, NORMAL, and EXCELLENT:
(It should be noted that the Tables 1-3 above are merely intended to provide general examples of the types of classification rules that could be applied in accordance with the present invention, and they are not intended to be limiting as to the format or syntax in which the rules are written or implemented. Furthermore, these classification rules may be stored somewhere in the memory 102 of the computing device 100 (or some other persistent storage) in such manner as to be updated and/or replaced as necessary.)
The figures provide various examples of warnings which may be displayed in the indicator 56 on the basis of some of these subcategories. For example,
Referring again to
Particularly,
Now, a more detailed description will be made as to the operation of the observers and the quantifiers, and the possible types of data and categories which are evaluated in order to classify the safety level of an active webpage. As part of this description, reference will be made to
As shown in
As described earlier, in an exemplary embodiment, the security advisor contains numerous functional units, referred to as “observer” for monitoring, discover, and relaying respective pieces of information to be assembled into the transitional knowledge base in S3110.
At least some of the observers may be implemented as, e.g., functional units within the web browser code capable of utilizing Application Programming Interfaces (APIs) in the operating system (OS) of the computing device 100 to obtain certain pieces of information. Particularly, such APIs may be used to intercept network traffic from the website and capture HTTP and Secure Sockets Layer (SSL) information. Such information may include the SSL certificate and headers transmitted by the site.
However, other observers may be programmed to collect information from the browser 200, e.g., by accessing the cache 209, browsing history, user inputs, etc. Other observers may attempt to analyze the webpage content to detect certain patterns. Furthermore, other observers may try to obtain data from third parties.
Furthermore some observers may utilize information obtained by another observer to gather more information. For instance, upon receiving the SSL certificate of the site, an observer may attempt to authenticate the certificate at a centralized certificate authority (CA). Another observer may use the certificate to vet the website at a phishing registry.
In
However, the category evaluations are not limited to an analysis of data gathered by the observers. It is contemplated that certain categories may also be evaluated using historical data pertaining to the results of previous analyses performed by the security advisor. For example, as shown in
Referring again to
According to operation S3120, an evaluation is performed on an “Encryption” category. In this evaluation, a quantifier attempts to quantify the strength or effectiveness of the encryption algorithm employed by the website to protect the data (such as HTTP requests) which the browser 200 transmits to the site. Particularly, this evaluation is performed in order to determine how well the website protects such data from eavesdropping by malicious parties. According to an exemplary embodiment, this evaluation may analyze the SSL connection over which each HTTP request (for the main document as well as any embedded resources) is transmitted from the browser 200 to the website. Also, certain contents of the webpage, particularly, the input forms, may be analyzed to deduce the encryption status of any HTTP requests which will be sent when, or if, the form data is submitted. As a result of such analyses, the quantifier can detect the particular type of encryption employed
It should further be noted that, in addition to generating a numeric (or other type of) score, the Encryption quantifier may also be configured to generate a textual description of the encryption status of the site. Further, such a textual description could be displayed to the user along with a visual indication representing the score of the Encryption category (e.g., a bar extending around the perimeter of a circle, the extent of which is representative of the score). For instance, as shown in
Furthermore, if the user would like even more details into the encryption status of the site, it would be possible to provide additional textual descriptions, e.g., in a new dialog box. An example of this is shown in
Referring again to
According to S3140, a “Familiarity” category is evaluated. As mentioned earlier, this evaluation may look at the following factors:
The more times that a website has been visited by the user, the more likely it is that the user is familiar with the site, thereby positively affecting the Familiarity score so that it indicates a lower degree of risk). On the other hand, factors which are indicative of a change in the website will negatively affect the Familiarity score. I.e., if the site does not look similar as before, and/or does not carry the same fingerprint as a previous visit, this will affect the score so as to indicate an increased degree of risk.
For purposes of this specification, a “fingerprint” refers to a particular set of data (e.g., header data) that is transmitted by the website, which is not expected to change (or change much) unless some characteristic in the website has changed (e.g., the site is utilizing different software, the site is hosted on a different server). According to an exemplary embodiment, operation S3140 generates a fingerprint by extracting and storing a certain set of header data from the site each visit, which is indicative of the type of software being run. If this stored data does not match the header data extracted during the next visit, this indicates that the website is not running the same software as before, thereby negatively affecting the Familiarity score. A similar fingerprinting technique based on header data can be used determine whether the website is hosted on the same server as before. If the server has changed, this will also negatively affect the score so as to indicate a higher degree of risk.
Referring again to
Also, similar to Encryption, the quantifier for the Reputation category could also be configured to generate a textual description regarding the website's reputation. Also, this textual description may be displayed in the Safety Report section 58 of the dialog box, along with a visual indication of the Reputation score (e.g., a bar extending around a perimeter of a circle, the extent of which is representative of the score). Examples of this are illustrated in
In operation S3160 of
It will be assumed for purposes of this example that the Certificate Quality score is scored on a scale of 0 to 10. If the site does contain an SSL certificate, the score is initially set to a value of 5, and then modified according to the remaining factors. For instance, the quantifier may deduct 5 points from the score if the certificate is deemed to be untrusted. Further, the quantifier may deduct 4 points from the score for an insecure signature algorithm and an insecure key. On the other hand, 1 point could be added to the score if the key is solid. Moreover, 5 points may be added if the quantifier determines that the size of the public key meets a certain threshold. Also, if the quantifier can successfully verify that the SSL certificate has not been revoked, then 2 points may be added to the score. Furthermore, if the certificate fully satisfies the requirements for an Extended Validation (EV) certificate, then 4 points can be added to the score (EV certificates are a class of SSL certificates, which are issued by a Certificate Authority to an entity who satisfies a very extensive set of identity verification criteria).
Further, in this example, it is also possible to convert the Certificate Quality score (as well as the other category scores) to make the application of the rules (in S320) simpler. For instance, the final score could be mapped to one of the following labels: TERRIBLE (score=0, 1, or 2); POOR (score=3, 4, or 5); GOOD (score=6, 7, or 8); and EXCELLENT (score=9 or 10). This could simplify the rules and make them easier to understand (e.g., If the Certificate-Based Security score is EXCELLENT, then . . . ).
It should be noted that the other categories evaluated in process 310 could be scored in a similar manner as described above in the above example regarding Certificate-Based Security.
Referring once again to
Some of the aforementioned factors can be analyzed using historical information of previous safety level classifications. As mentioned above in regard to
Further, as shown in
While particular embodiments are described above for purposes of example, the present invention covers any and all obvious variations as would be readily contemplated by those skilled in the art.
The present application claims domestic priority under 35 U.S.C. 119(e) to U.S. Provisional Application No. 61/876,166 filed on Sep. 10, 2013, the entire contents of which are herein incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
61876166 | Sep 2013 | US |