Application threat modeling

Information

  • Patent Application
  • 20080028065
  • Publication Number
    20080028065
  • Date Filed
    July 26, 2006
    19 years ago
  • Date Published
    January 31, 2008
    17 years ago
Abstract
A method and system for analyzing data relating to a website including the content and architecture of the website are provided. All relevant site related information is cataloged. Then “attack points” or vectors used by a hacker within the site are determined. Based on the above, a calculation of a relevant level of security for each attack point is determined.
Description

BRIEF DESCRIPTION OF THE DRAWING


FIG. 1 shows block diagram of the present invention.



FIG. 2 shows a system of the present invention.



FIG. 3 shows a flowchart of the present invention.



FIG. 4 shows a diagram of the present invention.





DETAILED DESCRIPTION OF THE INVENTION

In order to better understand the present invention, the following definitions or working definitions are listed in Table I below:









TABLE I





Definition of Terms
















Resource
typically a file on a web server that can create



a web page.


Resource
characteristics of a resource.


Attributes


Interactive
resources that perform a function of some kind (as


Resources
opposed to being a flat file on the web server).


Non-interactive
exemplified non-interactive resources are pages that


resources
contain static text and perhaps a few images and do not



require the web server to do anything other than have the



server feed the flat file to a browser. The user can not do



anything to this flat file because the web server does not



interact with anything.


Crawler
the part of a Spider program or search engine that



searches data prior to vulnerability assessment.









Resource may also be a JavaScript link that creates a page. Resources are not limited to files that comprise web pages. Resource may also be a configuration file or file that does not serve content, but rather performs some functions. All substantial resource “types” are listed below in Table II.









TABLE II





Exemplified Types of Resources
















1
HTML


2
Application content (e.g. PHP, ASP, Java, CFM, etc.)


3
JavaScript


4
Images


5
Text


6
Compressed files (e.g. zip, tar.gz, etc.)


7
Archive/backup files (e.g. .bak, etc.)


8
Log files


9
Database driven content (e.g. site.com/resource.php?resource=      )


10
Include files









Resource attributes are a resource (web page) that may contain some images as well as content that come from a database which require a cookie in order to browse the page. In this example, three attributes are needed to catalog: images, a database connection, and a cookie. Further examples of resource attributes are listed below in Table III.









TABLE III





Examples of Resource Attributes
















0
URL/Form Parameters


1
Cookies


2
Forms


3
Email id


4
JavaScript functions


5
Authentication points


6
Query string (e.g. for a database)


7
Hidden fields


8
Comments


9
Scripts


10
Applets/Objects









Examples of Interactive Resources include database driven content in which database driven content is “interactive” because it requires the web server to communicate with the database and retrieve something specific. An attacker typically focuses on Interactive Resources because they can modify the request the web server issues in order to attempt some form of attack by interacting with these backend systems that run the web site.


On the other hand, non-interactive resources are typically a page that contains static text and perhaps a few images. A non-interactive resource does not require the web server to do anything other than having the server feed the flat file to a browser. The user cannot do anything to this flat file because the web server does not interact with anything.


A crawler is responsible for, among other things, crawling the entire site. A crawler is the foundation for all scan activity since it provides data subject to further processing by the present invention. If the crawler can not build a proper catalog of all site contents, the present invention will not be able to do anything to it (i.e. attack it to perform a vulnerability assessment including the generation of a report).


The Application Threat Modeling Process

Referring to FIG. 1, the threat model begins with a crawling phase that uses an automated spidering engine 10 to actuate each link of the application. Links are identified through pattern recognition and parsing JavaScript of every response's HTML page. The engine 10 stores each link in memory and in an XML file.


Upon completion of the crawl, the spidering engine 10 passes the collected links to an analysis engine 12 that identifies attributes (e.g. attributes listed in Table III) that can be used to calculate exposure. Some of the attributes are cookies set by the “Set-Cookie” header, forms, hidden input fields, POST data, URL parameters, e-mail addresses, and HTML comments. The analysis engine 12 counts the raw number of attributes per link and the overall count for the application. Once the attributes have been identified, the exposure is then calculated. A report 14 is generated for analysis. The spidering engine and the analysis engine 12 may be controlled by a micro-controller 16.


Referring to FIG. 2, a network 18 such as the Internet or World Wide Web is provided. A first server 20, storing data relating to at least one web page, is coupled to network 18. Server 20 may comprise the present invention's method implemented in computer readable instructions. Typically, the present invention's method implemented in computer readable instructions is controlled by a second server 22 coupled to network 18, executing instructions by way of network 18.


Referring to FIG. 3, a flowchart 30 of the present invention is shown. A crawler is provided to work on a site 32. Application Threat Modeling is determined substantially from the crawl data, and not any other vulnerability assessment (VA) data. Thus, the application threat modeling of the present invention is calculated based on the architecture of a crawled site as analyzed by the Crawler portion of Present invention. The crawler will essentially execute every link 34 on a web site to catalog every file/resource on the site 36. The crawler will also catalog the resource's attributes (as shown in Table III) relating to the site 38.


A determination is made as to whether the resource cataloged is interactive or static (non-interactive) 40. It then takes all the static, non-interactive resources and tosses them out 42. What is left is the interactive content, or what we call Attack Points 44. Attack Points 44 are resources that possess attributes that an attacker could interact with (targeting the web server, application server or database), such as a form field, a database connection or a hidden field.


As shown in FIG. 4, crawler engine 10 essentially executes every link on a web site 50 to catalog every file/resource on the site 50. The link range from link-152 . . . to link-I 54 . . . to link-n 56.


One often refers to application threat modeling as a “qualitative analysis” of the target site. It does not contain any discrete vulnerability information (what is often called “quantitative analysis”), but rather focuses on the structure and content of the site and how that may have an impact on future, or emerging, security threats. This is what the present invention teaches.


A good example of why Attack Points 44 are a concern is shown with a site that has many form fields. While the application's processing of such form inputs may be secure at this time, any change to the site (such as a new application or a modification to one) could possibly introduce a form-based attack vulnerability. Additionally, a new attack could be devised so that it might affect form inputs that interact with such applications. Here we see that even though they may currently be secure, the sheer existence of such resources (i.e. form fields on a web page) creates a persistent concern that must be monitored and considered throughout the application life-cycle.


Additionally, the application threat modeling of the present invention allows security personnel to understand what their application security program should include to best secure their web sites. Since not all web sites have the same security exposure or security concerns, it is important to make sure that the organization is aligning their security programs with relevant security exposure. An exemplified technical explanation of the above using two types of web sites is shown below:

  • (a) An e-commerce site is likely to be heavily driven by databases and runs by utilizing many types of inputs. These inputs typically are not form data. In fact they are anything but form data, but rather may be the quantity of an item getting purchased to a price variable. The site applications must process these requests in order to perform the commerce function of selling things. However, if the site does not have a robust set of “input validation filters” it is possible that an attacker could modify input values to exploit the applications. This could result in purchasing an item for less money, one of other possible exploitations. These types of sites are highly dependent on input validation filters to prevent such attacks and, thus, are a suitable candidate for the application of the present invention.
  • (b) A very different site would be a company extranet that allows partners and vendors to obtain documents such as contracts or pricing information. This site most likely contains mostly flat files, thus inputting validation attacks may be entirely impossible. It is nonetheless critical that this site's data not fall into the wrong hands. Therefore, access to the site is important since it would create pressure to develop quality assurance (QA) and to utilize robust authentication and authorization and encryption techniques by restricting access to this data.


The above examples show us that not all sites are equally created. The application threat modeling of the present invention is designed to communicate this information so that a company's security, development, and QA teams may understand how their online business model is affected by such security threats. Simply put, the present invention gives them the information they need, but previously did not have in order to align their security related efforts of securing their web business.


The crawler also communicates with Response codes, Web server platforms, and External site links (including the data that is being sent via SSL and plaintext)


Application Threat Modeling Security Exposure Calculation

As mentioned, once the Present invention has catalogued all the interactive site content and its attributes, it then performs a calculation to determine the extent of “security exposure”. It is critical to point out that this calculation is subjective in that different people have different preconceived notions regarding the security field. Therefore while a paranoid individual might find even the slightest bit of exposure to be an unacceptable threat, another individual might not care that 100% of the site can be hacked through an abundance of attack vectors.


The present invention creates a rudimentary exposure scoring calculation that provides a perceived level of security exposure. The exposure is correlated with otherwise unused information into report 14 which communicates or answers the questions of:

  • 1. How much exposure to an attack does a site have?
  • 2. What resources/attributes make up that exposure?


    With the above in mind, the exposure calculation is based on two things:
  • 1. The ratio of Attack Points to non-Attack Points
  • 2. The types of attackable resource attributes


    An application's exposure is calculated based on each attack point:











Exposure
=

Sum





of






(


Minimum


(

APweight
*
APtotal

)


,
APceiling

)



)






or






Exposure
=




i
=
1

n




(


Min


(

APweight
*
APtotal

)


,
APceiling

)



)








(
1
)







Where for each type of attack point, the total number of points present in the application is denoted by (APtotal), which is multiplied by a weighting factor (APweight) that is predetermined by a user. An attack point can contribute no more than a maximum value (APceiling) to the exposure rating. The minimum value is chosen between the attack point's score and its ceiling. The sum of all attack point scores represents the exposure rating.


While other technologies may capture the above-mentioned data in many forms, some may capture only part of the data, and others may capture all of it. But the data is not the whole invention herein, but rather, it is the correlation of how the site construction does or does not create a security concern based upon a novel report 14 that correlates the parameters of a site automatically.


A human user or technician can perform the present invention. However, the present invention teaches an automatic process wherein human intervention during processing is not necessary. In other words, the present invention teaches a method of computer readable automatic data processing where no human operator is needed for generating the report 14 based upon equation 1.


Unlike prior art systems, such as the 737' patent that operates at OSI levels 4,5,6, the Web Application Scanner of the present invention operates at level 7 and generally only connects to the two web server ports (e.g. 80 and 443) in order to exercise the custom web application and the application's HTML pages. The present invention operates on a different network stack level, automating the manual input techniques an application tester would apply against the content of custom and dynamically generated HTML applications. In other words, the present invention does not test the level 6 input of the server.


The present invention is associated with a Web Application Scanner. A Web Application Scanner generally only connects to the two web server ports (e.g. 80 and 443) in order to exercise the custom web application that is accessed through it. The present invention only scans the web application content at level 7 of the network protocol stack and not the web server at layer 6 or lower. These packets for different levels are constructed differently and do not cross stack boundaries.


It is important to note that while the present invention has been described in the context of a fully functioning data processing system, those of ordinary skill in the art will appreciate that the processes of the present invention are capable of being distributed in a form of a computer readable medium of instructions in addition to a variety of other forms. Further, the present invention applies equally, regardless of the particular type of signal bearing media that is actually used to carry out the distribution. Examples of computer readable media include recordable-type media such a floppy disc, a hard disk drive, a RAM, a CD-ROM, a DVD-ROM, a flash memory card and transmission-type media such as digital and analog communications links, or wired or wireless communication links using transmission forms such as radio frequency and light wave transmissions. The computer readable media may take the form coded formats that are decoded for actual use in a particular data processing system.


Accordingly, it is to be understood that the embodiments of the invention herein described are merely illustrative of the application of the principles of the invention. Reference herein to details of the illustrated embodiments is not intended to limit the scope of the claims, which they themselves recite features regarded as essential to the invention.

Claims
  • 1. A method for modeling a threat to a site, comprising the steps of: a) recording substantially all related information relevant to understanding how a hacker may attack the site;b) determining a set of attack points based upon said related information;c) giving each attack point a set of values; andd) performing a calculation based upon said set of values to determine a relevant level of security exposure for a particular attack point.
  • 2. The method of claim 1 further comprising a summary of all of the given values.
  • 3. The method of claim 1 further comprising a generation of an exposure report.
  • 4. The method of claim 1, wherein said level of security comprises: none, low, medium, or high.