The present disclosure relates to managing network security and similar devices.
Evaluating lines of network policies across different network devices is complicated, particularly when trying to assign access rules to policy classifications. For example, a firewall determines whether to permit or deny traffic based upon the list of rules contained within its configuration file. An individual rule specifies the type of traffic that the firewall permits or denies, based on attributes such as protocol, source network, source port, destination port, destination network, interface used, etc. A user would like to work with sets of these rules that have been combined into logical units referred to herein as “policies”.
Overview
Presented herein are techniques for creating a policy block comprised of a group of lines of rules/statements across configuration files for network devices. An algorithm is provided that determines when multiple policies are to be merged together into one policy. This is particularly useful when deploying one or more configuration policies on the plurality of network devices. In one embodiment, data is uploaded from a network that includes a plurality of network devices. The data represents policy rules configured on the plurality of network devices. The data representing the policy rules is compared for similarities in order to group together policy rules based on their similarities. Data is stored representing a plurality of clusters, each cluster representing a group of policy rules that have been grouped together. One or more configuration policies are generated to be applied across the plurality of network devices using the data representing each of the plurality of clusters, while maintaining context of policy rule processing.
With reference to
The cloud-based management system 100 includes a management entity 110 including one or more computer servers 112(1)-112(M) that execute software to perform the operations described herein. An example of a hardware configuration for management entity 110 is described in more detail below in connection with
Customer datacenter/network 120 includes a plurality of network security devices or products (also referred to as network security appliances) 130(1)-130(P). Within a customer datacenter there are one or more resources 140 and one or more actors 150. The resources 140 may include servers, databases, and the actors 150 are users or processes using a computing device (personal computer, SmartPhone, etc.) that may seek access to one or more of the resources 140. The resources and actors may also reside outside the customer datacenter itself, e.g., in the Internet. The network security devices 130(1)-130(P) control access of the actors 150 to the resources 140 according to network security policies, e.g., sets of one or more network security rules configured on the respective network security devices.
As described herein, data is sent to the network 1200 to deploy one or more configuration policies on the plurality of network security devices 130(1)-130(P).
An administrator 180 may log onto a log-in web page 185 served by one the servers 112(1)-112(M) in order to enter commands to deploy policy configurations on the network security devices 130(1)-130(P).
As an example, given a set of firewall configuration files for a given user, it would be beneficial to automatically group the (possibly) thousands of individual rules into a manageable number of policies, where each policy has an internal cohesion. As used herein a “block” or “policy block” is a group of lines of rules/statements across configuration files for one or more network devices. The goal of the techniques presented herein is to group lines into a policy block, and in particular to determine when to take multiple policies and merge them together into one policy, and in so doing, divine the intent of the policy writer. A group of access rules are used to perform a network security task or function.
Different configurations for network devices have different sequences of rules. An automated process is presented herein that aligns these different sequences of rules to group them.
Reference is now made to
Phase I
Every pair of rules across configuration files is assigned a value that represents how close the two rules are, referred to as its “similarity score”. Specifically, the process starts when a new configuration file is received.
A configuration for a security appliance may take the form of:
access-list MAIL extended permit tcp host 10.1.1.38 host 10.1.1.7 eq smtp
access-list MAIL extended permit tcp host 10.1.1.31 eq smtp host 10.1.1.7 eq smtp
access-list MAIL extended permit tcp host 10.1.1.32 eq smtp host 10.1.1.7 eq smtp
access-list MAIL extended permit tcp host 10.1.1.33 eq smtp host 10.1.1.7 eq smtp
access-list MAIL extended permit tcp host 10.1.1.34 eq smtp host 10.1.1.7 eq smtp
access-list MAIL extended permit tcp host 10.1.1.35 eq smtp host 10.1.1.7 eq smtp
access-list MAIL extended permit tcp host 10.1.1.36 eq smtp host 10.1.1.7 eq smtp
access-list MAIL extended permit tcp host 10.1.1.37 eq smtp host 10.1.1.7 eq smtp
Phase I consists of operations 205, 210 and 215. At 205, a determination is made as to whether there are other configuration files (e.g., Config B) in the database for a security appliance. If so, then at 210, a rule-wise comparison is made between rules in the two configuration files (e.g., Config A and Config B). At 215, a score from the rule-wise comparison is logged in a rule-wise comparison matrix. Operations 210 and 215 are repeated until every rule in a configuration file is compared with every rule in the other configuration file. The output of operation 215 is a comparison matrix, and it is supplied as input to Phase II. A comparison matrix contains similarity scores between configurations.
Phase II
In Phase II, given the comparisons from the Phase I, using every pair of configuration files, “sub-classifications” (also referred to herein as “sub-classes”) are built. For a given pair of configuration files (A,B), a sub-classification contains a set of lines from file A and a set of lines from file B whose match scores are above a variable threshold. The threshold is configured at 220 and matches that are above the threshold are flagged as “interesting” in operation 225.
At operation 225, this alignment of comparison scores can be performed by any modular array alignment algorithm. An alignment algorithm partially matches two arrays so as to maximize the scores of aligned values while minimizing how much the arrays need to be disturbed from their starting formation to form the alignment. Many examples of such algorithms come from the computational biology space, where there are many known algorithms for solving the problem of local DNA sequence alignment (Smith-Waterman, for example). The Smith-Waterman algorithm performs local sequence alignment for determining similar regions between two strings or nucleotide or protein sequences. The Smith-Waterman algorithm compares segments of all possible lengths and optimizes a similarity measure. Traditional alignment problems treat the matches as binary (complete match or mismatch), whereas in this case most scores will be fuzzy (i.e. scores between 0 and 1), requiring slight modifications.
Phase III
Given the sub-classifications from the Phase II, they are merged into full classifications across all of the configuration files, which will eventually become the basis of a user's policy. Reference is now made to
At 230, the set of pairwise matches from operation 225 of Phase II (
As shown in
More concretely, assume a user has selected a line a for which they wish to see suggested classifications. Taking the configuration file A containing line a, an incidence matrix is constructed between lines in configuration file A and the sub-classifications from Phase 2 which refer to configuration file A, where a cell (x,C) will have a value of 1 if and only if line x was contained within sub-classification C, else 0.
Starting from line a, a block of lines is built out surrounding that line which contains the greatest percentage score in the incidence matrix. A set of lines will have a higher score if those lines share the same sub-classifications (incidences). Given that block, sub-classes which use that block may be merged to arrive at a final suggested classification.
Each incidence matrix compares lines in a single configuration file against sub-classifications (output from Phase II) of that configuration file. For example, configuration file A will result in an incidence matrix of the lines of configuration file A against sub-classes drawn from the results of Phase II on the pairs (A,B), (A,C), etc., for all other configuration files. A cell (x,cAB) in the matrix will have a value of 1 if and only if line x (from configuration file A) was contained within subclass cAB (generated between configure files A and B) else 0. By these operations, the lines of configuration file A can find the optimal sub-classes to merge to find their ideal classification across the set of configuration files.
This operation may be performed multiple times with different weights to arrive at different suggestions for the user. For a global classification of all lines across all configuration files, the above algorithm is run for each line and the highest scoring results are chosen.
Consider two rules r and r′. Consider configuration files that have rules as: [ . . . r,r′ . . . ], [ . . . r,r″,r′], [ . . . r′,r . . . ].
The rules r,r′ should form a policy block. However, in the case [ . . . r,r′ . . . ], [ . . . r . . . r′] or the case [ . . . r,r′ . . . ], [ . . . r . . . ], [ . . . r′ . . . ] the rules r,r′ should not be a block.
Allow SMTP traffic to Object Group Exchange servers
Allow POP3 traffic to Object Group Exchange servers
Allow HTTPS traffic to Object Group exchange Servers
These lines should probably be grouped.
Block all ICMP traffic
Allow port 15672 to server 10.0.0.1
Allow port 3712 to server 192.168.1.13
These lines should probably not be grouped.
The foregoing present mechanisms by which many configuration files can be correlated and simplified into a set of policy classes. A single rule with exact text matching is easy to find in configuration files. The methods presented herein allow for the ability to create a block of rules, with some variations within the configuration file and across configuration files and yet be able to extract a block of rules common to such group of configuration files, where number of files, variations within and between files are parameters. This helps in creating a more uniform policy. In other words, as a result of the blocks of rules that are created, it is possible to generate one or more configuration policies to be applied across a plurality of network devices.
As explained above, the methods presented herein involve pairwise comparisons, though Phase III achieves a larger convergence where there are “merges” based on the previous phases. The “pairs” being compared in Phase 1 and Phase II are different. In Phase I, single-line rules from configurations are compared for similarity.
In Phase II, two configuration files are compared to find what are called sub-classifications, which are blocks of rules common between the two files. For example, the lines a, b, and c may be found close to each other in configuration A. The sub-classification algorithm may then detect that similar (according to Phase I) lines a′, b′, and c′ are found in configuration B. A sub-classification may be formed between A and B for this pair ({a,b,c},{a′,b′,c′}).
Phase III does not involve pairwise comparisons. Phase III takes these sub-classifications and a target rule, and creates a classification, i.e., a block of common rules across all files (F1, F2, . . . Fn) that contains this target rule.
To clarify, the method is independent of the specific scoring mechanisms. In Phase 1, the algorithm generates a scoring matrix comparing individual rules using the scoring mechanism linked above (or any other scoring mechanism).
Reference is now made to
A goal of policy block creation is to classify a “chunk” of access rules and try to identify a similar policy block. It is desirable to unify access groups and create a single point of management. Hoping to find exact matches across network device configuration files will prove futile, because at some point some network administrator may have changed some policy somewhere. Also, different local area networks in a organization's network may have differing configurations. A generalized mechanism is useful to identify similar access-groups, and allow for parameterization and access group management. Access groups are a logical collection of access lists, e.g., access control lists (ACLs). Access groups determine the entire access policy for a particular network component, such as allow Network Time Protocol (NTP) in, but do not allow Simple Mail Transfer Protocol (SMTP) access, etc.
The process 300 shown in
Classification speed and optimization. If speed cannot be achieved, it seeks to delay classification until a user explicitly decides to classify a policy block.
Classification sensitivity. The algorithm should not be sensitive to the order of input among classes.
Use classification as a means to promote parameterization.
Potentially allow classifying different files simultaneously.
Delegate as much logic as possible to a database.
The process 300 of
For the sake of efficiency, the following assumptions are made in connection with the process 300. A new access group (collection of policy rule statements or access rules) is either (a) allowed to join one/many clusters, or (2) creates a new cluster. Whenever a cluster is created, there are exactly two members in the cluster. Access groups with a very high incidence of similarity are combined together.
A stamp is generated for each natural grouping in a network device, e.g., a firewall. The process 300 involves attempting to create clusters that are similar to each other. This reduces the problem space significantly. Then once these clusters are created, similarities comparisons are run within those as new candidate rules are analyzed to determine whether to join new rules to a cluster.
The process 300 beings at 302 when an access group is introduced into the system. At 305, all shared access groups are identified and coalesced and a call is made to determine if there are any exact match intersection sets. That is, a comparison is made between the new access groups and all existing access groups to determine is there is an exact match. At 310, it is determined whether the response/output of the comparison is empty. If the response is not empty, then at 315, a cluster of inconsistent objects is created, which all have the same ACLs but some of them may be out of order. On the other hand, if it is determined at 310 that the response is empty, then a similarity detection is initiated at 320. Specifically, at 320, a set membership query is created that is run against all existing similarity clusters. This is the same Jaccard coefficient call, but this time on a cluster rather than other access groups. This generates a query to determine whether the access group is a member of an existing cluster. At 325, it is determined whether the response/output of operation 320 is empty. If the response is empty, then a flow is entered to create a cluster. Specifically, at 330, a set intersection call is made with a relatively large Jaccard threshold. This is a query that calculates the Jaccard coefficient with another access group. It filters access groups by their Jaccard coefficient. So if the coefficient is ‘high’ (above the threshold referred to herein) it is determined that this is a ‘similar’ access group and therefore a cluster is created with it. At 335, it is determined whether the response of operation 330 is empty. At 340, if the response is not empty, then a new cluster is added to the system with the two access groups that were undergoing comparison.
If at operation 325, it is determined that the response is not empty, then at 345, using the first cluster, an association is created between the on-boarded access group. After learning from operation 320 that an access group can belong to a cluster, operation 325 simply adds this access group to the cluster. An access group is included in a cluster by creating an “association.” Data model associations are a way of creating relations in non-relational databases.
At 350, it is determined whether the intersection (determined at 320) is an exact match. If it is not an exact match, then at 355, the cluster's stamp set is calculated. In operation 320, where it is queried whether an access group can belong to a cluster, the cluster needs a set of stamps (just like the one that the access group has). The stamp set of a cluster is the intersection of the set of stamps of all access groups in the cluster. Moreover, operation 355 is performed also after operation 340.
Next, two asynchronous operations are performed at 360 and 365. At 360, a similarity between access rules is computed, such as by the process depicted in
At 370 and 375, it is determined whether there are other access group clusters. At 370, if there are other access group clusters, then the flow returns to operation 345 and at 375, if there are other access group clusters, the flow returns to operation 340. In other words, both of operations 335 and 345 are repeated if there are other access group clusters that can be merged with the current access group cluster. If there are no more access group clusters, then the process ends at 380.
Furthermore, as is apparent from
Access Group Data Model
Below an example of an access group data model is provided. The active rules in an access group are: RuleAction, protocol, sourceAddress, sourcePort, destinationAddress, destinationPort. A set intersection is performed on access rule stamps (denoted “accessRuleStamps”). The stamp could be as simple as a hash computation on an access rule or some small numeric representation of some access rule. In the example data model below, two access rules are shown, each having RuleAction, Protocol, sourceAddress, sourcePort, destinationAddress, destinationPort. For example, the stamp for the first access rule is denoted “12345” and the stamp for the second access rule is denoted “45678”.
The stamp can be abstract. It does not have to be tied to an access rule. It could be based on the number of external IP addresses, for example.
The “numAccessRules” quantity indicates the number of access rules in a set or group and should be equal to the number of access rule stamps for the set. In the example below, “numAccessRules” is equal to 2 because there are two access rules in the set or group. The quantity numAccessRules is useful when deciding whether to compare two sets of access rules because if the quantity numAccessRules greatly differs between two given sets, it is not worthwhile to compare them because it is unlikely if not impossible to result in a similarity greater than a reasonable threshold, such as 0.7.
Access Group Data Model Example
Below is an example of a cluster data model. The cluster is a representation of all of its members—that is, all of the rules that have been grouped together due to similarities. The name given to this example cluster data model is “outside-ad-in”. Noteworthy is that the example cluster data model below includes quantities “maxNumAccessRules” and “minNumAccessRules” parameters. maxNumAccessRules represents the maximum number of access rules a member of this cluster has, and minNumAccessRules is the minimum number of access rules the child has. Also, accessRuleStamp is the intersection of the access rules stamps in the cluster.
To avoid having the access rule stamp diminish rapidly for a cluster, a higher Jaccard threshold may be set for the cluster membership phase. Also, to encourage creation of a cluster block initially, the Jaccard threshold for the access-group query can be lower (e.g., 0.6).
Cluster Data Model Example
In one form, the data from the plurality of network devices is uploaded from a remote location (e.g., a customer datacenter/network) to a centralized management entity data (e.g., the management entity 110). Further still, the management entity may send data to the network to deploy the one or more configuration policies on the plurality of network devices. The data representing the policy rules may be obtained from configuration files for the plurality of network devices.
The comparing operation 420 may involve comparing every pair of policy rules across configuration files for the plurality of network devices to generate a similarity score indicating similarity between two rules of a given pair of policy rules. Further still, the comparing operation 420 may involve generating a plurality of sub-classifications, each containing a set of policy rule statements from a first configuration file and a second of policy rule statements from a second configuration file whose similarity score are above a threshold. Moreover, the plurality of sub-classifications may be combined across the configuration files for the plurality of network devices. In combining, for a given policy rule statement of the first configuration file, an incidence matrix is generated between lines of the first configuration file and a plurality of sub-classifications that refer to the first configuration file.
Turning now to
The processor(s) 510 may be a microprocessor or microcontroller (or multiple instances of such components). The network interface unit(s) 512 may include one or more network interface cards that enable network connectivity.
The memory 514 may include read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physically tangible (i.e., non-transitory) memory storage devices. Thus, in general, the memory 514 may comprise one or more tangible (non-transitory) computer readable storage media (e.g., memory device(s)) encoded with software or firmware that comprises computer executable instructions. For example, control software 516 includes logic to implement the operations described herein in connection with
Administrator 180 may interact with management entity 110 by way of a user device 520 that connects by way of a network (local area network (LAN) and/or wide area network (WAN)) 525 with the management entity 110. The user device 520 may be a personal computer (laptop, desktop), tablet computer, SmartPhone, etc.
In summary, a policy block is a group of lines of rules/statements across configuration files. Lines of rules/statements are grouped into a policy block. The operations described herein determine when multiple policies are to be merged together into one policy. Automatic grouping of access-list policies (using similarity algorithms) has proven invaluable for raising the low level firewall policies to the business policy level allowing better analysis of policies, on a single firewall and across firewalls and other network security products.
Thus, in one form, a method is provided comprising: uploading from a network that includes a plurality of network devices, data representing policy rules configured on the plurality of network devices; comparing the data representing the policy rules for similarities in order to group together policy rules based on their similarities; storing data representing a plurality of clusters, each cluster representing a group of policy rules that have been grouped together; and generating one or more configuration policies to be applied across the plurality of network devices using the data representing each of the plurality of clusters, while maintaining context of policy rule processing.
In another form, an apparatus is provided comprising: a network interface unit configured to enable communications over a network that includes a plurality of network devices; and a processor coupled to the network interface unit, wherein the processor is configured to: upload from the network, data representing policy rules configured on the plurality of network devices; compare the data representing the policy rules for similarities in order to group together policy rules based on their similarities; store data representing a plurality of clusters, each cluster representing a group of policy rules that have been grouped together; and generate one or more configuration policies to be applied across the plurality of network devices using the data representing each of the plurality of clusters, while maintaining context of policy rule processing.
In yet another form, one or more computer readable storage media are provided encoded with software comprising computer executable instructions and when the software is executed operable to perform operations comprising: uploading from a network that includes a plurality of network devices, data representing policy rules configured on the plurality of network devices; comparing the data representing the policy rules for similarities in order to group together policy rules based on their similarities; storing data representing a plurality of clusters, each cluster representing a group of policy rules that have been grouped together; and generating one or more configuration policies to be applied across the plurality of network devices using the data representing each of the plurality of clusters, while maintaining context of policy rule processing.
The above description is intended by way of example only. Although the techniques are illustrated and described herein as embodied in one or more specific examples, it is nevertheless not intended to be limited to the details shown, since various modifications and structural changes may be made within the scope and range of equivalents of the claims.
This application claims priority to U.S. Provisional Application No. 62/278,654, filed Jan. 14, 2016, the entirety of which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5764911 | Tezuka et al. | Jun 1998 | A |
5968176 | Nessett | Oct 1999 | A |
6021376 | Wilson | Feb 2000 | A |
6219053 | Tachibana et al. | Apr 2001 | B1 |
6442620 | Thatte | Aug 2002 | B1 |
6738908 | Bonn et al. | May 2004 | B1 |
7124192 | High, Jr. et al. | Oct 2006 | B2 |
7231661 | Villavicencio et al. | Jun 2007 | B1 |
7263719 | Jemes et al. | Aug 2007 | B2 |
7444395 | Sanghvi et al. | Oct 2008 | B2 |
7484237 | Joly et al. | Jan 2009 | B2 |
7516475 | Chen et al. | Apr 2009 | B1 |
7653930 | Griffin et al. | Jan 2010 | B2 |
7774830 | Dillaway et al. | Aug 2010 | B2 |
8140664 | Huang et al. | Mar 2012 | B2 |
8234387 | Bradley et al. | Jul 2012 | B2 |
8424053 | Gottimukkala et al. | Apr 2013 | B2 |
8429255 | Khan et al. | Apr 2013 | B1 |
8452876 | Williams et al. | May 2013 | B1 |
8490163 | Harsell et al. | Jul 2013 | B1 |
8719919 | Rice et al. | May 2014 | B2 |
8793763 | Williams et al. | Jul 2014 | B2 |
9325733 | Kolman | Apr 2016 | B1 |
9401933 | Dotan | Jul 2016 | B1 |
9769210 | Dotan | Sep 2017 | B2 |
20010032248 | Krafchin | Oct 2001 | A1 |
20020112043 | Kagami et al. | Aug 2002 | A1 |
20020169957 | Hale et al. | Nov 2002 | A1 |
20030021283 | See | Jan 2003 | A1 |
20030226038 | Raanan et al. | Dec 2003 | A1 |
20040025016 | Focke et al. | Feb 2004 | A1 |
20040103211 | Jackson | May 2004 | A1 |
20040193827 | Mogi et al. | Sep 2004 | A1 |
20040225877 | Huang | Nov 2004 | A1 |
20060129933 | Land et al. | Jun 2006 | A1 |
20060161879 | Lubrecht et al. | Jul 2006 | A1 |
20070220588 | Panda | Sep 2007 | A1 |
20080034401 | Wang | Feb 2008 | A1 |
20080126963 | Kim et al. | May 2008 | A1 |
20080183603 | Kothari et al. | Jul 2008 | A1 |
20080201760 | Centonze | Aug 2008 | A1 |
20080209505 | Ghai et al. | Aug 2008 | A1 |
20080216148 | Bienek et al. | Sep 2008 | A1 |
20080259919 | Monga | Oct 2008 | A1 |
20080307489 | Hubbard | Dec 2008 | A1 |
20080320549 | Bertino | Dec 2008 | A1 |
20090158385 | Kim et al. | Jun 2009 | A1 |
20100043053 | Wei | Feb 2010 | A1 |
20100082513 | Liu | Apr 2010 | A1 |
20100122208 | Herr et al. | May 2010 | A1 |
20110191460 | Sailhan | Aug 2011 | A1 |
20120046989 | Baikalov | Feb 2012 | A1 |
20120047575 | Baikalov | Feb 2012 | A1 |
20130246334 | Ahuja et al. | Sep 2013 | A1 |
20140029039 | Deter et al. | Jan 2014 | A1 |
20140075494 | Fadida | Mar 2014 | A1 |
20140109190 | Cam-Winget et al. | Apr 2014 | A1 |
20140165128 | Auvenshine et al. | Jun 2014 | A1 |
20150074750 | Pearcy et al. | Mar 2015 | A1 |
20160026620 | Gidney | Jan 2016 | A1 |
20160094561 | Jagtap | Mar 2016 | A1 |
20160212167 | Dotan | Jul 2016 | A1 |
20160212168 | Dotan | Jul 2016 | A1 |
20160212169 | Knjazihhin | Jul 2016 | A1 |
20160212170 | Martherus | Jul 2016 | A1 |
20160301717 | Dotan | Oct 2016 | A1 |
20160344738 | Dotan | Nov 2016 | A1 |
20160344773 | Knjazihhin | Nov 2016 | A1 |
20170032322 | Grover | Feb 2017 | A1 |
20170054757 | Siswick | Feb 2017 | A1 |
20170142158 | Laoutaris | May 2017 | A1 |
Number | Date | Country |
---|---|---|
11-288382 | Oct 1999 | JP |
2004-302751 | Oct 2004 | JP |
2012-150733 | Aug 2012 | JP |
Entry |
---|
Anonymous: “Creating Firewall Rules (reference)”, wiki.ipfire.org, Mar. 29, 2014, pp. 1-5, XP055258932, retrieved from the internet: https://web.archive.org/web/20140329122433/http://wiki.ipfire.org/en/configuration/firewall/rules/start [retrieved on Mar. 16, 2016]. |
Michael Teger, “Developer's Guide for Oracle Entitlements Server”, Oracle® Fusion Middleware, 11g Release 1 (11.1.2), E27154-01, Jul. 2012, 132 Pages. |
Matt Swain, “Chemical similarity search in MongoDB”, http://blog.matt-swain.com/post/87093745652/chemical-similarity-search . . . , Jun. 3, 2014, 16 pages. |
Number | Date | Country | |
---|---|---|---|
20170208094 A1 | Jul 2017 | US |
Number | Date | Country | |
---|---|---|---|
62278654 | Jan 2016 | US |