1. Field of Invention
The present invention relates in general to the field of computer networking. More specifically, embodiments of the present invention relate to systems and methods for the management of requests for Uniform Resource Locators (URLs) in computer networks.
2. Description of the Background Art
Many organizations use URL filtering software to prevent employees from accessing websites that are not relevant to their work or contain objectionable material. URL filtering involves blocking/allowing access to the site to which a URL points. Conventionally, URL filtering is performed at a firewall. After filtering, the request is sent to the server which hosts the website. On receiving a request for a URL from a requesting computer, the firewall sends the URL to a URL filtering server. The URL filtering server holds policies that define access rights for websites. In other words, rules that allow and deny access to websites, based on their URLs, are stored in the URL filtering server. On receiving the URL from the firewall, the URL filtering server checks the URL for the access rights and sends a response to the firewall. Based on the response, the firewall allows or denies the URL. If the URL is allowed by the URL filtering server, the firewall forwards the original request for the URL to a webserver, which responds with the contents of the website to which the URL points. If the URL is denied, the firewall sends an access denied webpage to the requesting computer.
The method for URL filtering, as described above, is process intensive as it involves processing at the firewall and the URL filtering server. Further, if the response from the URL filtering server is delayed, the requesting computer resends multiple requests for the URL. The method is not applicable for Virtual Private Networks (VPNs). VPNs are networks that use the Internet for communication between intranets of organizations, but are secure and cannot be accessed by computers that are not part of a VPN. Therefore, the access rights of each VPN have to be defined separately. In summary, the method of URL filtering is slow, wastes network resources and is not applicable to different types of networks.
Embodiments of the present invention provide a system for managing requests for URLs in a computer network. The system comprises a firewall, at least one URL filtering server and a webserver. The firewall comprises an exclusive domains list, which defines the filtering of URLs. In further embodiments, the firewall also includes an IP cache list for storing the responses from the URL filtering server. In further embodiments, the firewall also includes a response buffer for buffering the response of the webserver.
Embodiments of the present invention also provide a method for managing requests for URLs. Requests for URLs are scanned and the URLs are extracted from the requests. The URL is checked for in at least one exclusive domains list stored in a firewall. In case the exclusive domains list disallows the URL, the firewall blocks the URL. However, in case the exclusive domains list allows the URL, the URL is allowed.
Embodiments of the present invention also provide a method for controlling web access through a firewall comprising determining by a firewall that one of a plurality of URL filtering servers is not operable, and switching by the firewall to an operable URL filtering server.
Futher provided by embodiments of the present invention is a method for controlling web access of an organization comprising determining by a firewall if a URL filtering server is not operable. The method may additionally comprises denying all web access through the firewall after the determining by the firewall that the URL is not allowed, and allowing all web access through the firewall after said determining by the firewall that the URL is not allowed.
Further provided by embodiments of the present invention is an apparatus for filtering URL in a firewall comprising a processor and a machine-readable medium including instructions executable by the processor for: (i) sending through a firewall an HTTP request to a webserver, (ii) creating a URL request, (iii) sending the created URL request to a URL filtering server for determining if the URL request is acceptable or unacceptable, and (iv) buffering a response from the webserver until the URL filtering server determines if the URL is acceptable or unacceptable.
Embodiments of the present invention also provide an apparatus for storing a URL in a firewall comprising a processor, and a machine-readable medium including instructions executable by the processor for: (i) sending through a firewall an HTTP request to a webserver, (ii) creating a URL request, (iii) determining if the URL request is acceptable or unacceptable, and (iv) storing the URL acceptance or denial in the firewall.
Embodiments of the present invention also provide an apparatus for controlling web access through a firewall comprising a processor, and a machine-readable medium including instructions executable by the processor for: (i) determining by a firewall that one of a plurality of URL filtering servers is not operable, and (ii) switching by the firewall to an operable URL filtering server.
Embodiments of the present invention also provide an apparatus for controlling web access of an organization comprising a processor, a machine-readable medium including instructions executable by the processor for determining by a firewall if a URL filtering server is not operable.
Embodiments of the present invention also provide a system for filtering URL in a firewall comprising means for sending through a firewall an HTTP request to a webserver, means for creating a URL request, means for sending the created URL request to a URL filtering server for determining if the URL request is acceptable or unacceptable, and means for buffering a response from the webserver until the URL filtering server determines if the URL is acceptable or unacceptable.
Embodiments of the present invention also provide a system for storing a URL in a firewall comprising means for sending through a firewall an HTTP request to a webserver, means for creating a URL request, means for determining if the URL request is acceptable or unacceptable, and means for storing the URL acceptance or denial in the firewall.
Embodiments of the present invention also provide a system for controlling web access through a firewall comprising means for determining by a firewall that one of a plurality of URL filtering servers is not operable, and means for switching by the firewall to an operable URL filtering server.
These provisions together with the various ancillary provisions and features which will become apparent to those artisans possessing skill in the art as the following description proceeds are attained by devices, assemblies, systems and methods of embodiments of the present invention, various embodiments thereof being shown with reference to the accompanying drawings, by way of example only, wherein:
In the description herein for embodiments of the present invention, numerous specific details are provided, such as examples of components and/or methods, to provide a thorough understanding of embodiments of the present invention. One skilled in the relevant art will recognize, however, that an embodiment of the invention can be practiced without one or more of the specific details, or with other apparatus, systems, assemblies, methods, components, materials, parts, and/or the like. In other instances, well-known structures, materials, or operations are not specifically shown or described in-detail to avoid obscuring aspects of embodiments of the present invention.
The present invention provides a method, a system and a computer program product for Uniform Resource Locator (URL) filtering in a computer network. URL filtering involves blocking/allowing access to the website to which a URL or a domain name points.
Exclusive domains list 206 comprises access rights for commonly requested URLs. These URLs are often requested by computers from firewall 104. In an exemplary embodiment of the present invention, these URLs are decided based on a statistical analysis of the requests from the computers in a predefined period of time, for example, in a month. Further, a network administrator can modify exclusive domains list 206 to include specific URLs. Examples of URLs present in exclusive domains list 206 include URLs for important information sources, for popular e-mail providers and for search engines. An organization can also allow the URL for its own website. Similarly, exclusive domains list 206 can disallow access to websites that contain objectionable material. Further, exclusive domains list 206 can comprise complete and partial domain names. An example for a complete domain name is ‘www.yahoo.com’. If exclusive domains list 206 disallows ‘www.yahoo.com’, then computers cannot access the Yahoo website and also pages that are part of the same domain name, for example ‘www.yahoo.com/news’ and ‘www.yahoo.com/mail’. An example for a partial domain name is ‘.cisco.com’. If exclusive domains list 206 allows ‘cisco.com’, then computers can access the Cisco website, i.e., ‘www.cisco.com’ and also other websites that are part of the Cisco domain name, for example ‘www.cisco.com/products’ and ‘www.cisco.com/services’. Further, URLs that are variants of the partial domain name are also allowed. Therefore, computers can also access, for example, ‘people.cisco.com’ and ‘newsroom.cisco.com’.
In accordance with one embodiment of the present invention, IP cache list 204 and exclusive domains list 206 are stored in Non-Volatile Random Access Memories (NVRAMs). IP cache list 204 and exclusive domains list 206 can also be stored in other forms of storage, such as compact flash cards or hard disk drives.
In one embodiment of the present invention, URLs in IP cache list 204 are stored as a hash table. In the hash table, URL's are divided into categories or buckets that are substantially of equal size. Usage of a hash table for storing URLs reduces the time for searching for a URL in IP cache list 204. In another embodiment, URLs in IP cache list 204 and exclusive domain list 206 are stored in an array.
The time taken in searching for a URL in exclusive domains list 206 or IP cache list 204 is dependent on the number of URLs in exclusive domains list 206 or IP cache list 204. Therefore, in an exemplary embodiment of the present invention, the number of URLs in exclusive domains list 206 and IP cache list 204 is restricted to 5000 each.
At step 620, HTTP module 202 checks whether the URL is allowed or not. HTTP module 202 decides whether the URL is allowed or disallowed on the basis of the contents of IP cache list 204, exclusive domains list 206 or the response of URL filtering server 108. If the URL is allowed, HTTP module 202 sends the contents of the website to which the URL points, to computer 102 at step 622. In case the URL is not allowed, HTTP module 202 blocks the URL at step 624. This means that the buffered contents of the website stored in response buffer 210 are removed. In case the contents of the website are not received from webserver 106, HTTP module 202 closes the connection to webserver 106. Webserver 106 then rejects the contents of the website when they arrive. Further, HTTP module 202 sends an access denied page, as shown in
In accordance with another embodiment of the present invention, system 100 further comprises a plurality of secondary URL filtering servers. Plurality of secondary URL filtering servers enables controlling of web access in, for example an organization, through firewall 104. In case, URL filtering client 208 determines that URL filtering server 108 is not operable, URL filtering client 208 sends the URL to a secondary URL filtering server. URL filtering server 108 is inoperable if, for example, the TCP connection between URL filtering server 108 and URL filtering client 208 is disconnected. Secondary URL filtering servers ensure that even when URL filtering server 108 is inaccessible, requests for URLs are served. In case no response is received from the secondary URL filtering server, URL filtering client 208 sends the URL to another secondary URL filtering server. Further, in case none of the secondary URL filtering servers send a response to URL filtering client 208, system 100 serves the request for the URL based on an ‘allow mode’. If the allow mode is set to ‘on’ and no response is received from any URL filtering server, then all requests for URLs are served. In case the ‘allow mode’ is set to ‘off’ and no response is received from any URL filtering server, then all requests for URLs are disallowed. In this case, the access denied page informs computer 102 that no URL filtering server is active, and hence, all requests are disallowed.
Access rights for URLs can be defined on the basis of the users within an organization. For example, an organization may wish to disallow its employees to visit the website of a competitor organization. However, the management of the organization may want to view the website to identify the research interests of the competitor. In this case, access rights to the URL for the website have to be different for the users. As mentioned earlier, URL filtering client 208 sends the IP address of computer 102 or the username of the user of computer 102 to URL filtering server 108. In an exemplary embodiment of the present invention, URL filtering server 108 stores access rights for URLs based on user permissions. URL filtering server 108 decides whether computer 102 (or the user of computer 102) is allowed to view the requested website or not. This system for allowing access to websites based on user permissions can be implemented with the help of user authentication systems and protocols, such as NT LanMan system (NTLM), Lightweight Directory Access Protocol (LDAP), Terminal Access Controller Access Control System (TACACS), and Remote Access Dial-In User Service (RADIUS).
VPNs use routing and forwarding tables to route IP data packets between the various computers that are a part of the VPNs. These tables also support routing and forwarding IP data packets to and from the Internet. Routing and forwarding IP data packets in VPNs with the help of routing and forwarding tables is known as VPN routing and forwarding (VRF). VRF tables are stored at Provider Edge (PE) routers. These routers act as interfaces between VPNs and MPLS networks of network services providers.
As shown in
The present invention offers many advantages. Presence of an exclusive domains list and an IP cache list reduces the involvement of URL filtering servers while filtering URLs. This reduces the amount of processing. Further, as access rights for a URL are obtained at the firewall itself, the time for filtering is reduced. Finally, multiple requests for URLs, due to network delays, are reduced.
Although the invention has been discussed with respect to specific embodiments thereof, these embodiments are merely illustrative, and not restrictive, of the invention. For example, firewall 104 can be embodied in any computing device such as a router to manage the request for URLs.
Although specific protocols have been used to describe embodiments, other embodiments can use other transmission protocols or standards. Use of the terms ‘peer’, ‘client’, and ‘server’ can include any type of device, operation, or other process. The present invention can operate between any two processes or entities including users, devices, functional systems, or combinations of hardware and software. Peer-to-peer networks and any other networks or systems where the roles of client and server are switched, change dynamically, or are not even present, are within the scope of the invention.
Any suitable programming language can be used to implement the routines of the present invention including C, C++, Java, assembly language, etc. Different programming techniques such as procedural or object oriented can be employed. The routines can execute on a single processing device or multiple processors. Although the steps, operations, or computations may be presented in a specific order, this order may be changed in different embodiments. In some embodiments, multiple steps shown sequentially in this specification can be performed at the same time. The sequence of operations described herein can be interrupted, suspended, or otherwise controlled by another process, such as an operating system, kernel, etc. The routines can operate in an operating system environment or as stand-alone routines occupying all, or a substantial part, of the system processing.
In the description herein for embodiments of the present invention, numerous specific details are provided, such as examples of components and/or methods, to provide a thorough understanding of embodiments of the present invention. One skilled in the relevant art will recognize, however, that an embodiment of the invention can be practiced without one or more of the specific details, or with other apparatus, systems, assemblies, methods, components, materials, parts, and/or the like. In other instances, well-known structures, materials, or operations are not specifically shown or described in detail to avoid obscuring aspects of embodiments of the present invention.
Also in the description herein for embodiments of the present invention, a portion of the disclosure recited in the specification contains material, which is subject to copyright protection. Computer program source code, object code, instructions, text or other functional information that is executable by a machine may be included in an appendix, tables, figures or in other forms. The copyright owner has no objection to the facsimile reproduction of the specification as filed in the Patent and Trademark Office. Otherwise all copyright rights are reserved.
A “computer” for purposes of embodiments of the present invention may include any processor-containing device, such as a mainframe computer, personal computer, laptop, notebook, microcomputer, server, personal data manager or “PIM” (also referred to as a personal information manager or “PIM”) smart cellular or other phone, so-called smart card, set-top box, or any of the like. A “computer program” may include any suitable locally or remotely executable program or sequence of coded instructions which are to be inserted into a computer, well known to those skilled in the art. Stated more specifically, a computer program includes an organized list of instructions that, when executed, causes the computer to behave in a predetermined manner. A computer program contains a list of ingredients (called variables) and a list of directions (called statements) that tell the computer what to do with the variables. The variables may represent numeric data, text, audio or graphical images. If a computer is employed for synchronously presenting multiple video program ID streams, such as on a display screen of the computer, the computer would have suitable instructions (e.g., source code) for allowing a user to synchronously display multiple video program ID streams in accordance with the embodiments of the present invention. Similarly, if a computer is employed for presenting other media via a suitable directly or indirectly coupled input/output (I/O) device, the computer would have suitable instructions for allowing a user to input or output (e.g., present) program code and/or data information respectively in accordance with the embodiments of the present invention.
A “computer-readable medium” for purposes of embodiments of the present invention may be any medium that can contain, store, communicate, propagate, or transport the computer program for use by or in connection with the instruction execution system, apparatus, system or device. The computer readable medium can be, by way of example only but not by limitation, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, system, device, propagation medium, or computer memory. The computer readable medium may have suitable instructions for synchronously presenting multiple video program ID streams, such as on a display screen, or for providing for input or presenting in accordance with various embodiments of the present invention.
Reference throughout this specification to “one embodiment”, “an embodiment”, or “a specific embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention and not necessarily in all embodiments. Thus, respective appearances of the phrases “in one embodiment”, “in an embodiment”, or “in a specific embodiment” in various places throughout this specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics of any specific embodiment of the present invention may be combined in any suitable manner with one or more other embodiments. It is to be understood that other variations and modifications of the embodiments of the present invention described and illustrated herein are possible in light of the teachings herein and are to be considered as part of the spirit and scope of the present invention.
Further, at least some of the components of an embodiment of the invention may be implemented by using a programmed general purpose digital computer, by using application specific integrated circuits, programmable logic devices, or field programmable gate arrays, or by using a network of interconnected components and circuits. Connections may be wired, wireless, by modem, and the like.
It will also be appreciated that one or more of the elements depicted in the drawings/figures can also be implemented in a more separated or integrated manner, or even removed or rendered as inoperable in certain cases, as is useful in accordance with a particular application. It is also within the spirit and scope of the present invention to implement a program or code that can be stored in a machine-readable medium to permit a computer to perform any of the methods described above.
Additionally, any signal arrows in the drawings/Figures should be considered only as exemplary, and not limiting, unless otherwise specifically noted. Furthermore, the term “or” as used herein is generally intended to mean “and/or” unless otherwise indicated. Combinations of components or steps will also be considered as being noted, where terminology is foreseen as rendering the ability to separate or combine is unclear.
As used in the description herein and throughout the claims that follow, “a”, “an”, and “the” includes plural references unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
The foregoing description of illustrated embodiments of the present invention, including what is described in the Abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed herein. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes only, various equivalent modifications are possible within the spirit and scope of the present invention, as those skilled in the relevant art will recognize and appreciate. As indicated, these modifications may be made to the present invention in light of the foregoing description of illustrated embodiments of the present invention and are to be included within the spirit and scope of the present invention.
Thus, while the present invention has been described herein with reference to particular embodiments thereof, a latitude of modification, various changes and substitutions are intended in the foregoing disclosures, and it will be appreciated that in some instances some features of embodiments of the invention will be employed without a corresponding use of other features without departing from the scope and spirit of the invention as set forth. Therefore, many modifications may be made to adapt a particular situation or material to the essential scope and spirit of the present invention. It is intended that the invention not be limited to the particular terms used in following claims and/or to the particular embodiment disclosed as the best mode contemplated for carrying out this invention, but that the invention will include any and all embodiments and equivalents falling within the scope of the appended claims.