The present invention generally relates to communication networks like for instance the Internet. The invention more particularly resolves the problem of identifying sources of traffic and creating awareness with network operators and service providers of the applications delivered over their networks by these traffic sources.
Network operators and Internet Service Providers (ISPs) are facing an increasing need to monitor and control traffic and applications that are delivered over their networks by specific sources. Identification and a better understanding of the applications that cause traffic increases in the operator's network will enable the operator or ISP to negotiate and install source specific traffic policies in its network.
An existing tool for monitoring and controlling traffic is called Deep Packet Inspection (DPI) or Complete Packet Inspection, described for instance in Wikipedia at the following URL:
http://en.wikipedia.orgiwiki/Deep_packet_inspection
DPI consists in creating a packet inspection point in the data path where packet inspection hardware can identify the type of traffic where a packet belongs to. Knowing the traffic category where a packet belongs to, for instance TCP
(Transmission Control Protocol) or HTTP (Hypertext Transfer Protocol), does not enable to identify the source of the traffic, let aside the application that delivers the packet. Further, DPI devices are installed in the data path and therefore have to inspect and process the packets within very tight delay constraints, i.e. real-time processing at typical speeds of 10 to 40 Gbps (Gigabits per second) in today's networks. DPI devices hence require a high processing power and are therefore rather costly hardware solutions that do not meet the network operator's requirements in terms of identifying the sources and applications of traffic.
Known improvements of DPI consist in correlating the contents or behaviour of multiple packets in order to obtain more detailed information on the HTTP or TCP flows. By correlating certain re-directs, or by correlating the content of data packets with the URL that was used to retrieve an HTTP service or with the IP address and MAC address of the subscriber's residential gateway, more advanced DPI devices may be able to obtain or reconstruct more detailed information on the HTTP or TCP flows. However, such correlation techniques further increase the real-time processing requirements for DPI devices, making these devices even more complex and costly, and still do not enable to identify the exact source of traffic, the application that delivers the traffic, or the content of the traffic.
In summary, although DPI devices enable to categorize traffic in some categories, like HTTP, P2P, etc., they do not meet today's requirements for identifying traffic, sources, and applications, and they involve complex and costly hardware for real-time packet processing in the data path.
It is an objective of the present invention to provide a method and device that resolve the above mentioned drawbacks of existing traffic monitoring solutions. In particular, it is an objective to provide a method and device that enable to identify the source, application or content of traffic more detailed in order to enable network operators and ISPs to install and apply source specific policies in their networks.
According to the present invention, the above identified objectives are realized through a method for building a source database by a scout agent with network connectivity as defined by claim 1, the method comprising for a traffic source in the network the steps of:
Thus, a scout agent, i.e. an application or set of software programs installed in a data centre with network connectivity, according to the invention populates and maintains a database of addresses, ports, protocols and application traffic profiles for every important traffic source, e.g. server, on the network. In case of the Internet, the address information corresponds to the IP address of the traffic source, the port information corresponds to the source port number, and the protocol information corresponds to TCP (Transmission Control Protocol) or UDP (User Datagram Protocol). The application traffic profile information contains all important cross-layer information of the IP traffic sources and must therefore at least identify the application(s) supported by the IP traffic source, the codecs used, and a description of the temporal properties of the sourced IP traffic such as the average bit rate, burst size, jitter, etc. The approach in accordance with the invention, based on a source database is fundamentally different from the DPI approach based on real-time packet inspection in the critical data path. In comparison with traditional DPI, the scout agent and the resulting source database according to the current invention provide increased specificity of the traffic sources and applications, and they need not be placed in the critical data path. As a consequence, its processing requirements and cost are substantially below that of traditional DPI devices, whereas its accurateness in identifying and characterizing traffic sources and applications is much better. An advantage thereof is that the source database built according to the present invention can be used to generate and apply traffic policing rules for individual traffic sources or traffic sources from a service provider.
As is indicated by claim 2, the application traffic profile information in the method according to the invention at least comprises:
The information indicative for the supported application may for instance identify the type of application, e.g. video or audio, or may be more specific and identify for instance the exact video application like Hulu, Youtube, iTunes, Bittorent, etc. The information indicative for the codec used may for instance identify the encoding mechanism, like mp4, h264, etc. in case of video traffic, or mp3, way, etc. in case of audio. The information indicative for temporal properties of the traffic can be extracted from the Quality of Service profile of the traffic source, and will typically contain parameters like the average video bit rate, the burst size, jitter, etc. It is noticed that the scout agent may deduce the Quality of Service profile of a traffic source by acting as a client application and monitoring the application behaviour in terms of its traffic properties.
Optionally, as defined by claim 3, the method according to the invention further comprises:
Thus, the scout agent may optionally also collect application metadata such as the name of the application or service, the company offering the service, the content delivery network, the domain offering the service, the URLs or links involved in delivering the service, the applications involved in delivering the service, the geographical location of the servers involved in delivering the service, the delivered content, the company that is the source of the content, etc. Thanks to such information, the source database will not only be useful for generating and installing traffic policy rules, but will also be useful to build and deliver detailed reports on the traffic from specific sources or applications, e.g. to the network operator or service providers.
As is indicated by claim 4, application metadata in the context of the current invention may comprise one or more of the following:
The application name may for instance be iTunes, Hulu, Youtube, iPlayer, a web browser name, etc. Information indicative for the geographic location may be the state(s) or province(s) wherein the IP addresses or range of IP addresses used by all servers involved in the delivery of the content are registered. Information indicative for the owner or creator could be the name of the company that is the source of the content, like for instance NBC, RTL, etc. The content delivery network may be identified by its domain name, for instance akamai.com, limilight.com, etc. The invention is obviously not limited to these examples of application metadata.
Optionally, as defined by claim 5, the steps of learning and storing may be triggered manually, based on user instruction.
Indeed, in order to instruct the scout agent what traffic sources to contact and build profiles of in the source database, the scout agent may be configured manually with the addresses of important content sources, e.g. popular video websites.
Alternatively, as defined by claim 6, the steps of learning and storing may be triggered automatically, based on instruction of the traffic source.
As an alternative to manual configuration, the scout agent may receive automated instructions identifying important content sources. These automated instructions may be received from flow monitoring processes that run in the network and discover what addresses of services are popular, as is indicated by claim 7.
Also optionally, as defined by claim 8, the steps of learning and storing for the traffic source may be repeated event driven.
Thus, updates of the source database may be triggered by events.
Alternatively, as defined by claim 9, the steps of learning and storing for the traffic source may be repeated periodically.
Hence, as an alternative to event-based updates of the source database, the content of the database may be updated at a regular pace or frequency.
In addition to a method for building a source database as defined by claim 1, the current invention also applies to a scout agent for building a source database as defined by claim 10, the scout agent having means for network connectivity and further comprising:
The scout agent typically will be an application or set of software programs installed in a data centre with network connectivity, either centralized or distributed, either fixed or mobile. The scout agent is manually configured to contact traffic sources, receives instructions from a flow monitoring process running in the network to contact certain popular traffic sources, or spiders across websites to detect and identify popular sources of for instance video and audio traffic. The scout agent further uses a scripted application to contact the traffic sources and collect the source information (address, ports and protocols) and application meta-information.
Further, the present invention also relates to the resulting source database as defined by claim 11, adapted to store upon instruction of a scout agent with network connectivity for a traffic source in a network:
The information that the scout agent 101 collects, includes all the detailed information that is available to an application user. In other words, it contains all important cross-layer information of IP sources and applications, including besides network address and protocol information, also the application traffic profiles, information on the content delivered via the applications, and information on the companies that are involved with the full delivery chain of the application.
In more detail, the scout agent 101 learns the network information like IP addresses, ports and protocol information (UDP/TCP) of important applications sources like 104 or VIDEO APPL, content delivery networks like 103 or CDN, and servers or content sources. The IP addresses of the latter servers or content servers may for instance be learned from index sites (INDEX SITES), peer-to-peer trackers (P2P TRACKERS) and peer-to-peer applications (P2P APPL) 105 as is indicated by arrow 151 in
The scout agent 101 collects IP source information and application meta-information. The scout agent 101 is an application or set of software programs running in a data center with Internet connectivity. The scout agent 101 can be mobile or fixed, can be centralized or distributed over different geographical locations in the Internet, and may be event-driven or periodically triggered.
There are two processes that instruct the scout agent 101 what IP sources to contact and build application traffic profiles of: a manual process and an automated process. In the manual process, a user instructs or configures the scout agent 101 to contact certain popular video websites and content sources. In the automated process, the scout agent 101 receives automated instructions of important IP sources from a monitoring process that runs in the network and logs IP flow information, like for instance NetFlow, sFlow, IPFIX or cflowd. This monitoring process will discover what IP addresses of services are popular in the network. The scout agent 101 thereupon will translate the IP flow information into application level contact information (e.g. a web URL) of the service that was the source of the IP flow, using an Autonomous System Number database like 111 or ASN DB, i.e. a database that contains a mapping between IP address ranges, autonomous systems and organizations. The scout agent 101 further uses a scaled-down web-browser client to contact the application or service, and a scripted application client to contact services, for instance using a modified version of iTunes, iPlayer, etc. The scout agent 101 thus spiders across websites and servers to find out about links to videos.
As a result, the IP source database 100 shall contain all relevant cross-layer information about IP sources. For video sources, the IP source data in source database 100 may for instance be organized and associated as follows:
Although the embodiment focuses on video services, it will be appreciated by any person skilled in the art that similar type of information can be collected for any other type of service.
The IP source database 100 can be used to generate network management signals (e.g. SNMP traps) based on application traffic, route or police traffic based on policy rules derived from the IP source database 100, and correlate network flow information with the IP source database 100 to build detailed reports for operators or ISPs. Usage of the IP source database for these purposes is described in detail in a counterpart patent application of the same applicant entitled “Network Management Method and Agent” that is incorporated herein by reference.
As will be explained in the following paragraph, the scout agent 201 contacts the Hulu server 202 (s.hulu.com) and logs all redirects that lead to the actual video server (80.154.118.29) that delivers the video stream. In other words: the scout agent 201 learns that a service is associated to a link (or URL) that leads to a video server 5-tuple (IP source address, IP destination address. IP source port. IP destination port, protocol). The scout agent 201 upfront finds out that some links on the Hulu website lead to video clips by monitoring incoming packets and traffic, by manual instruction or via an automated process. Such automated process will detect that a link on a page is using semantics that indicate a video, e.g. file type in the link or any other tag in the links. The scout agent 201 discovers that an incoming stream is video for instance by recognizing the encoding of the data.
As is indicated by arrow 211, the scout agent 201 with IP address 192.168.0.106 acts as a client and requests content info for the Daily Show episode from the Hulu server 202, s.hulu.com whose IP address 209.130.205.59 was learned through monitoring packets conveying video traffic or alternatively was configured manually. The Hulu server 202 knows only the URL of the Akamai CDN element 206 holding the requested content item, i.e. “cp47346.edgefcs.net”. Subsequently, the scout agent 201 needs to resolve this URL to an IP destination address. The scout agent 201 thereupon contacts the Domain Name Server or DNS 203 to resolve the URL “cp47346.edgefcs.net” of the Akamai CDN element 206. This is indicated by arrow 212 in
Just like with Hulu, the scout agent 301 learns that requests from a certain geo-location to a certain youtube videoclip will lead to the IP 5-tuple of a Google CDN video server. The scout agent 301 updates the IP source database continuously. This means that the scout agent continuously finds out about changes in the IP 5-tuple information and in the services that are delivered from these IP traffic sources.
Initially, the scout agent 301 with IP address 192.168.0.106 contacts the YouTube server 302 with IP address 208.65.153.253 and requests content info for the Daily Show episode. This is indicated by arrow 311 in
Although the present invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied with various changes and modifications without departing from the scope thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. In other it is contemplated to cover any and all modifications, variations or equivalents that fall within the scope of the basic underlying principles and whose essential attributes are claimed in this patent application. It will furthermore be understood by the reader of this patent application that the words “comprising” or “comprise” do not exclude other elements or steps, that the words “a” or “an” do not exclude a plurality, and that a single element, such as a computer system, a processor, or another integrated unit may fulfil the functions of several means recited in the claims. Any reference signs in the claims shall not be construed as limiting the respective claims concerned. The terms “first”, “second”, third“, “a”, “b”, “c”, and the like, when used in the description or in the claims are introduced to distinguish between similar elements or steps and are not necessarily describing a sequential or chronological order. Similarly, the terms “top”, “bottom”, “over”, “under”, and the like are introduced for descriptive purposes and not necessarily to denote relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances and embodiments of the invention are capable of operating according to the present invention in other sequences, or in orientations different from the one(s) described or illustrated above.
Number | Date | Country | Kind |
---|---|---|---|
09305528.3 | Jun 2009 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP10/58047 | 6/9/2010 | WO | 00 | 12/13/2011 |