Proxy media caching techniques typically use uniform resource locators (URLs) found in requests for media content to index and search media content in a cache. However, the same media content can be identified by multiple different URLs, which can cause the same media content to be stored multiple times in the same cache. For example, when different video files contain the same video content but are encoded in different formats (e.g., as a Flash video file, a Moving Picture Experts Group (MPEG) file, or a Windows Media Video (WMV) file), these video files are separately indexed and stored in the cache by their corresponding URLs. In addition, with the use of Content Delivery Networks (CDNs), a request for a specific media file may be translated into a different URL domain or path on each subsequent request. Consequently, conventional proxy media caching techniques often result in the same media content being stored as multiple different entries in the same proxy cache.
Systems and methods for proxy media caching are disclosed. A method for providing media content to a client device in accordance with an embodiment of the invention involves receiving at a proxy a response to a request for media content, generating a fingerprint from a sample of media content contained in the response, searching a cache using the fingerprint, and if a cache hit occurs, causing cached media content, which is associated with the cache hit, to be sent to the client device. This allows the cache to be organized according to fingerprints that are reflective of the actual media content, which simplifies cache organization and reduces the redundant caching of media content.
A method for providing media content to a client device in accordance with another embodiment of the invention involves receiving at a Hypertext Transfer Protocol (HTTP) proxy an HTTP response to an HTTP request for media content, generating a fingerprint from a sample of media content contained in the HTTP response, searching a cache using the fingerprint, and if a cache hit occurs, causing cached media content, which is associated with the cache hit, to be sent to the client device.
A proxy in accordance with an embodiment of the invention receives a response containing media content. The proxy includes a cache management module configured to generate a fingerprint from a sample of the received media content and to search a cache using the fingerprint. If a cache hit occurs, cached media content, which is associated with the cache hit, is sent to the client device.
A non-transitory computer readable medium in accordance with an embodiment of the invention stores program instructions executable by a processor, which when executed by the processor, perform the steps of receiving at a proxy a response to a request for media content from a client device, generating a fingerprint from a sample of media content contained in the response, searching a cache using the fingerprint, and if a cache hit occurs, causing cached media content, which is associated with the cache hit, to be sent to the client device.
Other aspects and advantages of embodiments of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrated by way of example of the principles of the invention.
Throughout the description, similar reference numbers may be used to identify similar elements.
It will be readily understood that the components of the embodiments as generally described herein and illustrated in the appended figures could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of various embodiments, as represented in the figures, is not intended to limit the scope of the present disclosure, but is merely representative of various embodiments. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by this detailed description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
Reference throughout this specification to features, advantages, or similar language does not imply that all of the features and advantages that may be realized with the present invention should be or are in any single embodiment. Rather, language referring to the features and advantages is understood to mean that a specific feature, advantage, or characteristic described in connection with an embodiment is included in at least one embodiment. Thus, discussions of the features and advantages, and similar language, throughout this specification may, but do not necessarily, refer to the same embodiment.
Furthermore, the described features, advantages, and characteristics of the invention may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize, in light of the description herein, that the invention can be practiced without one or more of the specific features or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments of the invention.
Reference throughout this specification to “one embodiment,” “an embodiment,” or similar language means that a particular feature, structure, or characteristic described in connection with the indicated embodiment is included in at least one embodiment. Thus, the phrases “in one embodiment,” “in an embodiment,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
The client device 102 is any networked device including, without limitation, a mobile phone, a smart phone, a personal digital assistant (PDA), a tablet, a set-top box, a video player, a laptop, or a personal computer (PC). In one embodiment, the client device is a wireless device that can support at least one of various different radio frequency (RF) communications protocols, including without limitation, Global System for Mobile communications (GSM), Universal Mobile Telecommunications System (UMTS), Code Division Multiple Access (CDMA), Worldwide Interoperability for Microwave Access (WiMax) and communications protocols as defined by the 3rd Generation Partnership Project (3GPP) or the 3rd Generation Partnership Project 2 (3GPP2), 4G Long Term Evolution (LTE) and IEEE 802.16 standards bodies. Although some wireless communications protocols are identified herein, it should be understood that the present disclosure is not limited to the cited wireless communications protocols. In another embodiment, the client device is a wired device that communicates with the proxy 104 through a communication interface, such as analog modem, ISDN modem or terminal adapter, DSL modem, cable modem, Ethernet/IEEE 802.3 interface, or a combination thereof. In another embodiment, the client device is connected to the proxy via a combination of wireless and wired communication interfaces.
The proxy 104 is in the data communications path between the client device 102 and the origin server 108 and is implemented in, for example, a proxy server or gateway. In one embodiment, the proxy is a transparent proxy that passes requests and responses (e.g., HTTP requests and responses) between client devices such as the client device 102 and host servers such as the origin server 108 without modifying the requests and responses. A proxy that simply passes requests and responses is often referred to as a gateway or tunneling proxy. In another embodiment, the proxy is a non-transparent proxy that can modify requests and responses between client devices and host servers in order to provide additional services. For example, a non-transparent proxy may provide media caching services, group annotation services, media type transformation services, or protocol reduction services. In one embodiment, the proxy is an HTTP proxy that can parse HTTP requests and HTTP responses. In the embodiment depicted in
The proxy 104 is coupled to the cache 106, which may be located in the same server as the proxy or may be located in a physically separate computer system. The cache is a storage device and/or storage system that stores data. In the embodiment depicted in
In one embodiment, the proxy 104 and the cache 106 are part of an access network 114, which provides a communications interface for the client device 102 to access the Internet or an intranet. Typical access networks include wireless service provider networks (e.g., that offer 3G, 4G and/or WiFi access) and Internet Service Providers (ISPs, e.g., that offer dial-up, DSL, and cable modem access). A private enterprise network can also serve as the access network if client devices within the private enterprise network can access the Internet through the private enterprise network. In one embodiment, the access network is a wireless service provider network that provides a wireless communications interface for the client device. The wireless service provider network is accessible on a subscription basis (e.g., prepaid or post-paid) as is known in the field. In an embodiment, the wireless service provider network is a closed domain that is accessible only by a subscriber (e.g. a user of the user device) that is in good standing with the operator of the wireless service provider network. The wireless service provider network may include a radio access network (not shown) and an Internet gateway (not shown). The radio access network includes one or more base stations to facilitate communications among wireless devices that are within a communication range of the base stations. Each base station has at least one RF transceiver and the base stations communicate with the wireless devices using RF communication signals. The radio access network facilitates network communications among multiple wireless devices within the same wireless service provider network and between wireless devices in other wireless service provider networks and provides interfaces to facilitate communications with other entities, such as a Public Switched Telephone Network (PSTN), a Wide Area Network (WAN), the Internet, Internet servers, hosts, etc., which are outside of the wireless service provider network. In an embodiment, the wireless service provider network is operated by a single wireless service provider, such as, for example, AT&T, VERIZON, T-MOBILE, and SPRINT. The Internet gateway (not shown) of the access network provides a gateway for communications between the client device 102 and Internet-connected hosts and/or servers, which can also be referred to as the “cloud.” The Internet gateway may include a Serving General Packet Radio Service (GPRS) Support Node (SGSN) and a Gateway GPRS Support Node (GGSN). For example, the Internet gateway can be a Wireless Application Protocol (WAP) gateway that converts the WAP protocol used by the access network (such as a wireless service provider network) to the HTTP protocol used by the Internet. In an embodiment, the Internet gateway enables the user device to access multimedia content, such as HTML, compact HTML (cHTML), and extensible HTML (xHTML), which is stored on Internet-connected hosts and/or servers. In this way, the access network provides access to the Internet for its subscribers.
The origin server 108 can be any device or system that hosts digital content, which can be stored in various formats, such as video files, audio files, and/or text files. In one embodiment, the origin server is an Internet-connected host or server that hosts Internet accessible content elements. The origin server may be a web server that can be accessed via, for example, HTTP, Internet Message Access Protocol (IMAP), or File Transfer Protocol (FTP). A content element is any set of digital data suitable for transfer in a networked environment, such as video files, markup language files, scripting language files, music files, image files or any other type of resource that can be located and addressed through, for example, the Internet.
Conventional proxy networks for wireless carriers or CDNs typically cache multiple different versions of the same media content, which may include video content, audio content, image content or other types of content, (including different formats and/or different resolutions) to accommodate the configurations and needs of various client devices. For example, conventional URL-based proxy cache systems will store multiple different versions of the same video content because each different version of the video content is identified by a different URL. Using this approach, limited cache resources are often consumed by redundant media content. In accordance with an embodiment of the invention, content fingerprints generated from media content are used to index the media content in a cache and to search the cache so that cached media content can be identified regardless of what version of the media content is received at the proxy. For example, video fingerprints generated from video content are used to index the video content in a cache and to search the cache so that cached video content can be identified regardless of what version of the video content is received at the proxy. Accordingly, media fingerprint caching can reduce the redundant storage of media content (e.g., video content) at a proxy, can reduce the load on a proxy backhaul network, and can provide a better user experience for users of wireless carriers and CDNs. An embodiment of a video fingerprint caching technique is described in more detail below.
Referring now to
Upon receiving the HTTP response, the proxy 104 determines if the HTTP response contains video content. For example, the proxy can check the content-type field of the response header. Once it is determined that the HTTP response contains video content, the cache management module 110 generates a video fingerprint from the video content and searches the cache 106 using the fingerprint. In an embodiment, the video fingerprint is an identifier that is generated from a sample of the video content that is carried in the HTTP response. In an embodiment, metadata related to the video content is used along with the video content sample to generate the video fingerprint. The video fingerprint is then used in a cache search operation to see if there is any cached video content that matches the fingerprint.
Depending on whether a cache hit/miss occurs, cached video content is sent to the client device. In addition, depending on whether a cache hit/miss occurs, video content contained in the HTTP response can be indexed by the fingerprint and stored in the cache 106. A cache hit occurs if the fingerprint matches an entry in the cache. For example, a cache hit occurs if the fingerprint matches a previously identified fingerprint in the cache. If a cache hit occurs, cached video content that is stored in the database 112 is sent to the client device 102. Cached video content can be sent to the client device via the cache management module 110 or through another communications path such as directly from the cache.
When a cache hit occurs and the cached video content and the requested video content have different formats, the cache management module 110 can send the cached video content to the client device 102 if the client device can support the format in which the cached video content is encoded. Alternatively, the cache management module 110 can dynamically re-multiplex or transcode the cached video content on the fly to a format that is supported by the client device or to the format specified in the HTTP request. If the cache management module 110 cannot dynamically re-multiplex the video content, the cache management module 110 can request that the origin server 108 return video content in a format that is supported by the client device. Alternatively, the cache management module 110 can request that the origin server return video content that is in the format specified in the HTTP request.
If a cache miss occurs, the video content in the received HTTP response can be sent to the client device 102 from the proxy 104. Additionally, if a cache miss occurs, a copy of the video content in the HTTP response can be stored in the cache and indexed by its fingerprint for future use. In an embodiment, additional criteria may be evaluated to determine whether or not to cache the video content. For example, a measure of the popularity of the video content may be used to determine whether or not to cache the video content.
In an embodiment, a TCP connection between the proxy 104 and the origin server 108 is established in order to communicate the video content between the proxy and the origin server. When a cache hit occurs and cached video content is served from the proxy, the cache management module 110 can terminate the TCP connection or cause the termination of the TCP connection. Terminating the TCP connection when the video content is served from the proxy helps to preserve connection resources at the proxy.
As described above, a video fingerprint is generated from video content that is contained in a response (e.g., an HTTP response) and is used to search the cache 106. Because multiple different versions of the same video content (e.g., different formats and/or different resolutions) can have matching video fingerprints, the video fingerprint technique allows the detection of cached video content even if a different version of the video content is received at the proxy 104.
In operation, the fingerprint generator 516 receives a sample of the video content in the HTTP response. Examples of the video content include, without limitation, a Flash video (e.g., FLV) file, a MPEG file, and a WMV file. The fingerprint generator generates the fingerprint from the sample of the video content, such as a sample of an encoded video frame of the video content, which is carried in the body portion of the HTTP response. In an example, the video fingerprint is a string of data, such as a hexadecimal string, that is used as a key to index video content in the cache 106 and to perform cache lookups. The sample of the video content may be a sample of at least one encoded video frame of the video content. For example, the sample of the video content may include one or more selected video frames of the video content. In an embodiment, the fingerprint generator selects a configurable number of bytes (e.g., 100 k bytes) of compressed video frame data from the video content and generates the video fingerprint from the selected bytes. Compressed video frame data refers to raw video image data that has not been decoded into a displayable video image. Voice samples can be used in addition to video frame data to generate a video fingerprint. In an embodiment, the fingerprint generator selects a configurable number of bytes (e.g., 100 k bytes) of compressed video frame data and voice samples from the video content and generates the video fingerprint from the selected bytes. Although 100 k bytes of data is given as an example, the amount of video content included in the sample is configurable. In addition, the location from which the sample is obtained is configurable. The location from which the sample is obtained may be determined with respect to time, e.g., by a time stamp or a time offset (e.g., several milliseconds) from a time stamp, or with respect to a number of bytes, e.g., the first 100 k bytes of video content or 100 k bytes of video content starting from a known byte offset (e.g., 32 k bytes into the body portion of the HTTP response). In an embodiment, a configurable amount of video data sampled at a particular location of the returned video content (e.g., the first 100 k bytes of video content) is used to generate a video fingerprint. The generated video fingerprint is compared to an index of previously stored video fingerprints that were generated from received video content, which was sampled at the same location of each received video clip. For example, 100 k bytes of video data taken from the beginning (e.g., from a time stamp of zero) of a sports highlight clip carried in the response can be used to generated a video fingerprint. The generated video fingerprint is compared to an index of previously stored video fingerprints that were generated from 100 k bytes of video data that was taken from the beginning (e.g., from a time stamp of zero) of previously received video clips. In an embodiment, the samples that are used to generate video fingerprints are taken at the same location for each HTTP response that includes video content.
As illustrated in
In an embodiment, instead of using metadata in the header portion of the HTTP response to generate a video fingerprint, metadata in the header portion of the HTTP response is used as an extra criterion for determining if there is a video content match. In this embodiment, the video fingerprint is generated from the sample of the video content in the body portion and not from metadata in the header portion. If the video fingerprint matches a previously indexed video fingerprint (e.g., identical hexadecimal strings), metadata in the header portion of the HTTP response can be compared with metadata in the cache that is associated with the previously indexed video fingerprint as an extra criterion to determine if a match is appropriate. If the difference between the metadata in the header portion of the HTTP response and the metadata in the cache is within an acceptable threshold, it is determined that cached video content matches the returned video content. For example, when the metadata is the value in the “content-length” field, a match may be appropriate if the length of the cached video content and the length of the received video content are within an acceptable threshold, e.g., within 5% of each other.
In an embodiment, a cache hit occurs if a video fingerprint matches an entry in the cache 106. For example, a video fingerprint matches a previously stored video fingerprint if the two video fingerprints have the identical digital value, e.g., identical hexadecimal string.
Content providers may store multiple different versions of the same video content (e.g., a sports highlight or a news broadcast) on their origin servers. The different versions are typically characterized by different formats and/or different resolutions that are provided to different client devices and/or different bandwidth environments, such as dial-up, DSL, cable modem access, 3G, 4G, WiFi, etc. Using a video fingerprint generated from video content to index video content in a cache and to search the cache allows the detection of cached video content regardless what version of the video content is received at the proxy 104. For example, using the video fingerprint technique, two different versions of the same video content, e.g., one formatted in MPEG and the other formatted in Flash, can produce matching video fingerprints. In contrast, a conventional URL-based proxy cache system will store multiple different versions of the same video content (e.g., different formats and/or different resolutions) because each different version of the video content is identified by a different URL. However, because different versions of the same video can produce matching fingerprints, a single copy of a particular video can be stored in the cache and indexed by its fingerprint. Subsequent cache searches with a matching video fingerprint will produce a cache hit even if the format of the returned video content does not match the format of the cached video content. If a different version of a particular video is needed, cached video can be transcoded on the fly and served to a client. Therefore, using a video fingerprint generated from video content to index video content in a cache and to search the cache can eliminate the need to store multiple different versions of the same video in a proxy cache. As a result, the video fingerprint technique can reduce cache volume at the proxy and thereby allow a wider variety of content to be stored in the cache 106. In addition, the video fingerprint technique can reduce the load on a proxy backhaul network by eliminating redundant downloads of videos. Furthermore, the video fingerprint technique can provide a better user experience to clients because more content can be served directly from the proxy cache.
Although
At step 604, the proxy 104 receives an HTTP response that the origin server 108 sends in response to the HTTP request. In this case, the HTTP response contains a header portion and a body portion, which includes the video content identified in the HTTP request. In the embodiment depicted in
At step 606, the proxy 104 creates a fingerprint based on video data of the returned video content or a combination of video data of the returned video content and metadata in the response, as described above with respect to
At step 608, the proxy 104 performs a cache lookup in the cache 106 using the generated fingerprint. The cache lookup is, for example, performed by comparing the generated fingerprint to an index of previously stored fingerprints, as will be described below in more detail with regard to
After the cache lookup at step 608, the proxy 104 receives a response from the cache 106 indicating whether or not a matching cached video content has been found for the generated fingerprint. The cached video content may be encoded in the same format as the returned video content of the HTTP response or the cached video content may be encoded in a format that is different from the returned video content. A positive response indicates that a cache hit has occurred. If the response is positive, the HTTP response can be fulfilled from the proxy and the requested video content is sent from the proxy to the client device 102. A negative response indicates that a cache miss has occurred. If the response is negative, i.e., the video content element represented by the video fingerprint is not cached, the HTTP response is sent to the client device from the proxy. For example, the entire video content element is downloaded from the origin server and forwarded to the requesting client device. It may also be necessary to download a newer version of the requested video content even if the video content element is present in the cache if it is determined that the video content element in the cache is out of date. For example, the content element in the cache may have an expiration date that is used to determine the validity of the entry. In an embodiment, the downloaded video content element is placed into the cache.
As shown in
Although some of the embodiments of invention are described with respect to video caching techniques, the above described video caching techniques can also be applied to other media types, including, for example, audio and/or image media. Although the operations of the method(s) herein are shown and described in a particular order, the order of the operations of each method may be altered so that certain operations may be performed in an inverse order or so that certain operations may be performed, at least in part, concurrently with other operations. In another embodiment, instructions or sub-operations of distinct operations may be implemented in an intermittent and/or alternating manner.
It should also be noted that at least some of the operations for the methods may be implemented using software instructions stored on a computer useable storage medium for execution by a computer. As an example, an embodiment of a computer program product includes a computer useable storage medium to store a computer readable program that, when executed on a computer, causes the computer to perform operations, as described herein.
Furthermore, embodiments of at least portions of the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The computer-useable or computer-readable medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device), or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, and an optical disk. Current examples of optical disks include a compact disk with read only memory (CD-ROM), a compact disk with read/write (CD-R/W), and a digital versatile disk (DVD).
In an embodiment, the functionalities of the proxy 104, the cache management modules 110, 510, the fingerprint generator 516, and/or the hash function unit 518 are performed by a computer that executes computer readable instructions.
In the above description, specific details of various embodiments are provided. However, some embodiments may be practiced with less than all of these specific details. In other instances, certain methods, procedures, components, structures, and/or functions are described in no more detail than to enable the various embodiments of the invention, for the sake of brevity and clarity.
Although specific embodiments of the invention have been described and illustrated, the invention is not to be limited to the specific forms or arrangements of parts so described and illustrated. The scope of the invention is to be defined by the claims appended hereto and their equivalents.
This application is entitled to the benefit of provisional U.S. Patent Application Ser. No. 61/620,315, filed Apr. 4, 2012, which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61620315 | Apr 2012 | US |