1. Field
Example embodiments relate generally to caching media, and particularly to providing hierarchical caching for online media.
2. Related Art
Online video is presently the largest component, or one of the largest components, of internet traffic. Caching such video adds additional requirements on traditional algorithms. For many conventional algorithms, hit-rate has traditionally been the key performance metric used to compare such algorithms. Increasingly newer caches use SSD memory for storage due to SSD memory's access speed and reliability. A lifetime of SSD memory may have a limited number of write cycles available. Replacement rate, or the number of write cycles per request is another key performance metric that may be of interest with such memory. According to some online video usage data, a significant part of the online video is used to rebroadcast recent TV shows which are initially quite popular but rapidly drop in popularity. Measuring the performance of caching algorithms with such rapid changes is another new performance metric. Typically online caching stores the video in small chunks of typically between 2 and 10 seconds. Further each video may be encoded into multiple video quality levels further increasing the number of chunks per video. In all, online video can increase the number of files that can be requested by end users in a caching system by a thousand fold or more.
Many conventional caching algorithms proposed in research over the last 30 years have not been implemented in commercial caches. For example, prominent examples of implemented schemes include simple schemes like the least recently used (LRU) and a few others like greedy dual size frequency (GDSF) and least frequently used with dynamic aging (LFU-DA).
According to one or more example embodiments, a method includes receiving, at a first cache device, a request to send a first asset to a second device; determining whether the first asset is stored at the first cache device; when the determining whether the first asset is stored at the first cache device indicates that first asset is not stored at the first cache device, obtaining, at the first cache device, the first asset, performing a comparison operation based on an average inter-arrival time of the first asset with respect to the first cache device and a characteristic time of the first cache device, the characteristic time of the first cache device being an average period of time assets cached at the first cache device are cached before being evicted from the first cache device, and determining whether or not to cache the obtained first asset at the first cache device based on the comparison; and sending the obtained first asset to the second device.
The first cache may be one of a plurality of cache devices of a content distribution network, the plurality of cache devices being divided into a plurality of hierarchy levels, and the method may further include for each cache device of each hierarchy level, from among the plurality of cache devices divided into the plurality of hierarchy levels, determining, with respect to the cache device, an average inter-arrival time of a requested asset, when a request for the requested asset is received, and determining a characteristic time of the cache device, when the cache evicts an asset.
The method may further include determining the characteristic time of the first cache device by, initializing the characteristic time of the first cache device as a value which is higher than the initialized value of the average inter-arrival time of the first asset with respect to the first cache device, and updating the characteristic time of the first cache device based on exponentially weighted moving average of periods of time assets cached at the first cache device are cached before being evicted from the first cache device in accordance with a least recently used (LRU) cache eviction policy.
The method may further include gently increasing the characteristic time of the first cache device by applying a gentle increase operation on the characteristic time of the first cache device, when the first cache evicts an asset.
The method may further include determining the average inter-arrival time of the first asset with respect to the first cache device by, initializing the average inter-arrival time of the first asset with respect to first cache device as a value lower than the initialized value of the characteristic value of the first cache device, and updating the average inter-arrival time of the first asset with respect to the first cache device based on exponentially weighted moving average of periods of time between consecutive receipts of requests, at the first cache device, to send the first asset to another device.
The method may further include assigning the first asset to a first database, when an initial request for the first asset is received at the first cache device, when a second request is received consecutively with respect to the initial request, determining an inter-arrival time of the first asset based on times at which the initial and second requests were received at the first cache device, and assigning the first asset to an inter-arrival time database, the inter-arrival time database storing arrival times of requests corresponding to assets, the inter-arrival time database being different that the first database; and demoting the first asset from the inter-arrival database to the first database when, the inter-arrival time of the first asset becomes greater than a reference value, or the inter-arrival time of the first asset is the largest inter-arrival time among inter-arrival times of assets currently assigned to the inter-arrival time data base at a point in time when a new asset is added to the inter-arrival time database and a total number of the assets currently assigned to the inter-arrival time database is greater than database capacity value.
According to one or more example embodiments, a first cache device may include a processing unit including a processor, the first cache device being programmed to perform, with the processor, operations including, receiving, at the first cache device, a request to send a first asset to a second device; determining whether the first asset is stored at the first cache device; when the determining whether the first asset is stored at the first cache device indicates that first asset is not stored at the first cache device, obtaining, at the first cache device, the first asset, performing a comparison operation based on an average inter-arrival time of the first asset with respect to the first cache device and a characteristic time of the first cache device, the characteristic time of the first cache device being an average period of time assets cached at the first cache device are cached before being evicted from the first cache device, and determining whether or not to cache the obtained first asset at the first cache device based on the comparison; and sending the obtained first asset to the second device.
The operations the first cache is programmed to perform may further include, determining, with respect to the first cache device, an average inter-arrival time of a requested asset, when a request for the requested asset is received, and determining a characteristic time of the first cache device, when the cache evicts an asset.
The operations the first cache is programmed to perform may further include, determining the characteristic time of the first cache device by, initializing the characteristic time of the first cache device as a value which is higher than the initialized value of the average inter-arrival time of the first asset with respect to the first cache device, and updating the characteristic time of the first cache device based on exponentially weighted moving average of periods of time assets cached at the first cache device are cached before being evicted from the first cache device in accordance with a least recently used (LRU) cache eviction policy.
The updating the characteristic time of the first cache device further includes, gently increasing the characteristic time of the first cache device by applying a gentle increase operation on the characteristic time of the first cache device, when the first cache evicts an asset.
The operations the first cache is programmed to perform may further include determining the average inter-arrival time of the first asset with respect to first cache device by, initializing the average inter-arrival time of the first asset with respect to the first cache device as a large lower than the initialized value of the characteristic value of the first cache device, and updating the average inter-arrival time of the first asset with respect to the first cache device based on exponentially weighted moving average of periods of time between consecutive receipts of requests, at the first cache device, to send the first asset to another device.
The operations the first cache is programmed to perform may further include, assigning the first asset to a first database, when an initial request for the first asset is received at the first cache device, when a second request is received consecutively with respect to the initial request, determining an inter-arrival time of the first asset based on times at which the initial and second requests were received at the first cache device, and assigning the first asset to an inter-arrival time database, the inter-arrival time database storing arrival times of requests corresponding to assets, the inter-arrival time database being different that the first database; and demoting the first asset from the inter-arrival database to the first database when, the inter-arrival time of the first asset becomes greater than a reference value, or the inter-arrival time of the first asset is the largest inter-arrival time among inter-arrival times of assets currently assigned to the inter-arrival time data base at a point in time when a new asset is added to the inter-arrival time database and a total number of the assets currently assigned to the inter-arrival time database is greater than database capacity value.
According to one or more example embodiments, a content distribution system may include a plurality of first cache devices, the plurality of first cache devices being divided into a plurality of hierarchy levels, each of the plurality of first caches devices being programmed to perform a first caching operation, respectively, such that, for each one of the plurality of first caching devices, the first caching operation includes, receiving, at the first cache device, a request to send a first asset to a second device; determining whether the first asset is stored at the first cache device; when the determining whether the first asset is stored at the first cache device indicates that first asset is not stored at the first cache device, obtaining, at the first cache device, the first asset, performing a comparison operation based on an average inter-arrival time of the first asset with respect to the first cache device and a characteristic time of the first cache device, the characteristic time of the first cache device being an average period of time between receipt of last requests for, and eviction of, assets cached at the first cache device, and determining whether or not to cache the obtained first asset at the first cache device based on the comparison; and sending the obtained first asset to the second device.
According to one or more example embodiments, a method of operating a content distribution network, the content distribution network including a plurality of first cache devices, the plurality of first cache devices being divided into a plurality of hierarchy levels, may include performing a first caching operation for each of the plurality of first cache devices, respectively, such that, for each one of the plurality of first caching devices divided into the plurality of hierarchy levels, the first caching operation includes, receiving, at the first cache device, a request to send a first asset to a second device; determining whether the first asset is stored at the first cache device; when the determining whether the first asset is stored at the first cache device indicates that first asset is not stored at the first cache device, obtaining, at the first cache device, the first asset, performing a comparison operation based on an average inter-arrival time of the first asset with respect to the first cache device and a characteristic time of the first cache device, the characteristic time of the first cache device being an average period of time between receipt of last requests for, and eviction of, assets cached at the first cache device, and determining whether or not to cache the obtained first asset at the first cache device based on the comparison; and sending the obtained first asset to the second device.
At least some example embodiments will become more fully understood from the detailed description provided below and the accompanying drawings, wherein like elements are represented by like reference numerals, which are given by way of illustration only and thus are not limiting of example embodiments and wherein:
Various example embodiments will now be described more fully with reference to the accompanying drawings in which some example embodiments are shown.
Detailed illustrative embodiments are disclosed herein. However, specific structural and functional details disclosed herein are merely representative for purposes of describing at least some example embodiments. Example embodiments may, however, be embodied in many alternate forms and should not be construed as limited to only the embodiments set forth herein.
Accordingly, while example embodiments are capable of various modifications and alternative forms, embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intent to limit example embodiments to the particular forms disclosed, but on the contrary, example embodiments are to cover all modifications, equivalents, and alternatives falling within the scope of example embodiments. Like numbers refer to like elements throughout the description of the figures. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
It will be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element or intervening elements may be present. In contrast, when an element is referred to as being “directly connected” or “directly coupled” to another element, there are no intervening elements present. Other words used to describe the relationship between elements should be interpreted in a like fashion (e.g., “between” versus “directly between”, “adjacent” versus “directly adjacent”, etc.).
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises”, “comprising,”, “includes” and/or “including”, when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may in fact be executed substantially concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
Exemplary embodiments are discussed herein as being implemented in a suitable computing environment. Although not required, exemplary embodiments will be described in the general context of computer-executable instructions, such as program modules or functional processes, being executed by one or more computer processors or CPUs. Generally, program modules or functional processes include routines, programs, objects, components, data structures, etc. that performs particular tasks or implement particular abstract data types.
The program modules and functional processes discussed herein may be implemented using existing hardware in existing communication networks. For example, program modules and functional processes discussed herein may be implemented using existing hardware at existing network elements or control nodes (e.g., an eNB shown in
In the following description, illustrative embodiments will be described with reference to acts and symbolic representations of operations (e.g., in the form of flowcharts) that are performed by one or more processors, unless indicated otherwise. As such, it will be understood that such acts and operations, which are at times referred to as being computer-executed, include the manipulation by the processor of electrical signals representing data in a structured form. This manipulation transforms the data or maintains it at locations in the memory system of the computer, which reconfigures or otherwise alters the operation of the computer in a manner well understood by those skilled in the art.
The end user 110 may be embodied by, for example, an electronic user device examples of which include a mobile device, smart phone, laptop, tablet, or a personal computer. The end user 110 is capable of receiving content stored at the origin server 140 via CDN 130. The end user 110, the CDN 130 and the origin server 140 may be connected to each other through, for example, the internet.
The CDN 130 includes caches 135A˜F. Caches 135A˜F each include storage for storing media content. Caches 135A-F may be embodied, together, in groups, or individually in, for example, servers, routers, or wireless communications network components including, for example, base stations (BSs), evolved node Bs (eNBs), or a radio network controllers (RNCs). Though CDN 130 is illustrated as only including six caches 135A˜F, the CDN 130 may include any number of caches. Further, in the example illustrated in
The origin server 140 is a server that provides content in response to content requests. For example, the origin server 140 may store content corresponding to one or more videos which may be requested by the end user 110 for streaming. In this case, the origin server 140 may receive content requests associated with the particular videos, for example from a cache within the CDN 130, and the origin server 140 may respond to the requests by providing the requested content. Though, for the purpose of simplicity, only one origin server 140 is illustrated, data network 100 may include any number of origin servers.
The caches in the CDN 130 may be organized in a hierarchical cache structure.
As used herein, the term “asset” refers to data that may be stored in a cache or provided by an origin server, and may be requested by a user. For example, with respect to online video, an example of an asset is a 2-10 second chunk of video data stored at an origin server, that may be requested by a user and may be stored at one or more caches.
According to one or more example embodiments, the caches in the CDN 130 may be organized in the hierarchical cache structure shown in
An example structure of the network elements of data network 100 will now be discussed below with reference to
Referring to
The transmitting unit 252, receiving unit 254, memory unit 256, and processing unit 258 may send data to and/or receive data from one another using the data bus 259.
The transmitting unit 252 is a device that includes hardware and any necessary software for transmitting signals including, for example, control signals or data signals via one or more wired and/or wireless connections to other network elements in data network 100.
The receiving unit 254 is a device that includes hardware and any necessary software for receiving wireless signals including, for example, control signals or data signals via one or more wired and/or wireless connections to other network elements in the data network 100.
The memory unit 256 may be any device capable of storing data including magnetic storage, flash storage, etc.
The processing unit 258 may be any device capable of processing data including, for example, a processor.
According to at least one example embodiment, any operations described herein, for example with reference to any of
Examples of the network element 251 being programmed, in terms of software, to perform any or all of the functions described herein as being performed by any of a user, a cache, or server described herein will now be discussed below. For example, the memory unit 256 may store a program including executable instructions corresponding to any or all of the operations described herein with reference to
Examples of the network element 251 being programmed, in terms of hardware, to perform any or all of the functions described herein as being performed by a user, a cache, or server will now be discussed below. Additionally or alternatively to executable instructions corresponding to the functions described with reference to
An overview of hierarchical caching according to one or more example embodiments will now be discussed below.
With respect to many conventional caching algorithms, some of the impediments to practical use may have been a perceived complexity of implementation of the algorithms which can impede throughput performance and an expectation that the value of additional caching performance gains such as increased hit-rate may be outweighed by factors such as ability to adapt to asset popularity changes.
One or more example embodiments use new caching algorithms for online video. The new caching algorithms have low implementation complexity; and have improved hit-rate and replacement rate. The new caching algorithms have been evaluated based on simulations that involve typical popularity of assets and realistic changes using video on-demand (VoD) statistics. Some studies have indicated that many assets are quite popular in the beginning of their lifecycle, but drop in popularity at an exponential rate and are a fraction of their popularity within days after their introduction. Based on these, not only hit-rate, but also byte bit-rate and replacement rate are used to evaluate caching algorithms according to one or more example embodiments. Though one or more example embodiments of caching algorithms are described herein with reference to caching online video, one or more example embodiments described herein for caching online video may also apply to other types of media transferred through communications networks like the internet including, for example, audio, pictures, video games, and 3-D objects and models.
A method of providing hierarchical caching according to one or more example embodiments will now be discussed with respect to
According to one or more example embodiments, a method of providing hierarchical caching uses a caching algorithm that works in a local manner in a given cache without the need for global information. Further, according to one or more example embodiments, a method of providing hierarchical caching uses a caching algorithm that is an O(1) algorithm with respect to the number of assets or nodes.
According to one or more example embodiments, a method of providing hierarchical caching uses a caching algorithm that is relatively quick to respond to changes in popularity when previously very popular assets drop their value quickly.
According to one or more example embodiments, a method of providing hierarchical caching uses a caching algorithm that provides improved hit-rate performance.
According to one or more example embodiments, a method of providing hierarchical caching uses a caching algorithm that does not use the size of an asset in the decision to evict as this may cause undesirable behaviour for online video.
According to one or more example embodiments, a method of providing hierarchical caching uses a caching algorithm that runs at each cache in a given hierarchy independent of the algorithm in other caches. Each cache estimates the inter-arrival time of each asset locally. Each cache also calculates its own characteristic time, which is defined as the average time an asset stays in cache before it is evicted. For example the characteristic time of a cache may be determined by the cache in accordance with known method using LRU, by determining the average of several eviction times corresponding to several assets where, for each asset, the eviction time for that asset may be determined when the cache is about to evict an asset, as the difference between the current time and the time that asset was last requested. One or more example embodiments of the above-referenced caching algorithm will now be discussed in greater detail below.
According to one or more example embodiments, a method of providing hierarchical caching uses a caching algorithm that evicts assets in accordance with the known LRU scheme. For example, according to one or more example embodiments, once the caching algorithm determines an asset is to be evicted form a cache, the asset selected for eviction is the least recently used asset.
Table 1 below describes variables that are defined for each asset.
Table 2 below describes variables that are defined for each cache.
Table 3 below describes variables that are defined for the whole cache hierarchy.
Table 4 below describes initial conditions that, according to one or more example embodiments, are set before the caching algorithm used by the method of providing hierarchical caching illustrated in
Equation (1) below illustrates the manner in which TCjk, the characteristic time value of a cache (j,k), is calculated.
TC
jk
=w
TC×(TS−Pijk)+(1−wTC)×TCjk (1)
Equation (1) calculates the characteristic time TCjk as an exponential moving average of times assets stay in cache (j,k) without being requested before being evicted from the cache (j,k), for example, in accordance with an LRU cache eviction policy. For example, the characteristic time TCjk as an exponential moving average of times between receipt last requests for assets in cache (j,k) and eviction of the assets from cache (j,k). According to one or more example embodiments, in accordance with known LRU methods for operating caches, the value Pijk is updated to the current time whenever a cache receives a request for the ith asset by the LRU function itself.
Equation (2) below illustrates the manner in which the average the inter-arrival time of asset i for a cache (j,k), Tijk, is calculated. According to one or more example embodiments, the value Tijk may be calculated whenever an asset is requested from any cache, whether it is a leaf cache or a higher layer cache.
T
ijk
=w
IA×(TS−Pijk)+(1−wIA)×Tijk (2)
Equation (2) calculates an exponential moving average of the inter-arrival time of asset i using the weight wIA.
According to one or more example embodiments, after calculating the value TCjk in accordance with equation (1), the characteristic time TCjk is gently increased so that the characteristic time TCjk does not get stuck at a low number. The characteristic time TCjk is gently increased in accordance with equation (3) below.
TC
jk
=TC
jk
+GS×(TS−PLjk) (3)
According to one or more example embodiments, after gently increasing the value TCjk in accordance with equation (3), the current value for PLjk is set to the current time as is illustrated by equation (4) below.
PL
jk
=TS (4)
An example of a caching algorithm used by the method of providing hierarchical caching according to one or more example embodiments is described below in the form of pseudo code by algorithm 1. According to one or more example embodiments, algorithm 1 may be performed, individually, by each cache in the hierarchical tree structure of CDN 130. For example, every time a cache from among the caches of the CDN 130 receives a request for an asset i, the cache performs Algorithm 1 to determine if the requested asset needs to be cached by the cache.
An example use of Algorithm 1 will be discussed in greater detail below with respect to
Referring to
As is shown in Algorithm 1, upon receiving the request for the asset x in step S305, the cache 135B may calculate the value Tijk using equations (2). Next the cache may set the value for Pijk as the current time. Next, the cache 135B may proceed to step S310. Whenever the cache 135B evicts an asset, it may calculate the value TCjk using equation (1) and then the cache 135B may gently increase the value TCijk using equation (3).
In step S310, the cache determines whether or not the asset for which the request was received in step S305 is stored (i.e., cached) in the cache. For example, in step S310, the cache 135B may determine whether the asset x is already cached in the cache 135B. If the asset the asset x is currently stored in the cache 135B, then the cache 135B proceeds to step S315.
In step S315, the cache provides the asset requested in step S305 and updates the LRU database of the cache. For example, in step S315, the cache 135B may provide the asset x to the network element that requested the asset in step S310, the cache 135D. Further, in accordance with know LRU methods of operating a cache, the cache 135D may update an LRU database within the cache 135D (e.g., within the memory 256 of the cache 135D) that stores timestamps indicating, respectively, times of the most recent uses of each the assets presently stored in the cache.
Returning to step S310, if the cache determines in step S310 that the asset requested in step S305 is not included in the cache, the cache proceeds to step S320. For example, if, in step S310, the cache 135B determines that the asset x is not currently stored in the cache 135B, then the cache 135B proceeds to step S320.
In step S320, the cache retrieves the asset requested in step S305. For example, in step S320, the cache 135B may send a request for the asset x to the parent of the cache 135B, cache 135A. In response to the request sent by the cache 135B, the cache 135B may receive the asset x from the cache 135A. For example, the cache 135A may be storing the asset x already, or the cache 135A may retrieve the asset x from the origin server 140 before providing the asset x to the cache 135B.
Next, in step S325, the cache performs a comparison based on the values TCjk and Tijk and determines whether or not to store the asset x based on the comparison. For example, in step S325, the cache 135B may compare the value TCjk to the value Tijk in accordance with equation (5) below.
(Tijk<TCjk) (5)
If the result of the comparison operation in step S325 is true, the cache 135B may proceed to step S335 before proceeding to step S330. In step S335, the cache 135B stores the asset x in the cache 135B. In step S330, the cache 135B provides the asset to the network element that requested the asset, cache 135D.
If the result of the comparison operation in step S325 is false, the cache 135B may proceed directly to step S330 without proceeding to step S335. Accordingly, if the result of the comparison operation in step S325 is false, the cache 135B may provide the asset x to the cache 135D without caching the asset x in the cache 135B. Thus, according to one or more example embodiments, caches in the CDN 130 may determine whether or not to cache a retrieved asset in accordance with the comparison defined by equation (5).
After providing the requested asset x in step S330, the cache 135B returns to step S305 to await receipt of the next asset request.
Though
A new database, which may be included in caches of the CDN 130 (e.g., within the memories 256 of the caches of the CDN 130) in addition to the LRU databases of the caches, will now be discussed below.
As is discussed above, caches of the CDN 130 may include LRU databases in accordance with known LRU methods for operating caches. Further, each of the caches of the CDN 130 may also include a database for storing inter-arrival times of various assets for example, for calculating the average inter-arrival of asset i, Tijk.
Given that the number of assets in a library can be very large, sometimes much larger than the number of assets that can be stored in a cache, it may be desirable to use more than one database to store the inter-arrival time statistics. For example, according to one or more example embodiments, the caches of the CDN 130 may include a main database of inter-arrival times and an additional database of inter-arrival times. In accordance with one or more example embodiments, the main database of inter-arrival times for a cache does not contain inter-arrival times that are more than a few times the TCjk value for that cache; for example 3 times. According to one or more example embodiments, the exact value above which inter-arrival times are not included in the main inter-arrival time data base may be based on the popularity distribution of assets stored at that cache.
The additional database may be a “one occurrence” database that is used to store assets which have seen only a single request so far and assets that are demoted from the main inter-arrival time database. For example, when a cache sees a second request for an asset, the cache may calculate an inter-arrival time based on the timings of the first and second requests, and the cache may place that asset into the main inter-arrival time database based on the newly calculated inter-arrival time. Further, upon placing asset which has the newly calculated inter-arrival time into the main database, the asset with the largest inter-arrival time in the main database may be removed from the main database. In order to return to the main database, the removed asset may need to enter the “one occurrence” database first, and then be promoted to the main inter-arrival database in the manner discussed above.
The additional database may be a “one occurrence” database that is used to store assets which have seen only a single request so far. When the cache sees a second request for an asset in the “one occurrence” database, the cache may calculate an inter-arrival time based on the timings of the first and second requests, then the cache may place that asset into the main inter-arrival time database based on the newly calculated inter-arrival time while also deleting that entry from the “one occurrence” database. Further, upon placing asset which has the newly calculated inter-arrival time into the main database, the asset that was the least recently used in the main database may be removed from the main database. Least recently used means the asset whose last request was the oldest among all assets. In order to return to the main database, the removed asset may need to enter the “one occurrence” database first, and then be promoted to the main inter-arrival database in the manner discussed above.
Example embodiments being thus described, it will be obvious that embodiments may be varied in many ways. Such variations are not to be regarded as a departure from example embodiments, and all such modifications are intended to be included within the scope of example embodiments.
This application claims the benefit of provisional U.S. Application No. 62/064,631 filed on Oct. 16, 2014, the disclosure of which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62064631 | Oct 2014 | US |