The present invention relates to a prediction method and device, and in particular to a method and device for predicting the number of views of broadcast media on a broadcast channel at a future instance in time.
Advertisements during TV or radio broadcasts are usually billed by a TV broadcaster in advance for a particular number of views. If the actual number of views is less than what was expected when the billing was made, the broadcaster will have to repeat the advertisement at a later time, or refund the money which was over paid. Conversely, if the actual number of views is more than what was billed for at the time, the broadcasting air time has not been efficiently used.
One important research question for broadcast services is the user behavior, and one aspect of this is the knowledge about the number of viewers watching a specific TV channel (e.g. views per hour). This knowledge can be used, for example, to optimize advertisement insertions made by TV broadcasters, or to control bandwidth or other network parameters.
In that respect knowledge about the future number of views at particular points in time can help to plan and distribute the insertions of the advertisements to maximize profit. The embodiments described herein present a method and apparatus to predict the number of views of broadcast media at a future instance in time, for example future viewing hours.
According to a first aspect there is provided a method in a device for predicting at a current time n the number of views of broadcast media on a broadcast channel at a future instance in time n+k. The method comprises receiving past values representing the actual number of views of the broadcast media at particular instances in time, and defining a time window previous to the current time n. The method comprises analysing the past values received during the time window to determine if the time window is corrupt, and utilising the time window to predict the number of views of the broadcast channel at the future instance in time, n+k, depending on whether the time window is corrupt.
According to another aspect there is provided a device for predicting at a current time n the number of views of broadcast media on a broadcast channel at a future instance in time n+k. The device comprises a processor and a memory, the memory containing instructions executable by the processor. The processor is operable to receive past values representing the actual number of views of the broadcast media at particular instances in time, and define a time window previous to the current time n. The processor is operable to analyse the past values received during the time window to determine if the time window is corrupt, and utilise the time window to predict the number of views of the broadcast channel at the future instance in time, n+k, depending on whether the time window is corrupt.
As described previously the present embodiments relate to a method and apparatus for predicting at a current time n the number of views of broadcast media on a broadcast channel at a future instance in time n+k. In some embodiments described herein, the number of views of broadcast media are referred to as viewers of TV channels. It is noted, however, that the number of views may be other forms of access to broadcast media, such as listeners to audio broadcast. In some examples described herein, the embodiments relate to predicted future instances in time as being predicting hourly views per TV channel. It is noted, however, that the embodiments are intended to cover other instances or periods in time, including views over different periods of time.
It is also noted that references herein to broadcast media is intended to cover both live broadcast of media, and the streaming or downloading of media content at a later point in time.
In
This prediction can be done using an autoregressive (AR) prediction model, which may also be referred to as a linear predictive (LP) model. This type of model is an efficient technique for predicting future values of a time series based on a linear combination of past values, for example a weighted linear combination of past values. One example of such an AR prediction model is described below.
For the time series {x(m): m=1,2,3, . . . }, the AR model at time n for predicting x(n+k) from the current and past values x(n), x(n−1), . . . x(n−P) where P+1 is the number of AR prediction coefficients, is given by
where the coefficients an,0, an,1, . . . an,p are the prediction coefficients, and k is the prediction step (k≥1). The optimal prediction coefficients at time n, ân,0, ân,1, . . . ân,p are evaluated as the coefficients that minimize the least squares prediction over the times n, n−1, . . . n−L for some L>>P. The optimal prediction coefficients may therefore be given as:
To help clarify the relationship between the length of the analysis window, the length of the basis vectors used in the AR prediction and the AR model order P, the AR prediction equation is displayed in the following matrix format (2a): This helps to find the AR parameter vector an that minimizes the norm of the prediction error:
As can be seen above the length of the basis vectors used in the AR prediction is L+1, the number of AR prediction coefficients is P+1 and the data analysis window is (n″:n)=(n−k−P−L:n), which makes the length of the data analysis window equal to k+P+L+1.
The solution to equation (2) can be found using a standard least squares estimation problem. The solution can be given by:
â
n
=A
−1
X (3)
where the data matrix A is representative of the data window and can be expressed as:
where <xn,xm> is the inner product between the vectors xn and xm, wherein the vector xn is given by:
The matrix inversion shown in equation (3) can be achieved, for example, through Singular Value Decomposition (SVD). In this technique the data matrix A can be rewritten as:
A=USVT. (7)
In Equation (7) U and V are unitary matrices and S is a diagonal matrix with non-negative real numbers on the diagonal. Therefore the matrix S can be written as:
S=diag(s1, s2, . . . , sL,) (8)
Since the inversion of a diagonal matrix such as S is trivial, the inversion of A of equation (3) can be expressed as:
A
−1
=VS
−1
U
T (9)
Equations (9) and (3) can then be used to determine the coefficients to be used in Equation (1) to predict the future value x(n+k).
Theoretically, the larger the time window the more accurate the prediction, and therefore the optimal solution would be to use all past data in the AR model. However, this assumes that the data statistics do not evolve with time. This is not the case for real data recordings. The time window could therefore be shortened to follow the signal dynamics, but in this case the model parameters can be easily affected by outliers or corrupted data. Solutions to this problem of the present embodiments are discussed with reference to the remaining Figures.
As mentioned above, it will be appreciated that a broadcast channel may comprise a TV channel or audio broadcast channel and that the broadcast may be live, on-demand, or any other streaming service.
In step 201 the method comprises receiving past values representing the actual number of views of the broadcast media at particular instances in time. For example, this step may comprise receiving continual historical data relating to the actual number of views of broadcast media at particular instances in time (for example hourly viewer figures over a period of time in the past).
In step 202 the method comprises defining a time window previous to the current time n.
In step 203 the method comprises analysing the past values received during the time window to determine if the time window is corrupt. An example of this analysis is discussed in more detail with reference to
In step 204 the method comprises utilising the time window to predict the number of views of the broadcast channel at the future instance in time, n+k, depending on whether the time window is corrupt.
Thus, a defined time window is selected for a prediction, but a check is performed to determine whether or not the defined time window is to be then utilised for the prediction, depending upon whether or not the time window is corrupt.
In the embodiments described herein, the time window is defined between a first time n″ (corresponding to the beginning of the time window) and a second time n′ (corresponding to the end of the time window).
For example the second time n′ can be the same as the current time n. In such an embodiment the time window therefore spans a period immediately prior to the current time n.
In other embodiments, the second time n′ is prior to the current time n. In such an embodiment the time window therefore spans a period prior to the current time n. The difference between the current time n and the second time n′ may relate, for example, to a processing delay between the end of the time window and the current time n when the prediction is performed. Alternatively, the difference between the current time n and the second time n′ may relate, for example, to a time shift required in order to match the position of the time window which is to be used as the basis for analysis with the future time instance n+k that is being predicted (for example in a scenario where k is not equal to 24 hours).
In some embodiments, the past values are extracted from received data in a pre-processing stage. The past values can be extracted and filtered from the received data. This method and pre-processing may be performed by a single processor. In other embodiments, the method and the pre-processing may be performed in different modules or devices.
In step 301 the device determines whether or not the time window is corrupt. In some embodiments the time window is corrupt if it contains any erroneous data.
Erroneous data may refer to outliers or anomalies in the data. For example, for a TV broadcasting channel, the number of views may exhibit unpredictable behaviour if a particular channel is broadcasting media of unusually high interest, for example a particularly shocking news broadcast. This will have the effect of reducing the number of views on other broadcast channels and increasing the number of views on the channel broadcasting the news. This is seen as erroneous as it is not a typical consumer response to the media usually being broadcasted at that particular time.
Other examples of erroneous data can be envisaged, for example hardware failures, which can lead to the number of views being unrealistically constant over a period of time. In such scenarios data or part of the data can be lost or incomplete due to bad network conditions or software failure. In these scenarios different modules can be responsible for data reporting, collection, processing and storage. The incomplete data might reflect a typical consumer response to the media in time, but not in terms of the exact number of times the media is viewed. It is noted that the detection of erroneous data in a time window may include other techniques, including ones which are specific or related in some way to the type of broadcast media or TV channel.
If it is determined in step 301 that the time window is not corrupt, the method passes to step 302 wherein the device classifies the time window as a usable analysis window and utilises the usable analysis window to predict the number of views of the broadcast channel at the future instance in time, n+k.
This is described in more detail with reference to
In
Returning to
In some embodiments, for example the embodiment shown in
As can be seen in
For example, in some embodiments, the usable analysis window is the most recent usable analysis window, i.e. the last time window from the most recent iteration of the method which found the time window to be non-corrupt. Therefore, in this embodiment the usable analysis window may be the time window immediately before the corruption point 402.
In some embodiments, for example the embodiment shown in
In
In
In step 501 the method in the device comprises deriving a data matrix representative of the time window.
For example the matrix A as described above in equation (4).
In step 502, the method comprises determining whether a decision metric, for example D, based on the data matrix is above or below a threshold value, for example θ.
If the decision metric D is above the threshold value θ, the method passes to step 503 wherein it is determined that the time window is not corrupt.
If the decision metric D is below the threshold value θ, the method passes to step 504 wherein it is determined that the time window is corrupt.
For example, when using the autoregressive model as described previously the decision metric, D, may be as follows:
where A* is the usable time window, or the valid time window as described in
The decision metric D, may be based, for example, on readily available eigenvalues of the diagonal matrix S, for example:
The decision metric, D, is related to the hypervolume defined by the columns of the diagonal matrix, S. This metric, D, is motivated by the fact that when determinant of a matrix approaches zero (hypervolume degenerates and flattens) an indication is made that columns of the matrix are linearly dependent. In other words, D in equation (11) becomes negative when the prediction matrix A begins to approach a state of linear dependence between the basis vectors of the matrix.
The decision metric, D, in equation (11) is only one of many possible decision metrics that can be used. One can also envision other decision metrics involving the singular values of the A matrix such as the ratio between the highest and lowest singular values or a measure characterizing the distribution of the singular values.
The threshold Θ in equation (10) could be set, for example, to a fixed value or adapted to the level of past D values.
The above described embodiments predict at a current time n the number of views of the broadcast channel at a future instance in time, n+k, using an autoregressive prediction model as previously described. It will however be appreciated, the various analysis windows used to predict the number of views of the broadcast channel at the future instance in time n+k, may be used in any other suitable prediction model.
Where an autoregressive model is used the time window may have a size L+k+P+1, where L+1 is the length of basis vectors used in the autoregressive prediction model, P+1 is the number of coefficients used in the autoregressive prediction model, and k is the time between the current time n and the future instance in time n+k.
In some embodiments the value of L is significantly greater than the value of P and typically increases with increased value of k. In some examples the value of L is at least twice the value of k.
In an example embodiment for a prediction of k=24 hours (i.e. 24 hours ahead), a fixed threshold for the decision metric D was used, Θ=120. This may be optimal for a model with P=30 and L=240 i.e. with a time window of length L+P+k+1=240+30+24+1=295 hours. The inventors have found that for k=24 hours ahead in time, by analysing the time window to check that it is not corrupt the average correlation coefficient improves from 0.73 to 0.86. This can be seen in the graph of
The predication of the number of views at the future instance in time can be used, for example, to determine the advertisement to be placed at that future instance in time. This will therefore help to alleviate the problem of the actual number of views being less or more than what was expected when the billing for the advertisement was made, as the advertisement can be placed at a time when the prediction shows that the number of views is at least close to the number billed for.
It will also be appreciated that the prediction of the number of views at the future instance in time may also be used for other applications, for example to predict traffic flow for bandwidth purposes or controlling other network parameters, or other service enhancing functions.
The device 700 comprises a processor 701 and memory 702, the memory 1103 containing instructions executable by the processor 701. The processor is operable to: receive past values representing the actual number of views of the broadcast media at particular instances in time; define a time window previous to the current time n; analyse the past values received during the time window to determine if the time window is corrupt; and utilise the time window to predict the number of views of the broadcast channel at the future instance in time, n+k, depending on whether the time window is corrupt. The processor 701 may be adapted to perform other method steps descripted herein.
It is noted that the device 700 may be a stand-alone device, or form part of another device in a communications network, including for example part of a cloud based node.
As will be seen from the above, the embodiments described herein can be used to predict future hourly views, for example using an AR model, and since the signal statistics evolve with time, an analysis window for building the AR model can also be shifted with time, by defining a time window previous to a current time n. This time window is checked before applying to the prediction model. If the data is decided to be erroneous, another time window is selected (for example a data-optimized, but fixed in size analysis window), for example a model derived from the last good analysis window, or alternatively a reduced or enlarged time window to avoid the effect of the erroneous data.
The embodiments described herein therefor enable the number of views of broadcast media on a broadcast channel at a future instance in time to be predicted more accurately.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. The word “comprising” does not exclude the presence of elements or steps other than those listed in a claim, “a” or “an” does not exclude a plurality, and a single processor or other unit may fulfil the functions of several units recited in the claims. Any reference signs in the claims shall not be construed so as to limit their scope.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2016/071374 | 9/9/2016 | WO | 00 |