Operators involved in distributing services, in particular video services, need to supply the end user with a given level of quality on the user's terminal in order to avoid devaluing the content of the service.
Operators must also minimize service storage costs and/or delivery costs by controlling both a method that is used for rate-reduction video encoding, and also the resources of the transmission network that are allocated to conveying the video service.
Those encoding and transmission methods have an impact on the quality played back to the user, which impact varies, depending both on how the methods are configured and on the content of the video service.
Furthermore, the development of digital technologies has led to a very wide variety of terminals capable of playing back video images being made available to the public. Such terminals present a very wide variety of capabilities, for example going from the small screen of a portable terminal to the large screen of a TV set.
Numerous methods are known for allocating resources for rate-reduction encoding or digital transmission. The differences between those methods lies in the quality measurements used, the means for acting on quality (i.e. the resources on which the methods act), and sometimes also on the optimization algorithms implemented.
There are three main known types of quality measurement:
Those measurements are used to control various mechanisms for acting on the encoded video stream:
Methods of allocating or optimizing resources are of two types:
A particular field that uses techniques of adjusting resources is digital TV. At present, the adjustment of encoding rate takes no account of the characteristics of the various types of terminal that might be involved, and it is performed manually, at least in part:
Alternatively, the use of variable rate-reduction encoding methods and of statistical multiplexing methods make it possible to obtain a rate that is variable as a function solely of the content of the video. Nevertheless, it is still necessary to set a minimum and a maximum acceptable data rate manually, said rates being selected as a function of the content of the program.
The operators involved in distributing video services need to provide the end user with a given level of quality, while minimizing the cost involved in storing and/or delivering the service, which involves adjusting how resources are used. As mentioned above, there thus exist techniques for allocating resources that are at least partially manual and techniques that are automatic. All of those known techniques present at least one drawback.
Many encoding resource adjustment techniques are presently manual, at least in part. For example, digital TV operators decide on the encoding rate for each program or each type of program and set their parameters manually. In practice, programs with a great deal of movement, such as sports programs, require a data rate that is greater than that required by other programs. Alternatively, the use of variable rate-reduction encoding methods and of statistical multiplexing methods makes it possible to obtain a rate that is variable as a function of the content of the video. Nevertheless, it is still necessary to act manually to set an acceptable minimum rate and an acceptable maximum rate. Furthermore, the quality criterion used for adjusting the encoding rate is a parameter derived from the complexity of the image and not a measurement of quality as perceived after encoding. Finally, that method does not take account of the characteristics specific to the terminal for the purpose of adjusting encoding parameters.
Existing techniques for automatically allocating transmission resources are all based on measuring transmission quality at network level. Unfortunately, that type of measurement is not very representative of the perceived quality as played back to the user. One result of using such non-perceptual measurements is that there is no guarantee about the quality played back to the end user, and consequently transmission resources are not used optimally. This means that an operator cannot guarantee a given level of perceived quality, and therefore cannot make use of transmission resources in optimum manner.
The present invention provides a method and a system for selecting the configuration of rate-reduction video encoding and the allocation of resources at transmission network level.
The intended object is to play back a given level of video quality on a terminal and to optimize the use of storage and/or transmission resources. To do this, the method associates techniques for measuring the perceived video quality and possibly for optimization by vector quantizing, where appropriate.
The present invention proposes a method and a system for selecting the rate-reduction video encoding configuration and for allocating resources at transmission network level on the basis of the quality perceived at the terminal, and possibly also on the basis of the characteristics of the user's terminal.
The desired object is to play back a given level of video quality at the terminal and to optimize the use of storage and/or transmission resources.
To do this, the method associates techniques for measuring the perceived video quality and for performing optimization. The measurements of perceived quality can be obtained from decoded video images, instead of from the compressed video stream.
The invention thus provides a method of transmitting an audio and/or video program at varying bit rates over a transmission channel, the method implementing an adjustment of at least one encoding and/or transmission parameter as a function of at least one setpoint vector having at least one dimension representing a desired quality for reception by said end user.
A said transmission parameter may be the bit rate and/or the type of modulation and/or the transmission power.
Said adjustment is implemented from a deterministic relationship between the desired quality of reception and the encoding and/or transmission parameter(s).
Alternatively, said adjustment is implemented as a function of the distance between said setpoint vector and a measurement vector representing said reception quality as measured at said end user.
The quality of reception may be measured on a sequence of determined durations of said program. In particular, said adjustment is implemented by modifying the transmission power P as a function of a distance between the setpoint vector and the measurement vector.
In any event, said adjustment may be implemented as a function of at least one parameter concerning the content of the program. A content parameter may be an activity parameter and/or a parameter given to the name of the program and/or to the type of the program. Said adjustment may be also implemented as a function of a parameter characteristic of the terminal.
A parameter characteristic of the terminal may be the resolution of an image displayed on said terminal and/or its passband.
The method may generate a dictionary from a training set comprising NZ vectors R characterizing the data of NZ tests, each vector RZ (Z varying 1 to NZ) of a test of rank z resulting from the union of a vector QZ representing the perceived quality of said test of rank z, a vector PZ representing the encoding and/or terminal parameter(s) of said test of rank z, and optionally a vector TZ representing the parameter(s) of the terminal of said test of rank z, and/or a vector Cz representing the content parameter(s) of said program.
In a first variant applicable when the number NZ is not very large, the dictionary is made up of the vectors of the training set. It is made up of a group of N vectors (N=NZ). The maximum number NZ of vectors for which this variant is applicable depends strongly on the characteristics of the application (e.g. the number of requests per second for a search for an optimum vector) and on implementation constraints (e.g. the computation capacity and the memory capacity that can be allocated to the process of searching for the optimum vector). For example, if it is desired to perform no more than 10,000 vector comparisons per second, it will then be possible to perform no more than 100 searches for optimum vectors per second in a list of NZ=100 vectors, or only 50 searches for optimum vectors per second in a list of NZ=200 vectors.
Otherwise, the dictionary is obtained from said training set by a vector classification algorithm, and is made up of a group of N vectors (with N<NZ) presenting minimum mean distortion relative to the NZ vectors of the training set. The number of vectors N of a dictionary to be used depends strongly on the characteristics of the application and on implementation constraints (as for the selection of NZ in the first variant), and is also a function of a compromise between the accuracy of the dictionary and its size. The larger the dictionary, the greater its accuracy, thereby giving the system better performance. In practice, a dictionary having N=20 to 40 vectors can be suitable for a training set comprising 10 different encoded sequences at two different resolutions and 10 different data rates (giving NZ=200 configurations).
After the dictionary has been built, the adjustment may be performed by vector quantization to determine the vector of the dictionary that corresponds best to a constraint vector representing at least the desired quality.
Said vector may be constituted by the union of a vector representing the desired quality and a vector representing at least one content parameter and/or a vector representing at least one parameter of the terminal.
In another variant, that does not make use of vector quantizing, but that involves measuring the quality perceived for the transmitted program at terminal level, a said adjustment, e.g. of the terminal power P, is implemented in steps of size DP as a function of the difference between the measured perceived quality Q and the target quality QC for the video program.
The method being:
Advantageously, the method being the step size DP is variable as a function of the type of content associated with the video program.
Other characteristics and advantages of the invention appear on reading the following description with reference to the drawings, in which:
The quality of a video service played back to the end user is clearly influenced by the method used for data rate reduction encoding, the resources allocated to the service in the transmission network, and the capacities of the display terminal.
1) Methods of Video Compression or of Rate-Reduction Encoding:
They enable a binary information stream representing video images to be adapted to the capacities of equipment situated downstream: network equipment, terminal equipment. However these methods lead to losses of information: the images played back after decoding are not identical to the original images. This can lead to visible degradation of the images as decoded, thus having an impact on the quality of service delivered to the end user.
The extent to which a coding degradation is visible varies as a function of numerous parameters: the content of the video signal; the bit rate of the encoded binary stream; spatial resolution; the frequency with which images are refreshed, etc. In order to play back a desired level of quality, the parameters of the rate-reduction encoding method must therefore be selected with care.
2) Transmitting the Binary Stream Reduced by the Rate-Reduction Encoding to the Terminal Via a Transmission Network:
This transport may be accompanied by loss of binary information. Methods of receiving and then decoding the stream in a terminal then play back video signals that may suffer from visible degradation, thereby having an impact on the quality of service delivered to the end user.
The extent to which the degradation due to transmission is visible varies as a function of numerous parameters: content of the video signal, allocated bit rate or allocated transmission power, transmission protocol (by packet, with or without correction, . . . ), the distribution and the magnitude of losses, the type of information that is lost, etc. The invention proposes maintaining the level of quality delivered to the user while minimizing the use of network resources by adjusting transmission parameters to the requested quality or to the measured quality as compared with the requested quality.
3) The Terminal:
The characteristics of the binary stream and of the video need to be adapted to the processing and display capacities of the display terminal. For example, there is no point in sending a video stream at resolution greater than the resolution of the screen of the terminal, or that requires computation capacity exceeding that needed for receiving or decoding the stream. The characteristics of the terminal thus constitute constraints that need to be taken into account when selecting parameters for the video compression method.
By selecting parameters of the rate-reduction encoding method with respect to quality level in accordance with the invention, it is possible for the video service supplier to guarantee a perceived quality level. In addition, such selection taking account of the characteristic of the terminal enables the operator to minimize the resources needed for storage and/or transmitting the service.
Adjustment of the transmission parameters makes it possible to adapt to a change in the characteristics of the transmission channel in order to maintain perceived quality.
The invention makes it possible to obtain significant data rate savings. The quality perceived after rate-reduction encoding depends very greatly on the encoding rate. The type of content, and in particular the presence of movements and of fine details in the scene, requires a data rate that is greater than that required by a scene with little movement (said to be less complex) in order to obtain a given quality level.
Without the proposed method of resource allocation that implements measuring perceived quality, there is no way of knowing the quality that is played back on the basis of measurements performed at network level, such as measurements of video stream rates or of binary error rate. One known solution for obtaining good quality is then to allocate the rate needed for guaranteeing the quality of the most complex sequence, and to use that allocation regardless of the particular sequence in question. Under such circumstances,
The sensitivity of a video stream transmitted over a digital network varies depending on the type of video content. The presence of movement has a large influence on the extent to which degradations generated by transmission errors are visible. With transmission over an Internet protocol (IP) network, it can be seen that for a given number of IP packets that are lost, the drop in quality is greater for video sequences having content with a large amount of movement.
Use can be made of this, in practice, in a method of adjusting the transmission power from a UMTS transmitter in order to give priority to video streams that are “complex”, i.e. that present a large amount of movement.
The methods of measuring the applicable video perception quality are those making use of the data coming from the video decoding process:
Reference can be made in particular to the patent applications filed by Télédiffusion de France and published under the numbers EP 1 020 085 and PCT WO 2004/047451, the PCT case being entitled “A method and a system for measuring the degradation to a video image that is introduced by rate-reduction encoding” for examples of these two types of method.
Methods of measuring quality with complete reference are not applicable since they require both the pixels of the video images received after transmission and the pixels of the images before transmission.
The purpose of the optimization procedure is to control the use of resources by seeking an encoding or transmission configuration that enables a given level of perceived quality to be reached. One or other of the following two techniques can be used:
1) using a database of representative instances of the relationship between perceived quality and the encoding or transmission network configuration. A search engine using vector quantization searches for the instance in the database that corresponds best to the desired perceived quality under the (unavoidable) present conditions while minimizing the resources requested of the network;
2) using a determined logical or empirical relationship to make a calculation in advance, giving the relationship between perceived quality and the encoding or transmission network configuration under consideration.
The optimization procedure can be performed by vector quantization.
Vector quantization is a technique that associates a point X (or vector) in t-dimensional space with the closest point Uk=QV(X) from amongst a set of N vectors U1 . . . N known as a dictionary, where closeness is measured in terms of distance Δ.
U1 . . . N=(Uj,j=1 . . . N) (1)
QV(X)=i/Δ(X,Ui)≦Δ(X,Uk); k=1 . . . N (2)
Δ(X,U) being the distance between the vectors (X,U) (3)
That technique for modeling complex processes has been used by way of example with image encoding. The image is initially subdivided into subsets such as rectangular blocks of pixels, and then for each block of pixels, vector quantization consists in searching for the block of pixels in the dictionary (referred to as a vector) that is closest. Only an index or an address for the vector is transmitted to the image decoder, which decoder reconstitutes the image because it knows the dictionary and the corresponding vector identifiers.
The concept of distance or distortion between two vectors is introduced in order to search the dictionary for the vector that is closest. Several distances have been proposed for optimizing vector quantization and for maximizing fidelity with the initial signals.
The distance or distortion known as quadratic error is one that is in the most widespread use for vector quantizing.
where (A,B) are two vectors of dimension t.
The use of the vector quantization technique relies on two main steps that are interdependent:
1) forming the dictionary on the basis of a training set; and
2) searching for the nearest neighbor using an appropriate distance.
The way in which those two steps are used in the invention for controlling the perceived quality of a video service encoded by rate reduction and transmitted digitally are described in succession below in this document.
Generating the dictionary DB constitutes a step that is prior to any optimization of the encoding and transmission configuration by vector quantization. The dictionary is a database DB containing representative instances Uk=U1 . . . N of the relationship between perceived quality and the encoding or transmission network configuration for certain characteristics of the given video content and terminal.
In order to generate the dictionary, a set of tests needs to be performed. The data characterizing the tests consist in a training set {Rk} that is used by a specific procedure for dictionary construction (
Each of the NZ tests is identified by its number z. Each test gives a particular instance of the relationship between the measured perceived quality Qz and the encoding and transmission parameters Pz for the characteristics of the given terminal Tz and video content Cz. Appropriately selecting the various tests performed makes it possible to reach a dictionary that presents high performance.
For this purpose, in order to enable the relationship between the various parameters to be modeled well, the parameters Pz, Tz, and Cz are caused to vary, firstly over a range corresponding to operating conditions in practice, and secondly in such a manner as to obtain the desired perceived quality levels Qz (
Qz, Pz, Tz, and Cz are vectors in the most general case:
Qz=(VQ1,z, . . . , VQnq,z) (5)
with nq=number of quality parameters, and
Pz=(VP1,z, . . . , VPnp,z) (6)
with np=number of encoding and transmission parameters, and
Tz=(VT1,z, . . . , VTnt,z) (7)
with nt=number of terminal parameters, and
Cz(VC1,z, . . . , VCnc,z) (8)
with nc=number of content parameters, and
Each training vector Rz of dimension t comes form the union of Qz, Pz, Tz, and Cz. It characterizes the data set associated with test z (perceived quality, encoding and transmission parameters, terminal parameters, and content parameters):
Rz=Qz∪Pz∪Tz∪Cz=(V1,z, . . . , Vt,z) (9)
with t=ng+np+nt+nc
∪=union
The set of vectors Rz, 1<z≦NZ constitutes the training set (Table 1). A specific procedure is applied to the training set in order to generate the dictionary of representative instances Uk with 1<k≦N. Two situations are possible:
U1 . . . N=(Rk,k=1 . . . NZ) and N=NZ (10)
The limit of the number of combinations can be set freely, e.g. using implementation criteria such as the size of the database or the computation file needed by the optimization model in order to find the optimum configuration.
Classification algorithms are used. Several authors have proposed solutions for classifying dictionaries: dynamic clouds, or the LBG algorithm. The number N of vectors of the dictionary is selected depending on the initial number of vectors in the training set, the precision of the modeling, and implementation constraints.
The dictionary obtained by the classification procedure constitutes the database DB (
Naturally, it is possible at the least to make use of a training vector that takes account only of perceived quality and the encoding and transmission parameter. Nevertheless, it is advantageous to take account of content. The parameters of the terminal do not need to be taken into account except when the users of an intended application are diverse and when it is possible to obtain the parameter for the terminal of a given user.
The following step consists in searching for the encoding and transmission configuration.
The first step has generated a dictionary that is representative of the relationship between the measured perceived quality and the encoding and transmission network configuration for certain characteristics of the given video content and terminal.
The second step makes use of the dictionary to find an encoding and transmission configuration that guarantees a certain target quality QC for the end user. To do this, the module RECH searches for said configuration in the database DB (
The data represented by
Q=(VQ1, . . . , VQnq) (11)
where nq=the number of quality parameters VQi
A time and date stamp representative of the time the video content is presented is also associated with this vector Q.
For example, the vector QC may be of dimension nqc=1 and may contain a single value gqc corresponding to the target quality to be achieved (e.g. the target quality purchased by the user by contract with the supplier of an audiovisual service) for the quality of the audiovisual service (gqc).
The vector Q must necessarily contain an audiovisual quality value gq obtained by measurement to enable the method of optimization by vector quantization to operate, by comparing gq with gqc. Q may be of dimension greater than the dimension nqc of QC, for example nqc=1, but nq=3 in the configuration where Q contains three values Q=(aq, vq, gq) corresponding respectively to the quality obtained by measuring the audio (aq), the video (vq), and the audiovisual (gq) signals.
QC=(VQC1, . . . , VQCnqc) (12)
For example nqc=1, QC=target quality index in the range 30 to 95.
T=(VT1, . . . , VTnt) (13)
For example nt=1, parameter VT1=screen resolution.
C=(VC1, . . . , VCnc) (14)
For example nc=1, parameter VC1=activity of a video sequence or the type of the sequence (slow, fast, medium).
P=VP1, . . . , VPnp) (15)
For example, np=1, 2, or 3, VP1, VP2, VP3=transmission power and/or bit rate and/or passband.
The process for searching for the optimum configuration for encoding and transmission consists in extracting the vector P that gives the encoding and transmission configuration to be used so as to deliver the quality of service to the user as defined by the vector QC representative of the target quality under the current conditions of constraints represented by the values Q, T, and C. The advantage of the vectorization method is that there is no need to measure the perceived quality Q other than while building the dictionary.
The search process is subdivided into three sub-steps:
a) forming a constraint vector O. The date and time associated with the vector Q is associated with the constraint vector O. This date and time is representative the time at which the video content is presented;
b) vector quantizing on the constraint vector O to find the vector Uk of the dictionary that corresponds best to the constraint vector O presented at the input; and
c) extracting the vector P of parameters for the encoding and transmission system.
Sub-Step a) Forming the Constraint Vector O
The vector O representing the current set of operating constraints on the system is constituted in the highest performance circumstance of the union of the vectors T and C and a combination Q′ of the vectors Q and QC. Each parameter of the vector O must be unique, while the parameters of the parameter QC are all present in the vector Q. The final objective is to find the encoding parameter vectors P that enable a target quality as defined by QC to be obtained.
Q′=QC ∪{VQi such that VQi does not exist in QC}with VQi defined by Q=(VQ1, . . . , VQnq) (16)
For example, when QC is of dimension nqc=1 and contains a single vector gqc corresponding to the target quality to be achieved, and Q is of dimension nq=3 and contains three vectors Q=(aq, vq, gq) corresponding respectively to the signal at the measured audio (aq), video (vq), and audiovisual (gq) qualities, the vector Q′ resulting from applying equation (16) is Q′=(gqc), corresponding to the constraint of the audiovisual quality to be obtained by the encoding and transmission system.
The vector O is then formed by the union of T, C, Q′. The resulting vector is of dimension h:
O=Q′∪T∪C=(VO1, . . . , VOh) (17)
where h=nq+nt=nc
Sub-Step b) Vector Quantizing
Vector quantizing causes the input vector O of parameters VOi to correspond with the dictionary vector U that is the closest to the constraint vector O presented at the input. Vector quantizing proper is performed on a sub-vector Sk of each vector Uk. The vector O contains only a subset of the parameters of the vectors Uk. The parameters of Uk that are not present in O are the encoding and transmission parameters Pk associated with said set of constraints O. Each vector Sk is thus defined by:
Sk={Vi·such that Vi does not exist in 0}with Vi defined by U=(V1, . . . , Vi) (18)
Minimizing the distortion between the incident vector O and all the sub-vectors Sk of the vectors U1 . . . N of the dictionary is then performed. It serves to identify the vector U that corresponds best with the constraint vector O.
Sub-Step c) Extracting Encoding and Transmission Parameters
The parameters of U that are not present in O are the encoding and transmission parameters P associated with said set of constraints O. It therefore suffices to extract from U the vector P that represents the encoded parameters and that is thus defined by:
P={Vi such that Vi does not exist in 0}with Vi defined by U=(V1, . . . , Vt) (19)
The entire operation of the search procedure is shown in
Once the parameters of the vector P have been found, together with certain parameters of the vector U found by vector quantizing in sub-step b), if necessary, it is then possible to apply them to the rate reproduction encoding process and to the transmission process.
Some parameters considered as constraint parameters, and thus present in the vector O, can also be parameters that are useful for defining the transmission configuration.
For example, we consider the situation in which it is desired to optimize the video encoding configuration by acting on two parameters, namely spatial resolution and encoding rate. If there are two different types of terminal, corresponding to two possible spatial resolutions for the screen, and if those terminals are not capable of displaying correctly video that is encoded at a resolution other than the resolution of their own screens, then the resolution parameter becomes a constraint for the method of encoding the video with rate reduction. The only parameter of the vector P is then the encoding rate. Nevertheless, the encoding resolution (as imposed by the terminal) must also be applied to the encoding method in order to ensure that the optimization method is exhaustive.
The database DB also has a function of storing data generated by the module for measuring perceived quality, together with optimization decisions taken by the module RECH. For this purpose, the database DB stores the vectors O and Pi shown in
An alternative to vector quantizing is to perform calculation by implementing a relationship that is logically or empirically determined in advance, giving the relationship between perceived quality and the encoding or transmission network configuration under consideration. The optimization procedure f gives the encoding and transmission parameters P that are to be used to obtain a target quality QC, given the characteristics of the terminal T and of the video content C, and given the presently measured quality level Q (
P=f(QC,Q,T,C) (20)
Under these circumstances, all of the knowledge needed for the optimization procedure is thus contained in the deterministic relationship, located in the module RECH. The database DB does not contain data relating to the optimization procedure.
The optimization approach using a deterministic relationship is advantageous since it does not require a database, which might be very large. In contrast, a deterministic relationship can be determined easily only when the number of configurations is small.
The approach using vector quantizing and a database of representative instances is more advantageous when there are numerous configurations.
The invention applies particularly well to providing video sequences on demand from a server by having recourse to vector quantizing the encoding rate that is optimal as a function of the resolution of the terminal and the quality requested by the end user, as a function of the type of sequence desired.
This application makes use of the invention to select the data rate for pre-encoded video sequences stored on a video server from amongst a certain number of possible values. The resolution of the user terminal and the desired quality level are taken into account so as to minimize the data rate needed for supplying the service, thus leading to optimum utilization of the transmission network. The transmission network used may, for example, be of one of the following types: Internet protocol (IP); digital video broadcasting (DVB); or universal mobile telecommunications system (UMTS).
The application can use an optimization procedure based on vector quantization, as described above.
Using the same notation, this application thus defines the parameters Q, QC, T, P, and C as follows:
Otherwise, it is possible to characterize the video content by a parameter for the activity of the image in one or more sub-sequences of a few seconds in a sequence.
Two variant ways of building the dictionary are described below, depending on whether the video content is defined by content name or by the type of content in the dictionary contained in the DB module.
The first variant makes use of content name and is described with reference to
1) Some number of source video signals are required and encoded by rate reduction. The encoding is performed using all possible terminal resolutions, and using a plurality of data rates selected from a range corresponding to the capabilities of the terminals and of the transmission network. In the present example, the CIF and QCIF resolutions are used, with transmission channel data rates lying in the range 48 kbit/s to 384 kbit/s, for example, with a step size of 10 kbit/s, being applied for each of those two resolutions.
2) Each stream is evaluated by the perceived quality measurement module. The quality Qz characterizing the encoded video sequence is the mean quality measured over the sequence.
3) The encoded video streams are stored on a video server. The other data constituting the dictionary stored in DB includes the quality Q, the data rate P of the transmission channel, the resolution T of the terminal, and the content name C. It is therefore not necessary in this example to use a classification procedure since the size of the dictionary remains modest.
The dictionary can then be used by the module RECH to find the data rate needed, as explained above.
A second variant using content size instead of content name is described below with reference to
To do this, a classification procedure is preferable so as to group together the various quality measurements Qz carried out under the same viewing and encoding conditions Tz and Pz for a plurality of sequences that are different, but all of the same type Cz, with this being stored in a single vector Q, T, P, C. In this embodiment, the classification procedure used is preferably the LBG algorithm with distance as presented below in equation (21).
Initially, the user accesses a list of content stored on a video server, the content being identified by name and type, e.g. by means of an Internet browser; the user selects a content and a desired level of quality and makes a request. Thereafter, the mechanism using the invention takes place in three stages without intervention by the user:
1) The user terminal sends to the module RECH its own characteristics, the characteristics of the selected content, the desired quality level QC, and possibly the most recent quality measurement Q.
2) The module RECH searches by means of vector quantization in the database for the encoding rate P that exists for the content C on the video server that will ensure the requested quality QC, at the resolution imposed by the terminal, and it sends this information to the terminal. The parameters as received and then sent to the terminal are also stored in the database, e.g. for subsequent analysis.
The user terminal accesses the content C selected by the user at the rate P selected by RECH, and the user obtains the requested content with quality QC.
It should be observed that the linear distance between two vectors A and B that may be used herein for vector quantization is simpler to implement than is the quadratic distance of equation (4).
RECH receives the characteristics of the terminal T and of the content C, and possibly also the quality measurements Q. It stores these measurements in the database DB via a database management system (DBMS). Thereafter RECH performs a search for the best encoding or transmission configuration P on the basis of the dictionary also stored in DB. The configuration P is sent to the equipment concerned.
In a variant of this first application in which the optimum encoding rate is selected as a function of the resolution of the terminal and of the requested quality, the method of the invention is used to minimize the rate at which video sequences are encoded while taking account solely of the quality level that is to be reached, this leading to optimum utilization of the transmission network. This approach is particularly applicable when the utilization conditions of the video service, and in particular the type of terminal, and the type of service content, vary little. This applies for example with an on-request video service for viewing on television type terminals by users.
The invention uses the same optimization procedure based on vector quantization, as described above for said first application, and with the same notation. The main difference is that the parameters T and P are empty. Vector quantization is then based on:
The same methods can be used for building the dictionary and for optimizing the parameters P, in this case restricted to encoding rate.
In another variant, the method of the invention can be used for example to adjust the transmission power as a function of desired quality and possibly also as a function of the type of content, without implementing vector quantizing.
This application adjusts the transmission power level of the service from a transmitter of the UMTS access network as a function of perceived quality instead of as a function of standard network level parameters as used in UMTS, such as the signal-to-noise ratio Eb/No. The idea is to maintain a given quality level and not a target binary error rate.
The sensitivity of a video stream transmitted over a digital network varies depending on the type of video content. The presence of movement has a large influence on the extent to which degradation caused by transmission errors is visible. In the proposed implementation, the invention takes advantage of this property to react only when that is needed in order to maintain the perceived quality.
Under such circumstances, the invention can make use of an optimization procedure based on a deterministic algorithm, as described above. There is then no training procedure leading to a dictionary. Using the same notation as above, this application defines the parameters Q, QC, T, P, and C as follows:
The user accesses a list of content stored on a video server, e.g. by using an Internet navigator, the content being identified by name and by type; the user selects a content and a desired quality level. Thereafter, the mechanism using the invention takes place in three stages without intervention by the user:
1) The terminal sends periodically to the module RECH the most recent quality measurement Q, the desired quality level QC, and in the second variant, the characteristics C of the selected content.
2) The module RECH applies the optimization procedure on the basis of C, QC, and Q in order to discover the power P needed for ensuring the requested quality QC under the conditions of quality as presently perceived for the content C. This power P is applied to the video service transmission network.
The parameters received from the terminal and then sent to the network are also stored in the database, e.g. for subsequent analysis.
This optimization procedure that does not make use of vector quantization acts on power as a function of measured perceived quality Q. The greater the departure of the measured quality Q from the target quality QC, the greater the amount of variation in the power.
The procedure periodically calculates the new power P, e.g. once every second, on the basis of the current power Pold. This can be summarized as follows:
A step size DP for increasing the power is defined.
The function sign(X) returns the sign of X. Thus, power is increased when Q<QC. For example DP may represent 1% to 5% of the power.
The method can also be implemented to take account simultaneously of the quality measured at the terminal and the type of content, without having to vector quantizing.
This variant takes advantage of the variation in sensitivity of a video stream to transmission errors depending on the type of video content. When transmitting over an IP or a UMTS network, it is found that for a given number of IP packets that are lost, the drop in quality is greater for video sequences having content with a large amount of movement.
This second variant of the optimization procedure takes advantage of this property by using the following procedure:
Two step sizes for increasing power are defined, one for each type of content: DP_sport>DP_news.
If C=sport, DP=DP_sport, e.g. 2% to 10% of the power. Else DP=DP_news, e.g. 1% to 5% of the power.
The function sign(X) returns the sign of X. Thus, power is increased when Q<QC.
For the first variant of the first application (selecting the optimum encoding rate as a function of the resolution of the terminal and the requested quality, using content name).
The table shows a real example of a portion of a dictionary used for searching for the optimum rate at a function of a target quality and of a content designated by its name, with a display resolution constraint. The extract shown is valid for five different contents encoded in a combination of two resolutions and four rates. These contents have the following names: football, kayak, wood, TV news, and cartoon.
The coordinates of each line in Table 1, i.e. of each vector of the dictionary, can be associated with the definitions of the vectors Q, C, T, and P defined above for equations (5), (6), (7), and (8), as follows:
Q=(pqos) and nq=1
pqos=sequence quality in the range 1 to 100
C=(sequence name) and nc=1
T=(image size) and nt=1
P=(bit rate) and np=1
The application of the vector quantization procedure with a QC vector containing a target quality value of pqos then makes it possible to select the optimum value for the “bit rate” parameter. In this example, the distance between two “image size” coordinates (or “bit rate”) is zero if the two coordinates of a given vector are equal, otherwise it can be selected for example as being equal to 100 so as to have an order of magnitude comparable to the pqos coordinate.
Then, as mentioned above in the description (sub-step c), the encoding configuration sent to the encoder is made up of the vector P possibly together with said elements of the vector U. In the present example, the parameters “image size” or “bit rate” constitute this configuration.
For the second variant of the first application (selecting the optimum encoding rate as a function of the resolution of the terminal and of the requested quality, while using content type).
The table gives a real example of a dictionary used for searching for the optimum rate as a function of a target quality and of a content type (news or sport) with a display resolution constraint.
The coordinates of each line of Table 2, i.e. of each vector of the dictionary, can be associated with the definitions of the vectors Q, C, T, and P as defined above in equations (5), (6), (7), and (8), as follows:
Q=(pqos) and nq=1
C=(content type) and nc=1
T=(image size) and nt=1
P=(bit rate) and np=1
Applying the vector quantizing procedure with a vector QC containing a target quality value of pqos then enables the optimum value to be selected for the “bit rate” parameter. In the present example, the distance between two image size or bit rate coordinates is zero if the two coordinates are equal, otherwise it can be selected to be equal to 100, for example, in order to present an order of magnitude comparable to the pqos coordinate.
Thereafter, as mentioned above in the description (sub-step c), the encoding configuration sent to the encoder is made up of the vector P possibly together with some of the elements of the vector U. In the present example, the “bit rate” and “image size” parameters constitute this configuration.
Number | Date | Country | Kind |
---|---|---|---|
0413321 | Dec 2004 | FR | national |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/FR05/02868 | Nov 2005 | US |
Child | 11808804 | US |