This invention relates generally to the technical field of multimedia streaming over wired and wireless networks.
Multimedia applications (such as mobile television, video on demand, IPTV, video conference, digital video broadcasting (DVB), audio/video streaming, two-way video telephony, real-time gaming and the like) are as ever gaining popularity and acceptance, especially among mobile users. Such development is mainly due to wireless networks which, extended to the existing wired infrastructure, offer mobility and portability conveniences for the end-user. Hence, a great attention is paid for Quality of Service (QoS) requirements satisfaction with the purpose of an unconditional adoption of multimedia applications.
Nevertheless, multimedia data transmission particularly experiences multiple constrains that severely limit the QoS intended to be offered to end-users. These constrains have mainly to do with several key requirements with regard to the particular nature of multimedia applications when compared to other kind of applications that need to be satisfied so as to provide a reliable and efficient transmission:
Moreover, apart from the bandwidth scarcity due to an increased number of users, multimedia delivery to wireless receivers is particularly challenging
To address these problems, various error-control strategies based on the employment of a feedback mechanism, i.e. from the receiver to the source, have been proposed. This feedback mechanism is in charge of conveying information regarding the path characteristics and receiver behavior (estimated at the receiver) to the transmitter (i.e. the source). For doing so, the receiver sends channel quality measurements (the available bandwidth, the status of stream path, the loss rate for example) toward the source. A common technique implementing this feature is to use, for example,
Reported network information is then utilized by the source to optimize the transport of multimedia streams (rate adaptation, transcoding, packet drop, frame drop, or layer drop in case of scalable stream) such as a RTCP-based traffic-encoding adjustment at the application layer.
In recent bibliography, different proposals of cross-layer approaches utilizing these feedbacks can be found. These approaches aim to coordinate and optimize, jointly or separately, layers performances by adapting theirs behaviors to constantly varying reported feedbacks.
But on another hand, one can mention that
A further problem is about the complexity of feedbacks management between a mobile receiver and the source of the multimedia application.
One object of the present invention is to improve end-user QoS in multimedia applications without using feedbacks messages from the receiver.
Another object of the present invention is to ensure reliable transport of multimedia stream using feedback from the MAC layer.
Another object of the present invention is to get rid of the applicative feedbacks (e.g. RTCP).
Another object of the present invention is to estimate the channel condition in a time varying channel without applicative feedbacks.
Another object of the present invention is to effectively use the bandwidth allocated for the transmission, over wired and wireless networks, of a real-time multimedia application.
Another object of the present invention is to enable the derivation of the desired QoS metrics without using applicative feedbacks.
Another object of the present invention is to provide a channel-adaptive source coding and error-control schemes without using applicative feedbacks.
Another object of the present invention is to provide a channel-adaptive source coding and error-control schemes that cope with bandwidth variations and data losses.
Another object of the present invention is to provide improved QoS for multimedia applications over a variety of channel conditions.
Another object of the present invention is to control data-encoding rate without using applicative feedbacks.
Another object of the present invention is to provide a mechanism that can improve the QoS of real-time multimedia applications over wireless networks.
Another object of the present invention is to provide a method targeting at mitigating delay and packet loss ratio during the transmission of multimedia data over wired and/or wireless networks.
The objects, advantages and other features of the present invention will become more apparent from the following disclosure and claims. The following non-restrictive description of preferred embodiments is given for the purpose of exemplification only with reference to the accompanying drawing in which
The present invention is directed to addressing the effects of one or more of the problems set forth above. The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an exhaustive overview of the invention. It is not intended to identify key of critical elements of the invention or to delineate the scope of the invention. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is discussed later.
The present invention relates to a video packets scheduling method for multimedia streaming toward a receiver provided with a video decoder, via a transmission chain including an access point and a proxy, said proxy provided with post-encoder buffers and with a controller, said access point provided with a buffer of lower layer, said method comprising a resolution step of an optimisation problem controlling the state of the buffer in the access point and the state of the post-encoder buffers in the proxy.
In accordance with a broad aspect, feedback messages transmitted from the receiver are not considered in the controller for video packets scheduling.
In accordance with another broad aspect, the optimization problem is formulated in the framework of a discrete time Markov Decision Process.
The present invention further relates to a controller for video packets scheduling from post-encoder buffers to a buffer of lower layer in an access point, said video packets to be streamed to a receiver provided with a video decoder, said controller programmed for solving an optimization problem controlling the state of the buffer in the access point and the state of the post-encoder buffers (31) in the proxy.
The present invention further relates to a computer program product adapted to perform the method cited above.
While the invention is susceptible to various modification and alternative forms, specific embodiments thereof have been shown by way of example in the drawings. It should be understood, however, that the description herein of specific embodiments is not intended to limit the invention to the particular forms disclosed.
It may of course be appreciated that in the development of any such actual embodiments, implementation-specific decisions should be made to achieve the developer's specific goal, such as compliance with system-related and business-related constraints. It will be appreciated that such a development effort might be time consuming but may nevertheless be a routine understanding for those or ordinary skill in the art having the benefit of this disclosure.
The video sequence 1 may come from different sources such as a storage device, (a database, a multimedia server, a video server for example), or a live camera feed.
The mobile station 7 is any user equipment able to receive and play a multimedia streaming. A smart-phone, a tablet, a computer, a Personal Digital Assistant (PDA), a laptop are non-limitative examples of such mobile station 7.
The wireless network 12 may be a wireless IP network, a Wireless Personal Area Network, a Wireless Local Area Network, a Wireless Metropolitan Area Network, a Wireless Wide Area Networks, or more generally any Mobile devices network which may result from the combination of more than one wireless network.
More generally, the video sequence 1 is routed to the mobile station 7 via a wired network 10 that includes
Within the streaming server 2, the video sequence 1 is segmented into frames encoded into
Persons skilled in the art will readily realize that, in case of H264 AVC wherein there are two kinds of inter-frames (predicted frame, commonly denoted P, and the Bi predicted frame, commonly denoted B), the number of L is equal to 2. Further, H264 AVC may be seen as a particular case of H264 SVC wherein the used scalability is the temporal scalability. Accordingly, hereafter and for the sake of generality, the notation of L, and the term of H264 SVC are used.
In the case of a H.264 SVC scalability scheme, Access Units (AU) are the basic processing units, macroblock(s), slices, or frame(s), consisting of the base layer and its corresponding enhancement layers.
Encoding parameters (quantization steps, frame rate and the likes) are controlled by the streaming server 2, independently of the remainder of the transmission chain. Each scalable layer of each encoded frame is packetized (for example into RTP, UDP, or IP packets), then are delivered via an over-provisioned core network 10 to L post-encoder buffers 31 (one per layer) situated in the proxy 3. The controller 32 performs layer filtering within the proxy 3: for each layer, packets may be sent, kept, or dropped.
Sent packets are fed to the MAC buffer 4 (or more generally to a buffer 4 of lower layer) in the access point 11 after being segmented into Packet Data Units (PDUs). PDUs are then transmitted to the mobile station 7, which stores correctly received PDUs in its own MAC buffer 71. Packet de-encapsulation and buffering in one of the L buffers 71 at application layer of the mobile station 7 are done as soon as all corresponding PDUs have been received. Complete or incomplete AUs are then processed by the video decoder 72 of the mobile station 7. Outdated packets are dropped, without being decoded.
Among the components of the access point 11, a particular attention is paid to its MAC (or lower layer) buffer 4. Then to make it apparent, the remainder components of this access point 11 (the MAC scheduler 5, the physical layer, the radio front-end for example) are assumed to be belonging to the channel 13. This channel 13 further comprises the wireless channel 6, the physical layer of the mobile station 7, and the MAC buffer 71 (i.e. the MAC layer managing ACK/NACK procedures) of the mobile station 7.
The MAC buffer 4 of the access point 11 sends feedback (link 34 on
Based on only the reported feedbacks from the MAC buffer 4 of the access point 11, a scalable layer filtering process, operated by the controller 32 of the proxy 3, is designed in such a way that
Accordingly, without any additional observation on the channel 13 state, the controller 32 has to perform a scalable layer filtering (noting that some scalability layers may be dropped) using only observations made on the MAC buffer 4. Alternatively or in combination, other observation points, instead of MAC buffer 4, such as RLC buffer or PDPC buffer may be adopted.
In other words, by observing only the evolution of fullness of the last buffer in the path before the wireless channel 12, namely the MAC buffer 4, it is possible for the controller 32 of the proxy 3 to decide on which packet/layer to transmit. That is to say, the controller 3 performs a video packet scheduling algorithm without RTCP feedback messages from the wireless station 7 to the proxy 3. Only feedback provided by the ACK/NACK may be exploited to derive the state of the channel (link 57 on
These above items are achieved through the resolution of an optimisation problem formulated in the framework of a discrete-time Markov Decision Process (MDP).
Thus, the design of an efficient layer filtering process is translated in the framework of discrete-time MDP (R. S Sutton and A. G Barto, Reinforcement Learning: An Introduction, MIT Press, 1998). In fact, an MDP of 4-tuple (S,A,P,R) is defined, and wherein
Some policy ((s) (A, s(S maximizing the immediate reward (myopic policy) or a discounted sum of future rewards (foresighted policy) has then to be found.
The state of the controlled system consists of gathering
The state of the system is thus s=(sle,sm,h).
The channel is modeled by a first-order Markov process with n states, with known transition probability p(ht+1|ht) and stationary probability p(ht). ht can represent for example the rate available during the considered time slot. Tow hypotheses concerning the knowledge of the state of the channel are considered:
With regard to action α, the proxy 3 may, and time t, send, hold, or drop packets for each layer l. The action αl(l,t)(A taken for the lth layer between time t and t+1 represents the number of transmitted packets from the post-encoder buffer 31 to the MAC buffer 4 (when αl,t>0), or the number of dropped packets (when αl,t<0). If αl,t=0, packets are kept in the post-encoder buffer 31. The vector gathering all actions is α=(β1, . . . , αL) (AL.
Once all states S and actions A have been identified, one has to determine the transition probability matrix P(s,s′,α). To that end, two cases are distinguished as follows:
P1(st,st+1,αt)=Pr(st+1e,sm,ht+1|ste,sm,ht,αt) (1)
In order to limit the size of these transition matrices P1(st,st+1,αt), can quantize the values that may be taken by the states S in a more or less coarse way to get a compromise between complexity and description accuracy.
Concerning the reward function R(s,s′,α) at time t, the layer filtering process (i.e. the scheduling algorithm), performed by the controller 31, chooses an action that maximizes the QoS (notably, the video quality) at the receiver side (i.e. at the mobile station 7). To that end, the average Peak Signal to Noise Ratio (PSNR) of the decoded frames is maximized.
In order to avoid the variability of the delay between the time at which an AU is filtered and the time at which it is displayed, an alternative reward function R(s,s′,α), that penalize dropped packets, as well as buffer overflow and underflow according to the system constraints, is built. This reward function is expressed as follows:
Where E[.] denotes the expectation function.
The positive parameters γ1, α1, and β1, with l=1 . . . L, trade off the importance of the various constraints. The reward function (3) involves several parts; the first linked to the number of transmitted SNR layers, the others to the post-encoder buffers 31 and the MAC buffer 4 constraints. The reward function (3) is function of the state of the post-encoder-buffers 31 and the state of the MAC buffer 4.
Assuming that increasing the amount of transmitted packets increases the received quality, the transmission reward should help to maximize the amount of transmitted packets.
The parameters γ1 allow giving a higher priority to packets belonging to the base layer compared to those of the enhancement layers. For post-encoder buffers 31 and MAC buffer 4 constraints, ρ1(.) and ρ2(.) provide positive rewards for satisfying buffer states and negative rewards for states that should be avoided.
The policy π, as a mapping from joint states to joint actions in the considered system, indicates the number of scalable layers to transmit, knowing the state of the post-encoder buffers 31 and of the MAC buffer 4.
The optimal foresighted policy consists in finding the optimal stationary Markov policy π* corresponding to the optimal state-value function defined as
where 0<α<1 is the discount factor, which defines the relative importance of present and future rewards. The optimal foresighted policy may be obtained by value or policy iteration algorithms (R. S Sutton and A. G Barto, Reinforcement Learning: An Introduction, MIT Press, 1998). The value in (4) is updated and iterated for all states s until it converges with the left-hand side equal to the right-hand side (which is the Bellman equation for this problem). When α=0 , one gets a myopic policy, maximizing only the immediate reward.
Accordingly, the proposed algorithm controls the level of buffers 4 at MAC (link 34 on
The proposed algorithm performs packet scheduling and jointly buffer management in both Application and MAC layers at the transmitter side according to a cross-layer control mechanism.
The controller 32 retrieves information concerning the MAC buffer 4, solves the above MDP optimization problem, and then derives updated operational parameters for the physical and application layers.
The use of the optimized parameters permits to
Accordingly, the proposed method permits to
The proposed method is a scheduling algorithm for AVC or SVC video streaming in order to have a quality aware adaptive and selective frame/packet transmission.
The performance of the proposed layer filtering process has been evaluated on several video sequences.
As illustration of the gain achievable with the disclosed method,
Five curves are plotted on
The settings used in this illustrative example are:
for all layers.
The channel state transition probabilities are p11=0.9 and p00=0.8, resulting in an average channel rate of
Four possible actions per layer are considered at each time instant A={−1,0,1,2}
Known average source and encoder characteristics have been considered, leading to known average packet lengths in each layers.
In the two considered above cases, the evolution of the PSNR of the decoded sequence obtained with a myopic policy (α=0) and that with a foresighted policy (α=0.9) are represented in
When the applicative feedback is used, an average gain of about 1.5 dB is obtained with the foresighted policy compared to the myopic one. This gain is mainly due to more packets of the first enhancement layer reaching the receiver.
In the case of the using no applicative feedback (the target of the embodiment), with the myopic policy, about 46% of the time, the MAC buffer 4 is in the overflow state exceeding some time the maximum buffer size. This situation results in the loss of some PDUs which induces a notable decrease of the received video quality. With the foresighted policy, the MAC buffer is in overflow state for about 25% of the time but never loses PDU packets.
Without applicative feedback and using the foresighted policy results in a loss of 0.5 dB in PSNR compared to the case of deploying an applicative feedback strategy. The availability of the state of the MAC buffer 4 provides thus a reasonable estimate of the state of the channel, allowing a satisfying regulation of the received video quality.
Accordingly, the maximum PSNR is obtained when the disclosed method is introduced, and particularly, in this example, for frames after the 50th.
It is noteworthy to mention that the above described method addresses conventional codecs, known for the skilled person in the art, such as H264 AVC, H264/SVC, any further version of them, or any equivalent codec.
The above-described method fits well to multimedia streaming over wireless networks due to frequent channel changes that lead to considerably reduced packet loss rates, especially under heavy traffic conditions.
Others QoS features (IntServ/DiffServ for example) and congestion control mechanisms, especially designed for multimedia data transmission in wired networks, may be combined with the herein described method.
Number | Date | Country | Kind |
---|---|---|---|
11290068.3 | Jan 2011 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2012/050084 | 1/4/2012 | WO | 00 | 12/19/2013 |