Computer networks, such as the Internet, have revolutionized the way in which people obtain information. For example, modern computer networks support the use of e-mail communications for transmitting information between people who have access to the computer network. Increasingly, systems are being developed that enable the exchange of data over a network that has a real-time component. For example, a video stream may be transmitted between communicatively connected computers such that network conditions may affect how the information is presented to the user.
Those skilled in the art and others will recognize that data is transmitted over a computer network in packets. Unfortunately, packet loss occurs when one or more packets being transmitted over the computer network fail to reach their destination. Packet loss may be caused by a number of factors, including, but not limited to, an over utilized network, signal degradation, packets being corrupted by faulty hardware, and the like. When packet loss occurs, performance issues may become noticeable to the user. For example, in the context of a video stream, packet loss may result in “artifact” or distortions that are visible in a sequence of video frames.
The amount of artifact and other distortions in the video stream is one of the factors that has the strongest influence on overall visual quality. However, one deficiency with existing systems is an inability to objectively measure the amount of predicted artifact in a video stream. Developers could use information obtained by objectively measuring artifact to make informed decisions regarding the various tradeoffs needed to deliver quality video services. Moreover, those skilled in the art and others will recognize that when packet loss occurs, various error recovery techniques may be implemented to prevent degradation of the video stream. However, these error recovery techniques have their own trade-offs with regard to consuming network resources and affecting video quality. When modifications to the properties of a video stream are made, it would be beneficial to be able to objectively measure how these modifications will affect the quality of video services. In this regard, it would also be beneficial to objectively measure how error recovery techniques will impact the quality of a video stream to determine, among other things, whether the error recovery should be performed.
Another deficiency with existing systems is an inability to objectively measure the amount of artifact in the video stream and dynamically modify the encoding process based on the observed data. For example, during the transmission of a video stream, packet loss rates or other network conditions may change. However, with existing systems, encoders that compress frames in a video stream may not be able to identify how to modify the properties of the video stream to account for the network conditions.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Aspects of the present invention are directed at improving the quality of a video stream that is transmitted between networked computers. In accordance with one embodiment, a method is provided that dynamically modifies the properties of the video stream based on network conditions. In this regard, the method includes collecting quality of service data describing the network conditions that exist when a video stream is being transmitted. Then, the amount of predicted artifact in the video stream is calculated using the collected data. In response to identifying a triggering event, the method may modify the properties of the video stream to more accurately account for the network conditions.
The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
The present invention may be described in the general context of computer-executable instructions, such as program modules, being executed by computers. Generally described, program modules include routines, programs, widgets, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
Although the present invention will be described primarily in the context of systems and methods that modify the properties of a video stream based on observed network conditions, those skilled in the art and others will appreciate the present invention is also applicable in other contexts. In any event, the following description first provides a general overview of a system in which aspects of the present invention may be implemented. Then, an exemplary routine that dynamically modifies the properties of a video stream based on observed network conditions is described. The examples provided herein are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Similarly, any steps described herein may be interchangeable with other steps or combinations of steps in order to achieve the same result. Accordingly, the embodiments of the present invention described below should be construed as illustrative in nature and not limiting.
Now with reference to
In the embodiment illustrated in
Once the encoder/decoder 110 compresses the video stream by reducing redundancy of image data within a sequence of frames, the network devices 112 and associated media transport layer 113 components (not illustrated) may be used to transmit the video stream. In this regard, frames of video data may be packetized and transmitted in accordance with standards dictated by the real-time transport protocol (“RTP”). Those skilled in the art and others will recognize that RTP is one exemplary Internet standard protocol that may be used for the transport of real-time data. In any event, when the video stream is received, the encoder/decoder 110 on the receiving computer 104 causes the stream to be decoded and presented to a user on the rendering device 114. In this regard, the rendering device 114 may be any device that is capable of presenting image data including, but not limited to, a computer display (e.g., CRT or LCD screen), a television, monitor, printer, etc.
The control layer 116 provides quality of service support for applications with real-time properties such as applications that support the transmission of a video stream. In this regard, the quality controllers 118 provide quality of service feedback by gathering statistics associated with a video stream including, but not limited to, packet loss rates, round trip times, and the like. By way of example only, the data gathered by the quality controllers 118 may be used by the error recovery component 120 to identify packets that will be re-transmitted when error recovery is performed. In this regard, data that adheres to the real-time transport protocol may be periodically transmitted between users that are exchanging a video stream. The components of the control layer 116 may be used to modify properties of the video stream based on collected quality of service information. Those skilled in the art and others will recognize that, while specific components and protocols have been described with reference to
Now with reference to
The amount of data in each frame is visually depicted in
Similar to the description provided above, when a packet associated with an SP-frame is lost, the error may persist to other frames. For example, as depicted in the timeline 250, when the SP-frame 206 experiences packet loss, at event 256, the error persists until event 254 when the next I-frame 204 is received. Since fewer dependencies exist with regard to SP-frames than I-frames, the impact of packet loss is also less. When a P-frame experiences packet loss, only the B-frames and other P-frames which reference the P-frame that experienced packet loss are impacted by the error. Finally, errors in B-frames do not persist since B-frames are not referenced by other frame types.
As described above with reference to
NB=number of B-frames in one Group of Pictures;
NGOP=number of frames in a Group of Pictures;
NPG=number of P-frames between consecutive I-I, I-SP, SP-SP, or SP-I frames;
NSP=number of SP-frames in one Group of Pictures;
PB=B-frame loss probability;
PI=I-frame loss probability;
PP=P-frame loss probability; and
PSP=SP-frame loss probability.
Similar to Equation 1, Equation 2 contains a mathematical model that may be used to calculate the predicted artifact. However, in this instance, the mathematical model depicted in Equation 2 applies when error recovery is being performed. For example, error recovery may be performed when computers that are transmitting a video stream are configured to re-send packets of a video frame that are corrupted in transit. In this regard, Equation 1 provides a formula for calculating the predicted artifact in a principal video stream that is initially transmitted between computers when a video stream consists of the four frame types described above with reference to
PI=I-frame loss probability;
PSP=SP-frame loss probability.
PP=P-frame loss probability;
PB=B-frame loss probability; and
RTT=round trip time.
Those skilled in the art and others will recognize that the mathematical models provided above with regard to Equations 1 and 2 should be construed as exemplary and not limiting. For example, these mathematical models assume that a video stream consists of I-frames, P-frames, SP-frames, and B-frames. However, as mentioned previously, a video stream may consist of fewer or additional frame types and/or a different set of frame types than those described above. In these instances, variations on the mathematical models provided above may be used to calculate the predicted artifact in a video stream. Moreover, Equations 1 and 2 are described in the context of calculating the amount of predicted artifact. The “artifact percentage” from a video stream may be calculated using the mathematical models described above by dividing the predicted artifact with the number of frames in a Group of Pictures (“GOP”).
With reference now to
As
In accordance with one embodiment, ranges of predicted artifact associated with the distributions 302-304 may be used to set the properties of a video stream. For example, when error recovery is being performed and the artifact percentage represented in the distribution 304 is identified as being less than ten (10) percent, a video stream may be transmitted in accordance with a first set of properties. The properties of the video stream potentially modified given the range of artifact percentage may include, but are not limited to, the distribution of frame types (e.g., the percentage and frequency of I-frames, SP-frames, P-frames, B-frames), the frame rate, the size of frames and packets, the application of redundancy in channel coding including the extent in which forward error correction (“FEC”) is applied for each frame type, etc. In this regard, by objectively measuring the predicted artifact in a video stream, more informed decisions may be made regarding how the video stream should be transmitted. For example, as the amount of predicted artifact increases, the properties of the video stream may be modified to include a higher percentage of B-frames, thereby improving video quality at higher packet loss rates. Moreover, if the artifact percentage represented in the distribution 304 is identified as corresponding to a different range, the video stream may be transmitted in accordance with another set of video properties.
In accordance with one embodiment, ranges of predicted artifact obtained using the distributions 402-408 may be established to set properties of a video stream. For example, in some instances, a content provider guarantees a certain quality of service for a video stream. Based on information represented in the distributions 402-408, the predicted artifact percentage at different frame rates, packet loss rates, and other network properties may be identified. By identifying the predicted artifact percentage, the frame rate may be adjusted so that the quality of service guarantee is satisfied. In this regard, the frame rate may be reduced in order to produce a corresponding reduction in artifact.
The examples provided with regard to
Increasingly, a video stream is transmitted over multiple network links. For example, a multi-point control unit is a device that supports a video conference between multiple users. In this regard,
Now with reference to
In this exemplary embodiment, a video stream encoded by the encoder/decoder 802 on the sending device 702 is transmitted to the switcher 810. When received, the switcher 810 routes the encoded video stream to each of the rate matchers 812. For each device that will receive the video stream, one of the rate matchers 812 applies algorithms on the encoded video stream that allows the same content to be reproduced on devices that communicate data at different bandwidths. Once the rate matchers 812 have applied the rate matching algorithms, the video stream is transmitted to the receiving devices 704-708 where the video stream may be decoded for display to the user.
Unfortunately, existing systems may set the properties of the video stream to the lowest common denominator to accommodate a device that maintains the worst connection in the networking environment 700. Moreover, transmission of a video stream using the multi-point control unit 701 may not scale to large numbers of endpoints. For example, when the sending device 702 transmits a video stream to the multi-point control unit 701, the data may be forwarded to each of the receiving devices 704-708 over the downstream network connections 712-716, respectively. When packet loss occurs on the downstream network connections 712-716, requests to re-send lost packets may be transmitted back to the sending device 702, if error recovery is being performed. However, since the sending device 702 is supporting error recovery for all of the receiving devices 704-708, the sending device 702 may be overwhelmed with requests. More generally, as the number of endpoints participating in the video conference increase, the negative consequences of performing error recovery also increases. Thus, objectively measuring video quality and setting the properties of a video stream to account for network conditions is particularly applicable in the context of a multi-point control unit that manages a video conference. However, while aspects of the present invention may be described as being implemented in the context of a multi-point control unit, those skilled in the art and others will recognize that aspects of the invention will apply in other contexts.
The channel quality controllers 814 on the multi-point control unit 701 communicate with the channel quality controllers 806 on the sending device 702 and receiving devices 704-708. In this regard, the channel quality controllers 814 monitor bandwidth, RTT, and packet loss on each of their respective communication channels. The video conference controller 816 may obtain data from each of the channel quality controllers 806 and set properties of one or more video streams. In this regard, the video conference controller 816 may communicate with the rate matchers 812 and the local quality controllers 808 to set the properties for encoding the video stream on the sending device 702. These properties may include but are not limited to, frame and data transmission rates, GOP values, the distribution of frame types, error recovery, redundancy in channel coding, frame and/or packet size, and the like.
Aspects of the present invention may be implemented in the video conference controller 816 to tune the properties at which video data is transmitted between sending and receiving devices. In accordance with one embodiment, the properties of a video stream are modified dynamically based on observed network conditions. For example, the video conference controller 816 may obtain data from each of the respective channel quality controllers 806 that describes observed network conditions. Then, calculations may be performed to determine whether a reduction of artifact in the video stream may be achieved. For example, using the information described with reference to
In accordance with one embodiment, the video conference controller 816 communicates with the rate matcher 812 for the purpose of dynamically modifying the properties of the video stream that is transmitted from the sending device 702. To this end, data that describes the network conditions on the downstream network connections 712-714 is aggregated on the multipoint control unit 701. Then, an optimized set of video properties to encode the video stream on the sending device 702 is identified. For example, using a mathematical model described above, a set of optimized video properties that account for network conditions observed on the downstream network connections is identified. Then, aspects of the present invention cause the video stream to be encoded on the sending device 702 in accordance with the optimized set of video properties for transmission on the network connection 710. In this regard, the video conference controller 816 may communicate with the rate matchers 812 and the local quality controllers 808 to set the properties for encoding the video stream on the sending device 702.
In accordance with another embodiment, the video conference controller 816 communicates with the rate matcher 812 for the purpose of dynamically modifying the properties of one or more video streams that are transmitted from the multipoint control unit 701. In this regard, data that describes the network conditions on at least one downstream network connection is obtained. For example, using a mathematical model described above, a set of optimized video properties that account for network conditions observed on the a downstream network connection is identified. Then, aspects of the present invention cause the video stream to be transcoded on the multi-point control unit 701 in accordance with the optimized set of video properties for transmission on the appropriate downstream network connection. To this end, the video conference controller 816 may communicate with the rate matchers 812 to set the properties for transcoding video streams on the multipoint control unit 701.
In yet another embodiment, aspects of the present invention aggregate data obtained from the sending and receiving devices 702-708 to improve video quality. For example, those skilled in the art and others will recognize that redundancy in channel coding may be implemented when transmitting a video stream. On one hand, redundancy in channel coding adds to the robustness for transmitting a video stream by allowing techniques such as forward error correction to be performed. On the other hand, redundancy in channel coding is associated with drawbacks that may negatively impact video quality as additional network resources are consumed to redundantly transmit data. By way of example only, aspects of the present invention may aggregate information obtained from the sending and receiving devices 702-708 to determine whether and how the sending device 702 will implement redundancy in channel coding. For example, packet loss rates observed in transmitting data to the receiving devices 704-708 may be aggregated on the multi-point control unit 701. Then, calculations are performed to determine whether redundancy in channel coding will be implemented given the tradeoff of redundantly transmitting data in a video stream. In this example, aspects of the present invention may be used to determine whether redundancy in channel coding will result in improved video quality given the observed network conditions and configuration of the network.
With reference now to
At block 902, the transmission of video data is initiated using default properties. As mentioned previously, aspects of the present invention may be implemented in different types of networks, including wide and local area networks that utilize protocols developed for the Internet, wireless networks (e.g., cellular networks, IEEE 802.11, Bluetooth networks), and the like. Moreover, a video stream may be transmitted between devices and networks that maintain different configurations. For example, as mentioned previously, a sending device may merely transmit a video stream over a peer-to-peer network connection. Alternatively, in the example described above with reference to
Those skilled in the art and others will recognize that the capabilities of a network affect how a video stream may be transmitted. For example, in a wireless network, the rate that data may be transmitted is typically less than the rate in a wired network. Aspects of the present invention may be applied in an off-line context to establish default properties for transmitting a video stream given the capabilities of the network. In this regard, an optimized set of properties that minimizes artifact in the video stream may be identified for each type of network and/or configuration that may be encountered. For example, the distributions depicted in
Once the transmission of the video stream is initiated, the network conditions are observed and statistics that describe the network conditions are collected, at block 904. As mentioned previously, quality controllers on devices involved in the transmission of a video stream may provide quality of service feedback in the form of a set of statistics. These statistics may include packet loss rates, round-trip times, available and consumed bandwidth, or any other data that describes a network variable. In accordance with one embodiment, data transmitted in accordance with the RTCP protocol is utilized to gather statistics that describe network conditions. However, the control data may be obtained using other protocols without departing from the scope of the claimed subject matter.
As illustrated in
As illustrated in
At block 910, the properties of a video stream are modified to account for observed network conditions. Similar to the off-line context described above (at block 902), the distributions depicted in
While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.