The present invention is directed, in general, to systems and methods for dynamically scaling digital video data based on network conditions and client performance; more specifically to scaling video data streamed utilizing the Real Time Protocol.
Streaming video over the Internet has become a phenomenon in modern times. Many popular websites, such as YouTube, a service of Google, Inc. of Mountain View, Calif., and WatchESPN, a service of ESPN of Bristol, Conn., utilize streaming video in order to provide video and television programming to those consumers who cannot or do not have access to a traditional television.
The Transmission Control Protocol (TCP) is a protocol for transmitting a stream of bytes over IP networks. TCP provides reliable, ordered delivery between endpoints on a network. TCP is designed to ensure accurate delivery, requiring that the receiving computer acknowledge each packet of data before delivering the data to the receiving computer. This acknowledgement process, while ensuring reliable, ordered delivery, can cause delays of up to several seconds if transmission errors occur.
A network control protocol to stream data over a network utilizing TCP is the Real. Time Messaging Protocol (RTMP), developed by Macromedia, which is now owned by Adobe, Inc. of San Jose, Calif. RTMP is the protocol used to stream Flash video between a Flash player and a server. RTMP is used in several websites, including YouTube. RTMP is Transmission Control. Protocol (TCP)-based, allowing for persistent connections and allowing low-latency communication. An RTMP client sends and receives data streams over the persistent connection.
The Real-time Transport Protocol (RTP) is a standardized packet format for delivering multimedia data over Internet Protocol (IP) networks. RTP commonly utilizes the User Datagram Protocol (UDP) as the transport layer; however, TCP may also be utilized for the transport layer. RTP is used in situations where stream data, such as audio or video data, must be transported end-to-end in real-time. RTP is optimized for speed of transmission rather than reliability; however RTP provides the ability to correct for common errors in data transferred over IP networks, such as jitter and data that has arrived out of sequence. RTP also contains a sub-protocol, the Real-time Transport Control Protocol (RTCP), which is used to specify quality of service feedback and synchronization between various RTP media streams. Several. RTCP message types are defined: sender report (SR), receiver report (RR), source description (SDES), end of participation (BYE), and application-specific message (APP). The definition and implementation of each type of message is described in Internet Engineering Task Force RFC 3550, the entirety of which is incorporated by reference.
One network control protocol to stream data over a network utilizing RTP is the Real. Time Streaming Protocol (RTSP), used by QuickTime Streaming Server, a product of Apple, Inc. of Cupertino, Calif., and Helix Universal. Server, a product of RealNetworks of Seattle, Wash. RTSP is used to establish and control media sessions between endpoints, such as between a media server and a client machine. The client machines can issue commands, such as play, pause, and stop, to enable the real-time control of playback of media files stored on the server.
Scalable Video Coding (SVC) is an extension of the H.264/MPEG-4 AVC video compression standard. SVC enables the encoding of a video bitstream that additionally contains one or more sub-bitstreams. The sub-bitstreams are derived from the video bitstream by dropping packets of data from the video bitstream, resulting in a sub-bitstream of lower quality and lower bandwidth than the original video bitstream. SVC supports three forms of scaling a video bitstream into sub-bitstreams: temporal scaling, spatial scaling, and quality scaling. Each of these scaling techniques can be used individually or combined depending on the specific video system.
Systems and methods for dynamically scaling streaming video data based on network conditions and client performance in accordance with embodiments of the invention are disclosed. In one embodiment of the invention, a system for streaming data includes a media server configured to stream video data having a first maximum bitrate utilizing the Real-time Transport Protocol (RTP), a network client configured to connect to the media server wherein the network client is configured to measure network performance and video decoding performance and to send network and video decoder performance data to the network renderer utilizing the Real-time Transport Control Protocol (RTCP), wherein the network renderer is configured to stream video data having a second maximum bitrate in response to the network and video decoding performance data received from the network client.
In another embodiment of the invention, the video is encoded utilizing Scalable Video Coding (SVC).
In an additional embodiment of the invention, the network client is configured to send video decoder performance information utilizing a RTCP APP message.
In yet another additional embodiment of the invention, the network client is configured to send network performance information utilizing a RTCP RR message.
In still another embodiment of the invention, network renderer is configured to transmit video decoder configuration information to the network client utilizing RTCP.
Yet another embodiment of the invention includes a network client including a video decoder, where the video decoder is configured to decode video data, wherein the network client is configured to receive video data utilizing RTP, wherein the network client is configured to collect network and video decoder performance information, and wherein the network client is configured to send network and video decoder performance information using the network connection utilizing RTCP.
In still another embodiment of the invention, the video decoder is configured to decode SVC-encoded video data.
In yet still another embodiment of the invention, the network client is configured to send video decoder information utilizing a RTCP APP message.
In still another embodiment of the invention, the network client is configured to send network performance information utilizing a RTCP RR message.
In yet another additional embodiment of the invention, the network client is configured to receive video decoder information via RTCP and to update the video decoder configuration based upon the video decoder information.
Still another embodiment of the invention includes streaming video data, involving streaming video data having a first maximum bitrate from a network renderer to a network client utilizing RTP, receiving performance information regarding client performance and network performance utilizing RTCP, and streaming video data having a second maximum bitrate in response to the network and video decoding performance information received from the network client.
In another embodiment of the invention, the video data is encoded utilizing SVC.
In yet another embodiment of the invention, the video data having a second maximum bitrate is a sub-bitstream of the encoded SVC video data.
In still another embodiment of the invention, streaming video data further involves constructing a RTCP APP message containing performance information regarding client performance.
In yet another embodiment of the invention, streaming video data further involves constructing a RTCP RR message containing performance information regarding network performance.
In still yet another embodiment of the invention, streaming video data further involves sending an updated decoder configuration.
Yet another embodiment of the invention includes receiving streaming video data, involving receiving video data using a network client via RTP, decoding video data using a video decoder configured to decode video, analyzing video decoder performance, analyzing network performance, sending network and video decoder performance information to a network renderer via RTCP.
In yet another embodiment of the invention, analyzing decoding performance comprises analyzing frame type and time to decode one frame.
In still another embodiment of the invention, analyzing video decoder performance comprises constructing an RTCP APP message.
In another further embodiment of the invention, analyzing network performance comprises constructing an RTCP RR message.
Still yet another embodiment of the invention includes a machine readable medium containing processor instructions, where execution of the instructions by a processor causes the processor to perform a process involving receiving video data via a network connection, decoding video data, analyzing video decoder performance, analyzing network performance, and sending network and video decoder performance information to a network renderer via RTCP.
In a further embodiment of the invention, the process performed by a processor executing the instructions contained on the machine readable medium further comprises constructing a RTCP APP message containing video decoder performance information.
In yet another further embodiment of the invention, the process performed by a processor executing the instructions contained on the machine readable medium further comprises constructing a RTCP RR message containing network performance information.
Turning now to the drawings, systems and methods for streaming video data via Real-time Transport Protocol (RTP) so that the bitrate of the streamed video adapts in response to measurements of network and decoder performance in accordance with embodiments of the invention are illustrated. In several embodiments of the invention, a network renderer is connected to a plurality of network clients and the network renderer is configured to provide streaming video data encoded using adaptive video formats to the network clients based upon measurements performed by the network clients concerning network and decoder performance. In a number of embodiments of the invention, the Scalable Video Codec (SVC) is used to encode and decode the adaptive video format. However, any streaming system in which a video renderer can adjust the bandwidth utilized in streaming video can be utilized in accordance with embodiments of the invention.
In many embodiments of the invention, the network client is configured to send measured performance information to the video renderer. In a number of embodiments of the invention, the network client is configured to send measured performance information to the video renderer utilizing RTCP. In several embodiments of the invention, the network renderer is configured to use performance information to scale the video quality and to provide an updated decoder profile to a network client. In several embodiments of the invention, decoder profiles are provided to the network client utilizing RTCP. By utilizing standard RTP and RTCP messages, backward compatibility with legacy systems is maintained while allowing for scalable video data to be harnessed. Systems and methods for streaming video data in accordance with embodiments of the invention are discussed further below.
Video data networks in accordance with embodiments of the invention are configured to adapt the bitrate of the video transmitted to network clients based upon measurement of network and decoder performance. A video data network in accordance with an embodiment of the invention is illustrated in
The network clients 104 contain a video decoder 106. As is discussed further below, in many embodiments of the invention, the network client 104 is configured to measure the performance of the video decoder 106. In several embodiments of the invention, the video decoder 106 measures its own performance and sends performance data to the network client 104. In a number of embodiments of the invention, the network client 104 is configured to measure performance of the network connection with the network renderer 102. As is discussed further below, the network client 104 is configured to send performance data to the network renderer 102. In many embodiments of the invention, the performance data is sent utilizing RTCP.
In many embodiments of the invention, network clients can include consumer electronics devices such as DVD players, Blu-ray players, televisions, set top boxes, video game consoles, tablets, and other devices that are capable of connecting to a server via RTP and playing back encoded media. The basic architecture of a network client in accordance with an embodiment of the invention is illustrated in
Although a specific architecture of a video data network is shown in
Processes for streaming video data in accordance with embodiments of the invention allow for modification of the video stream transmitted to a network client in response to measurements of network and video decoder performance. A process for streaming scalable video data in accordance with an embodiment of the invention is illustrated in
In a number of embodiments of the invention, the video decoder performance received (214) is contained in an RTCP APP message having the following syntax: 2 bit protocol version, 1 bit padding, 5 bit APP packet sub-type, 8 bit packet type, 16 bit total length of packet, 32 bit SSRC, 32 bit unique name of APP packet, and a variable length application-dependent data. In several embodiments, the network performance is contained in a RTCP RR message as defined in IETF RFC 3550. In many embodiments of the invention, the process 200 repeats until an RTCP BYE message is received. In several embodiments of the invention, the process 200 repeats until there is no device available to receive the data. In a number of embodiments of the invention, the process 200 repeats until all data has been sent.
Although a specific process for streaming video data in response to measurements of network and decoder performance is shown in
In many embodiments of the invention, a network client generates performance information concerning both network performance and decoder performance and sends messages to a network renderer regarding the generated performance information. As discussed above with respect to
A process for determining and sending performance information in accordance with an embodiment of the invention is illustrated in
Although a specific process for determining and sending performance information is shown in
In several embodiments of the invention, a network renderer dynamically scales the data sent based on the performance of a network client. A process for dynamically scaling video data in accordance with an embodiment of the invention is illustrated in
A determination (416) is then made concerning whether the video quality has changed. If the scaled video data quality has not changed, scaled video data is sent (422). If the scaled video data quality level has changed, the video data quality is updated (418). For example, if the RTCP APP message contains information indicating that the video decoder is underutilized, the scaled video data quality level may be increased to better take advantage of the video decoder. Similarly, if the RTCP RR message contains information that drops are high, the scaled video data quality level may be decreased in order to improve performance. In many embodiments of the invention, the video data quality corresponds to a sub-bitstream of video data encoding using SVC. In a number of embodiments of the invention, a network renderer updates (418) the video data quality. In several embodiments of the invention, a video encoder updates the video data quality. In many embodiments of the invention, an updated decoder configuration may be necessary to decode the updated video data and an updated decoder configuration (420) is sent. The updated decoder configuration contains information required to decode the scaled video data, such as frame size, frame rate, the encoding used, and other relevant information. In several embodiments of the invention, the updated decoder configuration is a SVC decoder profile. In a number of embodiments of the invention, the updated decoder configuration is sent (420) utilizing RTCP. Scaled video data at the updated quality is then sent (422). In many embodiments of the invention, the updated decoder configuration is sent (422) with the scaled video data. In several embodiments of the invention, the data is sent utilizing RTP.
Although a specific process for dynamically scaling video data is shown in
Although the present invention has been described in certain specific aspects, many additional modifications and variations would be apparent to those skilled in the art. It is therefore to be understood that the present invention may be practiced otherwise than specifically described without departing from the scope and spirit of the present invention. Thus, embodiments of the present invention should be considered in all respects as illustrative and not restrictive.