The present invention will be readily understood by the following detailed description in conjunction with the accompanying drawings, and like reference numerals designate like structural elements.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. In other instances, well known process operations and implementation details have not been described in detail in order to avoid unnecessarily obscuring the invention.
Streaming clients 150 access data streaming server (DSS) 120 to receive audio, video, and other multimedia content such as images, documents, and annotations in real time. Any media can be transmitted using DSS 120 provided it can be encoded in a plurality of bitrates. In one embodiment, DSS 120 may receive audio, video, and other multimedia content directly from conferencing server 110 via a local area network (LAN) connection 146. Streaming clients 150 generally will be able to receive a combined audio stream made up of audio streams from all conference participants 130. However, in one embodiment, each streaming client 150 can only view one high audio/video stream from one of the conference participants 130, or a composite video stream formed from a plurality of video streams, and generally cannot choose which conference participant 130 to view. Furthermore, streaming clients may be given an opportunity to ask questions by sending a signal along reverse path 157 to DSS 120 indicating a desire to ask a question, and then, after permission is granted, a low bit-rate video and/or audio signal can be sent from the streaming clients to DSS 120 encoding the individual's question.
To select which video feed to send to streaming clients 150, and to permit question or feedback from streaming clients 150, a special conference participant, referred to herein as controller client 140, is provided with a control panel. Controller client 140 is connected to both conferencing server 110 and DSS 120. Using the control panel, controller client 140 can designate and communicate to conferencing server 110 which video stream from conference participants 130 to send to streaming clients 150. In addition, controller client 140 connects with DSS 120 to interact with streaming clients 150. Such interaction includes assistance with setting up, e.g., confirming their audio and video signals are being received, and selecting which audio and/or video feed from streaming clients 150 to send to conference participants 130 when asking a question. Other interactions are also possible, such as chatting. Chatting is the sending and receiving of instant text messages between participants. Additional details of conferencing system 100 are provided in related U.S. patent application Ser. No. 11/457,285, which is incorporated herein by reference.
I/O ports 164 can be connected to external devices, including user interface 174 and network interface 176. User interface 174 may include user interface devices, such as a keyboard, video screen, and a pointing device such as a mouse. Network interface 176 may include one or more network interface cards (NICs) for communicating via an external network.
DSS 120 receives the mixed audio and composite video signal from conference server 110. The format of the audio and video signal may vary depending on the particular implementation. For example, the audio and video signals may be provided in a compressed format or an uncompressed format. In one embodiment, the audio and video signals are transmitted to DSS 120 as high quality, high bitrate compressed signals via network connection 146. The high quality compressed audio and video signals may be retransmitted by DSS 120 as they are received directly to one or more streaming clients 150 that are capable of receiving and processing high quality, high bitrate signals. For these recipients, the audio and video signals may pass through audio and video codecs 121, 122 to packetizers 123, 124, and to transmit circuit 125. It should be noted that codecs 121, 122, packetizers 123, 124, and transmit circuits 125 may each be implemented as hardware or software components of DSS 120. In one embodiment, codecs, packetizers and the transmit and receive circuits are all implemented as hardware components, which operate at the direction of a software server application.
Some of streaming clients 150 may not be capable of receiving high quality high bitrate audio and video signals, either because they lack sufficient network bandwidth to accommodate the signals, or because they lack sufficient available processor power to decode the high quality signals in real time. By “real time” it is meant that the incoming data can be processed as fast as it is received, without increasing lag times between receipt of data representing a specific frame of video, and actual display of that frame of video. For streaming clients that are not capable of receiving the high quality high bitrate audio and video signals, audio and video codecs 122, 121, decode the audio and video signals, respectively, and re-encode the signals at a higher compression, lower quality, lower bitrate signal.
Because there may be many, e.g., 40 or more, streaming clients, each with different available bandwidth and processing power, it may not be possible to finely tailor the bitrate to optimize content quality for each streaming client. Therefore, DSS 120 may transmit audio and video data in a predetermined number of bitrates. In one embodiment, DSS 120 retransmits the high-quality high bitrate audio and video signals received from conference server 110 and generates two lower quality, lower bitrate signals for one or more streaming clients 150 that cannot receive the high quality high bitrate signal. For example, a second data stream may be generated by codecs 121, 122 that is half the bitrate of the high bitrate signal, and a third data stream may be generated by codecs 121, 122 that is a fourth of the bitrate of the high bitrate data stream. The reduced bitrate data streams may be generated by using a higher compression algorithm, and/or by dropping or combining audio channels, and/or by dropping frames to reduce the refresh rate of a video. Receive circuit 128 receives communications from streaming clients 150 as will be described in further detail below and passes this communication to controller 126, which may be a hardware or software component of DSS 120 to identify, for each streaming client 150, which level of bitrate it is capable of receiving.
While DSS 120 is presented herein in the context of a video conferencing system, it should be recognized that it may be implemented in other ways. For example, for broadcasting over the Internet a live sporting event in real time.
In operation 206, the DSS monitors the number of unprocessed data packets in a manner that is described in more detail below with reference to
In operation 208, if the number of unprocessed packets has not increased, then the procedure flows to operation 212, to determine if the number of unprocessed packets has reduced. If the number of unprocessed packets has reduced according to the algorithm described below with reference to
In operation 214, DSS 120 determines whether there is significant available bandwidth in the connection from the server to the streaming clients. The determination as to whether there is significant available bandwidth may be performed as described below with reference to
Operation 216 is performed when there is insufficient available bandwidth as determined in operation 214, or when the number of unprocessed packets is not reduced as determined in operation 212. In operation 216, the current data stream sent to the streaming clients is maintained. In operation 220, as will be described in further detail below, in some circumstances one or more streaming clients are randomly selected to increase the bitrate if the operation of DSS 120 is stable over a selected or predetermined period of time. The procedure then ends as indicated by done block 222.
The number of the TCP packets of the streaming multimedia data received by the receiver is obtained from the receiver, preferably as a Real-time Transport Control Protocol (RTCP) receiver report packet periodically sent by the receiver and received by receive circuit 128. The number of the TCP packets of the video data transmitted by video conferencing system 100 is obtained from video conferencing system 100. In a preferred embodiment, the RTCP reporting interval is two seconds, and the numbers of packets are counted starting with an initialization event, such as the start of the current video conferencing session.
Returning to
The procedure illustrated by flowchart 300 benefits from the stability of the value of D. Therefore, in a preferred embodiment, when a new value of D is calculated, it is compared to the previous value of D. If the new value of D falls inside an estimate window surrounding the previous value of D, then the new value of D is discarded, and the previous value of D is used. In one embodiment, the estimate window is D ± one standard deviation of DIFF. In this embodiment, the standard deviation of DIFF is computed as the median absolute deviation of the previous 50 values of DIFF, although other computation methods can be used.
In operation 306, the standard deviation SDev of the packets of video data in transit is estimated. In one embodiment, the standard deviation SDev is computed as the median absolute deviation of the previous 50 values of DIFF, although other computation methods can be used. During initialization, an insufficient number of values of DIFF are available. In this case, the standard deviation SDev is computed as the average of the highest and lowest values of DIFF until 7 samples of DIFF have been received, although other computation methods can be used. Thereafter, the standard deviation SDev may be computed as described above.
A plurality of threshold values is determined based on DIFF and D. Furthermore, a counter I is maintained for each threshold. In one embodiment, four thresholds are used, and counters I1, I2, I3, and I4 are maintained. In addition, a counter I5 may be used to count the number of receiver reports for which no video bit rate adjustments are made.
In operation 308, it is determined whether DIFF exceeds the sum D and two times SDev. If DIFF does exceed D+2SDev, then the procedure flows to operation 310, wherein controller 126 increments counter I1, and the procedure flows to operation 312. In operation 312, it is determined whether I1=3. In this case, DIFF>D+2SDev for three consecutive RTCP receiver reports, and the procedure flows to operation 314 wherein controller 126 identifies that there are an increased number of unprocessed packets. This information is used to determine the outcome of operation 208 of flowchart 200 in
In operation 314, the procedure pauses for a period of time to allow for any changes made to the bitrate in accordance with flowchart 200 of
If, in operation 308, DIFF≦D+2SDev, then the procedure flows to operation 320 wherein counter I1 is reset to zero to ensure that counter I1 counts only consecutive RTCP receiver reports where DIFF>D+2SDev. The procedure then flows to operation 322.
In operation 322, it is determined whether DIFF exceeds the sum of the value of D and the standard deviation SDev. If DIFF>D+SDev, then the procedure flows to operation 324 wherein controller 126 increments counter I2. The procedure then flows to operation 326 to determine whether I2=5. If I2=5, then DIFF>D+SDev for five consecutive RTCP receiver reports and the procedure flows to operation 314 as described above. If I2<>5, then the procedure jumps to operation 330.
If at operation 322 DIFF≦D+SDev, the procedure flows to operation 328 wherein counter I2 is reset to zero to ensure that counter I2 counts only consecutive RTCP receiver reports where DIFF>D+SDev. The procedure then flows to operation 330.
In operation 330, it is determined whether DIFF exceeds the value of D. If DIFF>D, then the procedure flows to operation 332 wherein controller 126 increments counter I3. The procedure then flows to operation 334 wherein it is determined whether I3=9. If I3=9, then DIFF>D for nine consecutive RTCP receiver reports and the procedure flows to operation 314 as described above. If I3 <>9, then the procedure jumps to operation 338.
If, at operation 330, DIFF≦D, the procedure flows to operation 336 wherein counter I3 is reset to zero to ensure that counter I3 counts only consecutive RTCP receiver reports where DIFF>D. The procedure then flows to operation 338.
In operation 338, it is determined whether DIFF is less than the value of D. If DIFF<D, then the procedure flows to operation 340 wherein controller 126 increments counter I4. The procedure then flows to operation 342 to determine whether I4=6, meaning DIFF<D for six consecutive RTCP receiver reports. If I4=6, then DIFF<D for six consecutive RTCP receiver reports and the procedure flows to operation 344 to indicate a reduced number of unprocessed packets. If, in operation 342, I4<>6, then the procedure jumps to operation 348.
In operation 344, the indication that there are a reduced number of unprocessed packets is used in flowchart 200 (
If, at operation 338, DIFF>D, then the procedure flows to operation 346 wherein counter I4 is reset to zero to ensure that counter I4 counts only consecutive RTCP receiver reports where DIFF<D. The procedure then flows to operation 348.
In operation 348, I5 is incremented. At this state, the number of unprocessed packets has neither increased nor decreased significantly. For example, DIFF may be fluctuating from less than D to greater than D, generally indicating stable operation. To ensure that the bitrate does not stabilize at an unnecessarily low value, the counter I5 is used to determine whether the number of unprocessed packets has remained stable for 16 consecutive values of DIFF (that is, for J RTCP receiver report packets). In operation 350, it is determined whether I5=16. If I5=16, then the procedure flows to operation 344 which is described above. Otherwise, the procedure returns to operation 302. It should be noted that other threshold values of each of the counters I1 through I5 may be used.
As mentioned previously, DSS server 120 (
Increasing bitrate requires ensuring not only that there is a reduced number of unprocessed packets, but also that there is sufficient bandwidth availability to accommodate the increased bitrate. Because all streaming clients 150 must be categorized in a defined number of categories, e.g., 3 categories, according to their capability to receive and process streaming data, DSS is not able to slowly increase bitrates by incremental amounts individually for each streaming client, as described in related U.S. patent application Ser. No. 11/051,674, which is directed to streaming media from conference server 110 to conference participants 130 (
Returning to
In operation 462, it is determined whether the slope of the best-fit line is negative. If the slope is negative, then the procedure flows to operation 464 in which it is determined that additional bandwidth is available. This determination is used in operation 214 of the procedure illustrated by flowchart 200 in
The algorithm described above with reference to
Since the measurements are only an estimate and the network conditions are always changing, the number of bitrate increases may be limited. In one embodiment, the rate at which a connection can be raised is equal to two minutes divided by the number of levels from the highest bitrate level. Thus, if a streaming client is dropped from a highest bitrate to a second highest bitrate level, then an increase can happen 2 mins./2nd level=1 increase per minute. If a streaming client is dropped from a highest bitrate to a third highest bitrate, then the increase is 2 mins/3rd lvl=1 increase every 40 seconds.
In one embodiment, the number of streaming clients connected to the DSS that can increase may be limited over a selected interval of time. For example, the number of streaming client that can be increased may be limited to one every five seconds. This allows each client to increase its bitrate and the network to have a short time to stabilize, and up to 12 streaming client to increase their bitrate every minute.
In addition, an embodiment may include a burst detection routine to reduce the impact of data bursts. For example, when the streaming data includes video data, an I-frame may be periodically transmitted when the video includes significant motion. As generally known in the art of video compression, an I-frame contains sufficient information to construct a complete video frame without relying on data previously transmitted for previous video frames. As such, I-frames are significantly larger in terms of data requirements than typical video frames. When such a burst occurs, controller 126 can reduce the bitrate in each stream by half or cause a lower bitrate data stream to be selected for each connected client, and maintain that value for 3 RTCP receiver report packets before resuming normal bitrates.
Occasionally, it is possible that the network stabilizes at a bitrate that is not optimal. The algorithm may behave as though the network is optimized, but in actuality, the bitrate can be increased. To account for this, in one embodiment, when there has not been many network adjustments for a period of time, a connection to one of the streaming clients is randomly picked and switched to a data stream having the next highest bitrate. As long as this is not performed too often, it may help each connection to get to its optimum bitrate level.
With the above embodiments in mind, it should be understood that the invention can employ various computer-implemented operations involving data stored in computer systems. These operations are those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared and otherwise manipulated. Further, the manipulations performed are often referred to in terms such as producing, identifying, determining, or comparing.
Any of the operations described herein that form part of the invention are useful machine operations. The invention also relates to a device or an apparatus for performing these operations. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
The invention can also be embodied as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data, which can be thereafter be read by a computer system. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes and other optical and non-optical data storage devices. The computer readable medium can also be distributed over a network-coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
Embodiments of the present invention can be processed on a single computer, or using multiple computers or computer components which are interconnected. A computer, as used herein, shall include a standalone computer system having its own processor(s), its own memory, and its own storage, or a distributed computing system, which provides computer resources to a networked terminal. In some distributed computing systems, users of a computer system may actually be accessing component parts that are shared among a number of users. The users can therefore access a virtual computer over a network, which will appear to the user as a single computer customized and dedicated for a single user.
Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.
This application is related to U.S. patent application Ser. No. 10/192,130 filed on Jul. 10, 2002 and entitled “Method and Apparatus for Controllable Conference Content via Back-Channel Video Interface;” U.S. patent application Ser. No. 10/192,080 filed on Jul. 10, 2002 and entitled “Multi-Participant Conference System with Controllable Content Delivery Using a Client Monitor Back-Channel;” U.S. patent application Ser. No. 11/051,674 filed on Feb. 4, 2005 and entitled “Adaptive Bit-Rate Adjustment of Multimedia Communications Channels Using Transport Control Protocol;” U.S. patent application Ser. No. 11/199,600 filed on Aug. 9, 2005 and entitled “Client-Server Interface to Push Messages to the Client Browser;” U.S. patent application Ser. No. 11/340,062 filed on Jan. 25, 2006 and entitled “IMX Session Control and Authentication;” and U.S. patent application Ser. No. 11/457,285 filed on Jul. 13, 2006 and entitled “Large Scale Real-Time Presentation of A Network Conference Having a Plurality Of Conference Participants;” all of which are incorporated herein by reference.