Streaming of variable bit rate multimedia content

Description

TECHNICAL FIELD

[0001] The systems and methods described herein relate to streaming variable bit rate multimedia content. More particularly, the systems and methods described herein relate to streaming variable bit rate multimedia content at a constant bit rate negotiated between a server and a client.

BACKGROUND

[0002] Traditional approaches to multimedia streaming provide content that is encoded at a constant bit rate and transmitted from a server to one or more clients over a constant bit rate channel. For example, standard telephone quality audio content is often encoded at 64K bits per second (b/s). Therefore, a telephone transmission channel must have a throughput of at least 64K b/s to properly stream the audio content.

[0003] In contrast, it is sometimes more advantageous to encode video content using a variable bit rate, the reason being that for certain types of video content—such as movies—the information that needs to be encoded varies over time. An idle segment of a movie contains less information than a segment where there is a great deal of action. It may be inefficient to encode such content at a constant bit rate. Depending on the bit rate chosen, idle segments can under-utilize an available bit rate, while action segments may not have sufficient bit rate to be encoded with the same quality as the idle segments.

[0004] Encoding a movie at a bit rate that varies as necessary to obtain the desired quality allows the movie to be encoded at a constant quality. However, variable bit rate content can pose problems for streaming media applications. One of the main problems is that the throughput of the transmission channel is often limited. If the instantaneous bit rate of the encoded content is ever higher than the throughput of the channel, the content cannot be streamed, even if the average bit rate of the content is less than the channel's throughput.

[0005] Another problem encountered is that transmission channels sometimes require that a constant bit rate be reserved for the streaming content application. In such cases, the channel will not be utilized efficiently. To stream the content, the reserved bit rate must be at least as large as the peak bit rate of the content. That means that during periods when the bit rate of the content does not fully reach the peak, a portion of the reserved bit rate is not being utilized, and that too much bandwidth was reserved in the first place.

SUMMARY

[0006] The systems and methods described herein provide for improved streaming of variable bit rate multimedia content. The described systems and methods solve the problems cited above by streaming the variable bit rate content at a constant bit rate that is negotiated between a server and a client. As a result, the content is streamed at a constant bit rate so the transmission channel is utilized in an efficient manner, solving the second problem mentioned above. If the client is able to buffer the streamed content (to a hard drive, for example) the content can be streamed at a rate that is significantly less than its peak bit rate. The content can even be streamed at a rate that is less than its average bit rate, which solves the first problem mentioned above.

[0007] The systems and methods described herein also provide for techniques to reduce the size of a client buffer utilized in the streaming process. Reducing the size of the client buffer also reduces a startup delay (streaming rate multiplied by the buffer size) experienced in the streaming process. These techniques solve two additional problems traditionally encountered in streaming multimedia content, namely, a large buffer size and a long startup delay.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] The same numbers are used throughout the drawings to reference like features and components.

[0009]
FIG. 1 is a block diagram of an exemplary server and encoder in accordance with the systems and methods described herein.

[0010]
FIG. 2 is a block diagram of an exemplary client in accordance with the systems and methods described herein.

[0011]
FIG. 3 is a flow diagram of an encoder methodological implementation described herein.

[0012]
FIG. 4 is a flow diagram of a server methodological implementation described herein.

[0013]
FIG. 5 is a flow diagram of a client methodological implementation described herein.

[0014]

FIG. 6

a
is a flow diagram of a client methodological implementation described herein for calculating a buffer size.

[0015]

FIG. 6

b
is a flow diagram of a client methodological implementation described herein for calculating a buffer size.

[0016]
FIG. 7 is an example of a computing operating environment capable of implementing the server side and/or the client side of the systems and methods described herein.

DETAILED DESCRIPTION

[0017] Systems and methods for improved streaming of variable bit rate multimedia content are described herein. One or more exemplary implementations of systems and methods for streaming of variable bit rate multimedia content are shown. The described systems and methods are exemplary only and are not meant to limit the scope of the appended claims. Other embodiments not specifically described herein may be implemented within the scope of the appended claims.

[0018] The implementations of systems and methods for streaming variable bit rate multimedia content are shown embodied in one or more of three components: a server, an encoder and a client. It is noted that the encoder may be embodied separate from the server; although a typical embodiment of a server including an encoder is shown.

[0019] Advanced System Format (ASF)

[0020] The exemplary implementations shown herein utilize the Advanced System Format (ASF) for storing coordinated multimedia data for its ability to deliver data over a wide variety of networks and for its additional suitability for local playback. However, it is noted that another format may be used without departing from the scope of the appended claims. Those skilled in the art will recognize any variations in the described implementations that may be necessary to adapt the implementations to utilize another file format.

[0021] An ASF file is composed of one or more media streams. A file header specifies properties of an entire file, along with stream-specific properties. Multimedia data, stored after the file header, references a particular media stream number to indicate its type and purpose. The delivery and presentation of all media stream data is aligned to a common timeline.

[0022] ASF is a multimedia presentation file format. It supports live and on-demand multimedia content. ASF files may be edited, but ASF is specifically designed for streaming and/or local playback.

[0023] ASF files are logically composed of three types of top-level objects: a header object, a data object and an index object. The header object is mandatory and must be placed at the beginning of every ASF file. The data object is also mandatory and must follow the header object. The index object (or objects) is/are optional but are useful in providing time-based random access into ASF files.

[0024] The following descriptions will focus on a file header object and file data objects and may be referred to simply as a header and one or more data packets. Although shown in one or more simplified diagrams, below, the header object and file data objects shown comport with ASF specifications © 2001-2003 MICROSOFT CORP.

[0025] Exemplary Server/Encoder System

[0026]
FIG. 1 is a block diagram depicting an exemplary server 100 that includes a multimedia content encoder 102 and memory 104. The server 100 also includes a processor 106, an input/output (I/O) module 108 and a network interface 110 through which the server 100 communicates with one or more client computers over a network.

[0027] The I/O module 108 may be any one or more modules that provide data to the server or assist in the transmission of data from the server 100, including but not limited to a parallel port, a serial port, a USB (Universal Serial Bus) port, an infrared (IR) port, a user input device or the like. The network interface 110 typically comprises a network interface card but may also comprise a modem, a wireless network interface device, or the like.

[0028] The server 100 also includes a mass storage device 112 such as a hard disk drive, an optical drive or the like, that may be used to store one or more applications, buffer content, etc. Additionally, the server 100 includes other miscellaneous hardware 114 that is typically found on a server and may be required to cooperate with other elements of the server described below to provide the functionality described herein.

[0029] The encoder 102 and the memory 104 are shown as having several elements therein. It is noted that these elements may be integrated within the server 100 elements or in a separate encoder embodiment. The elements shown in the encoder 102 and the memory 104 may comprise hardware modules, software modules and/or data or a combination of hardware and software. Furthermore, the elements and functionality attributed to the server memory 104 and the encoder 102, respectively, may be distributed among the server 100, the encoder 102 and the memory 104 in any other fashion that provides similar functionality as described below.

[0030] As shown in the present example, the memory stores an operating system 120 that controls the functionality and coordination thereof for the server 100 and its elements. The memory 104 also stores miscellaneous software 122 that may include application programs, control programs or the like that may be necessary in addition to the specific modules discussed below to carry out one or more of the functions described herein.

[0031] The memory 104 also stores a streaming module 124 and a speed multiplier module 126 that are configured to stream content to one or more client computers. A client stream selection 128 and stores a selection of streams selection from a client for transmission to the client and a client stream multiplier 130 which stores a speed multiplier requested by a client for streaming and an actual speed multiplier utilized by the server 100 for streaming content to the client. The client stream selection 128 and the client speed multiplier 130 will be discussed in greater detail below.

[0032] A header 132 is stored in a content buffer 118 of the memory 104 and is associated with a series of data packets (data packet 1134(1) through data packet n 134(n)). The header 132 stores multiple buffer (B) values 136, multiple streaming rate (R) values 138, and a variable bit rate flag 140 that, when set, indicates that the data packets 134 include variable bit rate data.

[0033] Each data packet 134 includes a send time 142 that is used to identify a time at which the server 100 is to transmit each respective data packet to a client. The send time 142 is utilized to cause variable bit rate content to be streamed at a variable bit rate to the client. However, if the client requests that the content should be streamed at a non-real-time rate and the data packets 134 contain variable bit rate content, then the server 100 will ignore the send times 142 and stream the content at a constant bit rate. This function is described in greater detail below.

[0034] As shown in the present example, the encoder 102 includes multimedia content 150 that may be stored in memory or streamed in to the encoder 102 from an outside source. A compression module 152 encodes the multimedia content 150 to store in the memory 102 as the series of data packets 134. The header 132 and the data packets 134 shown stored in the memory 104 comprise a data file 135.

[0035] A variable bit rate (VBR) flag module 154 sets the VBR flag 140 in the header 132 if the multimedia content is variable bit rate content and the client hasn't requested a constant bit rate, as previously discussed.

[0036] The encoder 102 also includes a buffer size calculation module 160 that is configured to calculate how much data a client needs to buffer to stream content at a given rate as the encoder 102 compresses the multimedia content 150. The buffer calculation module 160 uses a “leaky bucket” model that takes into account the multiple buffer (B) values 136 and the multiple rate (R) values 138. Such a buffer calculation model is described in U.S. patent application Ser. No. ______ by Ribas-Corbera and Chou, co-inventors named in the present application, filed Ser. No. ______ and assigned to MICROSOFT® CORP, the assignee of the present invention(s). Said patent application is hereby incorporated by reference.

[0037] The “leaky bucket” model is an allegorical reference to variable bit rate data flowing into a figurative “bucket” that may be represented as one or more values in the content buffer 118. There is a hole in the bucket, and data flows out through the hole at a constant bit rate (R). At moments when R is less than the instantaneous bit rate, data will accumulate in the bucket. The bucket can only hold a certain amount of data, B, before it overflows. Initially, the bucket is empty, at least in the implementations described herein. It is noted, however, that one or more other implementations may start with an initial buffer fullness of a nonzero value.

[0038] In at least one implementation described herein, the encoder 102 chooses fourteen (14) different streaming rates (R values 136) and calculates the smallest value of buffer size, B values 138, for each R such that the bucket does not overflow. The encoder stores the fourteen values for R and B in the header 132 (R values 138 and B values 136).

[0039] Additionally, the encoder 102 calculates a pair of R and B values for each stream of the multimedia content 150 (included in buffer values 138 and streaming rate values 136). For example, two R and B values for the audio stream and two R and B values for the video stream. One of these R values is referred to herein as the average bit rate of the stream (Ravg(n)) and the other value is referred to herein as the peak bit rate of the stream (Rpeak(n)). The bit rates correspond to an average buffer size for the stream (Bavg(n)) and the peak buffer size for the stream (Bpeak(n)). It is noted, however, that in practice these values may not exactly correspond to true average and peak bit rates. Furthermore, it is noted that it is not necessarily true that the average rates and buffer sizes are always smaller than the peak rates and buffer sizes.

[0040] It is also noted that more than two rate/buffer size pairs may be used for each stream. For example, one particular implementation may utilize four rate values and four corresponding buffer size values. Those skilled in the art will readily understand how the use of more than two rate/buffer size pairs can be integrated into the description provided herein.

[0041] The elements of the server 100 and the functionality thereof will be discussed in greater detail, below, with respect to explanation of one or more methodological implementations in which said elements are involved.

[0042] Exemplary Client System

[0043]
FIG. 2 is a block diagram of an exemplary client 200 in accordance with the systems and methods for improved streaming of variable bit rate multimedia content described herein. The client 200 includes a processor 202, memory 204, a decoder 206 and a playback engine 208. The decoder 206 is configured to decode and/or decompress multimedia content encoded and streamed to the client 200 from a server (e.g. server 100, FIG. 1). The playback engine 208 is configured to render the multimedia content on the client 200.

[0044] The client 200 also includes one or more input/output (I/O) modules 210 through which data is transmitted into and out of the client 200. The I/O module(s) 210 can include but is not limited to a port (parallel, USB, serial, IR), a speaker connection, or the like. The client 200 also includes a network interface card (NIC) 212 that the client 200 uses to communicate with one or more other systems across a network, such as the Internet (not shown). Although the NIC 212 is shown, it is noted that other means of network communication may be used in place of the NIC 212, such as a modem or other device.

[0045] The client 200 also includes a display 214 on which the multimedia content (video) may be rendered, a user input module 216 that may include a keyboard, mouse, stylus, etc., and other miscellaneous hardware 218 that may be necessary to accomplish functionally peripheral tasks associated with the techniques described herein.

[0046] A mass storage device 220 is included in the client 200 and is used to store electronic data, such as application software and other miscellaneous software that may be used to implement the systems and methods described herein.

[0047] The memory 204 stores an operating system 230 and has a content buffer 232. The content buffer 232 is also depicted as being stored on the mass storage device 220 as it may be stored in any available memory of a sufficient size. Also stored in the memory 204 is a stream selection module 234 that is configured to select one or more streams from multimedia content received from a server. A throughput determination module 236 stored in the memory 204 is configured to determine throughput for a channel with another system, such as a server.

[0048] The client 200 uses the channel throughput to determine which streams it wants to receive. If the same content has been encoded in multiple bit rates, the channel throughput will influence which streams the client selects. However, it is noted that the client 200 will typically select at least one audio stream and one video stream even if the channel throughput is not enough to stream the streams in real time.

[0049] A header acquisition module 238 is configured to retrieve header data 240 from a header associated with a stream from a server or other system that the client 200 can use to more efficiently determine streaming parameters. In one implementation, only certain information from the header is obtained while in another implementation, the entire header may be acquired.

[0050] A bit rate module 242 is configured to calculate a sum of an average bit rate of each selected stream, i.e. a total average bit rate (TAR 244). A speed multiplier module 246 calculates a requested speed multiplier 248 from channel throughput and the total average bit rate 244 to determine a speed to request from a server (in at least one implementation, this is done in a “Speed” header of a “Play” command).

[0051] An actual speed multiplier 250 is stored in the memory 204 and represents a speed multiplier contained in a server's response to a “Play” request (typically in a “Speed” header). This accounts for occasions in which a server may limit the speed multiplier to a lower value than what the client 200 requested.

[0052] The bit rate module 242 is also configured to calculate a streaming bit rate, which is included in streaming rate (R) values 252 stored in the memory 204. The memory 204 also stores buffer (B) values 254 that are used (as described below) in the calculation of the size of the content buffer 232.

[0053] A buffer calculation module 260 calculates how much data the client must buffer (i.e. the size of the content buffer 232) before the client can begin to play, or render, the multimedia content. The content will arrive from a server at a constant bit rate but the playback engine 208 will consume the data at a variable bit rate. Depending on the difference between the peak and average bit rates, different amounts of content data may need to be buffered before playback can begin. If not enough data has been buffered, the playback engine 208 will run out of data and the audio and video will exhibit undesirable artifacts.

[0054] Further aspects of the elements and functions shown and described in FIGS. 1 and 2 will be discussed below, with respect to methodological implementations of the systems and methods shown and described herein.

[0055] Methodological Implementation—Encoder

[0056]
FIG. 3 is a flow diagram 300 depicting one exemplary methodological implementation of an encoder in accordance with the systems and methods described herein. In the following discussion, continuing reference will be made to the elements and reference numerals of FIGS. 1 and 2.

[0057] At block 302, the encoder 102 acquires multimedia data from a content file or from an outside source that transmits the content to the server 100. The encoder 102 compresses (i.e. encodes) the content at block 304 and creates the data file 135 in the content buffer 132 (block 306).

[0058] In one implementation described herein, the buffer size calculation module 160 of the encoder 102 chooses fourteen (14) different streaming rate values (R) 138 (block 308) and calculates the smallest value of buffer value (i.e. size) B 136 for each R 138 so that the “bucket” in the model utilized does not overflow at block 310.

[0059] These buffer values 136 and rate values 138 are calculated across all streams in the multimedia content 150, and they include any non-data overhead from the file. Additionally, the buffer size calculation module 160 calculates a pair of R and B values 138, 136 for each stream at block 312. Each pair of R values 138 includes an average bit rate and a peak bit rate. For example, an average bit rate and a corresponding buffer value and a peak bit rate and a corresponding buffer value are calculated for an audio stream. Likewise an average bit rate and a corresponding buffer value and a peak bit rate and a corresponding buffer value are calculated for a video stream.

[0060] If the data file 135 contains variable bit rate content (“Yes” branch, block 314), then the VBR flag module 154 of the encoder 102 sets the VBR flag 140 in the header 132 at block 316. The encoder 102 then stores the rate values 138 and buffer values 136 in the header at block 318. If the data file 135 does not contain variable bit rate content (“No” branch, block 314), then the encoder 102 stores the rate values 138 and buffer values 136 in the header at block 318.

[0061] Methodological Implementation—Server

[0062]
FIG. 4 is a flow diagram 400 depicting one exemplary methodological implementation of a server in accordance with the systems and methods described herein. In the following discussion, continuing reference will be made to the elements and reference numerals of FIGS. 1 and 2.

[0063] At block 402, the server 100 receives client stream selections 128 from the client 200. The client stream selections 128 are streams from the data file 135 that the client 200 has selected to receive. At block 404, the server 100 receives the client speed multiplier 130 requested by the client 200. The speed multiplier module 126 determines the actual client speed multiplier 130 that will be used in the streaming process (block 406). The client speed multiplier 130 may remain unchanged from the requested value or it may be altered to better accommodate server and client parameters.

[0064] The actual client stream multiplier 130 is then transmitted to the client 200 at block 408. After the client 200 sets up to receive the streaming data, the server 100 streams the multimedia content to the client 200 at block 410.

[0065] Methodological Implementation—Client

[0066]
FIG. 5 is a flow diagram 500 depicting one exemplary methodological implementation of a client in accordance with the systems and methods described herein. In the following discussion, continuing reference will be made to the elements and reference numerals of FIGS. 1 and 2.

[0067] At block 502, the throughput determination module 236 of the client 200 determines the throughput of a channel established between the client 200 and the server 100. This determination may be done by any method known in the art.

[0068] At block 504, the header acquisition module 238 obtains the header data 240 from the header 132 that the client 200 will need to calculate an appropriate streaming rate and buffer size. The stream selection module 234 uses the buffer values 136 and the streaming rate values 138 retrieved from the header 132 and the channel throughput to select one or more streams for streaming (block 506) and notifying the server 100 of the stream selection(s). Although the buffer values 136 and the streaming rate value 138 are used in the selection process, it is noted that other considerations are also used in determining which streams to select for streaming (e.g. is an audio stream in English preferred over an audio stream in French? etc.).

[0069] At block 508, the bit rate module 238 derives the total average bit rate 244 of the selected streams by summing the average bit rate of each selected stream. It is noted that this bit rate is not one of the fourteen (14) bit rates selected in the “leaky bucket” method described above; this is one of the bit rate pairs chosen for each stream.

[0070] The requested speed multiplier 248 is derived from the channel throughput and the total average bit rate 244. More specifically, in the present example, the requested speed multiplier 248 is the channel throughput divided by the total average bit rate 244. If the channel throughput is insufficient to stream the content in real time, the speed multiplier will have a value that is less than one (1). Furthermore, the speed multiplier module 246 may also be configured to reduce the speed multiplier by some predetermined amount (e.g. 15%) to account for network transmission overhead.

[0071] After the requested speed multiplier 248 is sent to the server 100, the server 100 returns the actual speed multiplier 250 that will be used in the streaming process at block 512. The buffer calculation module 260 proceeds to calculate the appropriate buffer size that should be used in the streaming process at block 514. The process of block 514 is described in greater detail, below, with respect to FIG. 6.

[0072]

FIGS. 6

a
and 6b are a flow diagrams 600, 620 that, together, depict a client methodological implementation of buffer size calculation as described above. The flow diagram 600 of FIG. 6a derives an intermediate buffer size, Btot′, that is derived from the leaky bucket pairs of rates/buffer sizes selected earlier. The flow diagram 620 of FIG. 6b first derives an intermediate buffer size, Btot, that is derived from the selected streams, and then determines the better buffer size, Btot′ or Btot.

[0073] In the following discussion, the following values are used:

[0074] R(n)=average bit rate of stream n

[0075] S=speed multiplier

[0076] RS(n)=streaming bit rate of R(n) (i.e. R(n)×S)

[0077] Rtot=total average bit rate of selected streams

[0078] RStot=Rtot×S (actual transmission bit rate)

[0079] B(n)=required buffering time for stream n streamed at rate RS(n)

[0080] T=playing time of content

[0081] m=number of “leaky buckets” in file containing all streams

[0082] Rbucket(x), Bbucket(x)=bit rate and size of leaky bucket x containing all streams, x=1 to m

[0083] q=integer in the range of [1,m−1], where q is the smallest index such that RStot lies in the interval between Rbucket(q) and Rbucket(q+1).

[0084] In the following discussion, the actions discussed are performed by the buffer calculation module 260 unless specified otherwise.

[0085] The flow diagram 600 depicted in FIG. 6a utilizes the leaky bucket values (fourteen (14) pairs in the present example, but any other practicable number in one or more other implementations) to calculate a buffer size. The buffer size calculated in FIG. 6a is referred to herein as Btot′ and is, to some extent, an intermediate value that will be used in later calculations to determine a final buffer size.

[0086] At block 602, if RStot is greater than Rbucket(m) (“Yes” branch, block 602), then Btot′ equals Bbucket(m). In other words, if RStot is larger than the rate of the leaky bucket having the highest rate, then the buffer size is set to the buffer size associated with the leaky bucket having the highest rate at block 604.

[0087] Otherwise (“No” branch, block 602), if RStot is less than or equal to Rbucket(1) (“Yes” branch, block 606), then Btot′ is calculated to be Bbucket(1)+(Rbucket(1)−RStot)*T at block 608. If not (“No” branch, block 606), then Btot′ is determined by interpolation at block 610.

[0088] The calculation of Btot′ shown in block 610 is:

Btot′=B
bucket(q)+[(RStot−Rbucket(q))*(Bbucket(q+1)−Bbucket(q))]/[Rbucket(q+1)−Rbucket(q)]

[0089] At this point, the “intermediate” buffer size Btot′ has been determined. Btot′ is referred to as “intermediate” here only because it will be compared with another buffer size (Btot) derived below to determine the better buffer size to use. However, it is noted that Btot′ is a buffer size that could be used in an of itself.

[0090] The flow diagram 620 depicted in FIG. 6b utilizes the pair of rate values and corresponding buffer values associated with the streams selected to be streamed. Similar to Btot′ , above, the buffer size calculated in FIG. 6b (Btot) is—to some extent—an intermediate value that is used at the end of the process of FIG. 6b to determine a final buffer size.

[0091] As shown in the flowchart 620, the buffer size calculation module 160 cycles through each stream in the file and performs certain calculations using streams that have been selected for streaming. It is noted that in at least one other implementation, another technique may be used that simply determines beforehand which streams have been selected for streaming and bases calculations on those streams. The implementation depicted by FIG. 6b is exemplary only.

[0092] At block 622, the variable n (representing a stream number associated with a stream in the file) is initialized to 1 and the value Btot is initialed to zero. If n is not associated with a stream that has been selected for streaming (“No” branch, block 624)1, then the process skips down to block 640, where n is incremented by 1. If n represents a selected stream (“Yes” branch, block 624), then RS(n) is set to the average streaming rate for stream n (Ravg(n) multiplied by the speed factor (S) at block 626.

[0093] If RS(n) is greater than the peak bit rate for the stream, i.e. Rpeak(n) (“Yes” branch, block 628), then B(n) is set to Bpeak(n) at block 630). If not (“No” branch, block 628), then if RS(n) is less than or equal to the average bit rate of the stream (Ravg(n)) (“Yes” branch, block 632), then B(n) equals Bavg(n)+(Ravg(n) RS(n))*T(block 634). Otherwise (“No” branch, block 632), B(n) is derived using linear interpolation at block 636).

[0094] Block 636 derives B(n) as:

Bavg
(n)+[(RS(n)−Ravg(n))*(Bpeak(n)−Bavg(n))]/[(Rpeak(n)−Ravg(n)].

[0095] The Btot value is updated at block 638 by adding the newly derived B(n). In other words, Btot keeps a running total of the sum of all B(n) as B(n) for each selected stream is derived.

[0096] The stream number, n, is incremented at block 640. If there is another stream in the file that hasn't been processed yet (“Yes” branch, block 642), then the process reverts to block 624 and repeats until all streams have been processed. After all the streams in the file have been processed (“No” branch, block 642), then the minimum of Btot and Btot′ is determined at block 644, which provides the buffer size that will be used in the streaming process

[0097] The content buffer 232 in the client 200 is set to that amount and the streaming process can begin.

[0098] Other Considerations

[0099] Although not specifically described above, some other considerations can be made when deriving an optimal buffer size to utilize in a streaming process. In the particular example shown above, a set of fourteen (14) leaky buckets for all streams in a file are used together with a set of (2) leaky buckets for each stream in the file. In principle, any other subset of streams could also have its own set of leaky buckets.

[0100] For example, if there are four streams in a file, in addition to the usual sets of leaky buckets for the subsets {1,2,3,4}, {1}, {2}, {3} {4}, there could be sets of leaky buckets for the subsets {1,2} and {3,4}—which would be of significant greater interest if streams 1 and 2 were likely to be selected together and if stream 3 and 4 were likely to be selected together.

[0101] Then, given any subset of streams—say {1,2,3}—the minimum sufficient buffer size could be computed as the minimum of the Btot values computed from not only {1,2,3,4} and {1}+{2}+{3} but also {1,2}+{3}. The latter is likely to provide the tightest bound of the three.

[0102] Furthermore, it is noted that any of the available buffer size computations can be omitted. For example, the buffer size calculation (Btot′) based on {1,2,3,4} does not have to be used. The buffer size computation (Btot) based on {1}+{2}+{3} may be sufficient.

[0103] Exemplary Computing Environment

[0104] The various components and functionality described herein are implemented with a computing system. FIG. 7 shows components of typical example of such a computing system, i.e. a computer, referred by to reference numeral 700. The components shown in FIG. 7 are only examples, and are not intended to suggest any limitation as to the scope of the functionality of the invention; the invention is not necessarily dependent on the features shown in FIG. 7.

[0105] Generally, various different general purpose or special purpose computing system configurations can be used. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

[0106] The functionality of the computers is embodied in many cases by computer-executable instructions, such as program modules, that are executed by the computers. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Tasks might also be performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media.

[0107] The instructions and/or program modules are stored at different times in the various computer-readable media that are either part of the computer or that can be read by the computer. Programs are typically distributed, for example, on floppy disks, CD-ROMs, DVD, or some form of communication media such as a modulated signal. From there, they are installed or loaded into the secondary memory of a computer. At execution, they are loaded at least partially into the computer's primary electronic memory. The invention described herein includes these and other various types of computer-readable media when such media contain instructions programs, and/or modules for implementing the steps described below in conjunction with a microprocessor or other data processors. The invention also includes the computer itself when programmed according to the methods and techniques described below.

[0108] For purposes of illustration, programs and other executable program components such as the operating system are illustrated herein as discrete blocks, although it is recognized that such programs and components reside at various times in different storage components of the computer, and are executed by the data processor(s) of the computer.

[0109] With reference to FIG. 7, the components of computer 700 may include, but are not limited to, a processing unit 702, a system memory 704, and a system bus 706 that couples various system components including the system memory to the processing unit 702. The system bus 706 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISAA) bus, Video Electronics Standards Association (YESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as the Mezzanine bus.

[0110] Computer 700 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by computer 700 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. “Computer storage media” includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computer 700. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more if its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.

[0111] The system memory 704 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 708 and random access memory (RAM) 710. A basic input/output system 712 (BIOS), containing the basic routines that help to transfer information between elements within computer 700, such as during start-up, is typically stored in ROM 708. RAM 710 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 702. By way of example, and not limitation, FIG. 7 illustrates operating system 714, application programs 716, other program modules 718, and program data 720.

[0112] The computer 700 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, FIG. 7 illustrates a hard disk drive 722 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 724 that reads from or writes to a removable, nonvolatile magnetic disk 726, and an optical disk drive 728 that reads from or writes to a removable, nonvolatile optical disk 730 such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 722 is typically connected to the system bus 706 through a non-removable memory interface such as data media interface 732, and magnetic disk drive 724 and optical disk drive 728 are typically connected to the system bus 706 by a removable memory interface such as interface 734.

[0113] The drives and their associated computer storage media discussed above and illustrated in FIG. 7 provide storage of computer-readable instructions, data structures, program modules, and other data for computer 700. In FIG. 7, for example, hard disk drive 722 is illustrated as storing operating system 715, application programs 717, other program modules 719, and program data 721. Note that these components can either be the same as or different from operating system 714, application programs 716, other program modules 718, and program data 720. Operating system 715, application programs 717, other program modules 719, and program data 721 are given different numbers here to illustrate that, at a minimum, they are different copies. A user may enter commands and information into the computer 700 through input devices such as a keyboard 736 and pointing device 738, commonly referred to as a mouse, trackball, or touch pad. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 702 through an input/output (I/O) interface 740 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB). A monitor 742 or other type of display device is also connected to the system bus 706 via an interface, such as a video adapter 744. In addition to the monitor 742, computers may also include other peripheral output devices 746 (e.g., speakers) and one or more printers 748, which may be connected through the I/O interface 740.

[0114] The computer may operate in a networked environment using logical connections to one or more remote computers, such as a remote computing device 750. The remote computing device 750 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to computer 700. The logical connections depicted in FIG. 7 include a local area network (LAN) 752 and a wide area network (WAN) 754. Although the WAN 754 shown in FIG. 7 is the Internet, the WAN 754 may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the like.

[0115] When used in a LAN networking environment, the computer 700 is connected to the LAN 752 through a network interface or adapter 756. When used in a WAN networking environment, the computer 700 typically includes a modem 758 or other means for establishing communications over the Internet 754. The modem 758, which may be internal or external, may be connected to the system bus 706 via the I/O interface 740, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 700, or portions thereof, may be stored in the remote computing device 750. By way of example, and not limitation, FIG. 7 illustrates remote application programs 760 as residing on remote computing device 750. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.

[0116] Computer-Executable Instructions

[0117] An implementation of the exemplary systems and methods for improved streaming of variable bit rate multimedia content may be described in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.

[0118] Conclusion

[0119] Although the subject matter has been described in language specific to structural features and/or methods, it is to be understood that the invention defined by the appended claims is not necessarily limited to the specific features or methods described herein. Rather, the specific features and methods are disclosed as exemplary forms of implementing the claimed systems and methods.

Claims

1. A method, comprising: encoding multimedia data for processing, said multimedia data including at least one data stream; storing the encoded multimedia data in a data file for streaming to one or more clients; selecting at least two possible bit rate values at which the data file may be streamed; calculating a client buffer size value for each possible bit rate value, each client buffer size value being a client buffer size for the possible bit rate value corresponding to the client buffer size value; for each data stream, calculating an average bit rate value, a peak bit rate value, an average buffer size value corresponding to the average bit rate value and a peak buffer size value corresponding to the peak bit rate value; storing the bit rate values and the buffer size values in the data file; and wherein the bit rate values and the buffer size values can be used to determine an optimal client buffer size for a client receiving the multimedia data at a particular constant bit rate.
2. The method as recited in claim 1, wherein the one or more data streams further comprise at least one audio stream and one video stream.
3. The method as recited in claim 1, further comprising setting a variable bit rate flag in the data file if the encoded multimedia data in the data file contains variable bit rate data.
4. The method as recited in claim 1, wherein the average bit rate value, the peak bit rate value, the average buffer size value and the peak buffer size value are approximations.
5. The method as recited in claim 1, wherein the data file is an Advance System Format (ASF) file.
6. The method as recited in claim 1, wherein the calculating the bit rate values and the buffer size values further comprises using a leaky bucket method to calculate the bit rate values and the buffer size values.
7. The method as recited in claim 1, further comprising transmitting at least a portion of the data file to a client.
8. The method as recited in claim 7, wherein the transmitting at least a portion of the data file further comprises transmitting at least a portion of the data file that includes the buffer size values and the bit rate values.
9. A system, comprising: a data file; a compression module configured to compress multimedia data for streaming to one or more clients by encoding the multimedia data into a streaming format and storing the encoded multimedia data in the data file; a buffer size calculation module configured to: select one or more possible bit rates at which the data file may be streamed to the one or more clients and to calculate a client buffer size corresponding to each possible bit rate, calculate for each of the one or more streams included in the data file a first buffers size value and a second buffer size value that correspond, respectively, to a first bit rate value and a second bit rate value associated with the stream; store the bit rate values and the corresponding buffer size values in the data file; and wherein the one or more clients can utilize the bit rate values and the buffer size values to calculate an optimal client buffer size for receiving the data file at a particular bit rate.
10. The system as recited in claim 9, wherein the one or more streams further comprise at least one audio stream and at least one video stream.
11. The system as recited in claim 9, further comprising a variable bit rate module configured to set a variable bit rate flag in the data file if the data file contains variable bit rate data.
12. The system as recited in claim 9, wherein the data file is an Advance System Format file.
13. The system as recited in claim 9, wherein the buffer calculation module is further configure to calculate the bit rate values and the buffer values using a leaky bucket interpolation method.
14. The system as recited in claim 9, further comprising a content buffer configured to store a portion of the data file as a previously stored portion of the data file is streamed to the one or more clients.
15. One or more computer-readable media containing computer-executable instructions that, when executed on a computer, perform the following steps: selecting one or more possible bit rate values that each denote a bit rate at which a multimedia data file can be streamed to at least a client; calculating a client buffer size for each possible bit rate value; determining a first buffer size for a first bit rate associated with each of one or more streams in the multimedia data file; determining a second buffer size for a second bit rate associated with each of one or more streams in the multimedia data file; storing the buffer values and the bit rate values in a location associated with the multimedia data file; and and wherein the bit rate values and the buffer values can be used by a client to determine an optimal streaming rate at which to receive the multimedia data file, and to calculate an optimal buffer size associated with the streaming bit rate.
16. The one or more computer-readable media recited in claim 15, further comprising computer executable instructions that, when executed on a computer, perform the step of creating the multimedia data file by encoding multimedia data.
17. The one or more computer-readable media recited in claim 15, wherein the first bit rate is less than the second bit rate.
18. The one or more computer-readable media recited in claim 15, wherein the first bit rate is an approximate average bit rate and the second bit rate is an approximate peak bit rate.
19. The one or more computer-readable media recited in claim 15, wherein the multimedia data file includes at least one audio stream and at least one video stream.
20. The one or more computer-readable media recited in claim 15, further comprising computer executable instructions that, when executed on a computer, perform the step of providing an indication in the multimedia data file that the multimedia data file contains a variable bit rate data if the multimedia data file contains variable bit rate data.
21. The one or more computer-readable media recited in claim 15, wherein the multimedia data file further comprises a file in ASF format.
22. The one or more computer-readable media recited in claim 15, wherein the determination steps are accomplished using a leaky bucket method.
23. The one or more computer-readable media recited in claim 15, further comprising computer executable instructions that, when executed on a computer, perform the step of streaming client-selected streams of the multimedia data file to the client.
24. A data structure stored on one or more computer-readable medium, comprising computer-readable data that represents the following: multiple data packets for streaming to at least one client, each data packet containing coded multimedia data, the multimedia data having one or more data streams; and a header that includes: multiple bit rate values that each denote a bit rate at which the data packets can be streamed to the client; and multiple buffer size values, each buffer size value corresponding to one of the multiple bit rate values and denoting a buffer size for the client to maintain when receiving the multiple data packets streamed at the corresponding bit rate.
25. The data structure as recited in claim 24, wherein: the multiple bit rate values further comprise an average bit rate value and a peak bit rate value; and the multiple buffer size values further comprise an average buffer size value that corresponds to the average bit rate value, and a peak buffer size value that corresponds to the peak bit rate value.
26. A method, comprising: determining a throughput for a communication channel with a server; obtaining multiple streaming bit rate values from the server that identify various bit rates at which a multimedia data file can be streamed; obtaining buffer size values from the server, each buffer size value corresponding to one of the streaming bit rate values and identifying a buffer size to maintain when receiving the multimedia data file at the streaming bit rate value associated with the buffer size value; selecting one or more streams available from the multimedia data file to be received from the server; negotiating with the server to determine a bit rate at which the multimedia data file can be received from the server; and calculating an optimal buffer size that corresponds to the negotiated bit rate.
27. The method as recited in claim 26, further comprising receiving the multimedia data file from the server at the negotiated bit rate.
28. The method as recited in claim 26, wherein the negotiated bit rate is a constant rate and the multimedia data file contains variable it rate data.
29. The method as recited in claim 26, wherein the streaming bit rate values further comprises a first streaming bit rate value and a second streaming rate value associated with each selected stream; and the method further comprises calculating a first buffer size value and a second buffer size value wherein the first buffer size value is an optimal buffer size for a client to maintain when receiving the multimedia data file at the first streaming bit rate, and the second buffer size value is an optimal buffer size for a client to maintain when receiving the multimedia data file at the second streaming bit rate.
30. The method as recited in claim 29, wherein the first streaming bit rate value is lower than the second streaming bit rate value.
31. The method as recited in claim 29, wherein the first streaming bit rate value is an approximate average streaming bit rate and the second bit rate value is an approximate peak streaming bit rate.
32. The method as recited in claim 29, wherein the calculating the buffer size values further comprises calculating the buffer size values according to a linear interpolation method.
33. One or more computer-readable media containing computer-executable instructions that, when executed by a computer, perform the following steps: negotiating a constant bit rate at which variable bit rate multimedia data is received from a server; determining an optimal content buffer size to store the multimedia data received from the server; receiving the multimedia data from the server at the negotiated constant bit rate; buffering the multimedia data received from the server; and rendering the multimedia data from the content buffer.
34. The one or more computer-readable media as recited in claim 33, wherein the step of determining an optimal content buffer size further comprises using sample buffer size values and rate values corresponding with the buffer size values obtained from the server to calculate the optimal content buffer size.
35. The one or more computer-readable media as recited in claim 33, wherein: the multimedia stream further comprises one or more streams selected from multiple streams in a data file associated with the multimedia stream; the step of determining an optimal content buffer size further comprises using an average bit rate and a peak bit rate associated with an average buffer size and a peak buffer size for each selected stream.
36. The one or more computer-readable media as recited in claim 33, further comprising computer-executable instructions that, when executed on a computer, perform the step of calculating an optimal buffer size to utilize when receiving the multimedia data at the negotiated bit rate.
37. A system, comprising: a throughput determination module configured to determine a throughput of a server channel; a file header acquisition module configured to obtain a multimedia file header from the server, the multimedia file header being associated with a multimedia file that can be received from the server; a stream selection module configured to select one or more streams from the multimedia file to be streamed from the server; and a buffer calculation module configured to calculate an optimal buffer size for receiving the multimedia file from the server at a bit rate negotiated with the server.
38. The system as recited in claim 37, wherein the buffer calculation module is further configured to calculate the optimal buffer size using multiple bit rate values and multiple buffer size values obtained from the file header.
39. The system as recited in claim 37, wherein the buffer calculation module is further configured to calculate the optimal buffer size utilizing a linear interpolation method.
40. The system as recited in claim 37, wherein the buffer calculation module is further configured to calculate the optimal buffer size from two or more bit rate values and buffer size values for all streams, and from two bit rate values and two buffer size values for each selected stream.

Streaming of variable bit rate multimedia content

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

US Classifications

International Classifications

Abstract

Description

Claims