© 2006-2007 Elemental Technologies, Inc. A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. 37 CFR §1.71(d).
This invention pertains to methods and apparatus for real-time digital video transmission over a network, and more specifically to adapting to variations in the network bandwidth while optimizing image quality.
For in the home consumer video applications (TV) quality of service is critical, image breakup, stalling or other artifacts are not acceptable for the market/customer. This presents a serious problem for network based distribution of video in the home, in that the available bandwidth in these networks tends to be unstable/unpredictable. Changing available bandwidth can cause the client devices to fall behind and eventually run out of data for display resulting in the aforementioned issues. Wireless and home-plug based networks are particularly sensitive to this issue because of interference from other devices inside (and outside) the home. However, even if Ethernet or Fiber is available (very rare) changes in network demand based on user usage can cause the same issues. For instance transferring large files or printing while simultaneously streaming video often creates a network bottleneck which the video cannot recover from without visible artifacts.
Most current approaches use a constant bit rate (or near constant) for network video transmission, which requires the smallest available bandwidth to be known prior to initiating the video stream. Since bit rate (bandwidth) and image quality are highly correlated with higher bit rates resulting in better quality, selecting the lowest possible bandwidth will result in the lowest image quality. Even in the case where variable bit rate technology is used it must maintain over the long term a rate that matches the worst case network conditions using today's technology.
The need remains therefore for a way to dynamically adapt to the available network bandwidth so as to optimize image quality.
Our solution in a preferred embodiment dynamically adapts to the available bandwidth by down-scaling the video image (frames/fields) prior to encoding and transmission. As less bandwidth becomes available, the video is down-scaled by increasing amounts, and conversely when more bandwidth is available, the scaling factor is reduced. Normally, changing the input resolution to an encoder would result in loss of correlation to the already stored reference frames; however, because we also scale the reference frames prior to intra or inter predication, correlation is maintained. Since the lower resolution frames require fewer blocks the overall data rate per frame is reduced.
Additionally, high quality polyphase based image scaling preferably is used which low pass filters the image, thereby reducing aliasing and high frequency content and resulting in better compression at the expense of image sharpness. The scale factor for each image is encoded in the bit stream on a frame by frame basis allowing the decoder to reconstruct the image and maintain a consistent output resolution.
It is to be understood that both the foregoing general description and following more detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed. The accompanying drawings, which are incorporated in a constitute a part of the specification, illustrate an embodiment of the invention and together with the general description, serve to explain rather than limit the principles of the invention.
Alternately, the scaling factor can be derived from a look-up table index by P which allows for non-linear changes in the image scale factor. Since the back channel communications is through the host processor of each device (i.e., Encoder and Decoder) even more complex software based decision algorithms are possible which take into account the latency in the network. In fact the calculation of P by the encoders host processor must take into account the current data rate, decoder's current buffer depth, network latency, and response time. The current data rate (from the decoder's point of view) and decoder buffer depth are values sent from the decoder to the encoder in the back channel IP message. The decoder's view of the data rate is required because there may be buffering in the network infrastructure that masks the encoder from seeing the true data rate. Additionally the network latency can be determined by the encoder (for the backchannel) and the decoder (for the video stream) by comparing timestamp's of when data is sent vs. received.
While in
As was mentioned earlier in addition to scaling the input frame (image) to the encoder, scaling of reference frames also occurs. This is required because when the scale factor is changed the reference frames used by the encoder are in the original (or last) format (i.e, size). This requires that they be scaled to match the input resolution prior to searching for motion (inter prediction). Once the encoder has switched to the new downscaled resolution its reference frame buffer will fill with the scaled frame data, this is done so that the encoder's reference data matches the decoder's data thereby allowing the decoder to accurately reconstruct its output frames. This presents a problem of loss of reference frame correlation when the scale factor wants to be reduced (i.e., less downscaling) or eliminated. This occurs because the high frequency content of the image has been reduced in the scaling process, so when the reference frame is scaled up to match the new input resolution it will have less detail than you have in the scaled original source. Fortunately this case occurs when more bandwidth has become available which means a simple solution is to insert an intra coded I-Frame (or simply wait until the next point in time when this would be done, usually about 4 times per second) which does not depend on any of the reference frames. A more complex approach would be to attempt to perform inter prediction and if the correlation is good (i.e., low PSNR) then immediately switch to the new resolution, otherwise, use the I-frame approach.
Some of the benefits of embodiments of the present invention include the following:
In addition to real time encoding/decoding, the disclosed technology could be used for offline encoding with the result of reduced file size (data size) and/or higher quality image performance during scenes which have fast motion or for some other reason do not compress well. This would be particularly true for cases where a constant bit rate is desired.
It will be obvious to those having skill in the art that many changes may be made to the details of the above-described embodiments without departing from the underlying principles of the invention. The scope of the present invention should, therefore, be determined only by the following claims.
This application claims priority from U.S. Provisional Application No. 60/826,008 filed Sep. 18, 2006 and incorporated herein by this reference.
Number | Name | Date | Kind |
---|---|---|---|
5280349 | Wang | Jan 1994 | A |
5414468 | Lee | May 1995 | A |
5557332 | Koyanagi | Sep 1996 | A |
5565920 | Lee et al. | Oct 1996 | A |
5675331 | Watanabe | Oct 1997 | A |
5699460 | Kopet | Dec 1997 | A |
5963260 | Bakhmutsky | Oct 1999 | A |
6058143 | Golin | May 2000 | A |
6434196 | Sethuraman | Aug 2002 | B1 |
6504872 | Fimoff et al. | Jan 2003 | B1 |
6577767 | Lee | Jun 2003 | B2 |
6587590 | Pan | Jul 2003 | B1 |
6771704 | Hannah | Aug 2004 | B1 |
6870883 | Iwata | Mar 2005 | B2 |
6888477 | Lai | May 2005 | B2 |
6952211 | Cote | Oct 2005 | B1 |
7339993 | Brooks | Mar 2008 | B1 |
7376590 | Lee | May 2008 | B2 |
7634776 | Parameswaran | Dec 2009 | B2 |
7646810 | Tourapis | Jan 2010 | B2 |
20010047517 | Christopoulos et al. | Nov 2001 | A1 |
20020064314 | Comaniciu et al. | May 2002 | A1 |
20020136298 | Anantharamu et al. | Sep 2002 | A1 |
20020157112 | Kuhn | Oct 2002 | A1 |
20030028643 | Jabri | Feb 2003 | A1 |
20030123748 | Sebot | Jul 2003 | A1 |
20040076333 | Zhang et al. | Apr 2004 | A1 |
20040101056 | Wong | May 2004 | A1 |
20040161035 | Wedi | Aug 2004 | A1 |
20040213345 | Holcomb et al. | Oct 2004 | A1 |
20040218673 | Wang et al. | Nov 2004 | A1 |
20040252901 | Klein Gunnewiek et al. | Dec 2004 | A1 |
20050019000 | Lim et al. | Jan 2005 | A1 |
20050091696 | Wolfe et al. | Apr 2005 | A1 |
20050134735 | Swartz | Jun 2005 | A1 |
20050147033 | Chin et al. | Jul 2005 | A1 |
20050160471 | Cohen | Jul 2005 | A1 |
20050262510 | Parameswaran | Nov 2005 | A1 |
20060018378 | Piccinelli et al. | Jan 2006 | A1 |
20060056513 | Shen | Mar 2006 | A1 |
20060083308 | Schwarz et al. | Apr 2006 | A1 |
20060093042 | Kashima | May 2006 | A1 |
20060095944 | Demircin et al. | May 2006 | A1 |
20060114989 | Panda | Jun 2006 | A1 |
20060126667 | Smith et al. | Jun 2006 | A1 |
20060193388 | Woods | Aug 2006 | A1 |
20060268991 | Segall et al. | Nov 2006 | A1 |
20070053436 | Van Eggelen | Mar 2007 | A1 |
20070086528 | Mauchly | Apr 2007 | A1 |
20070091815 | Tinnakornsrisuphap et al. | Apr 2007 | A1 |
20070098070 | Saigo | May 2007 | A1 |
20070223580 | Ye | Sep 2007 | A1 |
20070285285 | Puri | Dec 2007 | A1 |
20080063082 | Watanabe | Mar 2008 | A1 |
20080123750 | Bronstein | May 2008 | A1 |
20080126278 | Bronstein | May 2008 | A1 |
20090034856 | Moriya | Feb 2009 | A1 |
20090092326 | Fukuhara | Apr 2009 | A1 |
20090290635 | Kim et al. | Nov 2009 | A1 |
Number | Date | Country |
---|---|---|
2004140473 | May 2004 | JP |
2007174569 | Jul 2007 | JP |
03036980 | May 2003 | WO |
WO 2004010670 | Jan 2004 | WO |
Number | Date | Country | |
---|---|---|---|
20080084927 A1 | Apr 2008 | US |
Number | Date | Country | |
---|---|---|---|
60826008 | Sep 2006 | US |