The present invention relates to a frame packing method, apparatus and system using a new 3D coding “frame compatible” format.
For transmission of 3D video streams, the so-called “frame compatible” formats are commonly used. Such formats allow to enter into a video frame, which is used as a container, the two images that make up the stereoscopic pair. In this way, the 3D video stream, consisting of two 2D video streams (one for the left eye and one for the right eye) becomes a single video stream and can therefore pass through the production and distribution infrastructures used for 2D TV and, most importantly, can be played by the 2D and 3D receivers currently available on the market, in particular for High Definition TV.
a and
A third format, called “tile format”, has also been proposed, wherein two 720p images (1280×720 progressive-scan pixels) are entered into a 1080p container frame. According to such a format, one of the two images (L) is entered unchanged into the container, while the other one is divided into three parts (R1, R2, R3), which are in turn entered into the space left available by the first image (see
The “tile format” differs from the other formats in that the container frame has a different format from the two component images, which undergo no decimation. In the most typical application, in fact, the format of the container is 1080p, while the format of the component images is 720p. It is apparent that a 1080p container (i.e. a progressive video) may also contain two interlaced images, i.e. of the 1080i type, with a halved frame rate. However, this type of frame packing has not been sufficiently studied yet, although it appears to be attractive for those broadcasters who have chosen the 1080i format for high definition and want to keep using the same format also for the two images that make up the stereoscopic pair.
These frame packing formats using 1080p as a container can be defined as “second generation” formats. Their use is interesting in both the distribution and production environments. It must be pointed out that all frame packing formats suffer from the drawback that they do not allow, at the compression stage, to exploit the so-called “interview” redundancies, i.e. they do not allow to exploit the similarities between the two images of the stereoscopic pair. For this reason, it has been proposed to use, for distribution, the so-called MVC (Multi View Coding) compression system in its “stereo high profile” version, wherein there are only two views because such a format allows to exploit said redundancies. In other people's opinion, the advantages deriving from MVC are limited, in that the bit-rate gain thus obtained is small, whereas the complexity of the encoder and of the decoder increases significantly.
In any case, even if MVC is used for distribution, in order to circulate a 3D signal in the same systems used for HDTV it is appropriate to use a frame packing solution for production. If MVC is used in the 2×720p version, the tile format is well suited to be used as a production format. Vice versa, if MVC is used in the 2×1080i version, it will be necessary to use the new format mentioned above.
It is therefore an object of the present invention to solve the above-mentioned problems of the prior art by providing a frame packing method, apparatus and system using a new 3D coding “frame compatible” format, wherein the composite signal obtained contains a signalling that defines the adopted packing type, and wherein the encoder, upon receiving said signalling, applies the coding algorithms typical of an interlaced signal to the composite signal, the signal to be coded being nevertheless of the progressive type.
These and other objects and advantages of the invention, which will become apparent from the following description, are achieved through a frame packing method as set out in claim 1.
In addition, these and other objects and advantages of the invention are achieved through a frame packing apparatus as set out in claim 19.
Finally, these and other objects and advantages of the invention are achieved through a frame packing system as set out in claim 30.
Preferred embodiments and non-obvious variants of the present invention are specified in dependent claims.
It is understood that all the appended claims are an integral part of the present description.
It will become immediately apparent that what is described herein may be subject to innumerable variations and modifications (e.g. in shape, dimensions, arrangements and parts having equivalent functionality) without departing from the protection scope of the invention as set out in the appended claims.
The present invention will be described in detail below through some preferred embodiments thereof, which are only provided by way of non-limiting example, with reference to the annexed drawings, wherein:
a and 1b show two HD frames composed of 1920 columns by 1080 rows of pixels (referred to as 1080p), respectively belonging to the video stream for the left eye L and for the right eye R;
c illustrates the so-called “tile format”;
a and 2b show a known interleaving process for entering two 1080i images into a 1080p frame;
a and 4b illustrate the frame packing method of the present invention.
a to 1c have already been described above in the paragraphs discussing the prior art.
In order to be able to enter two 1080i images into a 1080p container, it has been proposed in the art to use some sort of interleaving between two interlaced frames: in other words, the odd rows of the left image (L) and the even rows of the right image (R) are entered into a container frame C, and the even rows of the image L and the odd rows of the image R are entered into the next frame (C+1) (see
It is therefore preferable to use a new form of top-bottom format, wherein, in a frame C, the odd active rows of the image L are copied into the upper half of the active part of said frame C (
In the next frame C+1, the even active rows of the image L are copied into the upper half of the active part of said frame C+1 by observing the same order in which they are arranged in said image L, and the even active rows of the image R are copied into the lower half of the active part of said frame C+1 by observing the same order in which they are arranged in said image R.
In a first alternative embodiment of the invention, in the frame C, the odd active rows of the image R are copied into the upper half of the active part of said frame C by observing the same order in which they are arranged in said image R, and the odd active rows of the image L are copied into the lower half of the active part of said frame C by observing the same order in which they are arranged in said image L. As a consequence, in the next frame C+1, the even active rows of the image R are copied into the upper half of the active part of said frame C+1 by observing the same order in which they are arranged in said image R, and the even active rows of the image L are copied into the lower half of the active part of said frame C+1 by observing the same order in which they are arranged in said image L.
In a second alternative embodiment of the invention, in the frame C, the even active rows of the image L are copied into the upper half of the active part of said frame C by observing the same order in which they are arranged in said image, and the even active rows of the image R are copied into the lower half of the active part of said frame C by observing the same order in which they are arranged in said image R. As a consequence, in the next frame C+1, the odd active rows of the image L are copied into the upper half of the active part of said frame C+1 by observing the same order in which they are arranged in said image, and the odd active rows of the image R are copied into the lower half of the active part of said frame C+1 by observing the same order in which they are arranged in said image R.
In a third alternative embodiment of the invention, in the frame C, the even active rows of the image R are copied into the upper half of the active part of said frame C by observing the same order in which they are arranged in said image, and the even active rows of the image L are copied into the lower half of the active part of said frame C by observing the same order in which they are arranged in said image L. As a consequence, in the next frame C+1, the odd active rows of the image R are copied into the upper half of the active part of said frame C+1 by observing the same order in which they are arranged in said image, and the odd active rows of the image L are copied into the lower half of the active part of said frame C+1 by observing the same order in which they are arranged in said image L.
Each one of the above arrangements advantageously preserves the vertical spatial correlation, and therefore causes no problems in the compression process.
Also advantageously, the process of compressing the signal obtained in accordance with the invention can be carried out by using the algorithms typically employed for an interlaced signal, even if the signal to be compressed is structured like a progressive signal. However, this cannot be done by using the current standards unless a number of variants are introduced, which will now be described.
Such variants should be introduced both into the SMPTE standard, at the processing stage that defines the frame packing systems to be used for production, and into the compression standards used for distribution (MPEG2, AVC—Advanced Video Coding—and possibly also the new standard still under development, known as HEVC—High Efficiency Video Coding—).
First of all, it is necessary to introduce a signalling that identifies the new frame packing type adopted, i.e. a signalling to indicate the transmission of a 1080p composite frame containing two 1080i images making up a stereoscopic pair, arranged in the frame according to the “top-bottom” format, i.e. one on top of the other. Of course, the type of signalling depends on the standard taken into consideration. For example, in the case of the AVC (ITU-T H.264) standard, one may use the “frame_packing_arrangement_type” parameter, which is included in the so-called SEI (Supplemental Enhancement Information) messages. This parameter defines the various allowable frame packing types and may have different values, most of which are not used. It will be sufficient to choose one of the unused values and then use it to define the new frame packing type according to the invention.
Still with reference to the H.264 standard, the two component images are defined as “0” and “1”, and there is another parameter that indicates which one of the two component images “0” and “1” is the image L for the left eye and which one is the image R. These very same parameters can be used in the SMPTE HD-SDI and 3G-SDI interfaces currently used in the production environment: in such a case, said parameters are entered into an “ancillary data packet” located in the horizontal blanking, just like any other signalling identifying the characteristics of the video signal being transported.
It is important to underline that the encoder, when it receives a signal according to the invention through the 3G-SDI interface defined by SMPTE, should preferably code it as if it were an interlaced signal, although the signal is structured like a progressive signal. For example, a typical way of coding interlaced signals is to make, for each macroblock, a “motion detection” operation: this operation allows to identify the static areas of the image and those containing motion.
For static areas, it is possible to arrange together the pixels of the two even and odd half-frames (fields), and then treat such areas as if they were included in a progressive image. For moving areas, instead, the two half-frames are coded as if they were two different images. For example, in time prediction, the macroblock of one of the previous half-frames of the same type is searched for, and then the corresponding “motion vector” is calculated.
Since the 3D video signal consists of a succession of frames alternately containing the odd and even half-frames of the two component signals, it is necessary to signal, for at least one of two consecutive frames C,C+1, which type of half-frame it contains. For this purpose a new parameter may be used, or one of the existing parameters may be recycled by giving a new meaning to it.
In both the AVC and SMPTE standards, one of the various frame packing types taken into consideration is the so-called “frame alternate”, i.e. a signal containing a succession of frames alternately belonging to the image L and to the image R (
This is a different case from the one of the present invention; in this case as well, however, there is a need for identifying which image is contained in each frame. In other words, it is necessary to define a parameter indicating, in a “frame alternate” system, which is the current frame (either the one containing the image L or the one containing the image R). This very same parameter may be used, with a different meaning, in the present invention; more precisely, when the “frame_packing_arrangement_type” parameter takes the value used for defining the new frame packing type of the invention, then the parameter in question may indicate whether the current frame contains the odd half-frames or the even half-frames.
Of course, in order to define the parameters required for identifying the new format and handle it properly, many solutions are possible, which also depend on the various standards involved; nevertheless, the principle underlying the various possible solutions always remains the same, thus still falling within the protection scope of the patent.
In summary, an innovative frame packing method has been described, wherein the two images of a stereoscopic pair are of the 1080i type and are entered into a container frame of the 1080p type by using the “top-bottom” technique, wherein the active rows of the odd half-frames of said images are entered into a container frame, and the active rows of the even half-frames of said images are entered into the next container frame. The vertical blanking interval remains the one which is characteristic of a signal in the 1080p format.
The composite signal thus obtained contains a signalling that defines the type of packing adopted, so that an encoder, upon receiving said signalling, can apply the coding algorithms typical of an interlaced signal to said composite signal, the signal to be coded being nevertheless of the progressive type.
In particular, said method also includes a signalling that indicates, for each frame, the type of half-frame it contains, said half-frame being of the odd or even type.
More in particular, said method also includes a signalling indicating which one of the two images L,R is at the top and which one is at the bottom.
The invention also relates to a frame packing apparatus, wherein the two images of a stereoscopic pair are of the 1080i type, comprising means for entering said images into a container frame of the 1080p type by using the “top-bottom” technique, wherein the active rows of the odd half-frames of said images are entered into a container frame, and the active rows of the even half-frames of said images are entered into the next container frame.
Said apparatus also comprises means adapted to add to a composite signal thus obtained a signalling defining the adopted packing type, said signalling being adapted to cause an encoder, upon receiving said signalling, to apply the coding algorithms typical of an interlaced signal to said composite signal, the signal to be coded being nevertheless of the progressive type.
In particular, said apparatus also includes a signalling that indicates, for each frame, the type of half-frame it contains, said half-frame being of the odd or even type. Furthermore, said apparatus also includes a signalling indicating which one of the two images is at the top and which one is at the bottom.
The invention further relates to a system comprising a frame packing apparatus like the one illustrated above and an encoder which, after having identified the particular packing method adopted, applies the algorithms typical of an interlaced signal to the composite signal, the composite signal being nevertheless of the progressive type.
The stereoscopic video stream generated in accordance with the packing and coding method is transmitted via a communication channel and is received by a decoder adapted to generate a composite video signal and comprising means for receiving a stereoscopic video stream packed and coded in accordance with the above-described method, means for decoding the stereoscopic video stream, and means for outputting a composite video signal comprising the signalling entered during the step of packing and coding the composite video signal.
The video signal thus extracted is then sent to the input of an unpacking device adapted to generate a signal in a video format that can be used by a display device. Said unpacking device comprises means for receiving the composite video signal packed in accordance with the above-described frame packing method, and means for interpreting the signalling associated with said composite video signal. Said signalling contains the information necessary for the proper operation of the unpacker, which must execute operations which are the exact inverse of those executed by the packer. A complete reception system comprises said decoder, said unpacking device and the display device. In practical implementations, the decoder may be a set-top-box and the unpacker may be included in the display device; it may also be the case that the set-top-box contains, in addition to the decoder, also the unpacking device; finally, it may also happen that the decoder and the unpacking device together constitute a single apparatus.
Number | Date | Country | Kind |
---|---|---|---|
TO2012A000134 | Feb 2012 | IT | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB2013/051242 | 2/15/2013 | WO | 00 | 7/30/2014 |