The present invention relates to three-dimensional (3D) and multi-view video coding for depth data. In particular, the present invention relates to disparity vector derivation for inter-view motion prediction (IVMP) of depth data.
Three-dimensional (3D) television has been a technology trend in recent years that intends to bring viewers sensational viewing experience. Various technologies have been developed to enable 3D viewing. Among them, the multi-view video is a key technology for 3DTV application among others. The traditional video is a two-dimensional (2D) medium that only provides viewers a single view of a scene from the perspective of the camera. However, the 3D video is capable of offering arbitrary viewpoints of dynamic scenes and provides viewers the sensation of realism.
The 3D video is typically created by capturing a scene using video camera with an associated device to capture depth information or using multiple cameras simultaneously, where the multiple cameras are properly located so that each camera captures the scene from one viewpoint. According to the draft three-dimensional video coding standard based on high efficiency video coding (3D-HEVC), inter-view motion prediction (IVMP) is applied to depth coding as well as the texture coding. For texture coding, neighboring block disparity vector (NBDV) method is adopted to derive the disparity vector (DV) between two views. For depth coding, however, a simplified method is disclosed by Park et al. (3D-CE2 related: Simplification of DV Derivation for Depth Coding, Joint Collaborative Team on 3D Video Coding Extension of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 7th Meeting: San Jose, US, 11-17 Jan. 2014, Document: JCT3V-G0074).
The use of the fixed depth value 128 apparently is based on the assumption that the depth data has 8-bit accuracy corresponding to 0 to 255 and 128 is the middle depth value of the depth range. Nevertheless, the depth data may use other bit-depths such as 10 or 12 bits. In such cases, the fixed level 128 may not be proper depth estimation for deriving the disparity vector. Therefore, it is desirable to develop IVMP coding technique for the depth data that can reliably work for various bit depths for the depth data.
A method and apparatus for coding depth data using inter-view motion prediction (IVMP) in a three-dimensional or multi-view video coding system are disclosed. In one embodiment, the bit depth of the depth data associated with the current depth map is determined first and a converted disparity vector is derived from a selected depth value depending on the bit depth. A corresponding depth block in an inter-view reference depth map in a reference view is located using the converted disparity vector. The current depth block is then encoded or decoded using the corresponding depth block as an inter-view predictor.
The converted disparity vector can be determined from the selected depth value using a lookup table. The selected depth value, d can be calculated according to d=1<<(BitDepth−1), BitDepth corresponds to the bit depth and “<<” corresponds to an arithmetic left shift operation. The converted disparity vector may correspond to (DepthToDisparityB [d], 0), wherein DepthToDisparityB [] is a lookup function mapping an input depth value to an output disparity vector. The bit depth of the depth data associated with the current depth map can be indicated in a sequence level of a bitstream associated with the depth data.
The IVMP coding process for the depth data can be performed only if the bit depth is 8. If the bit depth is not 8, the IVMP coding process for the depth data will not be performed. A bitstream associated with the depth data will be declared as invalid if the bitstream indicates that inter-view motion prediction is applied to the depth data and the bit depth is not 8.
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
As mentioned before, when the depth data uses a bit depth other than 8 bits, the fixed depth value, 128 being used for depth-to-disparity conversion may not be a good estimation of depth value. Therefore, it is desirable to develop a depth value estimation that can fit to arbitrary bit depth.
In one embodiment, the disparity vector between two views used by inter-view motion prediction (IVMP) coding of depth data is derived depending on the bit-depth of depth data as indicated in the current sequence.
In another embodiment, the estimated depth value can be based on the bit depth of the current depth component. The estimated depth value is used by depth value to disparity conversion 130 to derive the converted disparity vector DV. Furthermore, the derived DV is used by the IVMP coding to locate the inter-view reference block for the current depth block.
Following procedure illustrates an example to derive a disparity vector (MvDisp) between two views for IVMP coding based on the bit depth. The disparity vector MvDisp between two views required by IVMP is calculated as:
MvDisp=(DepthToDisparityB [(1<<(BitDepth−1))], 0), (1)
where DepthToDisparityB is a function converting a depth value to a horizontal component of the corresponding disparity vector, and BitDepth is the bit-depth for the current depth component. In equation (1), the vertical disparity is assumed to be 0 since multi-view cameras are often configured horizontally. Nevertheless, a corresponding depth-to-disparity function can be used for other multi-view camera configuration. The conversion may also be efficiently implemented as a lookup table. In the above example, the estimated depth value corresponds to (1<<(BitDepth−1), where “<<” corresponds to the arithmetic shift less operation. Therefore, if BitDepth is 10, the estimated depth value is 512 and if BitDepth is 12, the estimated depth value is 2048.
In another embodiment, the inter-view motion prediction coding for the depth data is allowed only if the bit-depth for a depth component is 8. If the bit-depth is not 8 for the depth component is not 8, the inter-view motion prediction coding for the depth data is not allowed. A bitstream associated with the depth data is declared invalid if the bitstream indicates that inter-view motion prediction is applied to the depth data and the bit depth is not 8.
The flowchart shown above is intended to illustrate an example of inter-view motion prediction for depth data according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be a circuit integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
The present invention is a continuation-in-part application of and claims priority to PCT Patent Application, Serial No. PCT/CN2014/079155, filed on Jun. 4, 2014, entitled “Depth Coding Compatible with Arbitrary Bit-Depth”. The PCT Patent Application is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2014/079155 | Jun 2014 | US |
Child | 14728088 | US |