The present invention relates to three-dimensional and multi-view video coding. In particular, the present invention relates to method for Depth Lookup Table signaling.
Three-dimensional (3D) television has been a technology trend in recent years that is targeted to bring viewers sensational viewing experience. Multi-view video is a technique to capture and render 3D video. The multi-view video is typically created by capturing a scene using multiple cameras simultaneously, where the multiple cameras are properly located so that each camera captures the scene from one viewpoint. The multi-view video with a large number of video sequences associated with the views represents a massive amount data. Accordingly, the multi-view video will require a large storage space to store and/or a high bandwidth to transmit. Therefore, multi-view video coding techniques have been developed in the field to reduce the required storage space and the transmission bandwidth. A straightforward approach may simply apply conventional video coding techniques to each single-view video sequence independently and disregard any correlation among different views. Such straightforward techniques would result in poor coding performance. In order to improve multi-view video coding efficiency, multi-view video coding always exploits inter-view redundancy. The disparity between two views is caused by the locations and angles of the two respective cameras.
Depth lookup table (DLT) has been adopted into 3D-HEVC. Very often, there are only limited values appearing in the depth component. Therefore, DLT is a compact representation of the valid values in a block. When a CU is coded in Intra simplified depth coding (SDC) mode or depth map modeling (DMM) mode, DLT is used to map the valid depth values to DLT indexes.
In the current 3D-HEVC (Three-Dimensional Coding Based on High Efficiency Video Coding) draft standard, DLT is signaled as an extension to picture parameter set (PPS). The syntax elements related to DLT signaling are described in the following tables.
As shown in Table 1, the DLT parameter information, pps_dlt_parameters( ) is incorporated in the PPS if the PPS extension flag, pps_extension_type_flag[0] is asserted.
The syntax structure for the DLT parameter information, pps_dlt_parameters( ) is shown in Table 2. When the flag, dlt_present_flag has a value of 1, information related to DLT is incorporated in pps_dlt_parameters( ) as shown in Table 2. The inter-view DLT prediction flag, inter_view_dlt_pred_enable_flag[i] equal to 1 indicates that the i-th depth lookup table is predicted from the depth lookup table of the 0-th depth lookup table. On the other hand, the flag, inter_view_dlt_pred_enable_flag[i] equal to 0 indicates the i-th depth lookup table is not predicted from any other depth lookup table.
The DLT signaling according to the existing 3D-HEVC standard has some issues. First, inter-view prediction can be applied to DLT as indicated by a corresponding flag, inter_view_dlt_pred_enable_flag[i]. If this flag is set to 1, the i-th DLT is predicted from the 0-th DLT. On the other hand, dlt_flag[i] indicates whether the i-th DLT exists. Therefore, if dlt_flag[0] is 0 and inter_view_dlt_pred_enable_flag[i] with i>0 is 1, the i-th DLT is predicted from a non-existed DLT.
In the existing 3D-HEVC standard, pps_bit_depth_for_depth_views_minus8 is signaled to indicate the bit-depth for samples of the depth component in the picture. However, the bit-depth for samples in the depth component is also signaled in the sequence level as indicated by bit_depth_luma_minus8 which is signaled in sequence parameter set (SPS). Therefore. There is potential contradiction between these two syntax elements if these two syntax elements are different.
In the existing 3D-HEVC standard, the DLT is signaled in all PPSs of the bit-stream including the texture video in all views since the flag dlt_present_flag can be set to 1 for the texture data. However, the DLT is only required by the depth component of each view. According to the existing 3D-HEVC standard, the DLT is signaled in four PPSs in total when there are 3 views. As shown in
It is desirable to develop methods to overcome these issues without causing noticeable impact on the performance.
A method and apparatus for depth lookup table (DLT) signaling in a three-dimensional and multi-view coding system are disclosed. A method and apparatus for depth lookup table (DLT) signaling in a three-dimensional and multi-view coding system. The method identifies one or more pictures to be processed. If one or more pictures contain depth data, then the method determines the DLT associated with said one or more pictures, applies predictive coding to the DLT based on the previous DLT, includes syntax related to the DLT in the PPS, and includes first bit-depth information related to first depth samples of the DLT in the PPS. The first bit-depth information is consistent with second bit depth information signaled in a sequence level. The method further signals the PPS in a video bitstream for a sequence including said one or more pictures.
In another embodiment of the application, a circuit is also provided that embodies circuitry configured to carry out the operations specified above.
As mentioned before, there are various issues with the depth lookup table (DLT) signaling in the existing High Efficiency Video Coding (HEVC) based 3D video coding. Accordingly, embodiments of the present invention are disclosed to overcome these issues. The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
One aspect of the present invention addresses validity of DLT prediction. As shown in the PPS DLT parameter syntax of the existing 3D-HEVC, predictive DLT coding is allowed for all depth layers regardless of whether the depending DLT exists or not. In one embodiment, whether a corresponding DLT exists for predicting a current DLT is checked first. If the corresponding DLT exists, predictive DLT coding is allowed for the current DLT to use the corresponding DLT as a predictor. If the corresponding DLT does not exist, predictive DLT coding is not applied to the current DLT regardless of whether the inter-view DLT prediction is enabled or not as indicated by an inter-view DLT prediction enable flag, inter_view_dlt_pred_enable_flag[i]. Alternatively, if the corresponding DLT required for predicting the i-th DLT does not exist, the flag inter_view_dlt_pred_enable_flag[i] is forced to be 0. In another embodiment, the flag inter_view_dlt_pred_enable_flag[i] is inferred as 0 if the DLT for predicting the i-th DLT does not exist.
An exemplary syntax table to incorporating an embodiment of the present invention is shown in Table 3. For the i-th DLT, the flag inter_view_dlt_pred_enable_flag[i] is incorporated only when the 0-th DLT exists, i.e., dlt_flag[0] being 1.
When the flag inter_view_dlt_pred_enable_flag[i] is 1 and the DLT that is originally used to predict the i-th DLT does not exist, the predictor for the i-th DLT can be changed to another DLT that exists. Instead of changing to an existing DLT, a predefined DLT can be used in this case as well. For example, the predefined DLT may include all possible values, such as 0, 1, . . . , 255, in the depth component. In another example, the predefined DLT contains no values.
Another aspect of the present invention addresses the consistency of bit depth information in different layers of the syntax. For example, the consistency of the bit-depth indication can be checked for the sequence level. To be specific, all bit-depth indications for depth data signaled in a video sequence must be the same as the bit-depth indication signaled in the sequence parameter set (SPS). Also, the PPS level bit depth indication (i.e., pps_bit_depth_for_depth_views_minus8) is set to be the same as the SPS level bit depth indication (i.e., bit_depth_luma_minus8).
In another embodiment, the bit depth consistency is achieved by scaling when the bit depth indications are different in different levels. For example, depth values of the DLT signaled in the PPS can be scaled if the PPS level bit depth indication, pps_bit_depth_for_depth_views_minus8 and the SPS level bit depth indication, bit_depth_luma_minus8 are different. For example, a depth value D of the DLT can be scaled according to D′=(D+offset)>>(pps_bit_depth_for_depth_views_minus8−bit_depth_luma_minus8) if pps_bit_depth_for_depth_views_minus8 is greater than bit_depth_luma_minus8. In another example, the scaling can be done according to D′=D<<(bit_depth_luma_minus8−pps_bit_depth_for_depth_views_minus8) if pps_bit_depth_for_depth_views_minus8 is lower than bit_depth_luma_minus8. The offset can be any integer such as 0 or (1<<(pps_bit_depth_for_depth_views_minus8−bit_depth_luma_minus8−1)).
Another aspect of the present invention addresses redundancy in the DLT signaling. Since the DLT is not needed by the texture data, DLT is not signaled in the PPS for the texture component. In other words, DLT is not signaled in the PPS for the texture only layer. On the other hand, the DLT for the depth data of all views can be signaled in a single PPS that is shared by depth components of all views as shown in
In one embodiment, one PPS may only signal the DLT for the depth component associated with one view only. In other words, one PPS only signal the DLT for one layer. In another embodiment, a slice may use the DLT signaled in the PPS that contains PPS identification, pps_pic_parameter_set_id with the same value as the slice identification, slice_pic_parameter_set_id in the slice header for this slice.
In one embodiment, the DLT signaled in one PPS, noted as P1, can be predicted by a DLT signaled in a different PPS, noted as P0. Furthermore, the pps_pic_parameter_set_id of P0 can be signaled in P1 to locate the PPS (i.e., P0) containing a DLT to be used as a predictor for the DLT in PPS P1.
The flowchart shown above is intended to illustrate an example of 3D/multi-view coding using DLT signaling in three-dimensional and multi-view coding according to an embodiment of the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention may correspond to one or more electronic circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
The present application is a continuation of U.S. patent application Ser. No. 15/123,882, filed Sep. 6, 2016, which is a national phase application of PCT patent application no. PCT/CN2015/074391, filed Mar. 17, 2015, which is a continuation-In-Part of PCT Patent Application, Ser. No. PCT/CN2014/073611, filed on Mar. 18, 2014. The priority applications are all hereby incorporated by reference in their entireties.
Number | Name | Date | Kind |
---|---|---|---|
20090161989 | Sim | Jun 2009 | A1 |
20090262798 | Chiu | Oct 2009 | A1 |
20110090959 | Wiegand | Apr 2011 | A1 |
20110228855 | Gao | Sep 2011 | A1 |
20120229602 | Chen | Sep 2012 | A1 |
20130022111 | Chen | Jan 2013 | A1 |
20130156215 | Hickerson | Jun 2013 | A1 |
20130182761 | Chen | Jul 2013 | A1 |
20130235072 | Longhurst | Sep 2013 | A1 |
20130294509 | Song | Nov 2013 | A1 |
20130342644 | Rusanovsky et al. | Dec 2013 | A1 |
20140253682 | Zhang | Sep 2014 | A1 |
20150036745 | Hsu | Feb 2015 | A1 |
20150078441 | Han | Mar 2015 | A1 |
20150350623 | Zhang | Dec 2015 | A1 |
20150350677 | Lim | Dec 2015 | A1 |
20150358643 | Zhang | Dec 2015 | A1 |
20160007005 | Konieczny | Jan 2016 | A1 |
20160014434 | Liu et al. | Jan 2016 | A1 |
20160029036 | Jaeger | Jan 2016 | A1 |
20160057452 | Li | Feb 2016 | A1 |
20160073131 | Heo | Mar 2016 | A1 |
20160330479 | Liu | Nov 2016 | A1 |
20160330480 | Liu | Nov 2016 | A1 |
20170006309 | Liu | Jan 2017 | A1 |
20170013276 | Chen | Jan 2017 | A1 |
Number | Date | Country |
---|---|---|
101222639 | Jul 2008 | CN |
103491369 | Jan 2014 | CN |
WO 2014000664 | Jan 2014 | WO |
WO 2014166100 | Oct 2014 | WO |
Entry |
---|
International Search Report dated Jun. 23, 2015, issued in application No. PCT/CN2015/074391. |
Tech, G., et al.; “3D-HEVC Draft Text 3”; Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 7th Meeting: San Jose; JCT3V-G1001; Jan. 2014; pp. 1-94. |
Zhang K. et al.; “AHG7: An efficient coding method for DLT in 3D-HEVC;” Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/ WG 11; Jul.-Aug. 2013; pp. 1-5. |
Jager, F.; “CE6.H related: Results on Updating Mechanism for Coding of Depth Lookup Table (Delta-DLT);” Joint Oollaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/ WG 11; Apr. 2013; pp. 1-13. |
Zhao, X., et al.; “AHG7: On signaling of DLT for depth coding;” Joint Collaborative Team on 3D Video Coding xtension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/ WG 11; Jul.-Aug. 2013; pp. 1-8. |
Number | Date | Country | |
---|---|---|---|
20180014029 A1 | Jan 2018 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15123882 | US | |
Child | 15696260 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2014/073611 | Mar 2014 | US |
Child | 15123882 | US |