Embodiments of the present invention generally relate to a method and apparatus for parallelizing context selection in video processing.
Description of the Related Art
In video coding standards, such as H.264/AVC, context modeling is a popular approach used in entropy coding to improve coding efficiency. Context modeling involves selecting a context, which determines the probability used to encode binary symbols. The context selection is sequential and time consuming. Since there are many factors that impact the context selection, such as values of other binary symbols that impact the context selection for the current binary symbol, context selection is difficult to parallelize, particularly during decoding. Parallelizing context selection would result in a more efficient process, cost reduction and potentially better performance.
Therefore, there is a need for a method and/or apparatus for parallelizing context selection.
Embodiments of the present invention relate to a method and apparatus for parallel processing of at least two bins of the transform coefficient information (e.g. significance map and coefficient levels) relating to at least one of a video and an image. The method includes determining scan type of at least a portion of the at least one of video and an image, analyzing neighboring frequency position of a coefficient within a transform unit of a bin, removing dependencies of context selection based on the scan type and position of location being encoded in a transform, and performing parallel processing of that least two bins.
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
It is beneficial to improve parallel processing capabilities during entropy coding of transform information while maintaining high coding efficiency. Parallel processing is important for high performance and for reducing power consumption, such as reducing the frequency requirement or operational voltage. As such, here the term “position” is intended to refer to the frequency position of a coefficient within a transform unit.
The significance map indicates the location of the non-zero transform coefficients. For improved coding efficiency, the context selection can depend on the value of the coefficients in neighboring positions, for example, the left, top-left and top directions in the transform.
To improve parallel processing of context selection, dependency will be a challenge for parallel decoding of two bins or more at a time. For example, when switching from diagonal to diagonal in zig-zag scanning, the context selection for the current bin depends on the value of the previously decoded bin, as shown in
There are additional dependencies for decoding 3 or 4 bins in parallel.
There many ways to calculate context index. In one embodiment, neighbors may be used to calculation the context index: When there are 5 neighbors, for each of the five neighbors, determine if the value is greater than 1. The number of neighbors that are greater than 1 is used to calculate the context index. Hence, the sum can range is between 0 and 5. This sum can then be mapped to a context index. When 4 neighbors need to be removed, either the sum can range from 0 to 4, or a neighbor can be double counted so that the sum can range from 0 to 5.
Removed neighbors are not included in the calculation of the context index. So, if any of the neighbors are greater than 1, then the sum may range between 0 and 4 and mapped to a context index. In one embodiment, a neighbor may count for twice. In such a case, though 4 neighbors are being considered, a sum ranging between 0 and 5 is expected. In other embodiments, the number of neighbors may be truncated to stay below a specific number of neighbors for consideration.
As such, reducing context dependencies for the significant coefficient flag in positions highlighted in grey of
For example, the dependency may be removed for the highlight blue positions or for all the positions in the first row and first column. In one embodiment, the coding loss for reducing the dependency for context selection of significant coefficient flag in all positions of the first row and first column is 0.1%, which was measured under common test conditions for HEVC with HM-2.0. In another embodiment, coding loss for reduced dependency for only the grey positions is 0.1 to 0.3%. An additional benefit of reducing the dependency for all positions in first row and first column is that there are fewer checks that need to be done to identify the positions with reduced dependency.
Hence, removing context selection dependencies that cause dependencies between recently processed bins or bins that may be processed in parallel is beneficial to minimize complexity, facilitate parallel processing and improve performance. As such, dependencies may be removed or reduced from all positions in the rows or columns near the edges, or for a subset of positions. Another example, in addition to the ones given in
Accordingly, in one embodiment, the mask is modified to facilitate parallel decoding. By modifying the mask may exclude pixel location that has been calculated. For example, when using zig-zag scans, the mask has a new shape with position X, of
Thus, in one embodiment, dependencies on neighbors for context selection of significant coefficient flag are reduced at positions near the edge of the transform when switching from one diagonal to another. Specifically, dependency between N bins being processed in parallel is undesirable since it requires speculative computation. When those N bins occupy different diagonals, the dependency on top and/or left neighbor should be removed. Such an approach may be applied to any type of scan, such as, zig-zag (forward or reverse), diagonal, sub-block diagonal, vertical, horizontal. For example, in the case of vertical, if the N bins occupy different columns, or for horizontal scan if the N bins occupy different rows, remove dependency between bins.
The reduction of neighboring dependency depending on position extend to other syntax elements with neighboring dependencies for context selection, such as, syntax elements that describe coefficient levels (coeff_abs_level_greater1_flag, coeff_abs_level_greater2 flag). Furthermore, a neighbor may be weighted higher than another to account for the removed neighbors. For example, a nearest neighbor can be double counted to account for the affect of the removed neighbor. Similarly, extra neighbors may be truncated to maintain a specific number of context selections.
In one embodiment, a reduced dependency can be assigned to specific positions in each row/column or it may be assigned to all positions within a given row or column in order to reduce checks/logics required to identify the position. In one embodiment only those positions that are affected by the wrapping of N bins across different diagonals are removed. In other words, the template of the neighboring dependency may change depending on the position within the significance map.
In yet another embodiment, reduction of neighboring dependency may be based on position corresponding to other syntax elements where there are neighboring dependencies for context selection, such as, coefficient levels. Certain neighbors can be weighted higher to account for the removed neighbors. For example, the nearest neighbor can be double counted to account for the affect of the removed neighbor.
Removing neighboring dependency can be applied to embodiments that utilize any scan type, such as, forward zig-zag, reverse zig-zag, diagonal, sub-block diagonal, vertical, horizontal and the likes. For example, when using a reverse diagonal scan the neighboring dependencies are modified to be in the bottom right corner. While the diagonal scan addresses dependencies at some of the transform edges, there still remain dependencies near the corners of the significance map.
As such, since 4×4 sub-block is a multiple of 4, when doing 4 bins per cycle, the positions do not shift. Thus, positions may directly be mapped to a bin order within a cycle; for example, position 4 is the 1st bin, position 5 is the 2nd bin, position 6 is the 3rd bin, and position 7 is the 4th bin; this would be consistent across all cycles. As a result, only dependencies within a group matters to enable parallel. Note that dependencies across groups matters less. For example, in
The method and apparatus for parallel video processing may be utilized in systems such as, broadcast systems, video on demand systems, home cinema, surveillance, real-time communications video chat/conference, telepresence, mobile streaming, mobile broadcast, mobile communications, storage and playback video, camcorders, cameras, dvd players, internet streaming, internet download, internet play, remote video presentation, remote computer graphics display, and the likes.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
This application is a continuation of prior U.S. patent application Ser. No. 16/155,342, filed on Oct. 9, 2018, which is a continuation of prior U.S. patent application Ser. No. 13/415,550, filed Mar. 8, 2012, (now U.S. Pat. No. 10,142,637), which claims benefit of U.S. provisional patent application Ser. No. 61/450,253 filed Mar. 8, 2011, U.S. provisional patent application Ser. No. 61/453,231 filed Mar. 16, 2011, U.S. provisional patent application Ser. No. 61/560,565 filed Nov. 16, 2011, U.S. provisional patent application Ser. No. 61/564,121 filed on Nov. 28, 2011, 61/583,351 filed on Jan. 5, 2012, U.S. provisional patent application Ser. No. 61/587,492 filed on Jan. 17, 2012, U.S. provisional patent application Ser. No. 61/588,476 filed on Jan. 19, 2012, and U.S. provisional patent application Ser. No. 61/595,277 filed on Feb. 6, 2012, which are herein incorporated by reference in their entireties.
Number | Name | Date | Kind |
---|---|---|---|
6680974 | Faryar | Jan 2004 | B1 |
20080046698 | Ahuja | Feb 2008 | A1 |
20080089421 | Je-Chang | Apr 2008 | A1 |
20090232204 | Lee | Sep 2009 | A1 |
20100020867 | Wiegand | Jan 2010 | A1 |
20100284459 | Jeong | Nov 2010 | A1 |
20110206135 | Drugeon | Aug 2011 | A1 |
20120014438 | Segall | Jan 2012 | A1 |
20120121011 | Coban | May 2012 | A1 |
20120140813 | Sole Rojals | Jun 2012 | A1 |
20120163455 | Zheng | Jun 2012 | A1 |
20120163469 | Kim | Jun 2012 | A1 |
20120230417 | Sole Rojals | Sep 2012 | A1 |
Entry |
---|
Cheung Auyeung, Wei Liu, “parallel processing friendly simplified context selection of significance map”, JCT-VC Meeting, Jan. 20-28, 2011. |
Jian Lou, Krit Panusopne, and Limin Wang, “Parallel processing friendly context modeling for significance amp coding in CABAC”. JCT-VC Meeting, Jan. 20-28, 2011. |
Thomas Wiegand, “Description of video coding technology proposal by Fraunhofer HHI”, JCT-VC Metting, Apr. 15-23, 2010. |
J. Sole, R. Joshi, I. S. Chong, M. Goban, M. Karczewicz, “Parallel context processing for the significance map in high coding efficiency”, JCT-VC Meeting, Jan. 20-28, 2011. |
Vadim Seregin, Jianle Chen, “Low-complexity adaptive coefficients scanning”, JCT-VC Meeting, Oct. 7-15, 2010. |
Chuohao Yeo, Yih Han Tan, Zhengguo Li, “Mode-dependent coefficient scanning for intra prediction residual coding”, JCT-VC Meeting, Jan. 20-28, 2011. |
Yunfei Zheng, Muhammed Goban, Joel Sole, Rajan Joshi, Marta Karczewicz, “Model dependent coefficient scanning”, JCT-VC meeting, Jan. 20-28, 2011. |
Number | Date | Country | |
---|---|---|---|
20210014510 A1 | Jan 2021 | US |
Number | Date | Country | |
---|---|---|---|
61595277 | Feb 2012 | US | |
61588476 | Jan 2012 | US | |
61587492 | Jan 2012 | US | |
61583351 | Jan 2012 | US | |
61564121 | Nov 2011 | US | |
61560565 | Nov 2011 | US | |
61453231 | Mar 2011 | US | |
61450253 | Mar 2011 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16155342 | Oct 2018 | US |
Child | 17034806 | US | |
Parent | 13415550 | Mar 2012 | US |
Child | 16155342 | US |