This application claims the benefit, under 35 U.S.C. §365 of International Application PCT/CN2007/001297, filed Apr. 20, 2007, which was published in accordance with PCT Article 21(2) on Oct. 30, 2008 in English.
The invention relates to a method and to an apparatus for selecting a scan path for the elements of a block in spatial domain picture encoding and decoding.
International image or video coding standards like JPEG, MPEG-1/2/4 and H.261/H.263/H.264 use hybrid coding, wherein a picture is separated into pixel blocks on which predictive coding, transform coding and entropy coding is employed.
Normally, the transform coding is effective because the prediction error samples are correlated and in the transformed or frequency domain the signal energies become concentrated in partial areas of the coefficient blocks. Therefore any redundancy can be easily removed in the frequency domain. However, as disclosed in M. Narroschke, “Extending H.264/AVC by an adaptive coding of the prediction error”, Proceedings of Picture Coding Symposium, April 2006 (PCS2006), when the prediction quality is getting better and better, transform coding is no longer effective in many cases because the prediction error sample values are correlated only marginally and the signal energies will not concentrate in the frequency domain. M. Narroschke proposes a spatial domain or time domain video coding in which the prediction error samples (also called ‘residues’) are directly quantised and entropy-coded, without a prior transform into the frequency domain. He further proposed to use a rate-distortion optimisation (RDO) strategy for selecting adaptively whether to use spatial domain residue coding or transform coding.
Encoding quantised samples in the time domain is also disclosed in H. Schiller, “Prediction signal controlled scans for improved motion compensated video coding”, ELECTRONICS LETTERS, 4 Mar. 1993, Vol. 29, No. 5.
In the above publications, the scan of the quantised samples in the spatial domain is carried out according to the magnitude of the gradient in the prediction image, i.e. in the reconstructed reference frame, at the same spatial position.
M. Narroschke, H. G. Musmann, “Adaptive prediction error coding in spatial and frequency domain with a fixed scan in the spatial domain”, ITU-T, Question 6/SG16, document VCEG-AD07, Hangzhou, China, October 2006, discloses a fixed scan in the spatial domain.
On one hand, the adaptive spatial domain scan is vulnerable to transmission errors. Because the scan order depends on the prediction image, if previous data are lost or corrupted, the resulting error will propagate to the current block to be decoded, and will be further diffused or enlarged in subsequent pictures or frames. This kind of error propagation is worse than other kinds of error propagation so that it is unacceptable in video coding.
On the other hand, the fixed scan disclosed in the Narroschke/Musmann article, which is in fact a line-by-line scan, does not depend on previous data and thus there is no error propagation problem. But such simple fixed scan in the spatial domain reduces to some extent the performance improvement over frequency domain processing.
An optimum scan path for entropy coding should statistically scan from the sample with the greatest absolute value via decreasing absolute value samples to the sample with the smallest absolute value, whereby more non-zero samples are clustered in the beginning of the scan path while more zeros are arranged in the tail of the scan path. This allows to reduce the number of coding bits required for encoding the zeros, and also benefits the context-based entropy encoding.
The best scan mode in the spatial domain varies from case to case. Line-by-line scan is one choice and column-by-column scan is another choice, and zigzag scan is a third choice. However, a specific one of these scan modes does not outperform the others from a picture statistics point of view.
A problem to be solved by the invention is to provide a scan processing that improves the coding efficiency but does not introduce an error propagation problem. This problem is solved by the methods disclosed in claims 1 and 3. Apparatuses that utilise these methods are disclosed in claims 2 and 4.
This invention is related to an improved scan processing for spatial domain residue image or video coding. A fixed scan path pattern is selected adaptively for each block, in order to obtain better entropy-encoding performance without reference to previous data so that error propagation is prevented. I.e., a context-based adaptive scan mode is used.
When for a specific original picture content spatial domain video coding is better than frequency domain video coding, in the spatial domain usually more non-zero quantised prediction error values (more quantised prediction error values having a greater absolute value) are distributed in the outer side (i.e. near the boundary) of a current block, and/or they are clustered in a corner of the current block. Based on such statistic properties of prediction error values in the spatial domain, the invention uses a first step of scanning and encoding the quantised prediction error in the corners of the current block, and uses a second step of selecting a suitable scan mode corresponding to the result of the first step for scanning and encoding the rest of the non-zero quantised prediction error values.
In general, within a given current block, the later scan path of the samples is based on an initially scanned and coded result. The cost of this context-based adaptive scanning is an increased complexity. However, a few most probable scan modes can be pre-calculated or pre-determined or pre-defined for selection, and this increased complexity is almost negligible when compared with other processing steps in video coding.
In principle, the inventive method is suited for selecting a scan path for the elements of a block in spatial domain picture encoding, said method including the steps:
In principle the inventive apparatus is suited for selecting a scan path for the elements of a block in a spatial domain picture encoder, said apparatus including
means being adapted for determining in a current block, starting from a pre-defined corner element in said block, how many zero amplitude values of the corner elements said block contains when proceeding in clockwise direction, or as an alternative in counter-clockwise direction, and upon determining the first corner element having a non-zero amplitude, for forming a run-level value pair wherein the ‘run’ value corresponds to the number of preceding zero-amplitude corner elements in said current block and the ‘level’ value corresponds to the amplitude of said first non-zero amplitude corner element,
said means being further adapted for selecting, based on said ‘run’ value, for said current block a specific one from a group of pre-defined different scan paths for the remaining elements in said current block.
In principle, the inventive method is suited for selecting a scan path for the elements of a block in spatial domain picture decoding, said method including the steps:
In principle the inventive apparatus is suited for selecting a scan path for the elements of a block in a spatial domain picture decoder, said apparatus including
means being adapted for determining for a current block the ‘run’ value of a first run-level value pair in received scan select information (SCSI), wherein at encoder side, starting from a pre-defined corner element in the corresponding block, it was determined how many zero amplitude values of the corner elements said block contains when proceeding in clockwise direction, or as an alternative in counter-clockwise direction, and upon determining the first corner element having a non-zero amplitude, a run-level value pair was formed wherein the ‘run’ value corresponds to the number of preceding zero-amplitude corner elements in said current block and the ‘level’ value corresponds to the amplitude of said first non-zero amplitude corner element, said means being further adapted for selecting, based on said ‘run’ value, for said current block a specific one from a group of pre-defined different scan paths for the remaining elements in said current block.
Advantageous additional embodiments of the invention are disclosed in the respective dependent claims.
Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in:
An encoder block diagram similar like that in the above cited publications is shown in
Either the output from inverse transformer IT or the output from inverse time domain quantiser IQTD passes through a switch SW1 and an adder A to a motion compensated predictor step or stage MCP that outputs a predicted pixel block to the subtracting input of subtractor S, to the second input of adder A, and to the second input of motion estimator ME. Motion estimator ME calculates motion information for the current pixel or coefficient block to be encoded and controls motion compensated predictor MCP with this motion information. Motion information encoder step or stage MIENC entropy encodes this motion information and provides an encoded motion encoder output signal EMOS.
Either the output from frequency domain quantiser QFD or the output from time domain quantiser QTD passes through a switch SW2 to a video signal entropy encoder step or stage VEENC that output a correspondingly encoded video encoder output signal EVOS. In encoder VEENC the below described inventive scanning processing is carried out. Corresponding scan select information SCSI is also provided.
In the spatial domain block encoding process, transform and inverse transform are omitted, and the quantised prediction error values are scanned and entropy-coded. The step of scanning can be regarded as a part of the entropy encoding process. Signals EVOS, SCSI and EMOS may be combined into a bit stream that is transmitted to a corresponding decoder, or that may be stored or recorded on a storage medium.
In
Before describing the inventive scanning processing, some statistical results and the rationale of the invention are explained. Experiment have shown that, when spatial domain video coding is selected rather than frequency domain video coding, the prediction error in spatial domain usually has the following characteristics:
a) The prediction errors having greater absolute values are usually located near the boundaries of a block and are clustered in a corner of the block.
b) Only a few, e.g. less than four, prediction error values have a significantly greater absolute value than most of the other prediction error values of the block.
In other words, the prediction error value energies are concentrated not only in their positions but also in their values. In such specific cases, spatial domain video coding is better than the widely-used frequency domain video coding because the energy of the prediction error values is already concentrated in the spatial domain. If transform into the frequency would be used instead, the energy would be scattered in the frequency domain block, which characteristic does not fit for optimum entropy coding.
These specific cases usually occur at complex edges or at some complex moving object in the picture content where block-based intra or inter prediction can not provide a perfect prediction for all pixels of a current block, but a perfect prediction of the mean value of the whole block.
For example in the block depicted in
According to the invention, in a first step or stage, the matrix elements in the corners of a time domain block are scanned in order to determine in which one of the corners the prediction error values are clustered, and the related information is encoded. Based on the result of this first step, the spatial domain video coding adaptively selects a suitable scan order to entropy encode the rest of the prediction error values.
The following shows a detailed embodiment for encoding a block including at least one non-zero quantised prediction error value.
Step 1:
In the quantised prediction error values in the corners of the current block, as shown in
Only the first (run, level) pair is concerned in this step, that is, if more than one of the four positions ‘0’, ‘1’, ‘2’ and ‘3’ in
In case all four quantised samples in the four corners are zeros, then go directly to step 2.
Step 2:
According to the result of step 1, adaptively select the scan mode for the remaining quantised samples. For selection there are defined only e.g. five fixed scan tables.
If ‘run’ equals zero in step 1, which means (corner) position 0 (upper left corner) in
If ‘run’ equals ‘1’ in step 1, which means corner position ‘1’ (upper right corner) in
The case in
If ‘run’ equals ‘2’ in step 1, which means corner position ‘2’ (bottom-right corner) in
The case in
If ‘run’ equals ‘3’ in step 1, which means corner position ‘3’ (bottom-left corner) in
The case in
The selected pre-defined scan path may contain a section representing zigzag scanning with respect to the corner represented by the ‘run’ value.
If all four corner positions ‘0’, ‘1’, ‘2’ and ‘3’ have zero amplitude values, then the scan path is defined like in
When an inventive decoder receives a series of (run, level) pairs for a current block, it will first decode the first (run, level) pair and thereafter, according to the value of the first ‘run’ value, select a corresponding one of the group of encoder pre-defined scan path tables. After the decoder has finished decoding all the received (run, level) pairs for the current block, it assigns the level values to the correct block positions, based on the selected scan path table.
The scan path tables in
In the first step, instead of assigning run=0 to the upper left corner of the block, run=0 can also be assigned to another corner in the block. In the first step, instead of increasing the ‘corner’ run value clockwise, the corner run value can increase counter-clockwise.
The adaptive process can be extended by using more than the two steps described in the embodiments, whereby the second step (and may be further steps) again defines, based on run and/or level values in the beginning of the initially selected scan path, for the following scan path section which one to select from a group of pre-defined scan paths.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2007/001297 | 4/20/2007 | WO | 00 | 10/16/2009 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2008/128380 | 10/30/2008 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5654706 | Jeong | Aug 1997 | A |
5767909 | Jung | Jun 1998 | A |
5790706 | Auyeung | Aug 1998 | A |
5990960 | Murakami et al. | Nov 1999 | A |
7215707 | Lee et al. | May 2007 | B2 |
7379608 | Marpe et al. | May 2008 | B2 |
7586924 | Wiegand | Sep 2009 | B2 |
7599435 | Marpe et al. | Oct 2009 | B2 |
8009069 | Chen et al. | Aug 2011 | B2 |
8179981 | Chen et al. | May 2012 | B2 |
20100183069 | Chen et al. | Jul 2010 | A1 |
20100194610 | Chen et al. | Aug 2010 | A1 |
Number | Date | Country |
---|---|---|
2309613 | Jul 1997 | GB |
2309613 | Jul 1997 | GB |
6125278 | May 1994 | JP |
9275559 | Oct 1997 | JP |
10229340 | Aug 1998 | JP |
2000138939 | May 2000 | JP |
2000278680 | Oct 2000 | JP |
WO 9800807 | Jan 1998 | WO |
Entry |
---|
Copy of Search Report Dated Nov. 20, 2007. |
Narroschke, Matthias, “Adaptive Prediction Error Coding in the Spatial and Frequency Domain in the KTA Reference Model”, ISO/IEC JTC1/SC29/WG11, MPEG Video Subgroup, Montreux, Switzerland, Apr. 3, 2006. |
Wu, Chin Hsiung et al., “Run-Length Chain Coding and Shape's Moment Computations on Arrays with Reconfigurable Optical Buses”, IEEE International Conference on Parallel Processing, Piscataway, NJ, USA, Sep. 3, 2001, pp. 479-486. |
Narroschke, Matthias et. al., “Adaptive Prediction Error Coding in Spatial and Frequency Domain with a Fixed Scan in the Spatial Domain”, ITU—Telecommunications Standardization Sector—Video Coding Experts Group, 30th Meeting, Hangzhou, China, Oct. 23, 2007. |
Narroschke, Matthias et. al., “Adaptive Prediction Error Coding in Spatial and Frequency Domain for H.264/AVC”, ITU—Telecommunications Standardization Sector—Video Coding Experts Group, 29th Meeting, Bangkok, Thailand, Jan. 16, 2006. |
Number | Date | Country | |
---|---|---|---|
20100040298 A1 | Feb 2010 | US |