The present invention relates to a video encoding and decoding device. The invention more particularly relates to a video encoding and decoding device that utilizes the symmetry of image data to enhance an encoding efficiency.
Recently, the amount of data to be transmitted for video images has increased on a daily basis. As one example, the amount of data for an analog television is explained. For a current Japanese standard television, the number of pixels in a horizontal direction is 720, and the number of pixels in a vertical direction is 480. The pixels each have 8-bit luminance data and two color-difference data pieces (8 bits). A video image for one second has 30 images. Currently, a scheme in which the amounts of color components of the luminance data in the horizontal and vertical directions are reduced by half is used. Thus, as the amount of data for one second, 720×480×(8+8×½×½+8×½×½)×30=124416000 bits are necessary, and a transmission rate needs to be approximately 120 Mbps.
Even for an optical fiber that is currently and widely used as a household broadband, however, a transmission rate is approximately 100 Mbps. Thus, it is realistically impossible to transmit a video image unless the video image is compressed. It is said that the amount of data for digital terrestrial broadcasting to which the current broadcasting will be switched in 2011 is 1.5 Gbps. It can be said that a high-efficiency compression technique is a technique that will be necessary in the future.
At present, the technique that is expected to be widely used as a standard for the high-efficiency compression technique is H.264/AVC (hereinafter referred to as H.264). H.264 is the latest international standard (for video image encoding) that has been developed by Joint Video Team (JVT). JVT has been jointly established in December, 2001 by Video Coding Experts Group (VCEG) of International Telecommunication Union Telecommunication Standardization Sector (ITU-T) and MPEG (Moving Picture Expert Group) of International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC).
H.264 was approved by ITU-T in May, 2003 as a recommendation. In addition, H.264 was standardized by ISO/IEC Joint Technical Committee 1 (JTC 1) in 2003 as MPEG-4 Part 10 Advanced Video Coding (AVC). Further, an expansion task that is related to a color space and gradations of pixels was performed. The final plan for the standard was created by Fidelity Range Extension (FRExt) in July, 2004.
Main features of H.264 are as follows.
It should be noted that details of H.264 are described in http://ja.wikipedia.org/wiki/H.264, which is an URL of Wikipedia (free encyclopedia).
Hereinafter, encoding, which is performed according to H.264 that is the latest video image encoding standard, is described as Background with reference to
According to H.264, an intra prediction 104 and an inter prediction 105 are stipulated. The intra prediction 104 generates a predicted image a frame, and the inter prediction 105 generates a predicted image between frames. Here, differences between the generated predicted images and the original image are obtained. As illustrated in
The intra prediction is a process of generating a predicted image using a correlation between adjacent pixels. In the intra prediction, the predicted image is generated on the basis of a correlation between a pixel to be predicted and a pixel located around the pixel to be predicted, and a pixel that is located on the upper right side with respect to the left side of a block to be predicted is used. As illustrated in
In general, when a transmission rate is high, encoding is performed on data corresponding to the size of an input image in a video image encoding process. On the other hand, when the transmission rate is low, a pre-filter process is performed on a video format of the input image to change the size of the image so that the input image is encoded. This process is to change the sizes of all input images to a certain size.
As described above, when the transmission rate is low and the size of the image is to be changed, downsampling is performed in a horizontal direction or a vertical direction in general. When image data is downsampled, however, it is inevitable to degrade the quality of the image to some extent. In a general encoding method according to H.264, encoding is performed without utilizing the symmetry of the image. Thus, all parts of the image are encoded even when the image is bilaterally symmetric, and the encoding efficiency is not necessarily high.
The present invention was devised in order to solve the aforementioned problems, and an object of the present invention is to provide a video encoding and decoding device that encodes an image, compresses an information volume, utilizes the symmetry of the image and improves the encoding efficiency without degrading image quality.
According to the present invention, image folding determination processing is performed utilizing the symmetry of an input image, and a block of one area of the input image is set to be a folding area. By setting folding points describing the folding area, only information of the folding area and the folding points is encoded.
Thus, an area to be encoded can be reduced, and the size of the image to be encoded can be changed to an arbitrary size.
After decoding, the entire image is restored from the folding area, which was the encoded area. In areas that cannot be directly restored from the folding area, however, the image is restored by performing padding from peripheral blocks.
Hereinafter, an embodiment of the present invention is described with reference to
The embodiment of the present invention is described using a configuration in which H.264 that is the latest video image encoding standard is applied to the present invention.
First, an outline of a process that is performed by an encoder included in a video encoding and decoding device according to the embodiment of the present invention is described with reference to
As illustrated in
First, the configuration of the encoder that is included in the video encoding and decoding device according to the embodiment of the present invention is described with reference to
The encoder of the video encoding and decoding device according to the embodiment of the present invention is configured by adding a folding determination circuit 400 and a folding block information adding circuit 403 to a general H.264 encoder, as illustrated in
Next, details of the process that is performed by the encoder included in the video encoding and decoding device according to the embodiment of the present invention are described with reference to
Hereinafter, steps 1 to 5 of the process to be performed by the encoder are described.
(Step 1) First, as illustrated in
(Step 2) Next, as illustrated in
In this case, as illustrated in
Specifically, the folding line is set so that a folding error S of the following Equation (1) is minimized.
A symbol ai is a pixel value of the area A, and a symbol bi is a pixel value of the area B. A pixel with the pixel value ai and a pixel with the pixel value bi are located symmetrically about the folding line. The sigma Σ is the sum of absolute differences between all pixels of the area A and all pixels of the area B.
An area that is among two areas divided by the folding line and has a larger area than the other area is treated as a folding area. In
(Step 3) When the folding error is equal to or smaller than an arbitrarily set threshold, only the folding areas (for example, area A (700), area C (702) and area D (703)) are extracted, and post-processes that are orthogonal transform 401 and quantization 402 are performed on the extracted folding areas. When the folding error is larger than the threshold, the input image is not extracted and the post-processes are performed.
The arbitrarily set threshold that is used to determine whether or not a folding area is extracted may be set on the basis of a statistical decision or a transmission rate.
(Step 4) A folding processing flag 800 and block information of the folding points (600/601) are added to the quantized data as folding point block information 801. The folding processing flag 800 is used to determine whether or not a folding area is being extracted from the quantized data in Step 3.
(Step 5) Encoding such as variable-length coding is performed.
As illustrated in
The folding processing flag 800 and the folding point block information 801 are added to all the encoded macro block data pieces MBi in order to perform the process at a high speed.
In Step 4, by adding the necessary information to the quantized data, it is possible to prevent information of folding block numbers from being lost or having a rounding error due to a quantization error. In addition, since necessary information on the folding can be multiplexed into video image data to be transmitted, there is an advantage that an additional transmission path does not need to be provided.
Next, the configuration of a decoder that is included in the video encoding and decoding device according to the embodiment of the present invention is described with reference to
The decoder that is included in the video encoding and decoding device according to the embodiment of the present invention is configured by adding a decoded data separation circuit 901 and a folding processing circuit 906 to a general H.264 decoder. The decoder performs a process in an inverse manner with the process to be performed by the encoder so that the input image is restored.
Next, details of the process to be performed by the decoder included in the video encoding and decoding device according to the embodiment of the present invention are described with reference to
Steps 5 to 7 of the process to be performed by the decoder are described.
(Step 5) First, decoding 900 is performed.
(Step 6) Next, the decoded data separation circuit 901 extracts the folding block information from decoded data and transmits the extracted information to the folding processing circuit 906. In addition, the quantization coefficient data other than the extracted data is subjected to inverse quantization 902 and post-processes.
(Step 7) Next, the folding processing is performed on a reconfigured image 905 decoded from inversely orthogonally transformed data and predicted image data on the basis of the folding block information transmitted from the decoded data separation circuit 901 so that the image is restored.
Next, details of the folding processing performed in Step 7 are described. As illustrated in
When the image data is folded in the folding processing and thus some areas of the input image extend beyond the size of the input image, data on such extending areas (parts indicated by diamond symbols) of the input image is not used to restore the data.
Thus, parts (parts indicated by circle symbols and triangle symbols, areas E (704) and F (705) in
Alternatively, the parts (parts indicated by circle symbols and triangle symbols) that cannot be restored by the folding processing are encoded in accordance with the determination processes in a normal manner and transmitted from the encoder.
The above determination processes are as follows:
In this case, the padding can be achieved by adding a flag indicating whether or not the parts subjected to the padding are transmitted to block information to be added to the encoded data.
In the encoding of Step 5, a reconfigured image holds only the folding areas. In this case, when an area other than the folding areas is referenced in the intra prediction or the inter prediction, the folding processing is performed so that another area is restored.
According to the present embodiment, since the size of the input image is changed to an arbitrary size, information can be fed back for control of the amount of data to be encoded and generated, and the quality of the image can be improved. Especially, the video encoding and decoding device according to the present embodiment can obtain an effect of significantly improving the efficiency of encoding an input image that has symmetry. When an image has symmetry and is complex, the amount of the input image to be transmitted can be reduced. It is, therefore, possible to prevent the quality of the image from being degraded due to a quantization error.
Whether or not the process is performed by the video encoding and decoding device according to the present invention can be determined by analyzing the stream since the size of the input image is different from a conventional technique.
According to the present invention, the video encoding and decoding device that encodes an image and thereby compresses the amount of information can be provided, which uses the symmetry of the image and improves the encoding efficiency without degrading image quality.
101 . . . Input image, 102 . . . Orthogonal transform, 103 . . . Quantization, 104 . . . Intra prediction, 105 . . . Inter prediction, 106 . . . Reconfigured image, 107 . . . Filter, 108 . . . Inverse orthogonal transform, 109 . . . Inverse quantization, 110 . . . Encoding,
400 . . . Folding determination circuit, 401 . . . Orthogonal transform, 402 . . . Quantization, 403 . . . Folding block information adding circuit, 404 . . . Encoding, 405 . . . Intra prediction, 406 . . . Inter prediction, 407 . . . Reconfigured image, 408 . . . Filter, 409 . . . Inverse orthogonal transform, 410 . . . Inverse quantization,
901 . . . Encoded data analyzing circuit, 902 . . . Quantization, 903 . . . Orthogonal transform, 904 . . . Filter, 905 . . . Reconfigured image, 906 . . . Folding processing circuit
Number | Date | Country | Kind |
---|---|---|---|
2009-264090 | Nov 2009 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2010/067677 | 10/7/2010 | WO | 00 | 6/22/2012 |