This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2008-267862, filed on Oct. 16, 2008, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are directed to a transcoding device that re-encodes an encoded image into another encoded-format image and a transcoding method.
Video data generally contain a large amount of information, and it would be very costly if such data are stored in a medium or transmitted over a network without modification. Therefore, conventionally, many technical developments and standardization efforts have been widely made to compress-code video data using lossless or lossy compression techniques. Typical examples include the MPEG-1, the MPEG-2, the MPEG-4, and the MPEG-4 AVC/H.264, all of which are standardized by the Moving Picture Experts Group (MPEG).
In these standards, inter-frame motion prediction is adopted in coding. In the coding with the inter-frame motion prediction, highly correlative portions of frames are searched and positional difference (motion vectors) and pixel value difference (prediction errors) of the portions are coded.
In recent years, as various encoding methods of video data are developed and devices that performs recording, transmission, and display are diversified, there is highly required a transcoding function for converting video data encoded in a predetermined encoding format into another encoding format video data, for example, converting an MPEG-2 format video data into an MPEG-4 AVC/H.264 format video data.
For example, when video data of an MPEG-2 format is converted into video data of an MPEG-4 AVC/H.264 format, a transcoding device having such a transcoding function decodes the video data of the MPEG-2 format and then encodes the decoded video data into the MPEG-4 video data that can be further compressed.
In other words, the transcoding device has a tandem connection structure, in which a decoder and an encoder operate independently from each other. For this reason, the transcoding device has two frame memories for the decoder and the encoder, each storing therein reference images (see
Specifically, a transcoding device 200 includes a format-1 (for example, MPEG-2 format) decoder 30 and a format-2 (for example, MPEG-4 AVC/H.264 format) encoder 40 as illustrated in
The format-1 decoder 30 of the transcoding device reads a small-area image, which is indicated by the motion vector MV1, from a frame memory 30d for decoder, and performs a motion compensation by using the read small-area image as a reference image, with use of a motion compensating unit 30e. (Generally, each macroblock can have two motion vectors. One motion vector is for forward motion compensation. Another motion vector is for backward motion compensation. Each motion vector can have different reference images.) Then, the format-1 decoder 30 of the transcoding device adds the quantized DCT coefficient obtained after the inverse quantization and inverse DCT operation to the reference image having been subjected to the motion compensation, and generates a decode image. The format-1 decoder 30 outputs the decode image to the format-2 encoder 40 of the transcoding device and also stores the decode image in the frame memory 30d for decoder.
Next, the method-2 encoder 40 of the transcoding device searches for a motion vector to determine a motion vector MV2 by using the received decoded image of the first encoding format as an input image and using a local decoded image of the second encoding format stored in a frame memory 40h for encoder as a reference image, with use of a motion searching unit 40a.
Next, a motion compensating unit 40c performs motion compensation on the reference image stored in the frame memory 40h for encoder by using the motion vector MV2 searched by the motion searching unit 40a. Then, a subtractor 40b performs subtraction on the decoded image of the first encoding format and a predictive pixel output from the motion compensating unit 40c to obtain a predictive error signal.
Next, a DCT/quantization unit 40d performs DCT operation and quantization on the predictive error signal output from the subtractor 40b to obtain a quantized DCT coefficient. The quantized DCT coefficient that is an output of the DCT/quantization unit 40d is output as encoded data of the second format that is variable length encoded by using a variable length encoding unit 40e.
Moreover, the format-2 encoder 40 of the transcoding device performs inverse quantization and inverse DCT operation on the quantized DCT coefficient output from the DCT/quantization unit 40d to obtain a re-predictive error signal by using an inverse quantization/inverse DCT unit 40f. Next, an adder 40g, adds the re-predictive error signal to the predictive image to obtain a local decoded image of the second encoding format and stores the local decoded image in the frame memory 40h for encoder.
In order to lower a used amount of a memory for encoder in the transcoding device 200 as described above, a method for recalculating a motion vector based on a motion vector included in an encoded video data before transcoding has been known as disclosed in, for example, Japanese Laid-open Patent Publication No. 2003-9158. The structure of the transcoding device is similar to that of
Specifically, the motion searching unit of the mode-2 encoder of the transcoding device receives, from the mode-1 decoder, the motion vector included in the encoded video data before transcoding, narrows a search range centered around a point indicated by the motion vector, and researches a motion vector by using the reference image stored in the frame memory for encoder.
However, in the technique in which the decoder and encoder as described above operate independently from each other, the decoder and the encoder have their respective frame memories, and motion compensation in the decoder and motion search and compensation in the encoder are independently performed. Therefore, the conventional technique has a problem that a circuit scale, a used amount of memory, and a memory bandwidth increase and thus a product cost is high.
Similarly, in the technique in which recalculating a motion vector based on a motion vector included in an encoded video data before transcoding, the decoder and the encoder have their respective frame memories, and a motion vector is re-searched by using the reference image stored in the frame memory for encoder. Therefore, the conventional technique has a problem that a circuit scale, a used amount of memory, and a memory bandwidth increase and thus a product cost is high.
Moreover, in a technique for reducing the number of the above-described frame memories, the encoder performs a motion compensation process by using motion vector information included in an encoded video data before transcoding. Accordingly, for example, when video data in an MPEG-2 format is converted into video data in an MPEG-4 AVC/H.264 format, motion compensation is performed by not using a high-accuracy motion vector in the MPEG-4 AVC/H.264 but using the MPEG-2 motion vector without conversion. As a result, there is a problem that an image quality of a transcoded moving image degrades.
According to an aspect of the invention, a transcoding device includes a decoding unit that decodes both motion vectors of macroblocks and images from encoded images in a first encoding format; a first decoded image storing unit that stores therein the decoded motion vectors of macroblocks and the decoded images generated by the decoding unit; a vector searching unit that searches for motion vectors of macroblocks in a second encoding format by using the decoded images stored in the first decoded image storing unit as reference images and by using the decoded motion vectors of macroblocks stored in the first decoded image storing unit; and a motion compensating unit that reads, from the first decoded image storing unit, areas in the decoded images, which are indicated by the motion vectors of macroblocks for which the vector searching unit has searched, and performs motion compensation by using the areas in the decoded images and the motion vectors of macroblocks for which the vector searching unit has searched.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Preferred embodiments of a transcoding device and a transcoding method according to the present invention will be explained in detail below with reference to the accompanying drawings.
In the explanation below, a structure of a transcoding device according to a first embodiment of the present invention will be explained at first, and the process performed thereby will then follow. Finally, effects of the first embodiment will be described.
(Structure of Transcoding Device)
A structure of a transcoding device 100 will be explained with reference to
As illustrated in
The format-1 decoder 10 is a decoder that can singly perform a decoding operation. The format-1 decoder 10 decodes motion vectors of macroblocks in encoded images encoded in a first encoding format and also decodes the encoded images. Particularly, the format-1 decoder 10 includes a variable length decoding unit 10a, an inverse quantization/inverse DCT unit 10b, an adder 10c, a motion compensating unit 10d, a buffer 10e, and a frame memory 10f.
The variable length decoding unit 10a receives video data encoded in the first encoding format. Then, the variable length decoding unit 10a performs variable length decoding on a quantized IDCT coefficient and a motion vector (MV1) for each macroblock of the received video data. In this description, each macroblock is supposed to have only forward motion vector. After that, the variable length decoding unit 10a transmits the quantized IDCT coefficient to the inverse quantization/inverse DCT unit 10b and transmits the motion vector (MV1) to the motion compensating unit 10d and a motion searching unit 20a to be described below.
The inverse quantization/inverse DCT unit 10b performs inverse quantization and inverse DCT operation on the quantized IDCT coefficient obtained after DCT operation and quantization in the first encoding format. Then, the inverse quantization/inverse DCT unit 10b outputs the result thereof to the adder 10c.
The adder 10c adds the output of the inverse quantization/inverse DCT unit 10b to a result of a first motion compensation process output from the motion compensating unit 10d and generates a decoded image for each macroblock. Then, the adder 10c outputs the generated decoded images to the format-2 encoder 20.
The motion compensating unit 10d performs motion compensation by using a small-area image indicated by the motion vector (MV1) in a reference image output from the buffer 10e to be described below.
The buffer 10e has a higher reading speed than that of the frame memory 10f. The buffer 10e acquires, from the frame memory 10f, the small area indicated by the motion vector (MV1) output from the variable length decoding unit 10a and an area having a width of Δ pixels apart from the top, bottom, left, and right of the small area (see
Moreover, the motion searching unit 20a to be described below reads, from the buffer 10e, the small area indicated by the motion vector (MV1) and the area having a width of Δ pixels apart from the top, bottom, left, and right of the small area. Moreover, a motion compensating unit 20c to be described below reads, from the buffer 10e, a small area indicated by a motion vector (MV2) for which the motion searching unit 20a searches.
The frame memory 10f stores therein the decoded motion vector and the generated decoded images. Specifically, the frame memory 10f stores therein a decoded result of video data encoded in the first encoding format for each macroblock.
In this case, the frame memory 10f is shared for the format-1 decoder 10 and the format-2 encoder 20. In other words, the format-2 encoder 20 re-searches for the motion vector of a macroblock by using the decoded images stored in the frame memory for decoder as a reference image, and performs motion compensation (hereinafter, it will be explained in detail with reference to
The format-2 encoder 20 is an encoder that can singly perform an encoding operation. Particularly, the format-2 encoder 20 includes the motion searching unit 20a, a subtractor 20b, the motion compensating unit 20c, a DCT/quantization unit 20d, and a variable length encoding unit 20e.
The motion searching unit 20a searches for the research motion vector of a macroblock by using the stored decoded image as a reference image and using the motion vector of the same macroblock stored in the frame memory 10f (or the buffer 10e).
Specifically, the motion searching unit 20a receives the decoded images of the first encoding format output from the adder 10c, reads, from the buffer 10e, the small area (macroblock) indicated by the motion vector (MV1) and the area having a width of Δ pixels apart from the top, bottom, left, and right of the small area, and receives the motion vector (MV1) from the variable length decoding unit 10a.
Then, the motion searching unit 20a performs a motion vector search by using the decoded image of the first encoding format as a predictive image, using the small area indicated by the motion vector and the area having a width of Δ pixels apart from the top, bottom, left, and right of the small area as a reference image, and using the motion vector output from the variable length decoding unit 10a as an original point. After that, the motion searching unit 20a determines the motion vector (MV2) searched for and outputs the motion vector (MV2) to the motion compensating unit 20c.
A specific example is explained. When an MPEG-2 format video data is converted into an MPEG-4 AVC/H.264 format video data, the motion searching unit 20a researches a motion vector of quarter-pel precision in the encoder without using a motion vector of half-pel precision in an MPEG-2 format. For this reason, the transcoding device 100 can maximally utilize the motion compensation extended by using a motion vector in an MPEG-4 AVC/H.264 format. Accordingly, the transcoding device 100 can reduce a circuit scale and a used amount of memory and thus lower a product cost without degrading an image quality of a transcoded video data.
The subtractor 20b performs subtraction on the output of the adder 10c that is the decoded image of the first encoding format and the predictive image output from the motion compensating unit 20c to be described below, and obtains a predictive error signal. Then, the subtractor 20b transmits the predictive error signal to the DCT/quantization unit 20d.
The motion compensating unit 20c reads an area in the decoded image indicated by the searched re-search motion vector from the buffer 10e, and performs motion compensation by using the area in the decoded image and the re-search motion vector.
Specifically, the motion compensating unit 20c receives the motion vector (MV2) from the motion searching unit 20a and reads a small area indicated by the motion vector from the buffer 10e for each macroblock. Then, the motion compensating unit 20c performs motion compensation and generates a predictive image by using the small areas indicated by the read motion vector and the motion vector. After that, the motion compensating unit 20c outputs the generated predictive image to the subtractor 20b.
In other words, the transcoding device 100 acquires the decoded result of video data encoded in the first encoding format from the format-1 decoder 10 and performs a motion vector search process and a motion compensation process, instead of providing a frame memory for storing therein a decoded result (a reference image) of video data encoded in the second encoding format in the format-2 encoder 20. Specifically, the format-2 encoder 20 of the transcoding device 100 acquires, from the format-1 decoder 10, the small area indicated by the motion vector included in the encoded video data before transcoding and the area having a width of Δ pixels apart from the top, bottom, left, and right of the small area, as a reference image, among the decoded results of the video data encoded in the first encoding format, and re-searches a motion vector. Then, the format-2 encoder 20 of the transcoding device 100 performs motion compensation by using the small area indicated by the re-searched motion vector (MV2) output from the motion searching unit 20a.
In this manner, because the format-2 encoder 20 of the transcoding device 100 uses the decoded result of the video data encoded in the first encoding format as a reference image instead of the decoded result of the video data encoded in the second encoding format. Therefore, the transcoding device 100 may degrade an image quality due to a difference between both decoded results. However, the transcoding device 100 can realize a motion vector research process and a motion compensation process by using only a frame memory for decoder without providing a frame memory for encoder. Accordingly, the transcoding device 100 can reduce a circuit scale and a used amount of memory.
A small-area image to be acquired for motion compensation and motion search is now explained with reference to
The motion searching unit 20a can set a search start point to the motion vector of the original bit stream in the motion vector re-search process performed by the encoder. Therefore, the motion searching unit 20a reads an extended small area, which is obtained by extending the small area in a reference image to include the area having a width of Δ pixels apart from the top, bottom, left, and right of the small area, from the frame memory 10f when reading the small area in the reference image from the frame memory 10f.
In this way, a reference image extended to include the area having a width of Δ pixels apart from the top, bottom, left, and right of the small area can be also acquired at the same time and searched as a reference image for researching a motion vector. In the motion compensation process performed by the encoder, a small area indicated by the re-search motion vector determined in the re-searching process is read. In this case, the small area indicated by the re-search motion vector is 17 pixels by 17 pixels.
In an example as illustrated in
The DCT/quantization unit 20d performs a DCT operation on the predictive error signal output from the subtraction unit 20b and obtains a DCT coefficient. Then, the DCT/quantization unit 20d performs a quantization operation on the DCT coefficient and obtains a quantized DCT coefficient. After that, the DCT/quantization unit 20d outputs the quantized DCT coefficient to the variable length encoding unit 20e.
The variable length encoding unit 20e performs a variable length-encoding on the quantizes DCT coefficient transmitted from the DCT/quantization unit 20d and the motion vector output from the motion searching unit 20a.
(Process performed by Transcoding Device)
Next, a process performed by the transcoding device 100 according to the first embodiment will be explained with reference to
As illustrated in
Next, the transcoding device 100 uses the decoded image stored in the frame memory 10f as a reference image and also uses motion vector information, in order to research a motion vector (Step S103). Then, the transcoding device 100 reads, from the buffer 10e, a small area indicated by the re-search motion vector of the decoded image. Then, the motion compensating unit 20c performs motion compensation by using the small-area image indicated by the re-searched motion vector (Step S104).
After that, the transcoding device 100 acquires a predictive error signal, performs DCT and quantization operations on the predictive error signal, and obtains a quantized DCT coefficient. Then, the transcoding device 100 performs variable length encoding on the quantized DCT coefficient and the re-search motion vector (Step S105).
As described above, the transcoding device 100 decodes a motion vector obtained from encoded images encoded in the first encoding format, also decodes the encoded images to generate decoded images, and stores the decoded motion vector and the generated decoded images in the frame memory 10f. Then, the transcoding device 100 searches for a research motion vector by using the decoded image stored in the frame memory 10f as a reference image and using the motion vector stored in the frame memory 10f. After that, the transcoding device 100 reads, from the frame memory 10f, an area in the decoded image, which is indicated by the searched re-search motion vector, and performs motion compensation by using the area in the decoded image and the searched re-search motion vector. Accordingly, the transcoding device 100 can achieve a motion vector research process by using only the frame memory for decoder when transcoding a moving image. Therefore, the transcoding device 100 can reduce a circuit scale and a used amount of memory and can lower a product cost without degrading an image quality of a transcoded moving image.
In other words, for example, when a moving image of an MPEG-2 format is converted into a moving image of an MPEG-4 AVC/H.264 format, the transcoding device 100 performs a motion vector re-search process of quarter-pel precision in the encoder by not using a motion vector of half-pel precision in an MPEG-2 format. The transcoding device 100 maximally utilizes the motion compensation extended by using the MPEG-4 AVC/H.264 motion vector. Accordingly, the transcoding device 100 can reduce a circuit scale and a used amount of memory and thus can lower a product cost without degrading an image quality of a transcoded moving image.
According to the first embodiment, the transcoding device 100 generates a decoded image for each macroblock, stores the decoded image in the frame memory 10f for each macroblock, and searches a re-search motion vector by using the motion vector stored in the frame memory 10f as an original point. Then, the transcoding device 100 reads an area in a decoded image indicated by the searched re-search motion vector from the frame memory 10f for each macroblock and performs motion compensation for each macroblock. Therefore, the transcoding device 100 can perform a motion compensation process for each macroblock.
According to the first embodiment, the transcoding device 100 searches for a re-search motion vector by using a decoded image area, which is obtained by extending the area in the decoded image indicated by the motion vector stored in the frame memory 10f, as a search range. Therefore, the transcoding device 100 can search an extended reference image and can re-search for an appropriate vector.
According to the first embodiment, the transcoding device 100 includes a buffer that has a higher reading speed than that of the frame memory 10f and temporarily stores therein the motion vector and the decoded image used for the motion re-search process and the motion compensation process among the motion vectors and the decoded images stored in the frame memory 10f. Therefore, the transcoding device 100 can improve a processing speed compared with reading the motion vector and the decoded image from the frame memory.
Although the embodiment of the present invention has been explained as described above, the present invention may be realized by using various different structures in addition to the embodiment as described above. Hereinafter, another embodiment included in the present invention will be explained as the second embodiment.
System Structure, etc.
Each of the structural components of the units illustrated in the drawings is only conceptual in function, and is not necessarily physically configured as illustrated in the drawings. That is, the specific patterns of distribution and integration of the units are not limited to those illustrated in drawings. All or a part of the components can be functionally or physically distributed or integrated in arbitrary units, in accordance with various loads and the sate of use. For example, the variable length decoding unit 10a and the inverse quantization/inverse DCT unit 10b may be integrated. Furthermore, all or arbitrary part of the processing functions performed by the units can be achieved by a Central Processing Unit (CPU) and a computer program analyzed and executed by the CPU, or can be achieved as hardware using wired logics.
Moreover, among the processes described in the present embodiments, all or a part of the processes explained as being automatically performed may also be manually performed; or all or a part of the processes explained as being manually performed may also be automatically performed through a known method. In addition, the processing procedure, control procedure, specific names, information including various types of data or parameters described in the specification and illustrated in the drawings can be arbitrarily modified unless otherwise specified.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present inventions have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2008-267862 | Oct 2008 | JP | national |