The present disclosure relates to a method and a device for processing coding on a digital signal, and in particular, to a method and an electronic device for processing video coding.
The rapid development of the Internet also drives the rise of video streaming. In order to reduce the data transmission amount, it is necessary to process video coding of an input digital image. During the video coding, the input image is divided into a plurality of different data blocks, and a corresponding coding tree is generated according to a coding unit in the data block to predict momenta of other frames. Since the input image having higher resolution generates more data blocks and coding units, generation loads of the decision tree are correspondingly increased.
In view of the above, the present disclosure provides a method for processing video coding, for performing coding prediction on a plurality of frames of an input image to generate output streaming data. The method for processing video coding can maintain the same coding quality and reduce computational complexity of a hardware coder.
The method for processing video coding of the present disclosure includes the following steps: acquiring a target block in each of the frames; splitting the target block into at least one coding unit; loading the target block to a coding tree generation module to output a first coding tree and a second coding tree; calculating, by an integer motion estimation (IME) unit of the coding tree generation module, a rate-distortion cost of each coding unit, and selecting a smallest one and a second smallest one from the rate-distortion costs, where the smallest rate-distortion cost is a first integer estimation result, and the second smallest rate-distortion cost is a second integer estimation result; generating an output decision tree according to the first coding tree and the second coding tree; outputting streaming data according to the output decision tree and the frame. Different coding trees are processed by using corresponding rate-distortion costs, to reduce computational loads of the coding trees.
The step of selecting the smallest one and the second smallest one from the rate-distortion costs includes: loading the first integer estimation result to a fractional motion estimation (FME) unit of the coding tree generation module to obtain a first fractional estimation result; loading the first fractional estimation result to a coding mode decision unit of the coding tree generation module to obtain the first coding tree; loading the second integer estimation result to the FME unit to obtain a second fractional estimation result; and loading the second fractional estimation result to the coding mode decision unit to obtain the second coding tree.
The step of loading the target block to the coding tree generation module to output the first coding tree and the second coding tree includes: selecting a first reference frame; loading the first integer estimation result to an FME unit in a low delay P frame (LDP) mode according to the first reference frame, a first coding block, and a second coding block, to acquire the first coding tree; loading the second integer estimation result to the FME unit in the LDP mode according to the first reference frame, the first coding block, and the second coding block, to acquire the second coding tree; selecting a first node unit from the first coding tree and a second node unit from the second coding tree according to the first coding block, where a node position of the second node unit in the second coding tree corresponds to a node position of the first node unit in the first coding tree; selecting a first node unit from the first coding tree and a second node unit from the second coding tree according to the first coding block, where a node position of the second node unit in the second coding tree corresponds to a node position of the first node unit in the first coding tree; selecting one of the first node unit or the second node unit as an output unit according to the coding units in the first node unit and the second node unit; selecting one of the third node unit or the fourth node unit as another output unit according to the coding units in the third node unit and the fourth node unit; traversing the first coding tree and the second coding tree to acquire the corresponding output units, and generating an output decision tree according to the selected output units.
The step of loading the target block to the coding tree generation module to output the first coding tree and the second coding tree includes: selecting a first reference frame and a second reference frame; loading the first integer estimation result to an FME unit in an LDP mode according to the first reference frame ; the second reference frame, and a first coding block, to acquire the first coding tree; loading the second integer estimation result to the FME unit in a random frame access mode according to the first reference frame, the second reference frame, and a second coding block, to acquire the second coding tree; selecting a first node unit from the first coding tree and a second node unit from the second coding tree according to the first coding block, where a node position of the second node unit in the second coding tree corresponds to a node position of the first node unit in the first coding tree; selecting a third node unit from the first coding tree and a fourth node unit from the second coding tree according to the second coding block, where a node position of the fourth node unit in the second coding tree corresponds to a node position of the third node unit in the first coding tree; selecting one of the first node unit or the second node unit as an output unit according to the coding units in the first node unit and the second node unit; selecting one of the third node unit or the fourth node unit as another output unit according to the coding units in the third node unit and the fourth node unit; and traversing the first coding tree and the second coding tree to acquire the corresponding output units, and generating an output decision tree according to the selected output units.
An electronic device for processing video coding includes a storage unit, a coding tree generation module, and a decision tree module. The storage unit is configured to store an input image. The input image includes a plurality of frames. The coding tree generation module is configured to acquire a target block from any of the frames and generate a first coding tree and a second coding tree according to the target block. The decision tree module is configured to receive the first coding tree and the second coding tree and generate an output decision tree according to a plurality of rate-distortion costs of the first coding tree and the second coding tree.
The coding tree generation module further includes an IME unit, an FME unit, and a coding mode decision unit. The IME unit generates a first integer estimation result and a second integer estimation result according to the target block. The FME unit generates a first fractional estimation result according to the first integer estimation result, and generates a second fractional estimation result according to the second integer estimation result. The coding mode decision unit generates a first coding tree and a second coding tree according to the first fractional estimation result and the second fractional estimation result. The IME unit is configured to select a smallest one of the rate-distortion costs as the first integer estimation result, and select a second smallest one of the rate-distortion costs as the second integer estimation result.
According to the method and the electronic device for processing video coding in the present disclosure, coding prediction is performed on the plurality of frames of the input image to output streaming data. In the method for processing video coding, the coding trees are divided in advance, to generate two different sets of coding trees. Different coding trees are processed by using corresponding rate-distortion costs, to reduce computational loads of the coding trees. The output decision tree is generated according to nodes formed by the first coding tree and the second coding tree. The method for processing video coding can reduce the computational complexity of a hardware coder and can maintain the same coding quality.
Referring to
The storage unit 100 stores an input image 400 or temporary data during the image coding. The input image 400 includes a plurality of frames 410. Generally speaking, each frame 410 may be divided into at least one or more super blocks 420. The super block 420 may be selected from luminance samples of the input image 400 in a YUV mode. Referring to
A splitting method for the target block 430 may include “direct split”, “none split”, “horizontal split”, and “vertical split”, As described above, the target block 430 may have a maximum size of 64*64 pixels. As shown in
The coding tree generation module 200 reads the input image 400 from the storage unit 100. The coding tree generation module 200 selects any frame 410 from the input image 400, and then selects the target block 430 from the selected frame 410. The coding tree generation module 200 (as shown in
In order to further describe the generation process of the first coding tree 310 and the second coding tree 320, refer to
First, the coding tree generation module 200 reads the input image 400 of the storage unit 100, and selects the frame 410 and the target block 430 from the input image 400 (as shown in
where Source is the frame, Predictor is the frame predicted by the IME unit 210, and (i,j) are pixel positions in the foregoing two frames.
The IME unit 210 may select either a sum of absolute differences (SAD) or Hadamard transform during calculation of the rate-distortion costs, to perform the rate-distortion calculation. The IME unit 210 calculates a plurality of rate-distortion costs of the target block 430. The IME unit 210 selects a smallest one and a second smallest one from all of the rate-distortion costs. The smallest rate-distortion cost is referred to as a first integer estimation result 441 below. The second smallest rate-distortion cost is referred to as a second integer estimation result 442.
The IME unit 210 outputs the first integer estimation result 441 and the second integer estimation result 442 to the FME unit 220. In
Next, the FME unit 220 outputs the first fractional estimation result 451 and the second fractional estimation result 452 to the coding mode decision unit 230. The coding mode decision unit 230 may select either the SAD or the SATD for the rate-distortion calculation. The coding mode decision unit 230 obtains the first coding tree 310 according to the first fractional estimation result 451. The coding mode decision unit 230 obtains the second coding tree 320 according to the second fractional estimation result 452. The coding mode decision unit 230 outputs the first coding tree 310 and the second coding tree 320 to the decision tree module 300.
The decision tree module 300 calculates the rate-distortion cost, a frame pixel reconstruction value (Recon), and some related parameters according to the first coding tree 310 and the second coding tree 320. A sum of squared errors (SSE) may be selected as the rate-distortion cost, refer to the following formula 2. The decision tree module 300 acquires an optimal method for splitting into the coding units 431 according to the rate-distortion costs, and the splitting of the coding units 431 leads to generation of the output decision tree 330.
RD cost=λR+D(SSE)
SSE=Σi,jDiff(i, 2)2, Diff(i,j)=Recon(i,j)−Source(i,j). (Formula 2)
In an embodiment, in the generation process of the first coding tree 310 and the second coding tree 320, the coding trees may be divided according to a reference frame. Referring to
First, an electronic device 1 may select any of frames 410 other than the target block 430 as the first reference frame. Generally speaking, the electronic device 1 may select any of the frame 410, a predicted frame (P frame) 410, an Ultra frame (I frame) 410, or a bi-directional frame (B frame) 410 similar to the target block 430 as the first reference frame.
Next, the IME unit 210 processes the first reference frame, thefirst integer estimation result 441, and the second integer estimation result 442 based on the LDP mode by using the first coding block 351 and the second coding block 352 (as shown in
The first coding block 351 has a size of 16*16 pixels, and the second coding block 352 has a size of 32*32 pixels. The first coding block 351 may be formed by a plurality of coding units 431 having smaller sizes (as shown in
The IME unit 210 performs corresponding prediction processing on the target block 430, and outputs the first integer estimation result 441 and the second integer estimation result 442. For convenience of description, refer to
Finally, the coding tree generation module 200 outputs the first coding tree 310 and the second coding tree 320 to the decision tree module 300. The coding tree generation module 200 generates the output decision tree 330 according to the coding units 431 at different positions in the first coding tree 310 and the second coding tree 320. Referring to
A tree structure relationship among the coding units 431 of the first coding tree 310 and the second coding tree 320 may be obtained from
Next, the decision tree module 300 selects a second node unit 342 from the second coding tree 320. A node position of the second node unit 342 in the second coding tree 320 corresponds to a node position of the first node unit 341 in the first coding tree 310. In other words, the decision tree module 300 selects the coding units 431 at the corresponding positions from the first coding tree 310 and the second coding tree 320 according to the first coding block 351. For convenience of description, the first node unit 341 is used to represent the first coding block 351, and the second node unit 342 is used to represent the second coding block 352 below. The decision tree module 300 selects either the first node unit 341 or the second node unit 342 as an output node 345 according to the splitting of the target block 430 and the combination of the coding units 431 (refer to
Similarly, the decision tree module 300 further selects the third node unit 343 from the first coding tree 310 and the fourth node unit 344 from the second coding tree 320 according to the second coding block 352. In addition, the decision tree module 300 selects the third node unit 343 or the fourth node unit 344 as another output node 345 according to the composition structure of the coding units 431 of the third node unit 343 and the fourth node unit 344. The decision tree module 300 traverses the first coding tree 310 and the second coding tree 320 and obtains all of the output nodes 345. Generally speaking, the decision tree module 300 may traverse the coding trees in a zigzag manner, as indicated by an arrow in
In an embodiment, in the generation process of the first coding tree 310 and the second coding tree 320, the coding trees may be divided according to different quantities of reference frames. Referring to
The electronic device 1 selects any two from the frames 410 other than the target block 430, which are respectively the first reference frame and the second reference frame. The IME unit 210 applies the first integer estimation result 441 to the first reference frame and the second reference frame, and performs prediction processing of the first coding block 351, the second coding block 352, and the LDP mode. The IME unit 210 applies the second integer estimation result 442 to the first reference frame and the second reference frame, and performs prediction processing of the first coding block 351, the second coding block 352, and the random frame access mode (RA mode).
In some embodiments, the first coding block 351 has a size of 16*16 pixels, and the second coding block 352 has a size of 32*32 pixels. The IME unit 210 performs the prediction processing on the target block 430, and outputs the first integer estimation result 441 and the second integer estimation result 442. Referring to
The decision tree module 300 selects the coding units 431 from the first coding tree 310 by using the first coding block 351, and uses the selected coding units 431 (or a set of coding units 431) as a first node unit 341. The decision tree module 300 selects the second node unit 342 from the second coding tree 320. The node position of the second node unit 342 corresponds to the node position of the first node unit 341. The decision tree module 300 selects either the first node unit 341 or the second node unit 342 as the output node 345 according to the composition structure of the coding units 431 of the first node unit 341 and the second node unit 342.
Referring to
In an embodiment, after the IME unit 210 generates the first integer estimation result 441 and the second integer estimation result 442, the FME unit 220 further determines whether each node unit includes a leaf node. It is noted that the node unit may be composed of a single coding unit 431 or a plurality of coding units 431, and therefore the plurality of coding units 431 form a tree structure. Referring to
The FME unit 220 determines whether the first node unit 341 and the second node unit 342 each include the leaf node. Since the first node unit 341 (or the second node unit 342) may include more than two coding units 431, the first node unit 341 (or the second node unit 342) forms the tree structure. Taking
If one of the first node unit 341 or the second node unit 342 includes the leaf node, the FME unit 220 compares the rate-distortion cost of the first node unit 341 with the rate-distortion cost of the second node unit 342 and determines whether a difference between the two rate-distortion costs exceeds a threshold. If the difference between the two rate-distortion costs exceeds the threshold, the FMF unit 220 selects the first node unit 341 as the output unit, which has the same structure as the coding block filled with vertical lines in
The FME unit 220 determines the third node unit 343 and the fourth node unit 344 with the following processing flow:
The FME unit 220 determines whether the third node unit 343 and the fourth node unit 344 each include the leaf node. The FME unit 220 acquires the corresponding output node 345 according to the above processing, and builds the first coding tree 310 and the second coding tree 320. The decision tree module 300 performs momentum prediction on the first reference frame according to the first coding tree 310 and the second coding tree 320. In
The method for processing video coding and the electronic device 1 perform coding prediction on a plurality of frames 410 of the input image 400, to output streaming data. In the method for processing video coding, the coding trees are divided in advance, to generate two different sets of coding trees. Different coding trees are processed by using corresponding rate-distortion costs, to reduce computational loads of the coding trees. The output decision tree 330 is generated according to the nodes formed by the first coding tree 310 and the second coding tree 320. The method for processing video coding can reduce the computational complexity of a hardware coder and can maintain the same coding quality.