This application claims the benefit of Taiwan application Serial No. 98102973, filed Jan. 23, 2009, the subject matter of which is incorporated herein by reference.
1. Technical Field
The disclosure relates in general to a video processing method, and more particularly to a depth calculating method for calculating depth data corresponding to input frame data.
2. Description of the Related Art
In the modern age in which the technique is changing with each passing day, the digital content industry including computer motion pictures, digital games, digital learning, mobile applications and services has the flourishing development. In the prior art, the three-dimensional (3D) image/video has existed, and is well expected to enhance the service quality of the digital content industry.
Generally speaking, the existing 3D image/video generator utilizes the depth image based rendering (DIBR) to generate the 3D image data according to the 2-dimensional (2D) image data and the depth data. The precision of the depth data has the final influence on the quality of the 3D image data. Therefore, it is an important subject in the industry to design a depth calculating method of generating the precise depth data.
According to a first aspect of the present disclosure, a depth calculating method for calculating corresponding depth data in response to frame data of input video data is provided. The frame data includes u×v macroblocks. Each of the u×v macroblocks includes X×Y pieces of pixel data, wherein u and v are natural numbers greater than 1. The depth calculating method includes the following steps. First, smooth macroblocks are found in the u×v macroblocks. Next, motion vector data of these macroblocks in the u×v macroblocks are set to a zero motion vector. Then, a plurality of neighboring macroblocks is found with respect to each of the u×v macroblocks. Next, motion vector data of each of the u×v macroblocks is set to be equal to mean motion vector data of the neighboring macroblocks. Then, u×v pieces of macroblock motion parallax data respectively corresponding to the u×v macroblocks are found according to the corrected motion vector data of the u×v macroblocks. Finally, the depth data corresponding to the frame data is calculated according to the u×v pieces of macroblock motion parallax data.
According to a second aspect of the present disclosure, a depth calculating apparatus for calculating corresponding depth data in response to frame data of input video data is provided. The frame data includes u×v macroblocks. Each of the u×v macroblocks includes X×Y pieces of pixel data, wherein u and v are natural numbers greater than 1. The depth calculating apparatus includes a motion parallax data module and a depth calculating module. The motion parallax data module generates u×v pieces of macroblock motion parallax data according to motion vector data corresponding to the u×v macroblocks. The motion parallax data module includes a region correcting module, a motion vector data correcting module and a motion parallax data calculating module. The region correcting module finds smooth macroblocks in the u×v macroblocks and sets the motion vector data of these macroblocks in the u×v macroblocks as corresponding to a zero motion vector. The motion vector data correcting module finds a plurality of neighboring macroblocks with respect to each of the u×v macroblocks, and sets the motion vector data of each of the u×v macroblocks to be equal to mean motion vector data of the neighboring macroblocks. The motion parallax data calculating module generates the u×v pieces of macroblock motion parallax data according to the macroblock motion vector data of the u×v macroblocks, corrected by the region correcting module and the motion vector data correcting module. The depth calculating module calculates the depth data corresponding to the frame data according to the u×v pieces of macroblock motion parallax data.
The disclosure will become apparent from the following detailed description of the preferred but non-limiting embodiments. The following description is made with reference to the accompanying drawings.
The file of this patent contains at least one drawing executed in color. Copies of this patent with the color drawing(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
Exhibits 1 to 6 are illustrations of input video data Vdi.
The depth calculating methods of the exemplary embodiments adopt several motion vector correction techniques to adjust motion vectors corresponding to input frame data, and generate depth data according to motion parallax data corresponding to the motion vectors.
The depth calculating method of this embodiment adopts the region motion vector correction technique and the technique of correcting the motion vector of the target macroblock with reference to the neighboring macroblock motion vectors to correct the motion vector corresponding to the input frame data, and estimates the depth data corresponding to the input frame data with the motion parallax data corresponding to the corrected motion vector serving as a clue.
For example, the frame data Fd1 includes x×y pieces of pixel data, which are divided into u×v macroblocks BK(1,1), BK(1,2), . . . , BK(1,v), BK(2,1), . . . , BK(2,v), . . . , BK(u,v). Each of the macroblocks BK(1,1) to BK(u,v) includes x′×y′ pieces of pixel data, wherein x, y, u and v are natural numbers greater than 1, and x and y are respectively equal to the product of x′ and u and the product of y′ and v. For example, x′ and y′ are equal to 8. In this example, the depth data Dd1 includes u×v pieces of macroblock depth data.
The depth calculating apparatus 1 includes a motion parallax data module 120 and a depth calculating module 100. The motion parallax data module 120 generates u×v pieces of macroblock motion parallax data MP(1,1) to MP(u,v) according to the motion vector data corresponding to the u×v macroblocks BK(1,1) to BK(u,v). The motion parallax data module 120 includes a region correcting module 122 and a motion vector data correcting module 124 for correcting the motion vector data of the macroblocks BK(1,1) to BK(u,v).
The region correcting module 122 finds smooth macroblocks in the macroblocks BK(1,1) to BK(u,v). For example, the region correcting module 122 determines whether the to-be-detected macroblock DBK is a smooth macroblock according to the condition that whether the MAD mean MADB of M mean absolute differences MAD1 to MADM of the to-be-detected macroblock DBK is smaller than a MAD threshold value. The M mean absolute differences MAD1 to MADM of the to-be-detected macroblock DBK are calculated between the to-be-detected macroblock DBK itself and the M reference macroblocks RBK1 to RBKM of the to-be-detected macroblock DBK (RBK1 to RBKM are determined by the motion vector data of the surrounding macroblocks of DBK). The region correcting module 122 generates the MADB of MAD1 to MADM according to the operations that may be expressed by the following equations:
wherein X and Y are coordinate values of the x′×y′ pieces of pixel data in the macroblock, IDBK(X,Y) is the pixel data having the coordinate value (X,Y) in the to-be-detected macroblock DBK, and IRBKi(X,Y) is the pixel data corresponding to the same position (X,Y) of the pixel data IDBK(X,Y) in the reference macroblock RBKi.
For example, M is equal to 8, and the reference macroblocks RBK1 to RBK8 are eight neighboring macroblocks respectively located on the upper left side, the upper side, the upper right side, the left side, the right side, the lower left side, the lower side and the lower right side around the to-be-tested macroblock, as shown in
When the MAD mean MADB corresponding to the to-be-tested macroblock DBK is smaller than the MAD threshold value, the region correcting module 122 judges the to-be-tested macroblock DBK as a smooth macroblock and adjusts the motion vector data corresponding to the to-be-tested macroblock DBK as a zero motion vector. When the MAD mean MADB corresponding to the to-be-tested macroblock DBK is greater than or equal to the MAD threshold value, the region correcting module 122 judges the to-be-tested macroblock DBK as not being a smooth macroblock, and reserves the motion vector data of the to-be-tested macroblock DBK.
The motion vector data correcting module 124 finds N neighboring macroblocks NBK1 to NBKN with respect to each of the macroblocks BK(1,1) to BK(u,v), and sets the motion vector data of each of the macroblocks BK(1,1) to BK(u,v) to be equal to the mean motion vector data of the neighboring macroblocks NBK1 to NBKN. For example, N is equal to 8, and the neighboring macroblocks NBK1 to NBK8 are eight neighboring macroblocks respectively located on the upper left side, the upper side, the upper right side, the left side, the right side, the lower left side, the lower side and the lower right side around each of the macroblocks BK(1,1) to BK(u,v), as shown in
A motion parallax data calculating module 126 generates macroblock motion parallax data fM(1,1) to fM(u,v) according to the macroblock motion vector data corrected by the region correcting module 122 and the motion vector data correcting module 124. For example, the macroblock motion parallax data fM(U,V) satisfies the equation:
wherein MV(U,V)h is a horizontal motion vector corresponding to the macroblock BK(U,V), and MV(U,V)v is a vertical motion vector corresponding to the macroblock BK(U,V).
In one example, the motion parallax data calculating module 126 further normalizes the macroblock motion parallax data fM(1,1) to fM(u,v) into the values 0 to 255, and a median filter and a Gaussian filter are adopted to smooth the macroblock motion parallax data fM(1,1) to fM(u,v).
For example, when the normalized macroblock motion parallax data fM(1,1) to fM(u,v) have the low values (e.g., approximating the value 0), it represents that the corresponding motion parallax is small while the corresponding depth is large. When the normalized macroblock motion parallax data fM(1,1) to fM(u,v) have the high values (e.g., approximating the value 255), it represents that the corresponding motion parallax is large while the corresponding depth is small. Accordingly, the depth calculating module 100 determines the depth data Dd1 including the u×v pieces of macroblock depth data as corresponding to the macroblocks BK(1,1) to BK(u,v) in the frame data Fd1 according to the macroblock motion parallax data fM(1,1) to fM(u,v) provided by the motion parallax data calculating module 126. The value of each piece of the macroblock depth data in the depth data Dd1 also ranges from, for example, 0 to 255, which respectively indicates the depth data corresponding to the large depth to the small depth.
In this exemplary embodiment, the motion parallax data calculating module 126 further has a shot change detecting module (not shown) for detecting the frame data of the input video data Vdi with the shot change, and for performing a shot change detection on the motion vector data corresponding to the frame data. The motion parallax data calculating module 126 may generate the macroblock motion parallax data fM(1,1) to fM(u,v) according to the motion vector data corrected by the shot change detecting module.
For example, the shot change detecting module performs a histogram operation on each of the pieces of frame data of the input video data Vdi, such that 256 pixel amounts corresponding to the respective 256 gray levels (0 to 255) of each piece of frame data of the input video data Vdi are obtained. The shot change detecting module further obtains 256 pixel data amount differences corresponding to each piece of frame data of the input video data Vdi by comparing the histogram result of each piece of frame data and its previous piece of frame data in the input video data Vdi. The shot change detecting module further obtains a summed difference corresponding to each piece of frame data of the input video data Vdi by summing the corresponding 256 pixel data amount differences and judges whether the summed difference is greater than a threshold value. If not, it is judged that the shot change event does not occur between the piece of frame data and its previous piece of frame data.
If it is judged that the summed difference is greater than a threshold value, the shot change detecting module calculates another set of motion vector data to correct the corresponding motion vector data. For example, the set of motion data are calculated based on the frame data and the next N pieces of frame data of the frame data, wherein N is a natural number.
In this exemplary embodiment, the motion parallax data calculating module 126 further includes, for example, a motion refinement module (not shown) for performing the motion refinement on the motion vector data corresponding to each frame data of the input video data Vdi with reference to camera motion data. The motion parallax data calculating module 126 may generate the macroblock motion parallax data fM(1,1) to fM(u,v) according to the motion vector data corrected by the motion refinement module.
In one example, the input video data Vdi is the video data obtained through the MPEG-2 or MPEG-4 standard decompression, while the motion parallax data module 120 may further generate the motion vector data with reference to corresponding motion vector information in the MPEG-2 or MPEG-4 standard to obtain the corresponding macroblock motion parallax data fM(1,1) to fM(u,v).
The depth calculating method of this exemplary embodiment adopts the region motion vector correction technique to generate the depth data with reference to the motion vector of the neighboring macroblock motion vector corresponding to the input frame data and according to the macroblock motion parallax data corresponding to the motion vector. Thus, compared with the conventional depth data generating method, the depth calculating method of this exemplary embodiment has the advantage of enhancing the precision of the depth data.
The depth calculating method of this exemplary embodiment generates the depth data with reference to the motion parallax data and further to a portion of parameter data or the entire parameter data associated with the atmospheric perspective and the texture gradient.
In one example, the another set of parameter data relates to the parameter data of atmospheric perspective. Generally speaking, suspended particles in the air make the shot video data, corresponding to the object at the shorter distance, have the frame characteristic of the sharp edge information, and make the shot video data, corresponding to the object at the longer distance, have the frame characteristic of the blurred edge information. Consequently, the parameter module 230 in this example analyzes the macroblock variance data of the frame data Fd1 corresponding to each of the macroblocks BK(l,1) to BK(u,v) and thus provides the data of the frame edge sharpness information as the clue for depth estimation. The depth calculating module 200 generates the depth data Dd2 with reference to the macroblock variance data fV(1,1) to fV(u,v) and the macroblock motion parallax data fM(1,1) to fM(u,v).
For example, the parameter module 230 calculates the macroblock variance data fV(1,1) to fV(u,v) according to the steps of
Then, as shown in step (c3), the parameter module 230 finds mean square differences of the x′×y′ pieces of data differences with respect to the macroblocks BK(1,1) to BK(u,v). Thereafter, as shown in step (c4), u×v pieces of macroblock variance data fV(1,1) to fV(u,v) are generated according to the mean square pixel differences corresponding to the macroblocks BK(1,1) to BK(u,v).
For example, the step operation may be represented by the following equation:
wherein fV(U,V) is the macroblock variance data corresponding to the macroblock BK(U,V), IBK(U,V)(X,Y) is the pixel data with the coordinates (X,Y) in the macroblock BK(U,V), IBBK(U,V) is the mean square difference of the x′×y′ pieces of pixel data in the macroblock BK(U,V).
The parameter module 230 further normalizes the macroblock variance data fV(1,1) to fV(u,v) into 0 to 255, and adopts the median filter and the Gaussian filter to smooth the macroblock variance data.
For example, when the normalized macroblock variance data fV(1,1) to fV(u,v) have the low values (e.g., approximating the value 0), it represents that the corresponding image variance is small, the edge sharpness information is low and the corresponding depth is large. When the normalized macroblock variance data fV(1,1) to fV(u,v) have the high values (e.g., approximating the value 255), it represents that the corresponding image variance is large, the edge sharpness information is high and the corresponding depth is small.
The depth calculating module 200 generates the u×v pieces of macroblock depth data Dd2(1,1) to Dd2(u,v) in the depth data Dd2 according to the macroblock motion parallax data fM(1,1) to fM(u,v), macroblock variance data fV(1,1) to fV(u,v) and the weighting coefficients φ1 and φ2 and according to the following equation:
In one example, the depth calculating apparatus 2 further includes a weighting coefficient module 240 for deriving the weighting coefficients φ1 and φ2 according to true depth values g(1,1) to g(u,v), the macroblock variance data fV(1,1) to fV(u,v) and the macroblock motion parallax data fM(1,1) to fM(u,v). For example, the true depth values g(1,1) to g(u,v) are the true depth results shot by the depth camera. The weighting coefficient module 240 obtains the better solutions of the weighting coefficients φ1 and φ2 through a pseudo-inverse matrix. Thus, the depth calculating module 200 may generate the better solution of the depth data Dd2 according to the macroblock variance data fV(1,1) to fV(u,v) and the macroblock motion parallax data fM(1,1) to fM(u,v).
For example, the weighting coefficient module 240 generates the weighting coefficients φ1 and φ2 according to the following equation:
Although only the condition that the parameter module 230 obtains the parameter data associated with the atmospheric perspective by calculating the macroblock variance data fV(1,1) to fV(u,v) is illustrated as an example in this exemplary embodiment, the parameter module 230 of this exemplary embodiment is not limited thereto. In another example, the parameter module 230 may also generate the data associated with the atmospheric perspective parameter data by calculating the macroblock contrast data, as shown in
In this example, the parameter module 230′ generates the macroblock contrast data fC(1,1) to fC(u,v) corresponding to each of the macroblocks BK(1,1) to BK(u,v) through the following equation:
wherein IMAX(U,V) is the pixel data having the maximum pixel data value in the macroblock BK(U,V), and IMIN(U,V) is the pixel data having the minimum pixel data value in the macroblock BK(U,V). For example, the parameter module 230′ further normalizes the macroblock contrast data fC(1,1) to fC(u,v) into 0 to 255, and smoothes the macroblock contrast data using the median filter and the Gaussian filter.
For example, when the normalized macroblock contrast data fC(1,1) to fC(u,v) have the low values (e.g., approximating the value 0), it means that the corresponding image contrast is low, the edge sharpness information is low and the corresponding depth is large. When the normalized macroblock contrast data fC(1,1) to fC(u,v) have the high values (e.g., approximating value 255), it means that the corresponding image contrast is high, the edge sharpness information is high and the corresponding depth is small.
The weighting coefficient module 240′ and the depth calculating module 200′ generate the weighting coefficients φ1 and φ2 as well as the depth data Dd2′ according to the operations substantially the same as the operations of the weighting coefficient module 240 and the depth calculating module 200.
In this exemplary embodiment, only the condition that the depth calculating apparatus 2 includes one parameter module 230 or 230′ for generating another set of parameter data is illustrated as an example. However, the depth calculating apparatus 2 of this exemplary embodiment is not limited thereto. In another example, the depth calculating apparatus 2″ simultaneously includes two parameter modules 230 and 230′ for correspondingly providing two sets of atmospheric perspective parameter data (macroblock contrast data fC(1,1) to fC(u,v) and macroblock variance data fV(1,1) to fV(u,v)), as shown in
In this exemplary embodiment, only the condition that the depth calculating apparatus 2 calculates the depth data according to another set of atmospheric perspective parameter data is described as an example. However, the depth calculating apparatus of this exemplary embodiment is not limited thereto. In another example, a depth calculating apparatus 3 adopts another parameter module 330 to generate another set of parameter data associated with the texture gradient of the frame data Fd1 according to the frame data Fd1, as shown in
Generally speaking, when the distance between the object and the camera in the image data is increased, the texture of the object is correspondingly changed from clear to blur. Consequently, the parameter module 330 generates the macroblock texture gradient data fT(1,1) to fT(u,v) respectively corresponding to the macroblocks BK(1,1) to BK(u,v) by analyzing the intensity of the texture energy in the image. Therefore, a depth calculating module 300 may adopt the macroblock texture gradient data as the clue to perform the depth estimation on the frame data Fdi.
For example, the parameter module 330 fits eight 3×3 Law's mask L3E3, L3S3, E3L3, E3E3, E3S3, S3L3, S3E3 and S3S3 with each of 64 pieces of pixel data in each of the macroblocks BK(1,1) to BK(u,v) to generate eight pieces of sub-texture gradient data with respect to each piece of pixel data.
The schematic illustrations of the 8 Law's masks respectively satisfy the following matrix:
The parameter module 330 further accumulates the eight pieces of sub-texture gradient data to obtain one piece of texture gradient data with respect to each piece of pixel data. The parameter module 330 further calculates the number of pieces of pixel data having the texture gradient data greater than a texture gradient data threshold value in each macroblock with respect to the macroblocks BK(1,1) to BK(u,v), respectively, to generate the corresponding macroblock texture gradient data fT(1,1) to fT(u,v). For example, the operation of the parameter module 330 may be expressed by the following equations:
wherein fiT(X,Y) is the ith piece of sub-texture gradient data corresponding to the pixel data I(X,Y); wi(s,t) is the mask parameter at the position (s,t) in the ith Law's mask; I(X+s,Y+t) is the pixel data, on which the filter operation is performed through the Law's mask when the texture gradient data fiT(X,Y) corresponding to the pixel data I(X,Y) is being calculated, and Td1 is the texture gradient data threshold value. The
s a unit step function. When the texture gradient data (obtained by accumulating corresponding eight pieces of sub-texture gradient data) corresponding to one piece of pixel data is greater than the texture gradient data threshold value, it means that this piece of pixel data has the high texture gradient energy, and the unit step function has the value of 1. When the texture gradient data corresponding to this piece of pixel data is smaller than or equal to the texture gradient data threshold value, it means that this piece of pixel data has the low texture gradient energy, and the unit step function has the value of 0. Thereafter, accumulating the unit step function values corresponding to the same macroblock can obtain the number of pieces of pixel data having the corresponding texture gradient data greater than the texture gradient data threshold value in the 64 pieces of pixel data of the macroblock, so that the corresponding macroblock texture gradient data may be generated.
For example, the parameter module 330 further normalizes the u×v pieces of macroblock texture gradient data fT(1,1) to fT(u,v) into 0 to 255, and smoothes the u×v pieces of macroblock texture gradient data using the median filter and the Gaussian filter.
For example, when the normalized macroblock texture gradient data fT(1,1) to fT(u,v) have the low values (e.g., approximating the value 0), it means that the texture gradient energy corresponding to the image is low and the corresponding depth is large. When the normalized macroblock texture gradient data fT(1,1) to fT(u,v) have the high values (e.g., approximating the value 255), it represents that the texture gradient energy corresponding to the image is high and the corresponding depth is small.
Thereafter, similar to the depth calculating module 200, the depth calculating module 300 generates the depth data Dd3 according to the weighting coefficients φ1 and φ2 and the macroblock motion parallax data fM(1,1) to fM(u,v) and the macroblock texture gradient data fT(1,1) to fT(u,v) corresponding to the same macroblock.
In another example, as shown in
The depth calculating module 300′ may respectively determine the weighting coefficients of the macroblock motion parallax data fM(1,1) to fM(u,v), the macroblock variance data fV(1,1) to fV(u,v), the macroblock contrast data fC(1,1) to fC(u,v) and the macroblock texture gradient data fT(1,1) to fT(u,v) corresponding to the same macroblock according to the weighting coefficients φ1, φ2, φ3 and φ4, as listed in the following equation:
Different from the depth calculating method of the first exemplary embodiment, the depth calculating method generates the depth data with reference to the motion parallax data and further to a portion of parameter data or the entire parameter data associated with the atmospheric perspective and the texture gradient. Consequently, compared with the conventional depth data generating method, the depth calculating method of this embodiment has the advantage of enhancing the precision of the depth data.
The depth calculating method of this exemplary embodiment classifies the input video data into several video types with reference to the motion activity of all the frame data of the input video data and the complicated level of the frame background. The depth calculating method adopts different calculating operations to obtain the depth data according to the input video data pertaining to different video types.
The video classifying module 450 includes a motion activity analyzing module 452, a background complexity analyzing module 454 and a depth repairing module 456. The motion activity analyzing module 452 calculates the summed motion activity data Smd of all the J pieces of frame data of the input video data Vdi. For example, the operation of the motion activity analyzing module 452 may be shown in the flow chart of
First, as shown in step (f1), the motion activity analyzing module 452 calculates the x×y pieces of pixel data differences between the x×y pieces of pixel data in the jth piece of frame data of the J pieces of frame data and the x×y pieces of pixel data corresponding to the same position in the (j−1)th pieces of frame data, wherein j is a natural number smaller than or equal to J, and the initial value of j is 1. Next, as shown in step (f2), the motion activity analyzing module 452 calculates the data amount greater than a pixel data difference threshold value in the x×y pieces of pixel data differences, and divides the data amount by the pixel data quantity (i.e., the value x′×y′) included in one piece of frame data to generate the jth piece of difference data Smd(j). For example, the operations of steps (f1) and (f2) may be expressed by the following equation (Td2 is another threshold value):
Next, as shown in step (f3), the motion activity analyzing module 452 ascends j to repeat the steps (f1) and (f2) J times to correspondingly obtain J pieces of difference data Smd1 to SmdJ. Thereafter, as shown in step (f4), the motion activity analyzing module 452 multiplies the sum of the J pieces of difference data Smd1 to SmdJ by the coefficient 1/J to obtain the summed motion activity data Smd. For example, the operations of steps (f3) and (f4) may be expressed by the following equations:
The depth repairing module 456 judges whether the summed motion activity data Smd is greater than a summed motion activity data threshold value. If so, the depth repairing module 456 judges the input video data Vdi as pertaining to the video data with a high motion activity. If not, the depth repairing module 456 judges the input video data Vdi as pertaining to the video data with a low motion activity.
The background complexity analyzing module 454 calculates the background complexity data Bcd of the J pieces of frame data. In one example of the operation of calculating the background complexity data Bcd, the background complexity analyzing module 454 may selectively perform the calculation with reference to the entire video data in each piece of frame data or some regions (e.g., the upper half portion) of the video data in each of the J pieces of frame data.
For example, the operation performed by the background complexity analyzing module 454 is shown in
Next, as shown in step (h2), the background complexity analyzing module 454 ascends j to perform the step (h1) J times to obtain u×v pieces of macroblock texture gradient data with respect to each of the J pieces of frame data. Thereafter, as shown in step (h3), the background complexity analyzing module 454 calculates the number of the macroblock texture gradient data greater than a texture gradient data threshold value in the J×u×v pieces of macroblock texture gradient data, and divides the number by the parameter 1/(J×u×v) to calculate the background complexity data Bcd. For example, the operations of the steps (h1) to (h3) may be expressed by the following equation:
wherein Tt is the macroblock texture gradient data threshold value, and U(fT(U,V,j)−Tt) is the unit step function. When the texture gradient data of the macroblock texture gradient data fT(U,V,j) is greater than the texture gradient data threshold value Tt, the value of the unit step function is 1. When the texture gradient data of the macroblock texture gradient data fT(U,V,j) is smaller than or equal to the texture gradient data threshold value Tt, the value of the unit step function is 0. Thereafter, the J×u×v unit step functions are accumulated and then divided by the value J×u×v so that the background complexity data Bcd is obtained.
The depth repairing module 456 further judges whether the background complexity data is greater than a background complexity data threshold value. If so, the depth repairing module 456 judges the input video data Vdi as pertaining to the video data with the high background complexity. If not, the depth repairing module 456 judges the input video data Vdi as pertaining to the video data with the low background complexity.
In one example, the operations performed by the video classifying module 450 are listed in the flow chart of
When the input video data Vdi pertains to the video type I, the frame of the input video data Vdi has characteristics of the high motion activity and the low background complexity. In this case, the depth repairing module 456 acquires and repairs the depth data corresponding to the foreground in the input video data Vdi to obtain the better foreground depth data.
In one example, when the input video data Vdi pertains to the video type I, the operation flow chart of the depth repairing module 456 is shown in
In the step (j), the operation of generating the foreground block data Dfd according to the depth data Dd4 may be achieved according to various data processing methods. For example, the depth repairing module 456 binarizes the depth data Dd4 according to a pixel data threshold value so that the depth data with the depth value greater than the pixel data threshold value in the depth data Dd4 is classified into the foreground block data Dfd.
In the step (j), after the foreground block data Dfd is generated, the depth repairing module 456 may further perform several video processing techniques to correct the foreground block data Dfd. For example, the depth repairing module 456 may eliminate the noise influence using the mathematical morphology technique, the connected component labeling technique, the region removal method and the hole filling method so that the foreground profile corresponding to the foreground block data Dfd becomes smooth.
The depth repairing module 456 further adopts the object segmentation technique to correct the foreground block data Dfd with reference to the object information in the input video data Vdi so that the foreground profile of the foreground block data may correspond to the object profile in the actual input video data Vdi. For example, the object segmentation technique may be the Delaunay triangulation technique or the mean shift segmentation technique.
In one example, in the step (j), the depth repairing module 456 sequentially performs the steps (j1) to (j6) to sequentially adopt the binarization technique, the mathematical morphology technique, the connected component labeling, the region removal method, the hole filling method and the object segmentation method to correct the foreground block data Dfd, as shown in
In step (k), for example, the depth repairing module 456 assigns each macroblock depth value in the depth data Dd4 as the depth value corresponding to the foreground block. In one example, the depth repairing module 456 assigns the depth value, which is greater than or equal to a foreground depth threshold value (i.e., the depth value corresponding to the depth smaller than or equal to the depth corresponding to the threshold value), to the foreground block data Dfd. When the depth value corresponding to any macroblock in the foreground region is smaller than a foreground depth threshold value (i.e., the corresponding depth is greater than the foreground depth threshold value), the depth repairing module 456 corrects the depth value of this macroblock according to the interpolation result of the peak values (the higher depth value corresponding to the smaller depth) of the depth data of the neighboring macroblocks around this macroblock. Thus, it is possible to prevent the error condition that the assigned foreground region has the too-small depth data value (and the too large depth).
For example, the input video data Vdi is shown in exhibit 1, and the corresponding depth data Dd4 is shown in the exhibit 2.
When the input video data Vdi pertains to the video type II, the frame of the input video data Vdi has the small motion activity (when the background complexity is either high or low). In this case, the depth repairing module 456 judges the motion activity condition of each of the x′×y′ pieces of pixel data in the input video data Vdi with reference to a portion of k pieces of continuous frame data in the input video data Vdi. Thereafter, the depth data Dd4 is corrected according to the motion activity obtained with reference to the k pieces of continuous frame data.
In one example, when the input video data Vdi pertains to the video type II, the operation flow chart of the depth repairing module 456 is shown in
Next, as shown in step (i′), the depth repairing module 456 determines the foreground block data Dfd according to the x×y motion data Md(1,1) to Md(x,y). Thereafter, as shown in step (j′), the depth repairing module 456 generates the repaired depth data according to the depth data Dd4 and the foreground block data Dfd.
For example, in step (h′), the depth repairing module 456 determines the value k according to the value of the summed motion activity data Smd. In one example, the value k is a function of the summed motion activity data Smd, and the depth repairing module 456 calculates the summed motion activity data Smd through, for example, the equation of calculating the summed motion activity data Smd, and then calculates the value k.
For example, in the step (h′), the depth repairing module 456 calculates the corresponding motion data Md(1,1) to Md(x,y) by accumulating k pieces of pixel data motion activities corresponding to each of the x×y pixel data positions in the k pieces of frame data, as shown in
Next, as shown in step (h3′), the depth repairing module 456 ascends z and repeats the step (h2′) k times to obtain the x×y pieces of pixel data motion activities with respect to each of the k pieces of frame data. Then, as shown in step (h4′), the depth repairing module 456 accumulates the k pieces of pixel data motion activities corresponding to each of the x×y pixel data positions to obtain the corresponding accumulated pixel data motion activities Cd(1,1) to Cd(x,y) with respect to each of the x×y pixel data positions. For example, the operations of the steps (h2′) to (h4′) may be expressed by the following equation:
wherein I(X,Y,t) and I(X,Y,t−1) are respectively the pixel data corresponding to the current frame data, and the pixel data corresponding to the pixel data position (X,Y) in the previous piece of frame data.
Thereafter, as shown in step (h5′), the depth repairing module 456 respectively determines the x×y pieces of motion data Md(1,1) to Md(x,y) corresponding to the x×y pixel data positions in the k pieces of frame data according to x×y pieces of accumulated pixel data motion activities Cd(1,1) to Cd(x,y). For example, the depth repairing module 456 judges whether each of the pieces of accumulated pixel data motion activities Cd(1,1) to Cd(x,y) is greater than a motion activity threshold value. If so, the corresponding motion data Md(1,1) to Md(x,y) are set to indicate that the k pieces of pixel data at the corresponding pixel positions have a specific motion activity. If not, the corresponding motion data Md(1,1) to Md(x,y) are set to indicate that the k pieces of pixel data at the corresponding pixel positions have the zero motion activity.
For example, in the step (h′), the depth repairing module 456 further adopts the hole filling technique, the mathematical technique and the block removing technique to correct the x×y pieces of motion data Md(1,1) to Md(x,y) and thus eliminate the noise influence and make it become more smooth.
For example, in the step (h′), the depth repairing module 456 further detects whether any one piece of frame data in the k pieces of frame data encounters a temporary static state, and thus corrects the x×y pieces of pixel data motion activities corresponding to the frame data. For example, the depth repairing module 456 performs the steps (h6′) to (h8′) to perform the operation of detecting the temporary static state, as shown in
As shown in step (h6′), the depth repairing module 456 determines the motion pixel ratio data, which indicates the ratio of the motion pixels to the overall frame data in each piece of frame data, according to corresponding x×y pieces of pixel data motion activities with respect to each of the k pieces of frame data. Next, as shown in step (h7′), the depth repairing module 456 judges whether the mth piece of frame data is in the temporary static state according to the mth piece of motion pixel ratio data corresponding to the mth piece of frame data and the (m−1)th piece of motion pixel ratio data corresponding to the (m−1)th piece of frame data in the k pieces of motion pixel ratio data.
For example, the depth repairing module 456 judges whether the mth piece of frame data encounters the temporary static state by judging whether a ratio difference between the ratios indicated by the mth and (m−1)th pieces of motion pixel ratio data is greater than a ratio threshold value. When the ratio difference between the ratios indicated by the mth and (m−1)th pieces of motion pixel ratio data is smaller than or equal to the ratio threshold value, the depth repairing module 456 judges the mth piece of frame data as being in a dynamic state, and does not correct the x×y pieces of pixel data motion activities of the mth piece of frame data. Thereafter, the step (i′) is performed.
When the ratio difference between the ratios indicated by the mth and (m−1)th pieces of motion pixel ratio data is greater than the ratio threshold value, the depth repairing module 456 judges the mth piece of frame data as being in the temporary static state, and sets the x×y pieces of pixel data motion activities of the mth piece of frame data to be equal to the x×y pieces of pixel data motion activities of the (m−1)th piece of frame data. Then, the step (i′) is performed.
In one example, the depth repairing module 456 sequentially performs the steps (h1′), (h2′) to (h5′), (h9′) to (h11′) and (h6′) to (h8′) in the step (h′), as shown in
In one example, in the step (i′), the depth repairing module 456 further adopts the object segmentation technique to generate the corresponding object data according to the k pieces of frame data, and correct the foreground block data Dfd according to the object data. For example, the object segmentation technique may be the Delaunay triangulation technique or the mean shift segmentation technique. In another example, in the step (i′), the depth repairing module 456 further adopts the profile smoothing technique to correct the foreground block data Dfd.
For example, in the step (i′), the depth repairing module 456 sequentially performs the steps (i1′) and (i2′) to respectively adopt the object segmentation technique and the profile smoothing technique to correct the foreground block data Dfd, as shown in
For example, the input video data Vdi is shown in the exhibit 3, and the corresponding depth data Dd4 is shown in the exhibit 4.
In one example, when the input video data Vdi pertains to the video type III, it is very difficult to effectively acquire the foreground data. Therefore, when the input video data Vdi pertains to the video type III, the depth repairing module 456 does not correct the depth data Dd4 but directly outputs the depth data Dd4 as the repaired depth data Dd4′.
For example, the input video data Vdi is shown in the exhibit 5, and the corresponding depth data Dd4 is shown in the exhibit 6.
In one example, the depth repairing module 456 performs the foreground frame depth estimation operation, and further performs the depth estimation on the background of each frame with reference to the vanishing point data in each frame of the input video data Vdi. Consequently, the depth repairing module 456 may further repair the depth data corresponding to the background frame (e.g., the background frame corresponding to the floor) with the smaller depth in the frame to obtain the ideal depth data Dd4′ with respect to each frame.
For example, the background depth estimation operation performed by the depth repairing module 456 is shown in
In one example, the step (m) includes steps (m1) and (m2). As shown in the step (m1), the depth repairing module 456 finds the highest foreground block underline data, which indicates the highest foreground block underline position, according to the foreground block data Dfd. For example, the underline positions of the foreground blocks Fa1 and Fa2 are respectively the q1th pixel row position and the q2th pixel row position, wherein q1 and q2 are natural numbers greater than p, q1 is smaller than q2, and the highest underline position is the q1th pixel row position, for example.
Next, as shown in step (m2), the depth repairing module 456 finds the vanishing point position of the frame according to the highest foreground block underline data. For example, the highest underline position (the q1th pixel row position) has the height h relative to the lowest pixel row position (the xth pixel row position) of the frame, and the vanishing point position (the pth pixel row position) has the height equal to h/w relative to the highest underline position, wherein w is a natural number greater than 1. In one example, w is equal to 10, and the height of the vanishing point position relative to the highest underline position is equal to h/10.
Thereafter, as shown in step (n), the depth repairing module 456 corrects the background block data Dbd according to the vanishing point data. For example, the depth repairing module 456 respectively fills the ascending depth values from the depth value (i.e., the value 0) corresponding to the maximum depth to the depth value (i.e., the value 255) corresponding to the minimum depth in the background region (e.g., having 255 rows of pixel data) from the vanishing point position (i.e., the pth pixel row position) to the lowest pixel row position (i.e., the xth pixel row position) of the frame, as shown in
In one example, the motion activity analyzing module 452 may also divide the J pieces of frame data of the input video data Vdi into many portions, and respectively performs the calculations of summating the motion data multiple times, and the depth repairing module 456 correspondingly classifies the input frame data Vdi many times according to the video property.
Similarly, the background complexity analyzing module 454 of this exemplary embodiment may also perform the corresponding background complexity calculation on a portion of the frame data in the input video data Vdi, or divide the input video data Vdi into many portions and respectively calculate the background complexity data many times.
Different from the depth calculating methods of the first and second exemplary embodiments, the depth calculating method of this exemplary embodiment further analyzes the motion activity of the input video data and the frame background complexity level to classify the possible input video data into three video categories. The depth calculating method of this exemplary embodiment further performs different foreground depth data repair operations on the input video data pertaining to different video classifications according to classification result. Thus, compared with the conventional depth data generating method, the depth calculating method of this exemplary embodiment has the advantages of performing different adaptive foreground depth repair operations on the input video data having different properties, and of enhancing the precision of the depth data.
In addition, the depth calculating method of this exemplary embodiment may further perform the background depth correction on the input video data according to the vanishing point data. Thus, compared with the conventional depth data generating method, the depth calculating method of this embodiment further has the advantages of performing the correction operation on the background depth of the input video data and of enhancing the precision of the depth data.
The depth calculating method of each exemplary embodiment of the disclosure is for generating the depth data of the corresponding frame data. The depth calculating method of the disclosure may further perform the application operation of generating a two-eye three-dimensional (3D) video signal or a multi-viewing-angle video signal or the application operation of encoding and decoding the 3D video signal according to the depth data.
While the disclosure has been described by way of examples and in terms of disclosed embodiments, it is to be understood that the disclosure is not limited thereto. On the contrary, it is intended to cover various modifications and similar arrangements and procedures, and the scope of the appended claims therefore should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements and procedures.
Number | Date | Country | Kind |
---|---|---|---|
98102973 | Jan 2009 | TW | national |