This application claims priority of Taiwanese Application No. 103127056, filed on Aug. 7, 2014.
The invention relates to a method and an image processing device that are configured to efficiently generate image depth maps for image frames included in a video.
Recently, movies and televisions have developed toward being presented in a three-dimensional (3D) video format. Conventionally, a 3D image on a virtual point of view is created by using a two-dimension (2D) image with an existed visual angle and a corresponding depth map. The depth map is a grey-scale image using a luminous intensity to represent a depth (i.e., a distance from the camera).
Conventionally, technology for generating a depth map requires acquiring a plurality of images having parallax thereamong for analyze. Additionally, each of the images is needed to be processed in its entirety, resulting a relatively slow processing speed. Therefore, when filming a video and a plurality of video frames need to be displayed per second, a processor with better performance must be employed (resulting in higher hardware costs), or a percentage of image frames of the video processed may need to be reduced (adversely affecting the quality of the subsequently generated 3D images).
Therefore, an object of the present invention is to provide a method that enables efficient generation of image depth maps.
Accordingly, a method of this invention is for generating image depth maps for image frames included in a video, is to be implemented using an image processing device, and may include the following steps of:
(a) acquiring, by the image processing device, a full depth map of an nth image frame of the video, wherein n is a positive integer;
(b) finding, using a difference spotting module of the image processing device, a to-be-processed region of an (n+m)th image frame of the video, the to-be-processed region differing from a corresponding region of the nth image frame of the video, wherein m is a positive integer;
(c) processing, using a depth map generating module of the image processing device, the to-be-processed region of the (n+m)th image frame of the video for generating a partial depth map that corresponds to the to-be-processed region; and
(d) generating, using a depth map composing module of the image processing device, a full depth map of the (n+m)th image frame of the video by using the partial depth map to replace a part of the full depth map of the nth image frame of the video that corresponds in position to the partial depth map.
Another object of the present invention is to provide an image processing device that is configured to execute steps of the method of this invention.
Accordingly, an image processing device of this invention may include a storage medium and an image processor coupled to the storage medium. The storage medium stores a video therein. The video includes a plurality of consecutive image frames. The image processor is configured to:
acquire a full depth map of an nth image frame of the video, wherein n is a positive integer;
find a to-be-processed region of an (n+m)th image frame of the video, the to-be-processed region differing from a corresponding region of the nth image frame of the video, wherein m is a positive integer;
process the to-be-processed region of the (n+m)th image frame of the video for generating a partial depth map that corresponds to the to-be-processed region; and
generate a full depth map of the (n+m)th image frame of the video by using the partial depth map to replace a part of the full depth map of the nth image frame of the video that corresponds in position to the partial depth map.
Other features and advantages of the present invention will become apparent in the following detailed description of the embodiment with reference to the accompanying drawings, of which:
Referring to
The image processing device 1 includes a storage medium 11 that stores the video received from the image capturing device 2/the computer device therein, and an image processor 12 coupled electrically to the storage medium 11.
The image processor 12 includes a depth map generating module 121, a difference spotting module 122 and a depth map composing module 123. In this embodiment, the depth map generating module 121, the difference spotting module 122 and the depth map composing module 123 are realized using software or firmware program that resides in the image processor 12 and that includes instructions for execution by the image processor 12 of the image processing device 1.
The depth map generating module 121 is configured to generate a full depth map (as illustrated in
It is noted that a mechanism employed by the depth map generating module 121 in generating the full depth map and the partial depth map may be one similar to that disclosed in Taiwanese Patent Application No. 103110698, the entire disclosure of which is incorporated herein by reference.
The difference spotting module 122 is configured to compare two image frames of the video, and to find a region of one of the image frames of the video that differs from a corresponding region of the other one of the image frames of the video.
The difference spotting module 122 may determine whether the two image frames of the video are different by examining, pixel by pixel, the color pixel values of corresponding pixels on each of the two image frames of the video.
For example, when it is determined that a difference between color pixel values of a pixel on a first image frame and a pixel on a place of a second image frame corresponding in position to the first image frame is larger than a predetermined threshold, the difference spotting module 122 may deem the two pixels as “different”.
The depth map composing module 123 is configured to compose a full depth map of a particular image frame of the video by using a partial depth map of the particular image frame of the video to replace a part of the full depth map of another image frame of the video, the replaced part corresponding in position to the partial depth map.
Referring to
In step S1, the image processor 12 of the image processing device 1 acquires an nth image frame of the video stored in the storage medium 11. In this embodiment, n is a positive integer. Afterward, the depth map generating module 121 acquires a full depth map Dn of the nth image frame of the video, as shown in
In this embodiment, the depth map generating module 121 is configured to generate the full depth map Dn of the nth image frame of the video by processing all regions of the nth image frame of the video.
Afterward, the image processor 12 attempts to generate full depth maps for the subsequent image frames of the video. That is, the image processor 12 attempts to generate full depth maps for the (n+m)th image frames of the video, where m is a positive integer. For example, for m=1, the (n+m)th image frame of the video is one that immediately succeeds the nth image frame of the video.
The difference spotting module 122 is configured to compare the nthimage frame of the video and the (n+1)th image frame of the video to find differences therebetween.
When no significant difference is found (i.e., the nth image frame of the video is substantially identical to the immediately succeeding image frame), the full depth map generated for the nth image frame of the video may be taken directly as the full depth map of the (n+1)th image frame of the video. Subsequently, in generating the full depth map of the (n+2)th image frame of the video, the difference spotting module 122 compares the (n+1)th image frame of the video and the (n+2)th image frame of the video.
On the other hand, when some significant difference is found (e.g., between the (n)th image frame of the video and the (n+1)th image frame of the video), in step S2, the difference spotting module 122 finds a to-be-processed region of the (n+1)th image frame of the video.
Specifically, the to-be-processed region of the (n+1)th image frame of the video differs from a corresponding region of the (n)th image frame of the video.
In step S3, the depth map generating module 123 processes the to-be-processed region V of the (n+1)th image frame of the video, and generates a partial depth map P(n+1) that corresponds to the to-be-processed region, as best shown in
In particular, the depth map generating module 123 is configured to process the to-be-processed region of the (n+m)th image frame of the video and neighboring pixels of the to-be-processed region of the (n+m)th image frame of the video for generating the partial depth map that corresponds to the to-be-processed region.
Then, in step S4, the depth map composing module 123 generates a full depth map D(n+1) of the (n+1)th image frame of the video by using the partial depth map P(n+1) generated in step S3 to replace a part of the full depth map Dn of the (n)th image frame of the video that corresponds in position to the partial depth map P(n+1), as best shown in
In cases where n is larger than 1 (i.e., the nth image frame is not the very first image frame of the video), the method may be carried out using a similar procedure.
Specifically, the image processor 12 may first acquire a full depth map of an (n−1)th image frame of the video, and the nth image frame of the video. The difference spotting module 122 then finds a to-be-processed region of the nth image frame of the video, the to-be-processed region of the nth image frame of the video differing from a corresponding region of the (n−1)th image frame of the video.
Afterward, the depth map generating module 121 processes the to-be-processed region of the nthimage frame of the video for generating a partial depth map that corresponds to the to-be-processed region of the nth image frame of the video. The depth map composing module 123 in turn generates the full depth map of the nth image frame of the video by using the partial depth map thus generated to replace apart of the full depth map of the (n−1)th image frame of the video that corresponds in position to the partial depth map thus generated.
One potential advantage of the method and the image processing device 1 according to the present invention is that, in a video, successive image frames typically only differ slightly from each other, or only differ from each other in a relatively small region. Therefore, it may not be cost-effective to process all regions for each of the image frames included in the video since most regions of a particular image frame may have been already processed in previous image frames.
By using the full depth map of a previous one of the image frames of the video as a base, the depth map generating module 123 is required to process only a part of a succeeding one of the image frames of the video rather than processing all regions. As a result, processing load of the depth map generating module 123 may be dramatically reduced. This may prove especially useful in processing videos with higher frame rates (e.g., a higher frames per second (FPS)) since the need to employ better hardware and/or reduce the percentage of the image frames to be processed are eliminated.
While the present invention has been described in connection with what is considered the most practical embodiment, it is understood that this invention is not limited to the disclosed embodiment but is intended to cover various arrangements included within the spirit and scope of the broadest interpretation so as to encompass all such modifications and equivalent arrangements.
Number | Date | Country | Kind |
---|---|---|---|
103127056 | Aug 2014 | TW | national |