These and/or other aspects and advantages of the invention will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to the present embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
An aspect of the present invention is described hereinafter with reference to flowchart illustrations of user interfaces, methods, and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, and/or other programmable data processing apparatus to produce a machine or system of machines, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart block or blocks. However, the invention is not limited thereto.
These computer program instructions may also be stored in a computer usable or computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer usable or computer-readable memory produce an article of manufacture including instruction means that implement the function specified in the flowchart block or blocks.
The computer program instructions may also be loaded into a computer or other programmable data processing apparatus (or combination thereof) to cause a series of operational steps to be performed in the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute in the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks.
Moreover, each block of the flowchart illustrations may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the blocks may occur out of order. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in reverse order, depending upon the functionality involved.
The storage unit 210 stores encoded video, and stores data generated by each component of the video-retrieval apparatus 200. For example, the storage unit 210 stores the edge histogram on each I frame generated by the edge-histogram-generation unit 270. Such a storage unit 210 can be implemented by a nonvolatile memory element such as a cache, ROM, PROM, EPROM, EEPROM, or flash memory, by a volatile memory element such as RAM, or by a storage medium such as a HDD or optical medium, but it is not limited to this. The storage unit 210 can be detachable in addition to or instead of internal storage. However, it is understood that the storage unit 210 need not store the edge histogram for each stored video in all aspects of the invention.
The input unit 220 receives a sample video extracted from a predetermined video (i.e., a query video) which includes at least one I frame. The frame-detection unit 230 detects an I frame from frames included in the query video or the stored video stored in the storage unit 210. The detected I frame is provided to the entropy decoder 240. The entropy decoder 240 entropy-decodes the I frame provided from the frame-detection unit 230. The entropy-decoded I frame is provided to the inverse-quantization unit 250. While not required in all aspects, the input unit 220 can receive the sample video using a drive reading a medium (such as an optical storage medium or a magnetic medium), from a camera, or through a network from a remote medium.
The inverse-quantization unit 250 inverse-quantizes the entropy-decoded frame I. As shown in
Among DCT coefficients on a certain DCT block, AC0, 0 is a coefficient of the DC element, and refers to the average brightness of the DCT block. The remaining coefficients AC0, 1 to AC7, 7 are AC elements that have a certain direction and a certain rate of change, and reflect the change in the gray level value. F(i,j) represents a pixel value at location i,j of the DCT block. AC0, 1 depends on the difference in the horizontal direction between the left side and the right side of the DCT block in the space area. In comparison, AC1, 0 depends on the difference in the vertical direction between the upper side and the lower side of the DCT block in the space area. In other words, the coefficient AC0, 1 represents the edge element in the horizontal direction that is included in the DCT block, and the coefficient AC1, 0 represents the edge element in the vertical direction that is included in the DCT block.
Further, the determination unit 260 determines whether each DCT block is an edge area based on the DCT coefficient of each DCT block. Specifically, the determination unit 260 determines whether each DCT block includes an edge (i.e., an edge of an image within the DCT block). Here, the variance value of pixels values of each DCT block can be used as a basis for determining the edge area. The variance value in the DCT area can be acquired from the sum total of the squares of AC coefficients exempting DC elements. In other words, in the case where the variance value of a predetermined DCT block is greater than a first critical value, the determination unit 260 determines that the DCT block includes an edge.
In contrast, in the case where the variance value is less than the first critical value, the determination unit 260 determines that the DCT block does not include the edge (i.e., the determination unit 260 determines that the DCT block is a smooth area). In the case where the DCT block is a smooth area, the determination unit 260 determines whether the next DCT block is an edge area.
As a result, in the case where the DCT block is an edge area (i.e., a portion of the frame having an edge of an image), the determination unit 260 determines the type of the edge that the DCT block includes. First, the determination unit 260 determines whether the edge included in the DCT block is non-directional or directional. Some examples of the directional edge are a horizontal edge, a 45°-direction edge, a vertical edge, and a 135°-direction edge. The determination unit 260 can determine whether each DCT block is a non-directional edge based on the strength of the AC0, 1 and AC1, 0 coefficients. In other words, where the strength of the edge is less than a second critical value, the determination unit 260 determines that the type of the edge included in the DCT block is a non-directional edge.
Where the edge included in the DCT block is a directional edge, the determination unit 260 determines the type of the directional edge. Here, the type of the directional edge can be determined based on the rate of AC0, 1 and AC1, 0 among AC coefficients of each DCT block. R1 and R2, which represent the rate of AC0, 1 and AC1, 0, can be defined by the equation 2 and equation 3.
According to an aspect of the invention, each DCT block is partitioned into a first area 410, a second area 420, a third area 430, and a fourth area 440 depending on the values of the defined R1 and R2, as illustrated in
For example, in the case where the rate of the two coefficients is included in the first area 410 (i.e., R1 is close to infinity and R2 is not close), the determination unit 260 determines that the DCT block includes the vertical edge as shown in
The edge-histogram-generation unit 270 generates an edge histogram that includes the edge distribution information on an I frame. Specifically, the edge-histogram-generation unit 270 generates a local edge histogram based on the result of the determination of the determination unit 260, and then generates a global edge histogram and a semi-global edge histogram, respectively, based on the local edge histogram. For this, the edge-histogram-generation unit 270 includes a local-edge-histogram-generation unit 271, a global-edge-histogram-generation unit 273, and a semi-global edge-histogram-generation unit 272.
The local-edge-histogram-generation unit 271 generates a local edge histogram based on the result of the determination of the determination unit 260. Here, the local edge histogram indicates the distribution information of a certain I frame by sub-areas. The local edge histogram is described in more detail with reference to
In the same manner, if the edge histogram of the first sub-area 310 is completed, the local-edge-histogram-generation unit 271 performs this process on each sub-area (such as second sub-area 320) of the I frame 300 in order to complete the local edge histogram of the I frame.
The semi-global edge-histogram-generation unit 272 generates a semi-global edge histogram of the I frame based on the local edge histogram. Here, the semi-global edge histogram represents the edge distribution information of the I frame by semi-global areas. The semi-global area can be formed by grouping at least two sub-areas among 16 sub-areas. For example, as illustrated in
Then, the total area is grouped in 2×2 type as shown in
While not required in all aspects, the semi-global edge histogram can be acquired by the sum total of values of bins that represent the same edge element among bins of sub-areas included in the same semi-global area in the local edge histogram. For example, the sum of bins that represent the vertical direction among 5 bins for the first, fifth, ninth and thirteenth sub-areas 310, 330, 340, 350 is recorded in the bin that represents the vertical direction among five bins on the first semi-global area 601. In the same manner, the sum of bins that represent the horizontal direction among 5 bins for the first, fifth, ninth and thirteenth sub-areas 310, 330, 340, 350 is recorded in the bin that represents the horizontal direction among five bins on the first semi-global area 601.
Further, the global-edge-histogram-generation unit 273 generates the global edge histogram that represents the edge distribution information on the total area of the frame I. The global edge histogram includes five bins that correspond to the vertical, horizontal, 45-degree, 135-degree, and non-directional edge elements. Such a global edge histogram can be generated based on the local edge histogram. Specifically, the sum of bins that represent the vertical edge element is recorded in the bin that represents the vertical edge element among the global edge histogram. Likewise, the sum of bins that represents the horizontal edge element among the local edge histogram is recorded in the bin that represents the horizontal edge element among the global edge histogram.
Among the aforementioned edge-histogram-generation process, the local-edge-histogram-generation process is repeatedly performed on all I frames. While not required, it is preferable that the edge information on all I frames of the stored video is generated in advance (i.e., before the query video is inputted) and the edge-histogram bin on each I frame and are stored in the aforementioned storage unit 210 as shown in
Referring to
The video-retrieval unit 290 retrieves the video that matches the query video by measuring the similarity rate between a key frame (hereinafter, called a “first key frame”) extracted from the query video and a key frame (hereinafter, called a “second key frame”) extracted from the stored video. Here, the Hausdorff distance between the first key frame and the second key frame can be used as a basis for measuring the similarity rate. Using the Hausdorff distance between the first key frame and the second key frames for the stored videos in the storage unit 210, the one of the stored videos that has the smallest value can be specified as the video that matches the query video.
The Hausdorff distance can be acquired by the sum total of differential values of the bin at the same position, respectively, in each edge histogram on the first key frame and the second key frame. Preferably and while not required in all aspects, the differential value of the bin is produced by edge histograms of the same type. Specifically, first, the video-retrieval unit 290 produces differential values on the bin at the same position, respectively, in the local edge histogram of the first key frame and the second key frame. Here, a total of 80 differential values are produced, and the video-retrieval unit 290 produces a first result value that is the sum total of 80 differential values. Then, the video-retrieval unit 290 produces the differential values of the bin at the same position in the global edge histogram of the first key frame and the second key frame. Here, a total of 5 differential values are produced, and the video-retrieval unit 290 produces a second result value that is the sum total of 5 differential values. Then, the video-retrieval unit 290 produces the differential values of the bin at the same position, respectively, in the semi-global edge histogram of the first key frame and the second key frame. Here, total 65 differential values are produced, and the video-retrieval unit 290 produces a third result value that is the sum total of 65 differential values. Then, the video-retrieval unit 290 produces a Hausdorff distance that is the sum total of the first result value, the second result value and the third result value. Further, the global histogram includes less number of bins compared to the local histogram and the semi-global histogram, and thus when summing up each result value, a predetermined weight can be applied to the second result value.
The video-retrieval unit 290 repeats the aforementioned process on a plurality of second key frames which the key frame extraction unit 280 obtains from the stored videos in the storage unit 210, and identifies the one of the stored videos that includes the second key frame having the lowest Hausdorff distance as the result of the retrieval. The display unit 295 displays the result of the command-handling in a visible form. For example, the display unit 295 displays the stored video retrieved by the video-retrieval unit 290.
A video-retrieval method according to an exemplary embodiment of the present invention is described with reference to
When the inverse-quantization process on the I frame is completed, the video-retrieval apparatus 200 generates the edge histogram according to the type of the edge included in the plurality of DCT blocks by I frames S730. Here, the edge-histogram generation by I frames is described in more detail with reference to
The determination unit 260 determines whether each DCT block of each sub-area is an edge area, and generates the local edge histogram of the I frame. The determination unit 260 determines whether the first DCT block (hereinafter, called a “first DCT block”) 311 of a first sub-area 310 is an edge area S733. Here, the determination unit 260 determines whether the DCT block is an edge area according to whether the variance value of the first DCT block 311 is less than the first critical value. As a result, in the case where the variance of the first DCT block 311 is less than the first critical value (yes in S733), the determination unit 260 determines that the first DCT block 311 is an area that does include a smooth area (i.e., it is not an edge). In the case where the variance of first DCT block 311 is not less than the first critical value (no in S733), the determination unit 260 determines that the first DCT block 311 is an area that does not include a smooth area (i.e., it is an edge).
Then, the determination unit 260 determines whether the second DCT block 312 of the first sub-area 310 is an edge area S734, S732, and S733. As a result, where the variance value of the first DCT block 311 is greater than the first critical value (no in S733), the determination unit 260 determines that the first DCT block 311 is an edge area (i.e., an area that includes an edge).
If it is determined that the first DCT block 311 is an edge area, the determination unit 260 determines the type of the edge included in the first DCT block 311 S735. Specifically, the determination unit 260 determines that the type of the edge included in the first DCT block is a non-directional edge. Here, the determination unit 260 determines whether a non-directional edge is included based on the strength of AC0, 1 and AC1, 0 coefficients of the first DCT block 311. In other words, in the case where the strength of the edge is less than the second critical value, it is determined that the type of the edge included in the first DCT block 311 is a non-directional edge.
If the strength of the edge is greater than the second critical value, the determination unit 260 determines that the first DCT block 311 includes a directional edge. Here, the determination unit 260 determines what type of directional edge the first DCT block 311 includes depending on the ratio of two AC coefficients, especially AC0, 1 and AC1, 0, among DCT coefficients of the first DCT block 311. For example, in the case where the ratio of two AC coefficients is close to 1, and the signs of the two AC coefficients are the same, it is determined that the first DCT block 311 includes a 45°-direction edge. In the case where the ratio of the two AC coefficients is close to 1, and signs of the two AC coefficients are different, it is determined that the first DCT block 311 includes a 135°-direction edge. In comparison, in the case where the ratio of the two AC coefficients is close to infinity, it is determined that the first DCT block 311 includes a vertical edge or a horizontal edge. In other words, in the case where R1 (Equation 2) is close to infinity, it is determined that the first DCT block 311 includes a horizontal edge, and in the case where R2 is close to infinity, it is determined that the first DCT block 311 includes a vertical edge.
Likewise, if the type of the edge included in the first DCT block 311 is determined S735, the edge-histogram-generation unit 270 increases the value of the bin that corresponds to the edge, among five bins included in the first sub-area 310 in the local-edge histogram of the first I frame S736. For example, in the case where it is determined that the first DCT block 311 includes a vertical edge, the edge-histogram-generation unit 270 increases the value of the bin corresponding to the vertical edge by 1, among five bins included in the first sub-area 310. In the case where it is determined that the first DCT block 311 includes a horizontal edge, the edge-histogram-generation unit 270 increases the value of the bin corresponding to the horizontal edge by 1, among five bins included in the first sub-area 310.
In the case where the aforementioned process is performed on all DCT blocks that constitute the first sub-area 310 (yes in S737), the determination unit 260 and the edge-histogram-generation unit 270 repeat the aforementioned processes S731 to S737 on the second sub-area 320, and complete the local-edge histogram of the I frame.
Further, in the case where the local-edge histogram of an I frame is completed, the determination unit 260 and the edge-histogram-generation unit 270 repeat the aforementioned processes S731 to S737 on all I frames detected from the query video, and complete the local-edge histogram for each I frame.
Further, in the case where the local-edge histogram on each I frame is completed, the key-frame-selection unit 280 retrieves a key frame based on the local-edge histogram of each I frame S740. Here, the key-frame-selection unit 280 selects the I frame, of which the edge histogram bin difference (EHBD) with the local edge histogram of the previous I frame is greater than the third critical value, as the key frame.
If a key frame is selected from the query video, the edge-histogram-generation unit 270 generates the global edge histogram and the semi-global edge histogram, respectively, based on the local edge histogram of each key frame. Then, the video-retrieval unit 290 retrieves the video that matches the query video by measuring the similarity rate between the first key frame and the key frame of the stored video (i.e., the second key frame S750). Here, the video-retrieval process is described in more detail with reference to
While not required in all aspects, the video-retrieval unit 290 can apply a predetermined weight to the second result value when summing each result value because the global histogram includes less number of bins compared to the local histogram and the semi-global histogram.
The video-retrieval unit 290 produces the Hausdorff distance on all second key frames of the stored videos, and selects one of the stored videos of the lowest result value (i.e., distance) as the video that matches the query video S754 using the sum total of the first result value, the second result value, and the third result values for the respective first and second key frames. If the video that matches the query video is retrieved by measuring the similarity rate, the video-retrieval apparatus 200 displays the video retrieved through the display unit 295 S760.
The video-retrieval method according to an aspect of the present invention requires less calculations as compared to the conventional technology. This method is described in more detail with reference to Tables 1 and 2. Here, Table 1 compares the performance of the video-retrieval method according to an exemplary embodiment of the present invention and the retrieval-performance of EMI and Gof-Gop, which are the conventional video-retrieval technologies.
Normalized Modified Retrieval Rank (NMRR) and Average Normalized Modified Retrieval Rank (ANMRR) can be used as an index of the retrieval performance. Here, NMRR is an evaluation criterion for evaluating the retrieval efficiency in MPEG-7. The NMRR takes a value between 0 and 1; the lower the value the better the efficiency. ANMRR represents the average NMRR.
Referring to Table 1, the video-retrieval method according to the present embodiment of the present invention is similar to the conventional EMI and GofGop technologies in terms of retrieval performance. Further, referring to Table 2, in the case where a video is retrieved using the method of the present embodiment of the present invention, the amount of calculation is reduced by more than 90% compared to the methods of EMI and GofGop.
It should be understood by those of ordinary skill in the art that various replacements, modifications and changes may be made in the form and details without departing from the spirit and scope of the present invention as defined by the following claims. Therefore, it is to be appreciated that the above described embodiments are for purposes of illustration only and are not to be construed as limitations of the invention. For instance, while described in terms of using edge histograms, it is understood that other histograms (such as color histograms) can be further used in the comparison and/or key frame selection, and that each of the local, semi-global, and global edge histograms need not be used in all aspects of the invention.
According to the method and apparatus of the present invention, the amount of calculation needed for a video retrieval is reduced, and thus a video can be retrieved at high speed, which is advantageous.
Although a few embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in this embodiment without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2006-44416 | May 2006 | KR | national |