A method and an apparatus for detecting defects in digitized image sequences are presented. In particular, the present disclosure relates to a method and an apparatus for detecting defects in a frame of a sequence of digitized image frames, and to a corresponding computer readable storage medium.
Motion picture films are often affected by defects, i.e. undesirable objects such as scratches, dust, dirt, stains, abrasion and some more. They usually originate from the technical process of developing, handling, storing, and screening or scanning the material. In some rare cases static objects may already be induced during capturing, for example fluff within a lens or dirt on a scanner glass. However, a very common defect is non-steady dirt, i.e. undesired objects that appear only for a single frame.
For archival and conservation purposes and for making use of the benefits of a digital representation, analogue motion picture films are scanned and digitally encoded. Restoration of the films can, therefore, be carried out in the digital domain after scanning. Instead of time consuming manual restoration of the digitized films by finding and removing each object, application of automatic restoration software with algorithms trying to detect and remove dirt objects is a cost saving alternative to manual workflow.
One task with automatic dirt removal is to reliably discriminate dirt from moving objects.
The problem has been addressed by A. C. Kokaram in his Ph.D. thesis “Motion Picture Restoration”, Cambridge University, England, 1993. He proposed to perform motion compensation prior to dirt detection and developed a temporal spike detector (SDI). Since then, motion compensation, e.g. by hierarchical block matching, has been accepted as a preprocessing step before dirt detection. Research on dirt detection, therefore, concentrated on building new or improving existing detector designs.
In 1996, M. J. Nadenau and S. K. Mitra published an algorithm based on the rank order difference (ROD) in their paper “Blotch and Scratch Detection in Image Sequences based on Rank Ordered Differences”, Proc. of 5th Int. Workshop on Time-Varying Image Processing.
In 1999, P. M. B. Van Roosmalen proposed a simplification of the ROD named sROD in “Restoration of archived film and video”, Ph.D. thesis, Delft University of Technology. He also noticed that motion estimation often fails in case of dirt and proposed techniques for motion vector repair.
This was also noticed by J. Ren and T. Vlachos in “Segmentation-Assisted Detection of Dirt Impairments in Archived Film Sequences” (2007), IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics), who proposed a confidence measure to detect false alarms together with a segmentation-based approach to identify dirt structures. 2010 they reviewed their results in “Detection of dirt impairments from archived film sequences: survey and evaluation”.
Another contribution in this field was provided in 2010 by A. Buades, J. D. Deon, S. Masnou in “Adaptive blotches detection for film restoration”, Proceedings of 2010 IEEE 17th International Conference on Image Processing. They described an adaptive spike detector (ASDI), which also requires motion compensated input.
Motion vector estimation prior to detection is also a main aspect of WO9937087A1 “Moving image restoration” by T. Vlachos, Van Roosmalen and Kokaram.
However, these approaches are either computationally intensive, for example because of the processing step of motion vector estimation and motion compensation, or computation effort is reduced at the cost of less correct detection.
Particularly, motion vector estimation can be considered notoriously erroneous at locations of temporal incoherence, as shown in “Statistical Analysis of Pathological Motion Areas”, A. Rares, Proceedings of the IEE Seminar on Digital Restoration of Film and Video Archives, Jan. 16, 2001, London, UK.
There remains a need for a method and an apparatus for automatically detecting size and location of non-steady undesired objects, shortly referred to as “dirt”, in digitized film sequences efficiently and accurately.
A method and an apparatus for detecting defects in frames of a sequence of digitized image frames are suggested, as well as a computer readable storage medium.
According to an aspect of the present principles, a method for detecting defects in a frame of a sequence of digitized image frames comprises
Accordingly, an apparatus for detecting defects in a frame of a sequence of digitized image frames comprises
Units comprised in the apparatus, such as the motion determination unit, the temporal coherence detection unit and the defect detection unit may, for example, be provided as separate devices, jointly as at least one device or logic circuitry, or functionality carried out by a microprocessor, microcontroller or other processing device, computer or other programmable apparatus.
According to an aspect of the present principles, an apparatus for detecting defects in a frame of a sequence of digitized image frames comprises
Further, a computer readable storage medium has stored therein instructions enabling detection of defects in a frame of a sequence of digitized image frames, wherein the instructions, when executed by a computer, cause the computer to:
The computer readable storage medium tangibly embodies a program of instructions, which, when executed by a computer, cause the computer to perform the described method steps.
In an embodiment the terms “preceding frame” and “succeeding frame” refer to the neighboring frames of the frame receiving the defect detection processing. In another embodiment at least one of the “preceding frame” and the “succeeding frame” can be a frame of the sequence before or after the frame without directly neighboring said frame.
For the solution, it is expected that non-steady dirt objects of a certain size often appear only within a single frame, and, therefore, affect temporal coherence, i.e. it is assumed that motion flow in natural image sequences is usually smooth.
The solution according to the aspects of the present principles allows omitting estimation of motion vectors and motion compensation, as it only relies on a determination of absolute motion values. These could be derived as vector norms from calculated motion vector fields, but, as no motion direction information is used, can more efficiently be derived from faster, less complex computations. A motion adaptive workflow for detecting dirt comprising three main processing stages is as follows:
1. Estimate absolute motion values of each pixel within a frame relative to the frames before and after.
2. Check temporal coherence violations by detecting the pixels outside a displacement radius determined by the determined absolute motion values in the frames before and after.
3. Define pixels violating forward and also backward temporal coherence as defective or as showing dirt.
The provided solution at least has the effect that information on location, size and shape of dirt within a frame can be provided. Since according to the present principles dependence on traditional computational demanding motion vector estimation is omitted that is notoriously erroneous at locations of temporal incoherence, the defect detection can be less complex, while, at the same time, provide high detection ratios.
Although the defect detection solution allows non-steady dirt detection, where dirt is defined as a certain type of temporal incoherence, it is also applicable, e.g., to footage without dirt, for example directly captured by digital video cameras, for detection of temporal incoherences.
In an embodiment the detecting of temporal coherence violations comprises
In an example embodiment the corresponding displacement radii are determined depending on said absolute motion values and multiplied by a factor below 1. This may increase the defect detection ratio.
In an embodiment corresponding absolute motion values between the preceding frame and the succeeding frame are determined for the plurality of pixels; and at least one of the corresponding displacement radii is determined depending on a minimum of the absolute motion value for the pixel of the frame relative to the preceding frame or the succeeding frame and the corresponding absolute motion value between the preceding and the succeeding frame. Here, absolute motion values are determined not only between the frame and the preceding frame and between the frame and the succeeding frame, but also between the preceding frame and the succeeding frame. Assuming that motion in the image sequence is usually smooth, motion values between the frame and its preceding and succeeding frame are replaced by the overall motion detectable between the preceding and the succeeding frame in case of an incoherently high single frame motion peak. This increases reliability of the coherence violation detection result for the particular pixel.
In an embodiment the at least one pixel cannot be found displaced within the corresponding displacement radius in at least one of the preceding frame and the succeeding frame if, for none of the pixels in the preceding frame or the succeeding frame within the corresponding displacement radius, a distance measure between a pixel value of said at least one pixel and a value of any of the pixels in the preceding frame or in the succeeding frame within the corresponding displacement radius is below a threshold. In other words, a backward displacement radius determines a search area in the preceding frame and a forward displacement radius determines a search area in the succeeding frame. The threshold divides the search area into a region of acceptable coherence and another region of coherence violation. The distance measure may, for example, be a Euclidian distance. However, other distance measures may be used instead. The threshold can be selected, for example, depending on a demanded defect detection ratio. The lower the threshold is set, the higher the detection ratio at the cost of an increasing misdetection ratio.
In an embodiment the detecting of temporal coherence violations and the determining of at least one pixel of the plurality of pixels as defective are performed iteratively, wherein the corresponding displacement radii are set to smaller values with each iteration. This produces a set of potentially differing results of detected temporal coherence violations, wherein the reduction of applied displacement radii potentially results in increasing coherence violation detection ratios. In this context, a “smaller value” refers to a value smaller than the one used in the preceding iteration. As an example, with each iteration, the previously determined radius is multiplied by a predefined factor smaller than 1.
In another embodiment the detecting of temporal coherence violations and the determining of at least one pixel of the plurality of pixels as defective are performed iteratively, wherein the threshold is increased with each iteration.
For example, the detecting of temporal coherence violations and the determining of at least one pixel of the plurality of pixels as defective can be performed twice. The idea is to carry out dirt detection twice with two different search window radii. In the second iteration the method is modified such that the backward and forward search radius is significantly reduced, resulting in a much higher detection ratio. In other words, for the first pass, i.e. iteration, the threshold is set conservatively, i.e. to enable a low detection rate, in order to prevent misdetections. However, defects may probably only be detected partly in this iteration. For the second iteration a more relaxed threshold is chosen, resulting in more detections and better shape recovery, but increased misdetection rate.
Afterwards, only regions of the second iteration are selected that are confirmed by the first one. For example, in an embodiment where the detecting of temporal coherence violations and the determining of at least one pixel of the plurality of pixels as defective are performed iteratively, the determining of at least one pixel of the plurality of pixels as defective for an iteration further comprises determining at least one subset of said at least one pixel and determining the one or more pixels of the subset as defective only if in the previous iteration at least one corresponding pixel has been determined as defective. In this context, a subset of a single pixel is the pixel itself, whereas otherwise a subset consists of pixels adjacent or in close proximity to each other. A pixel of the previous iteration that corresponds to the subset is a pixel that overlaps with the subset.
Besides finding dirt or defective objects at the right position, estimating the correct shape improves the detection result. In an example embodiment the determining of at least one subset comprises performing a morphological reconstruction of the at least one subset. In the embodiment the morphological reconstruction may, for example, comprise dilation, closing and filling or combinations thereof to better identify potentially connected defective regions, e.g. dirt. In other words, the subset is changed by morphological operations before the confirmation by a corresponding potentially defective pixel, i.e. a pixel determined during the previous iteration and overlapping with the subset, is checked. Only those regions or subsets of the last iteration are selected as identified defective that are confirmed by the preceding iteration, whereas others are removed.
While not explicitly described, the present embodiments may be employed in any combination or sub-combination.
For a better understanding, the present principles will now be explained in more detail in the following description with reference to the drawings. It is understood that the present principles are not limited to these exemplary embodiments and that specified features can also expediently be combined and/or modified without departing from the scope of the present principles as defined in the appended claims.
Referring to
In a first step 101 absolute motion values for a plurality of pixels of the frame are determined relative to a preceding frame and to a succeeding frame of the sequence. In other words, an array of forward absolute motion values and an array of backward absolute motion values are determined.
In a second step 102 temporal coherence violations for the plurality of pixels are detected between the frame and the preceding frame and between the frame and the succeeding frame depending on said absolute motion values.
In a third step 103 at least one pixel of the plurality of pixels is determined as defective if corresponding temporal coherence violations are detected between the frame and the preceding frame and the succeeding frame.
Referring to
Here, it is assumed that each frame with temporal index t has a single channel representation y(t), e.g. luma or luminance information.
In the shown embodiment a frame y(t) is compared to a motion compensated representation of a preceding frame y(t−1) and/or a succeeding frame y(t+1) under a certain threshold. As described in the background section, different methods like SDI, ROD, and sROD have been proposed for similar comparisons. However, they all depend on the quality of the preceding motion vector estimation.
Here, a binary mask d(t) (matte) is generated from those pixels of y(t) that cannot be found in the frame before or after within a certain displacement radius, i.e. search radius, which has been determined before for each pixel:
In a first processing stage 201, a backward and a forward displacement radius, i.e. search radius, are determined for each pixel.
In a first step 202 of stage 201, for the pixels of the current frame y(t), arrays of absolute motion values m− and m+ of the current frame to the preceding, i.e. backward, and the succeeding, i.e. forward, frames are determined:
m
−
=M(y(t),y(t−1)), m+=M(y(t),y(t+1))
In a second step 203 absolute motion values between backward and forward frames are determined: m−+=M(y(t−1),y(t+1))
In a third step 204 the backward and the forward displacement radii, i.e. search radii, are determined:
r
−=min(m−, m−+), r+=min(m+, m−+)
In a next processing stage 205 forward and backward temporal coherence violations are detected for each pixel. In other words, in a fourth step 206 those pixels in y(t) are detected in the frames before and after that cannot be found within an Euclidian distance for a given threshold T within the corresponding search radius.
In the shown embodiment the fourth processing step 206 comprises building the backward detection mask d−(t) and forward detection mask d+(t) by evaluating
d
−(h,v,t)=˜any((y(h,v,t)−y(h−r−(h,v,t): h+r−(h,v,t), v−r−(h,v,t): v+r−(h,v,t),t−1))2<T), and
d
+(h,v,t)=˜any((y(h,v,t)−y(h−r+(h,v,t): h+r+(h,v,t), v−r+(h,v,t): v+r+(h,v,t),t+1))2<T)
for each pixel of y(t) with horizontal index h and vertical index v. (Colon notation denotes ‘from:to’ and ‘˜’ denotes NOT)
In a next processing stage 207 it is checked if temporal coherence violations have been detected in the forward and also in the backward direction for obtaining the defect- or dirt-detection mask d(t):
d(t)=d−(t) & d+(t)
At this point detection on pixel level is finished. d(t) contains indications of dirt detected at the corresponding pixel locations.
In an additional processing stage 208 connected-pixel defects indicating dirt objects are identified. Further, position, size and shape of the objects are determined. In an embodiment a rectangular bounding box is generated for each object, and its position, width and height are stored, for example together with a binary mask describing its shape, as generated metadata.
Referring to
A higher detection ratio for the shape of dirt objects can be achieved if the search window radius r is deliberately set lower than necessary. But this will also result in an increased amount of misdetection, i.e. a higher false positive detection ratio. This observation is utilized and combined with the idea of using two different thresholds:
At first, dirt detection is performed using processing stages 301, 302, 303, which correspond to respective processing stages 201, 205, 207 as shown in
In a first processing stage 301, a backward and a forward displacement radius, i.e. search radius, are determined for each pixel. In a next processing stage 302 forward and backward temporal coherence violations are detected for each pixel. In a next processing stage 303 it is checked if temporal coherence violations have been detected in the forward and also in the backward direction for obtaining the defect- or dirt-detection mask d(t).
In a next processing stage 304 the search radii are reduced by multiplication with a factor below 1:
r
low
−
=r
−
·T with T<1
r
low
+
=r
+
·T with T<1
Afterwards, processing stages 305 and 306 corresponding to respective processing stages 302 and 303 are performed to generate defect- or dirt-detection mask dlow(t) instead of d(t).
In a next processing stage 307 defective objects in dlow are identified based on d(t). For example, a binary morphological reconstruction binrecon( ) of mask dlow is performed with marker d, i.e. continuous areas or subsets of dlow which are not backed up by at least one element in d with the same area are removed. This results in a detection mask dcomplete where shapes of dirt are more completely resolved:
d
complete=binrecon(dlow, d)
Referring to
In this embodiment, displacement vectors indicating where a pixel has actually been found are stored for both directions, i.e. forward and backward. In the shown embodiment detection of temporal coherence violations is run only once while the detection result remains similarly accurate. The displacement radii are compared to reduced search window radii resulting in a detection mask with more elements.
In a first processing stage 401 an array of forward and backward motion vectors q+(h,v) and q−(h,v) are determined.
In a second processing stage 402 the motion vector arrays are transformed into absolute motion values, i.e. the length of each motion vector is computed by applying the Euclidian norm:
q
abs
−(h,v)=∥q−(h,v)∥2, qabs+(h,v)=∥q+(h,v)∥2
In a third processing stage 403 forward and backward search radii r−, r+ are determined.
In a fourth processing stage 404 backward and forward coherence violation detection arrays are determined, i.e. masks dlow− and dlow+ where absolute forward and backward motion is greater than a certain threshold which is lower than the forward and backward search radii r− and r+:
d
low
−(h,v)=qabs−(h,v)>r−(h,v)·T with T<1
d
low
+(h,v)=qabs+(h,v)>r+(h,v)·T with T<1
In a fifth processing stage 405 a defect detection array dlow is determined by combining masks dlow− and dlow+:
d
low
=d
low
− & dlow+
In a sixth processing stage 406 defective objects in dlow are identified. For example, a binary morphological reconstruction binrecon( ) of mask dlow is performed, for example using a marker d. This results in a detection mask dcomplete where shapes of dirt are more completely resolved.
Referring now to
The apparatus 500 shown in
The apparatus 500 further comprises a temporal coherence detection unit 503 configured to detect for the plurality of pixels temporal coherence violations between the frame and the preceding frame and between the frame and the succeeding frame depending on said absolute motion values.
Furthermore, the apparatus 500 comprises a defect detection unit 504 configured to determine at least one pixel of the plurality of pixels as defective if it detects corresponding temporal coherence violations between the frame and the preceding frame and the succeeding frame.
The shown apparatus 500 further comprises at least one memory 505 arranged to at least temporarily store or buffer, e.g., frames of the image sequence, as well as other values calculated during the subsequent processing, such as absolute motion values and temporal coherence violations. In another embodiment, the apparatus 500 does not contain the memory 505 but is connected or connectable to the memory by means of an interface.
In the embodiment shown in
In the shown embodiment the memory 505 is connected to the motion determination unit 502, the temporal coherence detection unit 503 and the defect determination unit 504. In other embodiments some all of the units are indirectly connected to the memory or the memory is provided as a plurality of separate memory devices.
Referring now to
The input 601 is connected to provide a received frame, for example as shown frame y(t+1), to the first frame buffer 603, the first absolute motion estimation unit 606, the third absolute motion estimation unit 608 and the first temporal coherence violation detection unit 612.
The first frame buffer 603 is connected to provide a previously buffered frame of the sequence, y(t) in the shown example, to the second frame buffer 604, the first absolute motion estimation unit 606, the second absolute motion estimation unit 607, the first temporal coherence violation detection unit 612 and the second temporal coherence violation detection unit 613. The second frame buffer 604 is connected to provide a previously buffered frame of the sequence, y(t−1) in the shown example, to the second absolute motion estimation unit 607, the third absolute motion estimation unit 608 and the second temporal coherence violation detection unit 613.
The first absolute motion estimation unit 606 is configured to determine forward absolute motion values m+(t) by evaluating y(t) and y(t+1) and to provide m+(t) to the first minimum determination unit 610. The second absolute motion estimation unit 607 is configured to determine backward absolute motion values m−(t) by evaluating y(t) and y(t−1) and to provide m−(t) to the second minimum determination unit 611. The third absolute motion estimation unit 608 is configured to determine absolute motion values m−+(t) between the preceding and the succeeding frame of y(t) by evaluating y(t−1) and y(t+1) and to provide m−+(t) to the first minimum determination unit 610 and the second minimum determination unit 611.
The first minimum determination unit 610 is configured to determine the forward displacement radii or search radii r+(t) as the minimum of m+(t) and m−+(t) and to provide r+(t) to the first temporal coherence violation detection unit 612. The second minimum determination unit 611 is configured to determine the backward displacement radii or search radii r−(t) as the minimum of m−(t) and m−+(t) and to provide r−(t) to the second temporal coherence violation detection unit 613.
The first temporal coherence violation detection unit 612 is configured to determine a mask or array d+(t) of detected forward temporal coherence violations by evaluating y(t) and y(t+1) with respect to r+(t). The second temporal coherence violation detection unit 613 is configured to determine a mask or array d−(t) of detected backward temporal coherence violations by evaluating y(t) and y(t−1) with respect to r−(t).
The AND-determination unit 614 is configured to determine a mask or array d(t) of detected defects by evaluating d+(t) and d−(t) and to provide the defect detection mask d(t) to an output 615.
Units comprised in the embodiment of the apparatus shown in
As shown in
For example, the processing device can be a processor adapted to perform the steps according to one of the described methods. In an embodiment said adaptation comprises that the processor is configured, i.e. for example programmed, to perform steps according to one of the described methods.
In an embodiment, the apparatus 500, 600 or 700 is a device being part of another apparatus or system, such as, for example, a video processing framework.
As will be appreciated by one skilled in the art, aspects of the present principles can be embodied as an apparatus, a system, method or computer readable medium. Accordingly, aspects of the present principles can take the form of a hardware embodiment, a software embodiment or an embodiment combining software and hardware aspects. Furthermore, aspects of the present principles can take the form of a computer readable storage medium. Any combination of one or more computer readable storage medium(s) may be utilized.
Aspects of the present principles may, for example, at least partly be implemented in a computer program comprising code portions for performing steps of the method according to the present principles when run on a programmable apparatus or enabling a programmable apparatus to perform functions of an apparatus or system according to the present principles.
Further, any shown connection may be a direct or an indirect connection. Furthermore, those skilled in the art will recognize that the boundaries between logic blocks are merely illustrative and that alternative embodiments may merge logic blocks or impose an alternate decomposition of functionality upon various logic blocks.
Number | Date | Country | Kind |
---|---|---|---|
14306576.1 | Oct 2014 | EP | regional |