The invention relates to a method and an apparatus for filtering an array of pixels. In particular, the invention relates to a method and an apparatus for filtering an array of pixels with a Domain Transform filter using a normalized convolution or a recursive filter.
In [1] the so-called Domain Transform is described, which facilitates fast, edge-preserving smoothing of input images. It has been proven to be effective in a wide range of applications, including stylization, recoloring, colorization, detail enhancement, tone mapping, and others. The Domain Transform has several key properties and advantages, of which the main ones are the preservation of fine details in the filter input and its efficiency. Its complexity is 0(n), where n denotes the number of pixels in the image. This means that the computational complexity is independent of the chosen kernel size.
The Domain Transform builds upon the observation that a color image can be regarded as a 2D manifold in a 5D space, with two spatial coordinates and three colorimetric coordinates, and that edge-preserving smoothing filtering of the 2D image can therefore be carried out with a 5D spatially-invariant kernel, which has a response that decreases with increasing distance among the pixels in 5D. As the advantage of the spatially-invariant filtering would be lost by the high dimensionality of the necessary kernel, the key idea consists in transforming the problem to a lower-dimensional space while preserving the distances between pixels.
It is known that such a distance-preserving transformation, i.e. an isometry, only exists for very special 2D manifolds in a 5D or any other space. Even then it requires slow optimization methods to find mere approximations. Of course, a very complex transformation would again undo the whole gain of the simpler spatially-invariant filtering. Therefore, the authors propose to approximate the two-dimensional filtering of an image by iteratively applying one-dimensional filters to its rows and columns. As the necessary spatially-invariant 1D smoothing filters can be implemented very efficiently, several such iterations may easily be performed while still being orders of magnitudes faster than the original edge-preserving smoothing filtering.
For any 1D signal it is very simple to define an isometry if only the distance between neighboring samples need to be preserved. This simple transformation can also be implemented very efficiently. The 1D signal is merely warped in such a way that the spacing between adjacent samples is proportional to their colorimetric and spatial distance. Therefore, a color edge in the input signal translates to a larger sample distance in the transformed domain. Later, during filtering of the signal, a larger sample distance either translates to a smaller weight or even pushes a sample out of the filter window. In the latter case its influence on the current sample is eliminated altogether. In effect, this preserves edges during the filtering process, therefore achieving the key goal of the filter.
Currently there are two alternatives available to the Domain Transform. The first is the classical (Joint) Bilateral Filter as described in [2]. Besides potential gradient reversal artifacts, it suffers from computational complexity, especially for larger kernel sizes. Its efficient implementation is currently subject to a lot of research works.
The second alternative to the Domain Transform is the Guided Image Filter as described in [3]. Its main advantages over the (Joint) Bilateral Filter are that it does not suffer from gradient reversal artifacts, and its very efficient implementation. Its complexity is 0(n), where n denotes the number of pixels in the image. While the Domain Transform and the Guided Image Filter share this property, their actual filter behavior is different though.
One difference is that the Domain Transform does not assume any linear relationship between the colors of some Guiding Image and the desired filter output. In this sense, it is more general. On the other hand, this linearity assumption leads to the ability of the Guided Image Filter to transfer structures from the Guiding Image to the filter output. This ability currently makes it the best choice for alpha matting. The Domain Transform lacks this capability.
Furthermore, the Domain Transform is reportedly even faster than the Guided Image filter, however it iteratively uses 1-dimensional filters horizontally and vertically in an alternating fashion to approximate the desired 2-dimensional filtering. Although this may be sufficient in most situations, this is may not be good enough for more complex image structures.
In addition, the Domain Transform offers the choice of three different 1-dimensional filters with different filter responses. This offers flexibility, e.g. for filtering with or without edge sharpening.
Overall, the Domain Transform and the Guided Image Filter each have their advantages and disadvantages and the choice of the appropriate filter depends on the application.
In many applications a confidence map accompanies the data to be filtered. A confidence value provides a measure of how reliable an element in the input data is deemed to be. A method that improves the quality of the (Joint) Bilateral's filter output by taking confidence information associated with the filter input into account is disclosed in [4]. In European Patent Application EP13306804.9 a similar method is described for the Guided Image Filter.
It is an object to propose an improved solution for filtering an array of pixels with a Domain Transform filter.
According to one embodiment, a method for filtering an array of pixels comprises:
Accordingly, a computer readable storage medium has stored therein instructions enabling filtering an array of pixels, which, when executed by a computer, cause the computer to:
Also, in one embodiment an apparatus configured to filter an array of pixels comprises:
In another embodiment, an apparatus configured to filter an array of pixels comprises a processing device and a memory device having stored therein instructions, which, when executed by the processing device, cause the apparatus to:
The proposed approach extends the Domain Transform by taking into account additional weights. This Weighted Domain Transform can be used in a broad range of applications. In general, it is useful whenever noisy data that is accompanied by information that allows determining weights is to be smoothed, for example, but not limited to, in alpha matting, disparity estimation, and optical flow estimation. Because of its low computational complexity and its small memory requirements, the described method is well suited for mobile devices and for processing of high-resolution content.
In one embodiment, the weights associated with samples of the Domain Transform signal are determined from confidence values. In many applications a confidence map accompanies the data to be filtered and can be used without additional computationally intensive processing.
In one embodiment, the Domain Transform signal is filtered using a Weighted Normalized Convolution. In this case the confidence values may be directly used as weights. Though this approach already delivers improved results, the necessary computational overhead is rather small. Furthermore, the extension maintains the overall complexity of 0(n) of the filter.
In one embodiment, the Domain Transform signal is filtered using a Weighted Recursive Filtering. In this case, the weights associated with samples of the Domain Transform signal are preferably determined by comparing a confidence of a current sample with an average confidence value accumulated so far. This approach delivers further improved results, though with a somewhat higher computational overhead. Nonetheless, the overall complexity of 0(n) of the filter is again maintained.
In one embodiment, a filter feedback is increased when the confidence of the current sample is smaller than the average confidence accumulated so far, whereas the filter feedback is reduced when the confidence of the current sample is larger than the average confidence accumulated so far. In other words, when the confidence of the current sample is smaller than the average confidence accumulated so far, the smoothing effect of the filter is strengthened. Vice versa, in case it is larger, the smoothing effect of the filter is attenuated.
For a better understanding the proposed solution shall now be explained in more detail in the following description with reference to the figures. It is understood that the proposed solution is not limited to these exemplary embodiments and that specified features can also expediently be combined and/or modified without departing from the scope of the present invention.
One embodiment of an apparatus 20 configured to perform the method of
Another embodiment of an apparatus 30 configured to perform the method of
For example, the processing device 31 can be a processor adapted to perform the steps according to one of the described methods. In an embodiment said adaptation comprises that the processor is configured, e.g. programmed, to perform steps according to one of the described methods.
In the original paper on the Domain Transform [1] three alternative implementations are proposed for filtering the non-uniformly sampled, 1-dimensional signal obtained by the transformation, namely Normalized Convolution, Recursive Filtering, and Interpolated Convolution. According to the present principles, the filtering of the non-uniformly sampled, 1-dimensional signal obtained by the transformation is extended by taking into account additional weights. Details for the implementations using Normalized Convolution and Recursive Filtering shall be described in the following.
The Interpolated Convolution filters the transformed signal by integrating over the area obtained by linearly connecting the non-uniformly spaced samples. As a consequence, every sample affects its two adjacent areas so that it would be rather intricate to introduce a weighting for individual samples. On the other hand, adjacent samples with a large spacing in-between them result in a large area, at least as long as they are within the integration interval, so that they have a big influence on the filter output. This behavior somewhat contradicts the principal idea behind the Domain Transform that a large distance between samples indicates a loose coupling. Therefore, the use of additional weights has not been implemented for this filter.
In one embodiment, additional weights are taken into account during Normalized Convolution of the Domain Transform signal.
In the original version of the filter, the filter output is computed as
Here, p denotes the pixel to be filtered, q denotes a pixel in the neighborhood Ω of p, {circumflex over (p)}and {circumflex over (q)}denote their coordinates in the high dimensional space (one spatial and three colorimetric coordinates in this case), respectively, t({circumflex over (x)}) denotes the Domain Transform of {circumflex over (x)}, H denotes a simple box filter kernel whose coefficients are one for neighbors within the filter radius, and zero otherwise. I denotes the input signal to be filtered, J denotes the filtered signal, and finally Kp=Σq∈ΩH(t({circumflex over (p)}),t({circumflex over (q)})) is a normalization factor.
According to the proposed extension, the filter output is computed by taking additional weights associated with the filter input into account as follows:
In the above equation w(q) denotes the weight associated with pixel q and {tilde over (K)}p=Σq∈Ωw(q)H(t({circumflex over (q)})) is a modified normalization factor. In this embodiment confidence values are used as the additional weights w(q), which are, for example, received as additional input data.
In another embodiment, additional weights are taken into account during Recursive Filtering of the Domain Transformed signal. In the standard version of the filter, the filter output is computed as
J
n
=v
n
J
n−1(1−vn)ln.
Here, I denotes the input signal subject to the filtering, J denotes the filtered signal, n denotes the sample index, and vn=ad
According to the proposed extension, the filter output is computed as:
In this equation Kn=
n
=v
n
n-1+(1−vn)cn.
Consequently, the proposed filter adjusts the influence of the filter feedback based on the confidence of the current sample and the average confidence accumulated so far. In case the confidence of the current sample is smaller than the average confidence, the influence of the filter feedback is increased, thus the smoothing effect of the filter is strengthened. Vice versa, in case it is larger, the filter feedback is reduced, thus the smoothing effect of the filter is attenuated.
In case cn=const=
In the context of alpha matting, where edge-preserving smoothing filters are often used as a post-processing step for the refinement of an initial (noisy) alpha matte, the preferred filter currently is the Guided Image Filter. Nonetheless, in the following the effectiveness of the proposed extensions for this field of application shall be demonstrated by comparing the results from the standard Domain Transform and the extended version.
For the performance evaluation the challenging alpha matting benchmark training data set from www.alphamatting.com was used, for which ground-truth alpha mattes are available. A Global Matting approach as described in [5] was used to generate the initial alpha mattes. The Global Matting algorithm also computes confidences associated with the alpha values. These were used as the weights for the proposed approaches.
Regarding the performance of the Weighted Normalized Convolution,
Regarding the performance of the Weighted Recursive Filter,
[1] E. S. L. Gastal et al.: “Domain Transform for Edge-Aware Image and Video Processing”, ACM Transactions on Graphics—Proceedings of ACM SIGGRAPH 2011, Vol. 30 (2011), pp. 69:1-69:12.
[2] G. Petschnigg et al.: “Digital photography with flash and no-flash image pairs”, ACM Transactions on Graphics (TOG)—Proceedings of ACM SIGGRAPH 2004, Vol. 23 (2004), pp. 664-672.
[3] K. He et al.: “Guided Image Filtering”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35 (2013), pp. 1397-1409.
[4] J. Jachalsky et al.: “Confidence evaluation for robust, fast-converging disparity map refinement”, 2010 IEEE International Conference on Multimedia and Expo (ICME), pp. 1399-1404.
[5] K. He et al.: “A Global Sampling Method for Alpha Matting”, Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2049-2056.
Number | Date | Country | Kind |
---|---|---|---|
14306869.0 | Nov 2014 | EP | regional |