Matting technique originated in the film industry as early as late nineteenth century to replace the background in which the subject is located. The traditional approach in the film industry captured the subject directly against a single uniformly illuminated and uniformly colored background [See section titled “REFERENCES” R1,R2], and then replaced the background during the post processing. However, those steps required special hardware set up and the subject is restricted to be captured only in a controlled environment. Therefore, digital matting, a process of extracting and compositing the foreground and background objects directly from media, becomes significant technology for general image/film editing and production.
In digital matting, a matte is represented by the variable a that defines opacity of subject/foreground at each pixel. The observed image is represented as a convex combination of foreground and background layers. Most of the matting processes restrict a to be in the interval [0,1], for each pixel. The matting equation can be written as
I=αF+(1−α)B, Equation (1)
where I is the observed image, F and B are foreground and background colors respectively. This compositional model is the well-known matting equation. However, this matting equation is severely under-constrained. For a single channel image, one must estimate three unknown values α, F, B at each pixel where only one value I is known. Setting constraint on alpha values of certain pixels simplifies the problem. The constrained matting problem can be viewed as supervised image segmentation where certain α values are known. Training/ground truth of α values are represented using a trimap or a set of scribbles. Trimap, as the name indicates, specifies foreground and background pixels in addition to unknown pixels for which α, F, B need to be estimated. Scribbles are a sparse representation of trimap, where the user provides samples of foreground and background pixels in the form of scribbles, using a brush of fixed width. Even though trimaps are harder to produce manually than scribbles, trimaps can be produced automatically using initial bounding box by object detection processes, such as [See section titled “REFERENCES” R3].
There are various approaches for digital matting. Poisson matting [See section titled “REFERENCES” R7] assumes that the foreground and background colors are locally smooth. Gradient of matte is locally proportional to the gradient of the image. Matte is estimated by solving Poisson equation with Dirichlet boundary conditions extracted from the trimap. Random walk matting [See section titled “REFERENCES” R8] defines affinity between neighboring pixels as a Gaussian of the norm of distance in color space. Neighboring pixels with similar colors have higher affinities than those with dissimilar colors. Matte for a pixel can be viewed as the probability that a random walker from this pixel will reach a foreground pixel without passing through a background pixel. Closed-form weights [See section titled “REFERENCES” R4, R5] assume a color line model, where local foreground colors lie on a straight line, and background colors also form a line but not necessarily the same line for foreground in RGB space. Under this assumption, the matte is shown to be a linear combination of color components of neighboring pixels. Unlike random walk matting, the affinities depend on mean and variance of local color channels. Robust matting [See section titled “REFERENCES” R9] improves robustness of the process to trimap by sampling only a few representative foreground and background pixels.
Rhemann et al. [See section titled “REFERENCES” R10] shows that Levin et al.'s Closed Form Matting approach [See section titled “REFERENCES” R4, R5] outperforms other processes in most cases. Therefore, the closed-form formula [See section titled “REFERENCES” R4, R5] was commonly extended to other applications where a compositional model was applicable as well. For example, Hsu et al. [See section titled “REFERENCES” R11] formulates white balance as a matting problem in chromaticity space. Relative contributions of two light sources are estimated at each pixel using the matting Laplacian. This is later used to neutralize and relight the scene by controlling the light contribution from each source. Haze removal approach in [See section titled “REFERENCES” R12] estimates the medium transmission map using matting model. The transmission map is an exponential function of scene depth and hence depth map can also be derived. Despite its wide usage, Levin et al.'s approach [See section titled “REFERENCES” R4] assumes a simple color line model. However, natural images do not always satisfy the color line model. Singaraju et al. [See section titled “REFERENCES” R13] has analyzed the severity of ill-posedness for different scenarios of the color line model, and presented a new point-color model to fix those limitations. As should be apparent, am unfulfilled need exists for embodiments that do not rely on such a strong assumption, and can be easily extended to incorporate multiple features in addition to color, and therefore yields more accurate matting results.
All references listed below are hereby incorporated by reference in their entirety herein.
The following presents a simplified summary of the disclosure in order to provide a basic understanding to the reader. This summary is not an extensive overview of the disclosure and it does not identify key or critical elements of the embodiments disclosed nor delineate the scope of the disclosed embodiments. Its sole purpose is to present some concepts disclosed herein in a simplified form as a prelude to the more detailed description that is presented later.
Systems for manifold learning for matting are disclosed, with methods and processes for making and using the same. The embodiments disclosed herein provide a closed form solution for solving the matting problem by a manifold learning technique, Local Linear Embedding. The transition from foreground to background is characterized by color and texture variations, which should be captured in the alpha map. This intuition implies that neighborhood relationship in the feature space should be preserved in the alpha map. By applying Local Linear Embedding using the disclosed embodiments, the local image variations can be preserved in the embedded manifold, which is the resulting alpha map. Without any strong assumption, such as color line model, the disclosed embodiments can be easily extended to incorporate other features beyond RGB color features, such as gradient and texture information.
The accompanying drawings, which are included as part of the present specification, illustrate the presently preferred embodiments and together with the general description given above and the detailed description of the preferred embodiments given below serve to explain and teach the principles of the disclosed embodiments.
It should be noted that the figures are not drawn to scale and that elements of similar structures or functions are generally represented by like reference numerals for illustrative purposes throughout the figures. It also should be noted that the figures are only intended to facilitate the description of the preferred embodiments of the present disclosure. The figures do not illustrate every aspect of the disclosed embodiments and do not limit the scope of the disclosure.
Systems for manifold learning for matting are disclosed, with methods and processes for making and using the same.
In the following description, for purposes of explanation, specific nomenclature is set forth to provide a thorough understanding of the various inventive concepts disclosed herein. However it will be apparent to one skilled in the art that these specific details are not required in order to practice the various inventive concepts disclosed herein.
Some portions of the detailed description that follow are presented in terms of processes and symbolic representations of operations on data bits within a computer memory. These process descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. A process is here, and generally, conceived to be a self-consistent sequence of sub-processes leading to a desired result. These sub-processes are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or “locating” or “finding” or the like, may refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other such information storage, transmission, or display devices.
The disclosed embodiments also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but not limited to, any type of disk, including floppy disks, optical disks, CD-ROMS, and magnetic-optical disks, read-only memories (“ROMs”), random access memories (“RAMs”), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
The processes and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method sub-processes. The required structure for a variety of these systems will appear from the description below. In addition, the disclosed embodiments are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the disclosed embodiments.
In some embodiments an image is a bitmapped or pixmapped image. As used herein, a bitmap or pixmap is a type of memory organization or image file format used to store digital images. A bitmap is a map of bits, a spatially mapped array of bits. Bitmaps and pixmaps refer to the similar concept of a spatially mapped array of pixels. Raster images in general may be referred to as bitmaps or pixmaps. In some embodiments, the term bitmap implies one bit per pixel, while a pixmap is used for images with multiple bits per pixel. One example of a bitmap is a specific format used in Windows that is usually named with the file extension of .BMP (or .DIB for device-independent bitmap). Besides BMP, other file formats that store literal bitmaps include InterLeaved Bitmap (ILBM), Portable Bitmap (PBM), X Bitmap (XBM), and Wireless Application Protocol Bitmap (WBMP). In addition to such uncompressed formats, as used herein, the term bitmap and pixmap refers to compressed formats. Examples of such bitmap formats include, but are not limited to, formats, such as JPEG, TIFF, PNG, and GIF, to name just a few, in which the bitmap image (as opposed to vector images) is stored in a compressed format. JPEG is usually lossy compression. TIFF is usually either uncompressed, or losslessly Lempel-Ziv-Welch compressed like GIF. PNG uses deflate lossless compression, another Lempel-Ziv variant. More disclosure on bitmap images is found in Foley, 1995, Computer Graphics Principles and Practice, Addison-Wesley Professional, p. 13, ISBN 0201848406 as well as Pachghare, 2005, Comprehensive Computer Graphics: Including C++, Laxmi Publications, p. 93, ISBN 8170081858, each of which is hereby incorporated by reference herein in its entirety.
In typical uncompressed bitmaps, image pixels are generally stored with a color depth of 1, 4, 8, 16, 24, 32, 48, or 64 bits per pixel. Pixels of 8 bits and fewer can represent either grayscale or indexed color. An alpha channel, for transparency, may be stored in a separate bitmap, where it is similar to a greyscale bitmap, or in a fourth channel that, for example, converts 24-bit images to 32 bits per pixel. The bits representing the bitmap pixels may be packed or unpacked (spaced out to byte or word boundaries), depending on the format. Depending on the color depth, a pixel in the picture will occupy at least n/8 bytes, where n is the bit depth since 1 byte equals 8 bits. For an uncompressed, packed within rows, bitmap, such as is stored in Microsoft DIB or BMP file format, or in uncompressed TIFF format, the approximate size for a n-bit-per-pixel (2n colors) bitmap, in bytes, can be calculated as: size width×height×n/8, where height and width are given in pixels. In this formula, header size and color palette size, if any, are not included. Due to effects of row padding to align each row start to a storage unit boundary such as a word, additional bytes may be needed.
In computer vision, segmentation refers to the process of partitioning a digital image into multiple regions (sets of pixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images.
The result of image segmentation is a set of regions that collectively cover the entire image, or a set of contours extracted from the image. Each of the pixels in a region share a similar characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristic(s).
Several general-purpose algorithms and techniques have been developed for image segmentation. Exemplary segmentation techniques are disclosed in The Image Processing Handbook, Fourth Edition, 2002, CRC Press LLC, Boca Raton, Fla., Chapter 6, which is hereby incorporated by reference herein for such purpose. Since there is no general solution to the image segmentation problem, these techniques often have to be combined with domain knowledge in order to effectively solve an image segmentation problem for a problem domain.
Throughout the present description of the disclosed embodiments described herein, all steps or tasks will be described using this one or more embodiment. However, it will be apparent to one skilled in the art, that the order of the steps described could change in certain areas, and that the embodiments are used for illustrative purposes and for the purpose of providing understanding of the inventive properties of the disclosed embodiments.
The embodiments disclosed herein provide a closed form solution for solving the matting problem by a manifold learning technique, Local Linear Embedding (LLE) [See section titled “REFERENCES” R14]. The main idea of applying LLE concept in matting problem is the transition from foreground to background is characterized by color and texture variations, which should be captured in the alpha map. This intuition implies that neighborhood relationship in the feature space should be preserved in the alpha map. By applying LLE using the disclosed embodiments, the local image variations can be preserved in the embedded manifold, which is the resulting alpha map. Without any strong assumption, such as color line model [See section titled “REFERENCES” R4, R5], the disclosed embodiments can be easily extended to incorporate other features beyond RGB color features, such as gradient and texture information.
A closed-form solution may be derived for the alpha matting problem by manifold learning. The assumption is that all the features of foreground and background layers extracted from an input image I with n pixels are locally smooth, and thereby can be inherently embedded in some smooth manifold. The solution αi for each image pixel Ii is the one dimensional embedding approximating the inherent foreground and background manifolds. Let fi=[ƒi1, . . . ƒim]T be the feature vector associated with each image pixel i, where m equals the number of features. Define matrix F=[f1, . . . , fn] and vector α=[α1, . . . , αn]T. Assuming that the inherent manifolds of foreground and background layers are well-sampled, we can expect that the feature vector fi of each image pixel and its neighbors lie closely on a locally linear patch of the inherent manifold. Consider an embedding that takes a smooth manifold F to α. One goal is to find a closed form solution of this nonlinear inherent embedding.
By introducing the approach of Local Linear Embedding (LLE) [See section titled “REFERENCES” R14], we can characterize the local geometry of the each feature vector fi by a set of linear coefficients reconstructing the feature vector itself from its neighbors. Denote wi=[wi1 . . . , Win]T where wij represents the reconstruction weight of neighboring pixel j to pixel i, i.e., {circumflex over (ƒ)}i=Σjwijƒj=Fwi. Define wij=0 when pixel j is not in the neighborhood of pixel i. Consider the reconstruction error εi for each individual feature fi as a vector in the following form:
εi={circumflex over (ƒ)}i−ƒi=Fwi−ƒi Equation (2)
Under the constraint Σjwij=1Twi=1, we can rewrite this equation as
εi=Fwi−ƒi(1Twi)=(F−ƒi1T)wi Equation (3)
Then the cost function of the reconstruction error for each pixel i can be written as:
∥εi∥2=εiTεi=wiTCiwi Equation (4)
where Ci=(F−1FiT)(F−1FiT)T defines the local covariance of ƒi.
The solution of each wi can be calculated by minimizing the reconstruction error of Equation 3 in the following closed form [See section titled “REFERENCES” R14]:
wi can be also computed in a more efficient way by solving the linear system, Ciwf=1, and then normalize the weights wi to sum to one.
It can be shown that the embedding vectors a can be determined by minimizing the summation of the cost function Equation 4 over all αi with solved W=[w1, . . . , wn] and the constraint 1Twi=1 in the following:
where I(n) represents n×n identity matrix.
Note that Equation 6 is equivalent to the quadratic cost function of matting Laplacian commonly used in many previous approaches [See section titled “REFERENCES” R4, R5, R8, R15]. Since (I(n)−W)(I(n)−W)T is also a sparse symmetric matrix, we can define a new matting Laplacian matrix by assigning L=(I(n)−W)(I(n)−W)T. There are two ways to solve Equation 6. The first approach presented by Grady [See section titled “REFERENCES” R8] partitions α into two disjoint sets, αk and αu, where k and u represents the index of known and unknown pixels from user input, such as trimap or scribbles. Also L can be re-written as
Then the global minimum can be found by taking the first derivative of Equation 6 with respect to αu and setting it to zero. This step yields to solve the following sparse linear system:
Luαu=−BTαk Equation (7)
The second approach [See section titled “REFERENCES” R4, R15] explicitly includes the constraint of user input by adding a regularization term in the cost function:
where D is the diagonal matrix, and those diagonal elements are some constant value d for constrained pixels and zeros for all the others. α* contains the alpha values specified by trimap or scribbles and zeros for all other pixels. The global minimum can be found again by taking the first derivative of Equation 8 with respect to all α and setting it to zero. This step yields to solve a larger sparse linear system in the following:
α=((I(n)−W)(I(n)−W)T+D)−1Dα*. Equation (9)
An advantage of the first approach is that only small portion of L needs to be computed and the size of the linear system to be solved is smaller. However, the second approach utilizes the full Laplacian matrix and might yield more accurate matting results.
The following addresses the differences between our proposed closed form solution and others in previous work [See section titled “REFERENCES” R8, R4, R5, R15]. The main differences among those approaches are how to construct the Laplacian matrix. Random Walk [See section titled “REFERENCES” R8] and Normalized Cuts [See section titled “REFERENCES” R16] applied the Gaussian weights to construct the affinity matrix by:
{tilde over (w)}ij=e−∥I
and then the matting Laplacian matrix is defined by L=D−{tilde over (W)}, where D is a diagonal matrix and dii=Σj{tilde over (w)}ij. The major problem of this affinity is that d is a global constant, which does not depend on local variance of each image, and therefore needs to be handpicked carefully for different images in order to generate qualified results.
Levin et al.'s approaches [See section titled “REFERENCES” R4, R5] assume that for each small patch in input image, the foreground and background layers lie on lines in RGB color space. There are some shortcomings of this strong assumption. First, natural images do not always satisfy the color line model. Second, it is not clear whether extending the line model to other image features, such as gradient or texture features, will work. Learning based matting [See section titled “REFERENCES” R15] performs a semi-supervised learning based approach to construct a different Laplacian matrix. However, this approach still assumes that the linear relationship between α and RGB color for each pixel. Therefore, the learning based matting cannot generalize well when the input image does not satisfy the assumption.
The disclosed embodiments do not assume any linear relationship between α and RGB color for each pixel. α is simply the embedding results of manifold learning from the feature space. Therefore, the disclosed embodiments do not suffer from the situation when an input image violates the assumption of linearity. In addition, the disclosed embodiments can be easily extended to incorporate multiple features, such as gradient and texture information, and therefore yields more accurate matting results. The following shows the comparisons on synthetic data in order to illustrate the aforementioned aspects of those well-known approaches. Levin et al.'s approach [See section titled “REFERENCES” R4] may be referred to as Closed Form Matting, and the disclosed embodiments may be referred to as Manifold Learning Matting (MLM).
Note that bleeding is isotropic for Random Walk and MLM using only RGB color features. Closed Form Matting results in anisotropic bleeding biased along the off-diagonal. Bleeding is the least and anisotropically localized along the main diagonal for MLM using RGB and gradient of luminance as features.
To analyze proximity bias, fix γB=10 pixels and vary γF.
To analyze the effect of gap, fix γF=γB=10 pixels and vary the gap.
As desired, the method manifold learning for matting may be executable on a conventional general-purpose computer (or microprocessor) system. Additionally, or alternatively, the method manifold learning for matting may be stored on a conventional storage medium for subsequent execution via the general-purpose computer.
A data storage device 1027, such as a conventional magnetic disk or optical disk and its corresponding drive, is coupled to computer system 1000 for storing information and instructions. The data storage device 1027, for example, can comprise the storage medium for storing the method manifold learning for matting for subsequent execution by the processor 1010. Although the data storage device 1027 is described as being magnetic disk or optical disk for purposes of illustration only, the method manifold learning for matting can be stored on any conventional type of storage media without limitation.
Architecture 1000 is coupled to a second I/O bus 1050 via an I/O interface 1030. A plurality of I/O devices may be coupled to I/O bus 1050, including a display device 1043, an input device (e.g., an alphanumeric input device 1042 and/or a cursor control device 1041).
The communication device 1040 is for accessing other computers (servers or clients) via a network. The communication device 1040 may comprise a modem, a network interface card, a wireless network interface, or other well known interface device, such as those used for coupling to Ethernet, token ring, or other types of networks.
Foregoing described embodiments of the invention are provided as illustrations and descriptions. They are not intended to limit the invention to precise form described. In particular, it is contemplated that functional implementation of invention described herein may be implemented equivalently in hardware, software, firmware, and/or other available functional components or building blocks, and that networks may be wired, wireless, or a combination of wired and wireless. Other variations and embodiments are possible in light of above teachings, and it is thus intended that the scope of invention not be limited by this detailed description, but rather by the claims following.
This application claims priority to U.S. provisional patent Application No. 61/395,078, filed May 10, 2010, which is hereby incorporated by reference herein in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5757380 | Bardon et al. | May 1998 | A |
5832055 | Dewaele | Nov 1998 | A |
5881171 | Kinjo | Mar 1999 | A |
5930391 | Kinjo | Jul 1999 | A |
6021221 | Takaha | Feb 2000 | A |
6205260 | Crinon et al. | Mar 2001 | B1 |
6222637 | Ito et al. | Apr 2001 | B1 |
6421459 | Rowe | Jul 2002 | B1 |
6429590 | Holzer | Aug 2002 | B2 |
6549646 | Yeh et al. | Apr 2003 | B1 |
6650778 | Matsugu et al. | Nov 2003 | B1 |
6687397 | DeYong et al. | Feb 2004 | B2 |
6697502 | Luo | Feb 2004 | B2 |
6757442 | Avinash | Jun 2004 | B1 |
6834127 | Yamamoto | Dec 2004 | B1 |
6940526 | Noda et al. | Sep 2005 | B2 |
6965693 | Kondo et al. | Nov 2005 | B1 |
7162082 | Edwards et al. | Jan 2007 | B2 |
7315639 | Kuhnigk | Jan 2008 | B2 |
7352900 | Yamaguchi et al. | Apr 2008 | B2 |
7391906 | Blake et al. | Jun 2008 | B2 |
7436981 | Pace | Oct 2008 | B2 |
7502527 | Momose et al. | Mar 2009 | B2 |
7526131 | Weber | Apr 2009 | B2 |
7613355 | Hirano | Nov 2009 | B2 |
7676081 | Blake et al. | Mar 2010 | B2 |
7711146 | Tu et al. | May 2010 | B2 |
7738725 | Raskar et al. | Jun 2010 | B2 |
7822272 | Lei | Oct 2010 | B2 |
7995841 | Lin et al. | Aug 2011 | B2 |
8041081 | Hu | Oct 2011 | B2 |
8094928 | Graepel et al. | Jan 2012 | B2 |
8126269 | Eggert et al. | Feb 2012 | B2 |
8149236 | Nakao et al. | Apr 2012 | B2 |
8165407 | Khosla et al. | Apr 2012 | B1 |
8175369 | Hong et al. | May 2012 | B2 |
8194974 | Skirko | Jun 2012 | B1 |
8462224 | Zhang et al. | Jun 2013 | B2 |
20020071131 | Nishida | Jun 2002 | A1 |
20020076100 | Luo | Jun 2002 | A1 |
20020154828 | Kobilansky et al. | Oct 2002 | A1 |
20020164074 | Matsugu et al. | Nov 2002 | A1 |
20030095701 | Shum et al. | May 2003 | A1 |
20030103682 | Blake et al. | Jun 2003 | A1 |
20030144585 | Kaufman et al. | Jul 2003 | A1 |
20040004626 | Ida et al. | Jan 2004 | A1 |
20040131236 | Chen et al. | Jul 2004 | A1 |
20040179233 | Vallomy | Sep 2004 | A1 |
20040212725 | Raskar | Oct 2004 | A1 |
20040264767 | Pettigrew | Dec 2004 | A1 |
20050008248 | Wang | Jan 2005 | A1 |
20050027896 | Mairs et al. | Feb 2005 | A1 |
20050196024 | Kuhnigk | Sep 2005 | A1 |
20050259280 | Rozzi | Nov 2005 | A1 |
20050271279 | Fujimora et al. | Dec 2005 | A1 |
20060029275 | Li et al. | Feb 2006 | A1 |
20060233448 | Pace et al. | Oct 2006 | A1 |
20060245007 | Izawa et al. | Nov 2006 | A1 |
20060251322 | Palum et al. | Nov 2006 | A1 |
20060269111 | Stoecker et al. | Nov 2006 | A1 |
20070013813 | Sun et al. | Jan 2007 | A1 |
20070247679 | Pettigrew et al. | Oct 2007 | A1 |
20070297645 | Pace et al. | Dec 2007 | A1 |
20080089609 | Perlmutter et al. | Apr 2008 | A1 |
20080123945 | Andrew et al. | May 2008 | A1 |
20080137979 | Perlmutter et al. | Jun 2008 | A1 |
20080292194 | Schmidt et al. | Nov 2008 | A1 |
20090010539 | Guarnera et al. | Jan 2009 | A1 |
20090060297 | Penn et al. | Mar 2009 | A1 |
20090087089 | Hu | Apr 2009 | A1 |
20090129675 | Eggert et al. | May 2009 | A1 |
20090190815 | Dam et al. | Jul 2009 | A1 |
20090251594 | Hua et al. | Oct 2009 | A1 |
20090315910 | Kambhamettu et al. | Dec 2009 | A1 |
20100061628 | Yamada | Mar 2010 | A1 |
20100150423 | Hong et al. | Jun 2010 | A1 |
20100171740 | Andersen et al. | Jul 2010 | A1 |
20100232685 | Yokokawa et al. | Sep 2010 | A1 |
20100316288 | Ip et al. | Dec 2010 | A1 |
20110096228 | Deigmoeller et al. | Apr 2011 | A1 |
Number | Date | Country |
---|---|---|
2005339522 | Dec 2005 | JP |
Entry |
---|
Vezhnevets V. et al., “A Comparative Assessment of Pixel-Based Skin Detection Methods”, Technical Report, Graphics and Media Laboratory, 2005. |
Leinhart R. et al., “Empirical analysis of detection cascades of boosted classifiers for rapid object detection”, MRL Technical Report , Intel Labs, May 2002. |
Wang et al. “Soft scissors: an interactive tool for real time high quality matting”, ACM Transactions on Graphics, Proceedings of the SIGGRAPH Conference, vol. 26, issue 3, Jul. 2007. |
Foley, Computer Graphics: Principles and Practice, Addison-Wesley Professional, p. 13, 1995). |
Russ, Chapter 6—Segmentation and Thresholding, The Image Processing Handbook, 4th Ed., pp. 333-381, 2002. |
Grady et al., Random Walks for Interactive AlphaMatting, Visualization, Imaging, and Image Processing: Fifth IASTED International Conference Proceedings, 2005. |
Shi J. et al., Normalized Cuts and Image Segmentation, IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 22, No. 8, pp. 888-905, Aug. 2000. |
Viola P. et al., Rapid object detection using a boosted cascade of simple features, Proceedings of Computer Vision and Pattern recognition, vol. 1, pp. 511-518, 2001. |
Levin A. et al., A Closed-Form Solution to Natural Image Matting, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 30, No. 2, Feb. 2008. |
Beyer W., Traveling matte photography and the blue screen system. American Cinematographer. The second of a fourpart series., 1964. |
He K. et al., Single image haze removal using dark channel prior. Proc. IEEE Conf. on Comp. Vision and Patt. Recog., 2009. |
Hsu E. et al., Light mixture estimation for spatially varying white balance. Proceedings of SIGGRAPH, 2008. |
Levin A. et al., Spectral matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(10), 2008. |
R. Fielding. The technique of special effects cinematography, third edition. Focal/Hastings House, pp. 220-243, 1972. |
Rhemann C. et al., A perceptually motivated online benchmark for image matting. Proc. IEEE Conf. on Comp. Vision and Patt. Recog., 2009. |
Roweis S. T. et al., Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2000. |
Singaraju D. et al., New appearance models for natural image matting. Proc. IEEE Conf. on Comp. Vision and Patt. Recog., 2009. |
Sun J. et al., Flash matting. Proceedings of SIGGRAPH, 25(3): pp. 772-778, 2006. |
Sun J. et al., Poisson matting. Proceedings of SIGGRAPH, 23(3): pp. 315-321, 2004. |
Tenebaum J. et al., A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500): pp. 2319-2323, 2000. |
Wang J. et al., Optimized color sampling for robust matting. Proc. IEEE Conf. on Comp. Vision and Patt. Recog., 2007. |
Zheng Y. et al., Learning based matting. Int. Conf. on Computer Vision, 2009. |
Pachghare, Comprehensive Computer Graphics: Including C++, Laxmi Publications, p. 93, 2005. |
Bin Abdul Rahman, KC Wei, RGB-H-CbCr Skin-Colour-Model—for—Human Face Detection, 2006. |
Brand J. et al., A Comparative Assessment of Three Approaches to Pixel-level skin-detection, 2000. |
Koschan A. et al., Digital Color Image Processing, ISBN 978-0-470-14708-5, 2007. |
Lipowezky et al: “Using integrated color and texture features for automatic hair detection”, IEEE, 2008. |
Boykov et al., Graph Cuts and Efficient N-D Image Segmentation, International Journal of Computer Vision 70(2), 109-131,2006. |
Chuang et al., A Bayesian Approach to Digital Matting, Proc. of IEEE CVPR, pp. 264-271, 2001. |
Number | Date | Country | |
---|---|---|---|
20110274344 A1 | Nov 2011 | US |
Number | Date | Country | |
---|---|---|---|
61395078 | May 2010 | US |