1. Field of the Invention
This invention relates generally to a method, system, and device for a chromakey function used in motion picture processing or composition. Such a chromakey function can provide a clean key signal and clean clipped-out foreground pictures without background colors. More specifically, the present invention pertains to a method, system and device for automatic determination of parameters which are used in a chromakey function as well as implementation of such method based on hardware means and software means. The present invention is related to technologies referenced in the List of References section of the present specification, and in brackets throughout the present specification, the entire disclosures of all of which are hereby incorporated by reference herein.
2. Discussion of the Background
Chromakey is a process used for special effects in motion picture and video industries where objects of interest against a backing color are shot into an electronic image, then this shot or image is used to generate a clipping signal called “Alpha”, “Key” or “Matte” in different application fields. It ranges from zero to unit and shows regions of the foreground objects, the background colors, and the transition boundaries between the objects and the background color. With such a clipping signal, a new picture can replace the backing color as a new background. The entire replacement process is technically called composition. The history and applications about chromakey and composition are well known and described in the prior art referenced to [1], [2], [3], [4], [5], [6] and [7], and in the List of References.
Solutions for a chromakey function in the prior art focuses on two issues:
Issue 1: pre-defines two-dimensional (2D) areas or three-dimensional (3D) volumes in a color space for background colors and foreground objects. Between foreground and background areas in the color space are transition color areas contributed by foreground objects' boundaries, semi-transparent objects, and shadows. Usually, the two geometrical shapes are not overlapped.
Issue 2: determines the parameters for the two geometrical shapes, such as the centroid of convex closed curves or volumes, sizes of shapes, alpha values between surfaces of two volumes, and so on. In this invention, the terms nominal backing color or reference background color refer to the centroid of a geometrical shape for the background colors.
Many patents and the literature can be found for Issue 1 on how to make better geometric shapes to classify and identify the color space, referenced to Appendix. A recent trend shows that instead of using geometrical shapes, people started to use more complex mathematical models in statistics to classify and identify color clusters. These mathematical models include Bayesian inference or MAP (maximum a posteriori) and PCA (principal component analysis), referenced to [8], [9], [10], [11], [12]. Although we see these methods published in the academic field and applied for still pictures or non-linear edition, we believe that these complex models will be applied to the motion picture industry with progress of semiconductor technology and computing devices in the future.
On the other hand, we barely find out the literature and patents on how to determine parameters for those geometric shapes, such as the nominal backing color and its range. Most of commercial products rely on user interface (UI) devices to implement Issue 2 as shown in
1) Sampling at different areas for the nominal backing color causes different results if background colors are not uniform. This is because the alpha value for each pixel is calculated in terms of this nominal backing color. If the nominal backing color changes, alpha values of pixels also changes. Usually, samples on the bright areas with high saturation give better results.
2) Once the nominal backing color is determined, it is hardly modified on the fly, because moving foreground objects on the fly could occupy any spot on background and do not give users enough time to pick proper background samples. When background colors change due to lighting, the predetermined nominal backing color certainly causes changes of alpha in the background area. For example, if a camera aims at a bright part of a large-area backdrop and takes samples for the nominal backing color but later it moves to a dark part of the backdrop, the entire background would become dark. This is because the dark background area appears like shadows in terms of bright reference color.
U.S. Pat. No. 5,774,191 discloses a method for determination of chromakey range based on histogram of one or more color component. It has following disadvantages.
1) Only one-dimensional histograms are used. Theoretically, a chromakey function is based on the 2D chroma vector or 3D color space, and hence color ranges must be determined by 2D or 3D histograms. However, 2D or 3D histograms require large memory devices to collect statistical data, which dramatically increases the cost of a chromakey function.
2) Due to one-dimensional histogram used, only rectangle shape or cubical shape is used to define the backing color region, which does not correctly separate foreground colors and background colors in many real cases.
U.S. Pat. No. 5,838,310 also discloses a method for determination of a key color range based on three histograms from color components R, G, and B respectively. Since this patent uses one-dimensional histogram, as U.S. Pat. No. 5,774,191 does, it has the same disadvantage as the above. Different from the prior art, U.S. Pat. No. 5,838,310 calculates histograms only on those pixels which are identified as the background colors. However, this method requires users to define background regions on a picture during initialization stage, and the optimum range for the key color depends on how to choose the background regions. Moreover, it requires an extra storage called plane memory to identify background pixels.
Therefore, there is a need for a method, system, and device that addresses the above and other problems with conventional systems and methods. Accordingly, in exemplary aspects of the present invention, a method, system, and device are provided for generating a clean clipping signal a for a chromakey function or a video composition, including identifying background colors formed by a solid color background, shadows cast by still and moving subjects, a non-uniform reflection caused by spot lighting and non-flat backdrop or flaw wall, and translucent foreground objects, with a 3D volume in a 3D color space; determining parameters defining same by using a dirty alpha α; generating a clean clipping signal αshd for background colors, and a clean clipping signal αtsl for translucency colors; identifying foreground colors formed by the still and moving subjects with a 3D volume in a 3D color space; classifying colors into transition colors; and generating an area map for mapping each pixel into background, shadow, translucent, foreground, and transition areas. Advantageously, the exemplary process for identification and classification produces a side effect called an area map, which labels each pixel with one of the preceding areas and becomes useful for a user interface.
Still other aspects, features, and advantages of the present invention are readily apparent from the following detailed description, simply by illustrating a number of exemplary embodiments and implementations, including the best mode contemplated for carrying out the present invention. The present invention also is capable of other and different embodiments, and its several details can be modified in various respects, all without departing from the spirit and scope of the present invention. Accordingly, the drawings and descriptions are to be regarded as illustrative in nature, and not as restrictive.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
a is a 3D graph that shows a 3D shape defined by the present invention.
b is a 2D graph that shows a projection of the 3D shape in
a is a 3D graph that shows an alternative 3D volume to confine translucency colors.
b is a 2D graph that shows a projection of the 3D shape in
a-b are block diagrams for implementing functions for forming shadow alpha signals and for forming translucency alpha signals.
a-d are graphs that show the results produced by the present invention.
a-d are graphs that show the results for the picture in
a-d are graphs that show the results for the 3rd picture in
A method, system, and device for automatically determining a nominal backing color and a range thereof are described. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It is apparent to one skilled in the art, however, that the present invention can be practiced without these specific details or with equivalent arrangements. In some instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.
The present invention describes a novel method, system and device to solve the problems posed in the prior art. The exemplary method automatically determines the nominal backing color and its range. It need not employ collecting samples through user interface devices from time to time; instead, it can employ users giving a rough range of background colors by choosing one color from 3 principal colors (red, green, blue) and 3 complementary colors (yellow, cyan, magenta) as the initial color one time. Since these six colors are very distinctive in common sense, users can easily decide which color is a good match to the real background. Once a rough range of background color is determined, the exemplary method can automatically find out the optimal nominal backing color and its range; and also it can monitor the changes of background colors and recalculate these parameters. In other words, the exemplary method is capable of recalculating parameters on the fly because the method collects image data from an entire picture or a large area of picture instead of fixed spots on a picture in the prior art. From the collected data, parameters are derived and used to determine the nominal backing color and its range on the fly. The exemplary data collection is implemented by accumulations of various weighed color components and averages of accumulated data. This invention also describes means of how to implement the exemplary method.
The basic idea behind this invention is motivated by the observation of color distribution in 2D and 3D color space from many chromakey pictures.
The present invention introduces a term shadow axis which is a semi line starting from the origin—black color 503 and pointing to the nominal backing color in the 3D color space such as YUV domain as shown in
Similar to the shadow axis, another term translucency axis is introduced to represent translucent areas' colors. The translucency axis 504 shown in
In the real world, a single translucency axis may not be enough to represent entire transparent/semi-transparent objects in a foreground. Each transparent/semi-transparent object may have different colors rather than non-color material. Therefore, the white light color 505 in
Different from the prior art where background colors only include the solid background color, the present invention considers background colors contributed by shadows, non-uniform light reflection, and transparent/semi-transparent objects. This is because 1) the background color cluster shown in region 1 of
Based on the facts described above, the present invention implements a simple and effective method to classify background and foreground as following steps.
1) A two-semi-ellipse cylindrical segment is constructed to define the background region, as shown in
In an exemplary implementation, the translucent area can be further confined. Moreover, an alternative way to identify translucency colors is to confine the translucent area to other shapes, such as a wedged cylinder instead of the cylinder 702. An example of such alternative is shown in
2) After identifying and cutting out background colors by using a two-semi-ellipse cylindrical segment, foreground and transition areas are separated by an arbitrary cylinder, and easily processed in 2D chromatic plane, as shown in
The exemplary steps are expanded as follows.
In Equation 1a, the denominator is constant when the nominal backing color is found out.
Since ykey, ukey, and vkey are constant, after finding optimal nominal backing color, (kshdykey) (kshdukey) and (kshdvkey) become constants and are easily pre-calculated.
Given an observed color {right arrow over (C)}bg=(ybg,ubg,vbg) within the translucent area, alpha αtsl generated for the observed color is given by:
where Ktsl is a scale.
Similar to the shadow colors, Equation 1 c is simplified as:
αtsl=ktslytsl(ybg−ywht+ktslutsl(ubg−uwht)+ktslvtsl(vbg−vwht) (1 d)
where
is called translucency factor which can emphasize or de-emphasize translucency effects.
Since ytsl, utsl, and vtsl, are constant after finding optimal nominal backing color, (ktslytsl)(ktslvtsl) and (ktsl/vtsl) become constants and easily are pre-calculated.
The next method used in this invention is to automatically determine the parameters for the geometrical shapes defined by the preceding method. The present invention introduces another term bounding rectangle or bounding box which confines the background color to a rectangle area. By finding the bounding rectangle, we can easily find two semi-ellipses as shown in
The present invention for determination of range stands on the following facts and observations.
1) The nominal backing color is most likely a background color with high saturation. In other words, a color on the shadow axis and far away from the origin has a high likelihood of being the nominal backing color. If given an initial estimation of alpha where alpha equal to unit represents 100% background, a color with a high value of alpha has a high likelihood of being background color.
2) Given a initial estimation of alpha, a color with a high value of alpha and far away from the shadow axis has a high likelihood of being a background color on an edge of the bounding box.
Fact 1 directs us to find out the backing color and Fact 2 helps us to find out the bounding box.
Situation 1: occurs as recursive calculation during initialization. In theory, when initial parameters are sent to the closed loop, the recursive calculation is implemented until the results are convergent. In an exemplary implementation, 3 times repeated calculation can be employed and results are quite satisfactory.
Situation 2: occurs as dynamic data collection after initialization and during chromakey processing on the fly. Every a frame or multiple frames, parameters are updated.
Initialization 1501 in
One example for implementation of the low-cost dirty alpha estimator 1401 is described as follows. Given the nominal backing color with chroma vector (ukey,vkey), any input color with chroma vector (u,v) in UV plane is transferred into (x,z) in XZ plane by rotation with angle θkey.
And the norm of the nominal backing color is:
Nkey=∥{right arrow over (C)}key∥={square root}{square root over (u2+v2)}. (4)
And then:
Elements 1402, 1403 and 1404 implement the following functions.
1) Calculation for the Nominal Backing Color {right arrow over (C)}key
We first define a saturation-related measure: the norm difference Δk between an input chromatic vector {right arrow over (C)} and the initial nominal backing color vector {right arrow over (C)}k0.
Δk=∥{right arrow over (C)}∥−∥{right arrow over (C)}key0∥ (6)
The weight generator 1402 forms a weight wkey when Δk>0,
wkey=α×Δk, (7)
Then the accumulator 1403 collects weighed data and weights pixel by pixel. Finally, the updater 1404 finds the new nominal backing color by:
If Σwkey=0, {right arrow over (C)}key is {right arrow over (C)}avg shown in Equation 18.
2) Calculation for the Maximum Departure from the Nominal Backing Color Along the Shadow Axis
When Δk<0 in Equation 6, we define a weight wleft
wleft=αleft×Δk (9)
The maximum departure {right arrow over (C)}left due to shadows is calculated by:
3) Calculation for the Maximum Departure from the Shadow Axis
To measure the departure from the shadow axis, we define a distance δ as the distance from an input chroma vector {right arrow over (C)} to the nominal shadow axis {right arrow over (C)}key.
An easy way to calculate δ is to use the chromatic vector in XZ plane as shown in 1). From Equation 3, we immediately have:
δ=z. (12)
When δ>0, we define a weight for maximum departure above the shadow axis as:
wup=α×δ (13)
The maximum departure is found by using 1402, 1403 and 1404:
When δ<0, we define a weight for maximum departure below the shadow axis as:
wdn=α×δ, (15)
4) Calculation for Average of Chroma Components
Another set of parameters is the statistical means of chroma vectors.
5) Calculation for the Bounds of the Bounding Box
The bounding box is defined by four parameters as shown in
6) Calculation for the Offset of Tilted Plane
Referenced to
When Δy>0, the weight is defined as:
wy=α×Δy. (24)
The offset is set as:
7) Calculation for the Two-Semi-Ellipse Cylindrical Segment
There lookup tables are generated for the wedged cylinder by using El, Eb, Et, Eb, S, and {right arrow over (C)}key.
According to the exemplary embodiments:
We show herein three of our experimental results from the exemplary method. The first experiment works on a picture shown in
In the initial state, a user determines one of six principal colors. Due to a blue background in
c shows the second time calculation result where the nominal backing color is found out and the range of background colors are confined by two semi-ellipses with the same center.
d shows the third time calculation result where the closed boundary curve restricts the background cluster tightly. The second experiment works on a picture in
The third experiment works on a picture shown in
Further exemplary embodiments can be used to produce an area map which displays different areas classified by the 3D volumes. With aid of the area map, a chromakey operator can quickly tune the parameters which are automatically determined by using Equations 2-25 when the automatic determination cannot achieve perfect results. An area map uses multiple colors to label different areas on a foreground chromakey picture. One embodiment of the present invention uses five colors to label 1) solid backing color area, 2) foreground object areas, 3) shadow area, 4) translucent area, 5) transition area outside the four preceding areas.
The exemplary embodiment of implementing the area map is to label each pixel with one of the five areas during the process of identification and classification with 3D volumes.
Although the exemplary area map uses five colors for five areas, in further exemplary embodiments an area map need not be restricted to the 5 areas. For example, we could uniquely identify different transparency regions on multiple translucency axes, and so on.
Although the exemplary embodiments described with respect to
The devices and subsystems of the exemplary embodiments described with respect to
As noted above, it is to be understood that the exemplary embodiments, for example, as described with respect to
The exemplary embodiments described with respect to
All or a portion of the exemplary embodiments described with respect to
While the present invention have been described in connection with a number of exemplary embodiments and implementations, the present invention is not so limited but rather covers various modifications and equivalent arrangements, which fall within the purview of the appended claims.
The most important issue is what geometrical shapes can be used to define foreground and background area.
With respect to 2D geometrical shapes in a chromatic plane, U.S. Pat. No. 4,533,937 discloses a conventional method of using rhombus but suggests a group of quadratic curves include a polar quadratic curve, ellipse, and hyperbola, parabola; and U.S. Pat. No. 5,812,214 uses a circle to define background color area, and uses a polygon to separate foreground and transition areas.
With respect to 3D geometrical volumes in a color space, U.S. Pat. No. 5,355,174 uses a smaller polyhedron to define background and a larger polyhedron wrapping the smaller one to separate foreground and transition areas; U.S. Pat. No. 5,774,191 equivalently uses box-shaped volume; U.S. Pat. No. 5,903,318 and No. 5,923,381 use a cone-shaped volume and conical frustum; U.S. Pat. No. 5,719,640 separates a color space into two sub-spaces with a boundary surface or a boundary curve (if the boundary surface or curve is linear, this method can be thought as a particular case of polyhedron method in United States Paten No. 5,355,174, except that the background polyhedron is shrunk to a point); and U.S. Pat. No. 6,445,816 equivalently uses an ellipsoid/spheroid.
A color space includes either chromaticity domain, such as color differences Cr-Cb and its variant U-V, or luminance-chrominance and three primary colors RGB. Most inventions in chroma key techniques started their ideas from one of color spaces and then extended the ideas into the other color spaces with or without proof. In the various geometrical shapes, the polyhedron method is most powerful provided that there is enough number of faces used. This is because the polyhedron has no regular shape and it can be easily reshaped to fit various boundaries of color distributions. The other regular solid geometric shapes hardly separate complicated real color distributions of a picture into background and foreground areas because of their regularity. However, the complexity of implementation of the polyhedron also poses a difficulty to those applications, and which requires high-speed calculation in real time. Even for those solid geometrical shapes, the computing cost is also high due to the requirement of calculation in a 3D space.
U.S. Pat. No. 4,630,101 discloses a chromakey signal producing apparatus.
U.S. Pat. No. 4,344,085 discloses a comprehensive electronic compositing system background video signal to be combined with a foreground video signal
U.S. Pat. Nos. 5,032,901, 5,424,781, and 5,515,109 disclose backing color and luminance non-uniformity compensation for linear image compositing.
U.S. Pat. No. 5,343,255 discloses a Method and apparatus for compositing video image (Ultimatte)
U.S. Pat. No. 5,202,762 discloses a method and apparatus for applying correction to a signal used to modulate a chromakey method and apparatus disclosed in U.S. Pat. No. 5,249,039.
U.S. Pat. No. 5,249,039 discloses a chromakey method and apparatus.
U.S. Pat. No. 5,400,081 discloses a chroma keyer with correction for background defects.
U.S. Pat. No. 5,539,475 discloses a method of and apparatus for deriving a key signal from a digital video signal.
U.S. Pat. No. 5,708,479 discloses a method of inserting a background picture signal into parts of a foreground picture signal, and arrangement for performing the method.
U.S. Pat. No. 6,141,063 discloses a chromakey method and arrangement.
U.S. Pat. No. 5,500,684 discloses a chromakey live-video compositing circuit.
U.S. Pat. No. 5,838,310 discloses a chromakey signal generator.
U.S. Pat. No. 6,011,595 discloses a method for segmenting a digital image into a foreground region and a key color region.
U.S. Pat. Nos. 6,134,345, and 6,134,346 disclose a comprehensive method for removing from an image the background surrounding a selected subject.
U.S. Pat. No. 6,348,953 discloses a method for producing a composite image from a foreground image and a background image.
The present invention claims benefit of priority to U.S. Provisional Patent Application Ser. No. 60/556,504 of LIU et al., entitled “Method, System, and Device for Automatic Determination of Nominal Backing Color and A Range Thereof,” filed Mar. 26, 2004, the entire disclosure of which is hereby incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
60556504 | Mar 2004 | US |