The present invention is related to commonly assigned U.S. patent application Ser. No. 11/072,502 of LIU et al., entitled “Method, System, and Device for Automatic Determination of Nominal Backing Color and A Range Thereof,” filed Mar. 7, 2005, which claims benefit of priority to U.S. Provisional Patent Application Ser. No. 60/556,504 of LIU et al., entitled “Method, System, and Device for Automatic Determination of Nominal Backing Color and A Range Thereof,” filed Mar. 26, 2004, the entire disclosures of all of which are hereby incorporated by reference herein.
1. Field of the Invention
The present invention relates generally to video processing and live video composition, and more particularly to a method, system and device for video data correction, video data compensation, and the like, for example, for use in live video composition applications, and the like. The present invention is related to technologies referenced in the List of References section of the present specification, and in brackets throughout the present specification, the entire disclosures of all of which are hereby incorporated by reference herein.
2. Discussion of the Background
In recent years, the technique of video data correction has been widely used in video display devices, such as plasma, LCD, other projector devices, and the like. However, the present invention is intended to apply such a technique to a different application such as a picture shot in a live video studio, where a character is positioned in front of a solid color wall, such as green wall or blue wall, and a video composition superposes this picture on a new background by cutting the solid backing color area off and replacing it with the new picture content.
Generally, the process of cutting the backing color off a foreground picture is called a chromakey process or chromakey function in a live video composition. This process normally generates a signal called alpha signal or alpha channel, which identifies various areas on the foreground picture with values ranging from 0 to 1. For example, a value of 1 indicates a foreground object, a value of 0 indicates a backing area, and a value of 0.5 indicates semitransparent objects on the foreground. This alpha signal is also used to represent the transparency or opacity of the foreground picture when overlaid on the new background picture.
Various problems can arise when removing the solid backing color and generating the alpha channel. Accordingly, various data correction techniques have been employed, including methods of correcting various types of flaws, such as seams, brim, wrinkles, and dirty marks on back walls or backdrops. However, such techniques can include various disadvantages and problems that still need to be addressed.
Therefore, there is a need for a method, system, and device that addresses the above and other problems with conventional systems and methods. Accordingly, in exemplary aspects of the present invention, a method, system, and device are provided for dynamic data correction of non-uniformity in live video composition where still and moving subjects shot against a backing area in a solid color such as blue or green by a video camera are clipped out and overlaid on a new background.
According to an aspect of the invention, there is provided a method comprising extracting feature data from input video data, the input video data comprising a subject shot against a backing area in a solid color, generating a curve based on the extracted feature data, forming correction factors based on the generated curve, and correcting at least one of the input video data and alpha data associated with the input video data based on the correction factors.
The feature data may describe with certainty characteristics of non-uniformity in the input video data.
The input video data may comprise one of the following color spaces and respective video signal components:
The operation of extracting may include horizontal accumulation for each row, where at row i:
average data at row i is given by:
vertical accumulation for each column, where at column j:
average data at column j is ƒavgv(j) given by:
The operation of generating may involve modelling the feature data using an L-order polynomial
wherein a sum of weighted square errors is given by:
minimizing a sum of weighted square errors to determine the (L+1) coefficients of the L-order polynomial as:
fitting the horizontal average video data to an M-order polynomial gh(j), wherein an error Eh represents a difference between a fitting model and the average data from the vertical accumulation, Eh being given by:
minimizing Eh to determine a vector of (M+1) coefficients as:
where:
fitting the vertical average video data to an N-order polynomial gv(i), wherein an error Ev represents a difference between a fitting model and the average data from the horizontal accumulation, Ev being given by:
minimizing Ev to determine a vector of (N+1) coefficients as:
where:
The operation of forming may involve determining horizontal multiplicative correction factors as:
determining vertical multiplicative correction factors as:
where dh and dv are constants, determined according to a method selected from the following:
wherein ƒref is a constant represented in a color component selected from the following, corresponding to the type of video data to be accumulated:
wherein ƒref is obtained from one of 1) a reference backing color chosen by a user, from pictures to be processed, by means of a GUI (Graphical User Interface), and 2) a reference backing color automatically generated by an algorithm.
The operation of forming may involve one of:
determining additive correction factors as a difference between a standard reference and the fitting model, a horizontal additive correction factor being given by:
Caddh(j)=dh−gh(j), and
a vertical additive correction factor being given by:
Caddv(i)=dv−gv(i),
wherein only one of the additive correction factors is used in the correcting, and
determining a combined additive correction factor as:
where dhv is a constant selected from the following:
In some embodiments, correcting involves performing a multiplicative correction, an additive correction, or a combined additive and multiplicative correction, the multiplicative correction being given by:
ƒ′(i,j)=ƒ(i,j)*Cmultv(i)*Cmulth(j),
the additive correction being defined by:
ƒ′(i,j)=ƒ(i,j)+Caddhv(i,j), and
the combined correction being defined by combining Cmultv(i), Cmulth(j), Caddv(i), Caddh(j), including a combination selected from the following:
The alpha data may be resultant data which are produced by feeding the input video data into a chromakey function to generate a signal of magnitude ranging from 0 to 1, in which case the alpha data represent the opacity or transparency of the input video data in a video composition process.
The operation of extracting may involve generating a weight for a pixel of the input video data, accumulating the weight from each pixel, accumulating a resultant value of the weight multiplied with the input video data at each pixel to be processed, storing the accumulated weight, and storing the accumulated resultant value.
Such a method may be embodied, for example, in instructions stored on a computer-readable medium.
An apparatus is also provided, and includes a feature extraction unit operable to extract feature data from input video data, the input video data comprising a subject shot against a backing area in a solid color, a data fitting unit operatively coupled to the feature extraction unit and operable to generate a curve based on the feature data extracted by the feature extraction unit and to form correction factors based on the generated curve, and a data correction unit operatively coupled to the data fitting unit and operable to correct at least one of the input video data and alpha data associated with the input video data based on the correction factors.
the feature data may describe with certainty characteristics of non-uniformity in the input video data.
The input video data may include one of the following color spaces and respective video signal components:
The feature extraction unit may be operable to extract the feature data by performing horizontal accumulation for each row, where at row i:
average data at row i is given by:
and by performing vertical accumulation for each column, where at column j:
average data at column j is ƒavgv(j) given by:
The data fitting unit may be operable to generate a curve by: modelling the feature data using an L-order polynomial
wherein a sum of weighted square errors is given by:
minimizing a sum of weighted square errors to determine the (L+1) coefficients of the L-order polynomial as:
fitting the horizontal average video data to an M-order polynomial gh(j), wherein an error Eh represents a difference between a fitting model and the average data from the vertical accumulation, Eh being given by:
minimizing Eh to determine a vector of (M+1) coefficients as:
where:
fitting the vertical average video data to an N-order polynomial gv(i), wherein an error Ev represents a difference between a fitting model and the average data from the horizontal accumulation, Ev being given by:
minimizing Ev to determine a vector of (N+1) coefficients as:
where:
The data fitting unit may be operable to form correction factors by:
determining horizontal multiplicative correction factors as:
determining vertical multiplicative correction factors as:
where dh and dv are constants, determined according to a method selected from the following:
wherein ƒref is a constant represented in a color component selected from the following, corresponding to the type of video data to be accumulated:
wherein ƒref is obtained from one of 1) a reference backing color chosen by a user, from pictures to be processed, by means of a GUI (Graphical User Interface), and 2) a reference backing color automatically generated by an algorithm.
In some embodiments, the data fitting unit is operable to form correction factors by one of:
determining additive correction factors as a difference between a standard reference and the fitting model, a horizontal additive correction factor being given by:
Caddh(j)=dh−gh(j), and
a vertical additive correction factor being given by:
Caddv(j)=dv−gv(i),
wherein only one of the additive correction factors is used in the correcting; and
determining a combined additive correction factor as:
where dhv is a constant selected from the following:
The data correction unit may be operable to correct the at least one of the input video data and the alpha data associated with the input video data by performing a multiplicative correction, an additive correction, or a combined additive and multiplicative correction,
the multiplicative correction being given by:
ƒ(i,j)=ƒ(i,j)*Cmultv(i)*Cmulth(j),
the additive correction being defined by:
ƒ′(i,j)=ƒ(i,j)+Caddhv(i,j), and
the combined correction being defined by combining Cmultv(i), Cmulth(j), Caddv(i), Caddh(j), including a combination selected from the following:
The alpha data may be resultant data which are produced by feeding the input video data into a chromakey function to generate a signal of magnitude ranging from 0 to 1, in which case the alpha data represent the opacity or transparency of the input video data in a video composition process.
The apparatus may also include a memory operatively coupled to the feature extraction unit. The feature extraction unit may then be operable to extract the feature data by generating a weight for a pixel of the input video data; accumulating the weight from each pixel; accumulating a resultant value of the weight multiplied with the input video data at each pixel to be processed; storing the accumulated weight in the memory; and storing the accumulated resultant value.
Still other aspects, features, and advantages of the present invention will be readily apparent from the following detailed description, simply by illustrating a number of exemplary embodiments and implementations, including the best mode contemplated for carrying out the present invention. The present invention also is capable of other and different embodiments, and its several details can be modified in various respects, all without departing from the spirit and scope of the present invention. Accordingly, the drawings and descriptions are to be regarded as illustrative in nature, and not as restrictive.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
Referring now to the drawings,
While the term and technique of video data correction are widely mentioned and used in video display devices, such as plasma, LCD, other projector devices, and the like, for example, as in references [1][2][3], the exemplary embodiments employ this term and general procedure to represent a technique and method which can improve the video quality in a live video composition. In order to easily understand the applications of the exemplary embodiments,
As previously described, the process of cutting the backing color off a foreground picture is called a chromakey process or a chromakey function in a live video composition. This process normally generates a signal called an alpha signal or an alpha channel, which identifies various areas on the foreground picture with values ranging from 0 to 1. For example, a value of 1 can be used to indicate a foreground object, a value of 0 can be used to indicate a backing area, and a value of 0.5 can be used to indicate semitransparent objects on the foreground. This alpha signal is also used to represent the transparency or opacity of the foreground picture when overlaid on the new background picture.
Embodiments of the present invention may include recognition that one of the problems that arises when removing the solid backing color and generating the alpha channel is that a non-uniform backing color increases the difficulty for a chromakey function or device to not only clean up the backing color, but also preserve the foreground object details. A non-uniform backing color may arise due to reflection from a non-uniform backdrop or bumpy wall, for example. A real lighting system also casts non-uniform lights. Although most resultant pictures look flat and uniform on the backing area, actual waveforms of video signals still show obvious curves along the horizontal scan direction and vertical scan direction, which can cause general chromakey devices to malfunction.
In theory, data correction is a general technique that can be applied, for example, when the acquisition of a physical property with a physical sensor produces flawed data, which deviate from an ideal or theoretic model due to imperfections of the sensor. Such a technique is most commonly used in the industry of imaging, referred to as shading correction [4]. Some recent techniques are found in references [5][6]. In these applications, shading patterns are completely static, so that they can be pre-determined.
For example,
As an example of data correction applied to image composition, as used in reference with [7],
(1) Only dealing with static cameras' shots. For example, movement from a camera, such as pan/zoom/shift or a change from a lighting system, even a vibration, can cause failures with such methods.
(2) Lack of immunity from noise. This is because normally the correction pattern storage captures individual and independent pixel data from a single field or frame rather than a statistical average, such as an accumulated value from multiple frames.
(3) Loss of precision due to changes. The correction pattern is generated by removing foreground objects out of a scene and comparing the bare backing area with a reference constant. When the foreground objects move into the scene, the brightness from backing reflection normally changes because the foreground objects disturb the environmental brightness. Therefore, the stored pattern looses its precision.
(4) In theory, the correction factor αinit−1 not only corrects backing area to generate uniform α, but also corrects any transition area where α is not zero.
The following description includes an exemplary method of data correction for non-uniformity in live video data, and with a strict mathematical style for certainty, and as elaborated by exemplary embodiments of the implementation. Accordingly, the exemplary embodiments include an innovative method, system and device to solve at least some of the above-noted problems and possibly other problems with conventional approaches for data correction for non-uniformity in live video composition.
For example,
Feature Extraction from Mass Data
Referring back to
For example, let us assume that a coordinate (i, j) is a pair of row and column numbers, that a pixel located at (i, j) has data ƒ(i, j), which can be any of color components as shown in Table 1, and that there exists a weight function w(i, j) in terms of pixel location (i, j), which is used to qualify the contribution of each pixel to the non-uniformity feature of video data. In an exemplary embodiment, the video feature is equivalent to the statistical mean of data by accumulating weighted video data ƒ(i, j), horizontally and vertically, respectively.
First, horizontal accumulation is implemented for each row, for example, at row i, as given by:
Then, the average data at row i is given by:
Similarly, vertical accumulation is implemented for each column, for example, at column j, as given by:
Then, average data at column j is ƒavgv(j), as given by:
The pair {ƒavgh(i), wacch(i)} from horizontal accumulation is used for vertical data fitting, and the pair {ƒavgv(j), waccv(j)} from vertical accumulation is used for horizontal data fitting.
Model Fitting into Extracted Feature
Generally, many types of curves can be used to model the feature data. For example, a data fitting step could use either a single high-order polynomial or multiple high-order polynomials which are smoothly connected to form splines. However, in exemplary embodiments, an L-order polynomial is employed, which is defined by:
A sum of weighted square errors can be defined as:
Then, by minimizing the sum of weighted square errors, the (L+1) coefficients of L-order polynomial can be found as:
For example, if the horizontal average video data is fitted into M-order polynomial gh(j), error Eh can be defined by a difference between the fitting model and the real average data from the vertical accumulation, as given by:
By minimizing Eh, a vector of (M+1) coefficients is found as:
where:
The above derivation is further described in detail below.
Similar to the horizontal fitting, if the vertical average video data is fitted into N-order polynomial gv(i), an error Ev represents a difference between the fitting model and the real average data from the horizontal accumulation, an error Ev can be given by:
By minimizing Ev, a vector of (N+1) coefficients is found as:
where:
Construction of Correction Factors
Multiplicative correction factors can be defined by reciprocating fitting models. For example, a possible horizontal multiplicative correction factor is given by:
and a vertical multiplicative correction factor is given by:
where dh and dv are constants, depending on a real implementation. Several possible formulas for dh or dv are listed in Table 2. A reference backing color ƒref is a constant represented in one of the color components listed in Table 1, corresponding to the type of video data to be accumulated, in some embodiments. This reference ƒref can be obtained, for example, from one of the following methods: 1) a reference backing color chosen by a user, from pictures to be processed, by means of a GUI (Graphical User Interface), and 2) a reference backing color automatically generated by a method, such as previously described. In one possible implementation, a frame grabber stores and displays input video data, and a user can use a GUI to select a reference backing color in the frame grabber.
An additive correction factor can be defined by a difference between a standard reference and the fitting model. For example, a horizontal additive correction factor can be defined as:
Caddh(j)=dh−gh(j), (20)
and a vertical additive correction factor can be defined as:
Caddv(i)=dv−gv(i). (21)
The above additive correction factors generally would not be used together. This is in contract with multiplicative corrective factors. Alternatively, a combined additive correction factor can be defined as:
where dhv is a constant that may be determined using one of the formulas listed in Table 3, for example.
Correction of Non-Uniform Luminance & Chrominance
The correction is implemented as either multiplicative correction, additive correction, or a combination of additive and multiplicative corrections. In accordance with the above examples, a multiplicative correction is defined by:
ƒ′(i,j)=ƒ(i,j)*Cmultv(i)*Cmulth(j), (23)
an additive correction is defined by:
ƒ′(i,j)=ƒ(i,j)+Caddhv(i,j), and (24)
a combined correction is defined by combining Cmultv(i), Cmulth(j), Caddv(i), Caddh(j). Some exemplary combinations are listed in Table 4.
One of the exemplary embodiments is to use a rough alpha as a weight, as previously described and then used through the exemplary methods. The concept of rough alpha or “dirty alpha” was previously described, and first introduced in the above-referenced co-pending U.S. patent application Ser. No. 11/072,502. For example, the rough alpha can be generated by any one of the following methods:
1) a low-cost, low-precision and low-quality, but full-level alpha generator, wherein normally an 8-bit color component, or 255 levels, is sufficient for the human visual system, high quality systems at present may offer a 10-bit color component, or 1023 levels, and even higher levels including a 12-bit color component, or 4095 levels;
2) an intermediate result from a high-cost, high-precision, high-quality alpha generator;
3) a hard-switching between foreground and backing area so that only 0 and 1 are used as an alpha output and without transition details across them; or
4) a very finite level of alpha value, such as 4 or 16 levels.
The exemplary embodiment shown in
In order to understand and prove how well embodiments of the present invention work,
Other embodiments than those specifically described above are also possible. For example, if a video chrominance component is used as video input data, a feature extraction unit may determine the correction curve for the chrominance. By using the same correction method applied to video luminance, the non-uniformity in chrominance can also or instead be corrected.
Another exemplary embodiment is to elaborate variations of implementation of
Further exemplary embodiments also can include different techniques of combining data correction with a chromakey function in a system or a device. For example,
Considering now the derivation of equations (11) and (15) in more detail, suppose the feature data ƒavg(x), the weight wavg(x), and the fitting model, as given by:
The error E is given by:
In order to minimize the error E, constrain, as given by:
Expand the above equation so as to have:
The above is a linear system of (L+1) equations. To more simply represent the formula, define:
Express the above equations in a matrix, as given by:
By finding an inverse matrix of [A], (L+1) coefficients of the polynomial are given by:
Thus, the exemplary embodiments provide an innovative method, system and device related to the dynamic data correction of non-uniformity in video data or streams for applications of live video composition. The exemplary method includes extracting features from live video dynamically, and hence there is no need to remove any foreground objects from a scene. In other words, the extracted features describe with certainty the dynamic characteristics of non-uniformity in live video data due to such causes as camera movement and movement of light(s). Then, a data-fitting technique is employed to construct a correction factor with a continuous curve from the extracted features. By such construction, the correction is strongly immune from noise, which is not only because the extracted features are from statistical mass data, but also because the data fitting tends to minimize the sum of square errors between an ideal model and the statistically deviated real data. Furthermore, the correction factors are of continuity on high order of derivative and hence no hard-switch correction occurs among neighboring pixels. Finally, the derived correction factor can be applied to either alpha data or video data, or both.
The exemplary embodiments also provide a good platform for future expansion, wherein a better algorithm can be used to construct a better correction factor based on extracted video features. For example, data fitting models can be easily altered to match real scenes, such as a scene with seams on back walls and floors, and the like. Moreover, the exemplary embodiments describe various implementations of an exemplary theoretic method, and include devices for such implementations.
The devices and subsystems of the exemplary embodiments described with respect to
As noted above, it is to be understood that the exemplary embodiments, for example, as described with respect to
The exemplary embodiments described with respect to
All or a portion of the exemplary embodiments described with respect to
Thus, more generally, embodiments of the invention may be implemented using further, fewer, or different elements, interconnected in a similarly or different manner, than explicitly shown in the drawings and described above.
It should also be noted that the various plots shown in the drawings are solely for the purposes of illustration. Similar or different results may be observed under different test or actual usage conditions.
While the present invention have been described in connection with a number of exemplary embodiments and implementations, the present invention is not so limited but rather covers various modifications and equivalent arrangements, which fall within the purview of the appended claims.
For example, although described above primarily in the context of methods, systems, and devices, embodiments of the invention may be implemented in other forms, such as in instructions stored on computer-readable media.
Number | Name | Date | Kind |
---|---|---|---|
5032901 | Vlahos | Jul 1991 | A |
5327247 | Osborne et al. | Jul 1994 | A |
5424781 | Vlahos | Jun 1995 | A |
5907315 | Vlahos et al. | May 1999 | A |
6346994 | Inoue | Feb 2002 | B1 |
6844883 | Bakhmutsky | Jan 2005 | B2 |
7075574 | Niko | Jul 2006 | B2 |
7102787 | Tamamura | Sep 2006 | B2 |
20030086018 | Berman et al. | May 2003 | A1 |
Number | Date | Country | |
---|---|---|---|
20090021528 A1 | Jan 2009 | US |