The present invention relates to a digital image processing method for automatic image content analysis. More specifically, the present invention relates to applying infrared cameras and a facial recognition algorithm to a movie theater for image content analysis.
Content Providers in the movie theater industry are responsible for selling ad space as part of pre-feature “entertainment” in the theater. Ad sponsors desire accurate feedback on the success or improvement opportunities of their ads. In this regard, facial recognition has been done for determining the number of viewers in the audience. For example, Publication WO2006060889A1 discloses using facial recognition for detecting the faces and gazes of the audience.
Even though the presently known and utilized method and system are satisfactory, they include drawbacks. Movie theaters are frequently displayed in low lighting conditions. This makes facial recognition difficult and inaccurate. Consequently, a need exists to overcome this drawback.
The present invention is directed to overcoming one or more of the problems set forth above. Briefly summarized, according to one aspect of the present invention, the invention resides in a method for automatically collecting viewer statistics from one or more persons in a movie theater, the method including the steps of capturing an image of the one or more persons in the movie theater; using a facial-recognition algorithm to determine persons present in the movie theater; and determining one or more categories from characteristics from persons present to compute the viewer statistics.
These and other aspects, objects, features and advantages of the present invention will be more clearly understood and appreciated from a review of the following detailed description of the preferred embodiments and appended claims, and by reference to the accompanying drawings.
The present invention has the advantage of improving detection of an audience of a movie theater, particularly in low-lighting conditions of theaters.
FIG. 4A′ is an illustration of a static background image of the present invention;
FIG. 4B′ is an illustration of a foreground plus background image of the present invention;
FIG. 5′ is an illustration of a foreground image of the present invention;
In describing the present invention, it should be apparent that the computer program of the present invention can be utilized by any well-known computer system, such as the personal computer of the type shown in
It is to be understood that the present invention may make use of image manipulation algorithms and processes that are well known. Accordingly, the present description will be directed in particular to those algorithms and processes forming part of, or cooperating more directly with, the method of the present invention. Thus, it will be understood that the computer program of the present invention may embody algorithms and processes not specifically shown or described herein that are useful for implementation. Such algorithms and processes are conventional and within the ordinary skill in such arts.
Other aspects of such algorithms and systems, and hardware and/or software for producing and otherwise processing the images involved or co-operating with the computer program product of the present invention, are not specifically shown or described herein and may be selected from such algorithms, systems, hardware, components, and elements known in the art.
The computer program for performing the method of the present invention may be stored in a computer readable storage medium. This medium may comprise, for example: magnetic storage media such as a magnetic disk (such as a hard drive or a floppy disk) or magnetic tape; optical storage media such as an optical disc, optical tape, or machine readable bar code; solid state electronic storage devices such as random access memory (RAM), or read only memory (ROM); or any other physical device or medium employed to store a computer program. The computer program for performing the method of the present invention may also be stored on computer readable storage medium that is connected to the image processor by way of the Internet or other communication medium. Those skilled in the art will readily recognize that the equivalent of such a computer program product may also be constructed in hardware.
Now referring to
After a start step 200, the first step in acquiring of movie viewer demographic data for advertisement space management is identifying possible face regions 202 which is followed by detecting faces 204 and demographic statistics gathering 206. There is a query step 208 that checks if it is the end of advertisement time. If not, the acquiring of movie viewer demographic data repeats; otherwise the program ends 210.
Referring to
In order to determine characteristics, a training and calibration step 312 is done in order to obtain calibration statistics 316 which are used to train the algorithm to determine the characteristics obtained in step 306.
Referring to
In
In
Referring to
In step 505, the background image is subtracted from the foreground plus static background images IBb. Therefore, a sequence of foreground images, denoted by InF, is obtained in step 505. An exemplary foreground image 500 is shown in FIG. 5′. The foreground images contain foreground objects that are non-zero valued pixels 522. Areas in the foreground images other than the foreground object regions are filled with zero valued pixels 524.
The foreground image InF is used in step 506 to detect faces. In step 507, the detected faces are used to obtain movie viewer demographic statistics.
A program residing in the image processor 102 waits for time T1 and increases the index n by 1 in step 508. In a query step 509, a status of the theater operation is checked. If it is not the end of playing advertisement, camera 100 takes another foreground plus background image In in step 504. Then steps 504, 505, 506, 507 and 508 repeat. If it is the end of playing advertisement, the image capturing operation stops in step 510. In step 510, the total number of images, n−1, is recorded in variable N. Thus, the index n for the foreground plus background image In varies from 1 to N. The index n for the foreground image InF varies from 1 to N, the same as the foreground plus background image In.
In fact, before the steps 506 and 507 (equivalently, step 306) can be carried out, a step of training and calibration 312 needs to be performed. The input to the step of training and calibration 312 is a calibration foreground image 318 (as shown back in
To explain the operation of step 306 and associated operations, the following C-like code is used:
In the above code, the operation, Cni=1, indicates that there is a viewer sitting at the seat corresponding to cell i in foreground image n.
The operations of background subtraction and calibrating foreground images into cells make the face detection simpler. In step 506 of detecting faces, a face detector does not need to search the entire foreground image, instead, the face detector only operates on a cell if the cell is indicated as a face candidate region with Cni=1 in the previous steps. A preferred face detection algorithm can be found in “Method for locating faces in digital color images”, U.S. Pat. No. 7,110,575, by Shoupu Chen et al. This algorithm includes the steps of generating a mean grid pattern element (MGPe) image from a plurality of sample face images; generating an integral image from the digital color image; and locating faces in the color digital image by using the integral image to perform a correlation between the mean grid pattern element (MGPe) image and the digital color image at a plurality of effective resolutions by reducing the digital color image to a grid pattern element images (GPes) at different effective resolutions and correlating the MGPe with the GPes.
People skilled in the art should know that other face detection algorithms can be readily employed to accomplish the task of step 506.
The face detector 506 outputs the locations and sizes of faces found in the image(s). Each face detected is preferably classified as baby, child, adult or senior in step 507. A method for assigning a face to an age category is described in U.S. Pat. No. 5,781,650 by Lobo issued on Jul. 14, 1998. The adult faces are further classified as male or female.
In a preferred embodiment, gender classification involves the steps shown in
Some facial measurements that are known to be statistically different between men and women (ref. “Anthropometry of the Head and Face” by Farkas (Ed.), 2nd edition, Raven Press, New York, 1994, and “What's the difference between men and women? Evidence from facial measurement” by Burton, Bruce and Dench, Perception, vol. 22, pp. 153-176, 1993) are computed 722. The features are normalized by the inter-ocular distance, to eliminate the effect of differences in the raw size of the face. For symmetrical features, measurements from the left and right side of the faces are averaged to produce more robust measurements.
The presence or absence of hair in specific location on and around the face are also cues used by humans for gender determination. These features are incorporated 724 as a difference in gray-scale histograms between the patch where hair may be present, and a reference patch on the cheek that is typically hairless.
Binary classifiers are constructed 726 using each of the possible single features separately. Simple Bayesian classifiers described in standard literature (“Pattern Classification” by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley and Sons, 2001) are trained on large sets of example male and female faces to produce the single feature-based binary classifiers. The classification accuracy of each of these binary classifiers ranged from 55 to 75%.
The binary classifiers were combined using the AdaBoost algorithm to produce an improved final classifier 728. AdaBoost is a well-known algorithm for boosting classifier accuracy by combining the outputs of weak classifiers (such as the single feature binary classifiers described above). The weighted sum of outputs of the weak classifiers is compared with a threshold computed automatically from the training examples. A description and application of this method is available in “Rapid Object Detection Using a Boosted Cascade of Simple Features” by P. Viola and M. Jones, in International Conference on Computer Vision and Pattern Recognition, 2001. The classification accuracy of the final classifier obtained using AdaBoost was 90% on un-aligned faces.
Based on the information computed above, each face is assigned a demographic profile, which includes the age and gender of the people.
The invention has been described with reference to one or more embodiments. However, it will be appreciated that variations and modifications can be effected by a person of ordinary skill in the art without departing from the scope of the invention.