Detecting facial expressions in digital images

Description

FIELD

Embodiments of the invention relate generally to the field of detecting facial expressions in digital images and applications thereof.

BACKGROUND

It has proven problematic to accurately and automatically identify facial expressions in digital images. Approximately 30% of facial images are images, such as snapshots, representing faces which have various facial expressions. When a conventional face classification apparatus is used to detect faces in general images, the accuracy in detection is lower compared with images which have substantially the same facial expressions. Therefore, there is a problem that the face classification apparatus of prior art schemes cannot accurately detect facial expressions and specific facial expressions-such as smiles, frowns, etc.

SUMMARY OF THE INVENTION

A technique is provided for in-camera processing of a still image including one or more faces as part of an acquisition process. The technique involves identifying a group of pixels that correspond to a face within at least one digitally-acquired image on a portable camera. A collection of relatively lower resolution images including a face are generated in-camera, captured or otherwise obtained in-camera, and the face is tracked within the collection. Cropped versions of multiple images of the collection are acquired. Smile state information of the face is accumulated over the multiple images. A statistical smile state of the face is classified based on the accumulated smile state information. One or more smile state-dependent operations are selected based upon results of the analyzing.

Face recognition may be applied to one or more of the multiple images. A relatively short classifier cascade of images may be trained that each include a specifically-recognized person's face. The relatively short classifier cascade may include different poses and illuminations of the specifically-recognized person's face. A pose and/or illumination condition is/are determined, and the relatively short classifier cascade is adjusted base on the determined pose and/or illumination. Image acquisition may be initiated or delayed when the face is or is not recognized as one of one or more specific persons and/or when the face is classified as having a smile or not having a smile.

The technique may further include determining a pose and/or illumination condition for the face, and training a specific set of face classifiers adjusted based on the determined pose and/or illumination condition.

The classifying of the statistical smile state may include assigning a chain of Haar and/or census features.

The identifying of the group of pixels that correspond to a face may include applying approximately the same Haar and/or census features as the classifying.

The cropped versions may each include substantially only a region of the image that includes the face or that only includes a mouth region of the face.

The classifying may include thresholding, such that a classifying result may be one of smile, no smile or inconclusive. The thresholding may include comparing the statistical smile state to a first threshold between 60%-90% likely to be a smile, or to a second threshold of 10%-40% likely to be a smile, or both, with the 60%-90% or more corresponding to a smile result, and with the 10%-40% or less corresponding to a no smile result, and with between the 10%-40% and the 60%-90% corresponding to an inconclusive result. The first threshold may be approximately 80% and the second threshold may be approximately 20%.

The classifying may include calculating a statistical smile difference vector between frames of the collection of relatively lower resolution images, and determining that a certain threshold or more of difference corresponds to a sudden change in pose, illumination, or other image parameter, or to a changing smile state. A particular cause of the certain threshold or more of difference may be confirmed.

Multiple faces may be identified and tracked. Smile state information for each of the multiple faces may be classified. A smile-dependent group shot operation may be initiated if more than a first threshold number of faces is classified as no smile and/or if less than a second threshold number of faces is classified as smile. The smile-dependent group shot operation may include triggering a warning signal to a user or delaying acquisition of a group shot until determining that less than the first threshold number of faces is classified as no smile and/or that more than the second threshold number of faces is classified as smile.

A best smile image may be composited by combining one or more face regions of the at least one digitally-acquired image with a best smile region of one or more of the images of the collection of relatively lower resolution images. The best smile region may include a mouth region with a highest probability of being classified as a smile.

A portable digital image acquisition device is also provided, including a lens and image sensor for acquiring digital images, a processor, and one or more processor-readable media having code embedded therein for programming the processor to perform any of the techniques as described above or below herein.

One or more processor-readable media are provided with code embedded therein for programming a processor to perform any of the techniques as described above or below herein.

A method is also provided for in-camera processing of a still image including one or more faces as part of an acquisition process. A group of pixels is identified that corresponds to a face within at least one digitally-acquired image on a portable camera. The method also includes generating in-camera, capturing or otherwise obtaining in-camera a collection of relatively lower resolution images including a face, and tracking said face within said collection of relatively lower resolution images. Cropped versions of multiple images of the collection are acquired including the face. The method also includes accumulating smile state information of the face over the multiple images. A statistical smile state of the face is classified based on the accumulated smile state information. One or more smile state-dependent operations is/are selected and/or initiated based upon results of the analyzing.

The method may include applying face recognition to one or more of the multiple images.

A pose or illumination condition, or both, may be determined for the face. A specific set of face classifiers may be adjusted based on the determined pose or illumination or both.

The classifying of the statistical smile state may include assigning a chain of Haar and/or census features.

The cropped versions may each include substantially only a region of the image that includes the face or only a region of the image that includes a mouth region of the face.

The classifying may include thresholding, such that a classifying result includes smile, no smile or inconclusive.

The classifying may include calculating a statistical smile difference vector between frames of the collection of relatively lower resolution images, and determining that a certain threshold or more of difference corresponds to a sudden change in pose, illumination, or other image parameter, or to a changing smile state. The classifying may include confirming a particular cause of the certain threshold or more of difference.

Multiple faces may be identified ⋅and tracked. Smile state information for each of the multiple faces may be classified. The method may include initiating a smile-dependent group shot operation if more than a first threshold number of faces is classified as no smile or if less than a second threshold number of faces is classified as smile, or both.

The method may further include compositing a best smile image including combining one or more face regions of the at least one digitally-acquired images with a best smile region of one or more of the images of the collection of relatively lower resolution images.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention may be best understood by referring to accompanying drawings briefly described as follows to illustrate the embodiments:

FIG. 1 illustrates a technique of processing a still image including a face.

FIG. 2 illustrates a further technique of processing a still image including a face.

FIG. 3 illustrates specific classifying and identifying processes for use with the technique of FIG. 1.

FIG. 4 illustrates an alternative embodiment for training smile and non-smile facial expression classifiers.

FIG. 5 illustrates an alternative embodiment for testing with trained classifiers whether an image includes a face with a smile.

FIG. 6 illustrates a face looking straight ahead which is classified as non-smile.

FIG. 7 illustrates a face looking down which is also classified as non-smile.

DETAILED DESCRIPTIONS OF SEVERAL EMBODIMENTS

Systems and methods for detecting facial expressions (e.g., smiles), as well as applications for such systems and methods are described. In this description, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known, structures and techniques have not been shown in detail in order not to obscure the understanding of this description.

Reference throughout the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearance of the phrases “in one embodiment” or “in an embodiment” in various places throughout the specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Moreover, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, any claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.

Embodiments of the invention are applicable in a variety of settings in which it is desired to detect facial expressions in digital images.

For certain embodiments, a binary classifier is created and used for each face to be recognized. That is, samples of a target face are acquired through operation of a face detection algorithm and these samples are used as positive samples for the binary classifier.

FIGS. 1-3 illustrate a smile detector in accordance with an exemplary embodiment. Referring first to FIG. 1, a digital still image is acquired that includes a face at block 102. At block 104, a group of pixels is identified that corresponds to the face in the digital still image. At block 106, a collection of low resolution images is generated in-camera, captured or otherwise obtained in-camera including multiple instances of the face. The face is tracked at block 108 within the collection of low resolution images. At block 110, cropped versions are acquired of images of the collection including the face. Smile state information of the face is accumulated based on the cropped versions at block 112. A statistical smile state of the face is classified at block 114. One or more smile state-dependent operations is/are initiated at block 116.

FIG. 2 illustrates a technique including applying faced recognition at block 202. At ‘block 204, a relatively short classifier cascade of images is trained that includes a specifically recognized person's face. At block 206, different poses and/or illuminations of the specifically recognized person's face are selected for the relatively short classifier cascade.

FIG. 3 illustrates specific operations that may be used advantageously in the method of FIG. 1. At block 302, in the classifying at block 104, a chain of Haar and/or census features is assigned. At block 304, in the identifying, apply approximately the same Haar and/or census features as in the classifying at block 114.

Smile Detector Based on Face Detector Cascades

Embodiments of the invention employ in-camera training of new classifiers (i.e., instead of reusing the exact detection classifiers), that are used for separating one face from another. In certain embodiments, a binary classifier is built for faces that are and/or should be recognized. This training means that upon user request samples of the target face are acquired by employing a face detection algorithm. These samples are then used as positive samples for a binary classifier. Negative samples are either used from a small collection of generic faces and/or from other previously trained faces, which are stored locally. A relatively short classifier cascade is then trained.

In certain embodiments, the process may be repeated for faces that the user selects for future recognition. In a typical live view mode, the camera will run the tracking algorithm. A new detected face will be compared against the classifiers in the relatively short cascade in the recognition database. Depending on classifier responses and confidence accumulation, over several frames, a voting algorithm will choose one of the database faces or decide that the face does not belong to the recognition set.

In certain embodiments, information from the detection process is used to adjust the recognition process. For one such embodiment, the adjustment of the recognition process is effected dynamically based on the detector/tracker.

In accordance with various embodiments a particular face may have a number of recognition profiles, since the illumination conditions can change the classifier responses quite significantly. When a previously trained face is not correctly recognized under a certain condition, a new recognition profile can be added to that face either automatically or upon user input.

In general, certain embodiments allow the use of detection classifiers to perform recognition based on detection probability. That is, the face detector probability output is used to re-scale the classifiers for the recognizer. For one such embodiment, the detector indicates if a face is a “strong” or “weak” face and then the result is boosted or suppressed in accordance with the indication.

For certain embodiments, smile detection works as an add-on feature to the face tracking algorithm. It will receive as input the face region in the form of a polygon such as a rectangle, or alternatively a square, rhombus, triangle, circle, or otherwise, as well as the already computed integral images and other available maps.

The smile detection algorithm will run a binary classifier on each of the tracked face regions and will decide with a certain degree of confidence whether each of the faces is smiling or not smiling. If the required confidence level to provide an answer is not reached, the smiling-state of the face will be declared as uncertain or unknown. In certain embodiments, the prerequisites for the face may be that it should be frontal, with in-plane orientation close to 0, 90 or −90. However, as described below with reference to FIGS. 6 and 7, different poses can be identified and smiling states can be determined from them.

The smile classifier is the same type of chain with Haar and census features as the face detector. During the training part, it is learned to differentiate between positive smiling samples and negative non-smiling samples. The samples are face crops which are obtained by running the face detector and by automatic cropping based on manual or automatic markings on images with faces. The samples may have the same upright orientation, with slight variations.

In an alternative embodiment of the system the samples could be mouth region crops, which hold most of the useful information for smile classification. Such alternative system involves an additional identification of the mouth region prior to the actual classification. This can be done by running a feature based mouth detector, or identifying the mouth by a maximum color saturation region in the bottom half of the face or another alternative method. This general approach adds an extra level of uncertainty, but may be advantageous in utilizing less data.

The training process may provide a binary classifier chain that can decide the smiling state for a whole face region as it is delivered by the face detector. Smile detection/classification may be executed on individual frames, but the logic spans over several frames as confidence is being accumulated in order to provide a consistent response for a certain face. On a particular frame, the smile classifier runs only on face rectangles (or other polygons) coming directly from the detector, because these are best centered and fitted over the face, before the tracking algorithm re-evaluates the rectangle position. The smile classifier is also evaluated at several slightly shifted positions around the face region.

A confidence based on these neighboring classifications is summed up and thresholded. A smiling decision can be positive, negative or inconclusive. The classifier evaluation is done by the same engine as the one running the face detector, but the smile classifiers are provided instead of the face ones. During a sequence of frames, a smiling confidence parameter assigned to each tracked face, is either incremented or decremented for each positive or, respectively, negative smile response. This confidence parameter may be integer, and may be bound by upper and lower limits such that the smiling decision is responsive enough, and will not lock in a certain state. The confidence parameter is updated after each smile classification (which occurs each frame or at an interval). The final smile state output for a face may be inquired at each frame (may be continuously output), and may be based on the sign and the absolute value of the integer confidence parameter.

In accordance with certain embodiments, an algorithm is capable of detecting smiling frontal faces, as in-camera applications. The algorithm could be viewed as a standalone feature of digital cameras for facial expression detection (e.g., smile or frown detection). Certain embodiments may also be employed in apparatuses or methods involving decisions or further actions based on the presence of a smiling person and may include this algorithm as a decision algorithm.

In an alternative embodiment, Discreet Cosine Transforms (DCTs) are used.

The Training Part of the Algorithm

In certain embodiments, the facial expression to be detected is a smile. There may be two databases, one with smiles, and the other with non-smile, greyscale images. A training algorithm is applied to each database. For one embodiment, the steps of the training algorithm may be identical or substantially the same for both databases. Crops may be used including entire faces or just mouth regions or another subset at least including mouth regions, as outputted from a face detector. In alternative embodiments where blinks are being detected, then just eye region crops may be used or another subset at least including one or both eyes.

Images are read from the database (e.g., as squared crops delivered by the face detection algorithm). Then, for each image, the following steps may be performed:

- 1. Re-dimension the image to 25×25 pixels. This can be effected using bilinear interpolation, or alternatively bicubic splines.
- 2. Apply the 2DCT transform:

$\begin{matrix} F (u, v) = C (u) C (v) [\sum_{x = 0}^{(N - 1)} \sum_{y = 0}^{(N - 1)} f (x, y) \cos \frac{(2 x + 1) u π}{2 N} \cos \frac{(2 y + 1) v π}{2 N}] \end{matrix}$

- 3. Set the pixels in the upper left corner of the transformed matrix (20% of the number of pixels on Ox times 20% of the number of pixels on Oy) to 0.

This corresponds to removing the low frequency coefficients which are related to person features

- 4. Apply the 2IDCT transform:

$\begin{matrix} f (x, y) = [\sum_{u = 0}^{(N - 1)} \sum_{v = 0}^{(N - 1)} C (u) C (v) F (u, v) \cos \frac{(2 x + 1) u π}{2 N} \cos \frac{(2 y + 1) v π}{2 N}] where : C (u) = \frac{1}{\sqrt{N}}, C (v) = \frac{1}{\sqrt{N}} for u, v = 0; C (u) = \frac{\sqrt{2}}{\sqrt{N}}, C (v) = \frac{\sqrt{2}}{\sqrt{N}} for u, v = 1 through N - 1; \end{matrix}$

- 5. Set all the negative values to 0.

This has the effect of ignoring the values outside of the value range (0 . . . 255 for gray255; 0 . . . 1 for normalized values).

- 6. Apply an improved histogram equalization:
  - a. For each pixel, compute the mean of its horizontal, vertical and diagonal neighbours;
  - b. Sort pixels after their grey level, then after the computed mean;
  - c. Assign new levels of grey to each pixel;
  - d. Re-sort pixels in the original position.

The process will also work with conventional histogram equalization, though the quality of the results may be reduced.

- 7. Reshape the image to a vector (e.g. using vectorization).

For the whole database, after all images have been reshaped to vectors, perform the following steps:

- 8. Sort the vectors in 8 clusters using k-means. This is an arbitrary clustering that has been determined empirically to be sufficient to effect an advantageous concept. In general, the clustering may be different as will be appreciated by those skilled in the art.
- 9. Retain the cluster's centroids.

The training algorithm may be performed offline (i.e., the cluster centroids can be computed a priori and stored in a memory unit).

FIG. 4 illustrates an example of a training component of a facial expression detection technique. At block 402, parameters are initialized for smile and nonsmile databases, and the number of clusters is set to 8, and the OCT parameter is set to 20%.

For the smile database, an image is read at block 404. Dimensions are changed at block 406. A 2DCT algorithm is applied at block 408 as set forth above. The high frequencies are kept at block 410, and the upper left corner is turned to zero. A 21 OCT algorithm is applied at block 412 as set forth above. Negative values are made zero at block 414. Histogram equalization is performed at block 416, e.g., as described above. It is determined at block 418 whether the smile database is finished. If not, then a next image is read at block 404. If so, then K-means is used to sort clusters at block 420, and means of clusters for smile are calculated at block 422.

For the non-smile database, an image is read at block 424. Dimensions are changed at block 426. A 2DCT algorithm is applied at block 428 as set forth above. The high frequencies are kept at block 430, and the upper left corner is turned to zero. A 21 DCT algorithm is applied at block 432 at set forth above. Negative values are made zero at block 434. Histogram equalization is performed at block 436, e.g., as described above. It is determined at block 438 whether the non-smile database is finished. If not, then a next image is' read at block 424. If so, then K-means is used to sort clusters at block 440, and means of clusters for non-smile are calculated at block 442.

The Detection/Test Part of the Algorithm

The following sequence may be applied for performing detection of smile or non-smiles (or blinks, etc.).

- 1. Load the 16 cluster centroids.
- 2. Read the image to be classified.
- 3. If necessary, turn it to a grayscale image.
- 4. Re-dimension the image to 25×25 pixels.
- 5. Apply the 2DCT transform.
- 6. Set the pixels in the upper left corner of the transformed matrix (20% of the number of pixels on Ox times 20% of the number of pixels on Oy) to 0.
- 7. Apply the 2IDCT transform.
- 8. Set the negative values to 0.
- 9. Apply the improved histogram equalization.
- 10. Reshape the image to a vector.
- 11. Compute the Euclidian distances between the vector and all the clusters centroids.
- 12. Find the minimum distance.
- 13. Assign to the test image the same label (Smile or NonSmile) as the images within the closest cluster.

For certain embodiments, the number of clusters (e.g., S clusters for each database) may be varied. Additionally, or alternatively, the number of pixels made 0 after 2DCT (in this case 5×5 pixels) may be varied.

FIG. 5 illustrates an example of a detection component of a facial expression detection technique. At block 502, data is input including means of the clusters from the example of FIG. 4 and a test image. The test image is read at block 504. Dimensions are changed at block 506. A 2DCT algorithm is applied at block 508 as set forth above. The high frequencies are kept at block 510, and the upper left corner is turned to zero. A 21 DCT algorithm is applied at block 512 as set forth above. Negative values are made zero at block 514. Histogram equalization is performed at block 516, e.g., as described above. Distances to the center classes are computed at block SIS. It is determined at block 520 whether minimum distances exist for a smile cluster. If not, then the image is classified as a non-smile image at block 522. If so, then the image is classified as a smile image at block 524.

Alternative Implementations

As will be appreciated by those skilled in the art, many alternative embodiments of the invention are possible. For example, the principle embodiment describes a technique that determines the smile/no-smile state of a face region within a digital image. It is implicit that a face tracking/face detector has been run on the image and that knowledge of the location of face region(s) within the analysed image is made available to the “smile detector”. This technique can be applied both within a digital camera given sufficient computing resources, and may be implemented partly within the camera (e.g. face detection) and partly outside the camera (e.g. smile detection using derived and saved face detection information), or in certain embodiments both the face detection process and the smile detection are used to post-process previously acquired images.

Where the invention is implemented entirely within the camera various improvements to the operation of the invention can be achieved. In particular, the digital camera may acquire a constant stream of preview and/or postview images, and where a face tracking algorithm is embodied within the camera, then information about the determined face regions within each frame of the preview stream is available on a real-time basis. Where the present algorithm is sufficiently optimized, it can be applied in real-time either in parallel with, or sequentially following the application of the face tracker algorithm. Such an embodiment enables (i) improvements in the smile detection process itself and (ii) additional operational features to be provided to a user of the camera.

With respect to item (i) and referring to the computing of Euclidian distances between the vector and cluster centroids, and to the finding of minimum distance per steps 11 & 12 of the above-described exemplary embodiment, where such a real-time smile detection algorithm is implemented, it is possible to compute the smile/no-smile state of a tracked face region and to accumulate this state information over multiple pre-acquisition frames. This enables statistical analysis of the smile/no-smile state of a face and is useful to avoid confounding factors such as sudden changes in illumination and/or face pose which may degrade the accuracy of the smile detection algorithm. Thus, sudden inter-frame fluctuations in the smile feature vector can be ignored until the feature vector stabilizes.

In one embodiment in addition to calculating the smile feature vector for each frame, and determining its smiled/no-smile state, the algorithm calculates a difference vector between subsequent frames of the preview/postview image stream. Where this is greater than a certain threshold it may either be interpreted as indicating a sudden change in external illumination or pose (which may be confirmed by the exposure determining subsystem of the camera for the case of illumination, or by the face-lock characterization of the face tracking algorithm) or it may be interpreted as a transition between smile and no-smile states (which may be confirmed by analysis of subsequent preview/postview frames).

In alternative embodiments, a running average of the smile feature vector may be calculated and this averaged feature vector is used to determine the smile/no-smile state of a face region over multiple preview frames.

In yet a further embodiment, the distances between the current smile feature vector and both the nearest smile centroid and the nearest no-smile centroid are calculated for each preview frame. The ratio between these two distances is analyzed statistically over several frames and used to determine a smile/no-smile′ probability measure rather than a simple smile/no-smile state measure. Thus where a smile feature vector is a normalized distance of 0.2 from the nearest smile centroid and a distance of 0.8 from the nearest no-smile centroid it is 80% likely to be a smile or 20% likely to be not a smile. In a variation on this embodiment the log of the normalized distance is used to calculate a probability rather than the normalized distance itself.

With respect to item (ii) above, where the smile detection process is operable on a preview/postview stream, it is possible to monitor state transitions of tracked face regions. This enables, for example, a camera to implement an improved “group shot” feature, where an image is captured when everyone in a preview frame is determined to be smiling.

In other embodiments, the camera could issue a warning beep if one or more people are not smiling (the “smile guarantee” feature); or acquisition could delayed until everyone (or a plurality or certain percentage or certain number) are determined to be smiling.

In embodiments where additional image reconstruction and/or compositing and/or super-resolution algorithms are available within the camera then face regions, or portions thereof, from one or more preview frames may be combined with the main acquired image to ensure that a final, composited image presents the “best smile” for each detected face. The judging of the quality of a smile may be achieved using a smile/no-smile probability as described above.

Metadata relating to the smile/no-smile state or smile probability may be stored/saved with other information relating to the relevant tracked face region.

FIGS. 6 and 7 illustrate a further embodiment. In both of the photos illustrated at FIGS. 6 and 7, the subject is not smiling and not blinking. In FIG. 6, the no smile, no blink state of the subject may be detected using a variety of geometrical and/or learning techniques. However, inferior techniques can tend to falsely detect the subject as smiling and blinking in FIG. 7, even though the subject is not smiling and not blinking. Because the subject is looking down in FIG. 7, it can appear that the subject's lips are curved upward on the outsides just like a smiling mouth would appear on a face in a frontal, non-tilted pose. The subject can also appear to be blinking or sleeping or otherwise have her eyes closed in FIG. 7, because no part of her eye balls is showing.

Based on the triangle eyes-mouth (smoothed by the face tracking algorithm on more frames), it is determined in this embodiment whether the face orientation is in the plane (RIP) and out of the plane (ROP). Based on this information, smile acceptance/rejection thresholds are adjusted dynamically in this embodiment.

The smile detection threshold may be relaxed on different rotations or plane (RIP) angles, or a smile detection may be applied on a precise angle (by rotating the crop image or the classifiers) and having stronger smile classifiers on 0 (+/−5) degrees. [Note: Now they are more relaxed in the training process=>0 (+/−20) degrees.]

A stronger smile detection threshold may be placed when the faces are up-down (pitch rotation). Note: Up-down faces can otherwise tend to lead to a large-percentage of false smile detections.

This same idea can also be applied to adjust dynamic blink acceptance/rejection thresholds.

Applications

As noted above, there are many applications for embodiments of the invention that detect smiles in digital images. Further applications are possible where two or more sensors are implemented within a digital image acquisition device. In accordance with one embodiment of the invention where at least one additional sensor is implemented in the device and that sensor faces the user (e.g., photographer), an image of the photographer may be acquired as the photographer is in the process of acquiring an image. Such an embodiment allows the production of a diptych which includes the photographer as well as the image acquired by the user.

When employed with facial expression detection, such an embodiment may allow the image acquisition device to acquire an image upon recognition or detection of a given facial expression (e.g., smile) of the user (e.g., photographer). This allows the motion associated with typical press-button image acquisition schemes to be reduced.

Similarly, embodiments of the invention can be employed to review and categorize acquired images or images as they are being acquired based upon the facial expressions of the user or a subsequent reviewer. For example, the facial expressions (indicating emotions) of the person(s) reviewing photos are detected. If the reviewing person(s) smile, then the image is auto tagged as a keeper or a preferred image. If the image gets multiple “smile” reviews over time, then its preferred score goes up. The list of preferred images can be used for playback on the camera where preferred images are presented first over lesser preferred images as a playback mode.

For certain embodiments, this concept of emotion determination based upon facial expression detection is broadened as follows. Smiling and other facial expressions are used for tagging on, personal computers, documents, videos, establishing entry points or tags of interest in video. Such PC applications could be effected for cameras mounted in the displays of personal computers for example.

In accordance with certain embodiments, data processing uses a digital processing system (DPS). The DPS may be configured to store, process, and communicate, a plurality of various types of digital information including digital images and video.

As discussed above, embodiments of the invention may employ a DPS or devices having digital processing capabilities. Exemplary components of such a system include a central processing unit (CPU), and a signal processor coupled to a main memory, static memory, and mass storage device. The main memory may store various applications to effect operations of the invention, while the mass storage device may store various digital content.

The DPS may also be coupled to input/output (I/O) devices and audio/visual devices. The CPU may be used to process information and/or signals for the processing system. The main memory may be a random access memory (RAM) or some other dynamic storage device, for storing information or instructions (program code), which are used by the CPU. The static memory may be a read only memory (ROM) and/or other static storage devices, for storing information or instructions, which may also be used by the CPU. The mass storage device may be, for example, a hard disk drive, optical disk drive, or firmware for storing information or instructions for the processing system.

General Matters

Embodiments of the invention provide methods and apparatuses for detecting and determining facial expressions in digital images.

Embodiments of the invention have been described as including various operations. Many of the processes are described in their most basic form, but operations can be added to or deleted from any of the processes without departing from the scope of the invention.

The operations of the invention may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor or logic circuits programmed with the instructions to perform the operations. Alternatively, the steps may be performed by a combination of hardware and software. The invention may be provided as a computer program product that may include a machine-readable medium having stored thereon instructions, which may be used to program a computer (or other electronic devices) to perform a process according to the invention. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnet or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions. Moreover, the invention may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication cell (e.g., a modem or network connection). All operations may be performed at the same central site or, alternatively, one or more operations may be performed elsewhere.

While the invention has been described in terms of several embodiments, those skilled in the art will recognize that the invention is not limited to the embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting.

Claims

1. An image processing method, comprising: identifying a group of pixels that correspond to a face within at least one digitally-acquired image using a portable image acquisition device;performing the following steps in real time as a stream of images is acquired by the portable image acquisition device: identifying face regions within said stream of images;for each interval of a plurality of intervals of the stream of images, determining a facial expression state for the interval by accumulating facial expression state information over multiple images of the interval and classifying a facial expression state for the interval based on the accumulated facial expression state information and logic that spans over several images of the interval, wherein each interval of the plurality of intervals includes one or more previously acquired images from the stream of images;determining a facial expression state for a first interval of the plurality of intervals;determining a facial expression state for a second interval of the plurality of intervals;detecting a facial expression state transition between the first interval and the second interval based upon the determined facial expression states for the first interval and the second interval, wherein the determined facial expression state for the first interval is different from the determined facial expression state for the second interval;in response to the detected transition in the facial expression states, performing one or more operations to post-process at least one previously acquired image from the stream of images; andcausing display of the at least one post-processed image via a display coupled to the portable image acquisition device.
2. The method of claim 1, wherein each interval of the plurality of intervals includes multiple images from the stream of images.
3. The method of claim 2, wherein the stream of images comprises a collection of images having a lower resolution than the resolution of an image sensor of the portable image acquisition device.
4. The method of claim 3, wherein identifying the group of pixels that correspond to the face comprises applying at least one of Haar and census features.
5. The method of claim 4, wherein: identifying face regions within said stream of images comprises tracking pose of the face within the stream of images; anddetermining a facial expression state for the interval comprises determining a facial expression state based upon the tracked pose of the face.
6. The method of claim 3, wherein: identifying a group of pixels that correspond to a face within at least one digitally-acquired image on the portable image acquisition device comprises using a face detector to identify the group of pixels that correspond to the face with the at least one digitally-acquired image;identifying face regions within said stream of images comprises tracking pose of the face within the stream of images; anddetermining a facial expression state for the interval comprises determining a facial expression state based upon the tracked pose of the face.
7. The method of claim 2, wherein: identifying a group of pixels that correspond to a face within at least one digitally-acquired image on the portable image acquisition device comprises using a face detector to identify the group of pixels that correspond to the face with the at least one digitally-acquired image;identifying face regions within said stream of images comprises tracking pose of the face within the stream of images; anddetermining a facial expression state for the interval comprises determining a facial expression state based upon the tracked pose of the face.
8. The method of claim 2, wherein identifying face regions within said stream of images comprises tracking pose of multiple faces within the stream of images including said face.
9. The method of claim 2, wherein performing one or more operations to post-process the stream of images in response to the detected transition in the facial expression state comprises performing a compositing operation.
10. The method of claim 2, wherein performing one or more operations to post-process the stream of images in response to the detected transition in the facial expression state comprises compositing a best smile image by combining at least one face region from multiple images in the stream of images.
11. The method of claim 2, further comprising performing in-device training of new classifiers used to separate one face from another.
12. The method of claim 2, wherein images in the stream of images are greyscale images.
13. The method of claim 2, wherein detecting a facial expression state transition comprises detecting at least one transition in a facial expression state selected from the group consisting of: a transition from a non-smile to a smile;a transition from a smile to a non-smile;a blink of one eye; anda blink of both eyes.
14. An image processing method, comprising: identifying a group of pixels that correspond to a face within at least one digitally-acquired image on a portable image acquisition device using a face detector;acquiring a stream of images using the portable image acquisition device, where the stream of images comprises a collection of images having a lower resolution than the resolution of an image sensor of the portable image acquisition device; andperforming in real time as the stream of images is acquired by the portable image acquisition device steps including: tracking pose of the face within the stream of images;for each interval of a plurality of intervals of the stream of images, determining a facial expression state based upon the tracked pose of the face by accumulating facial expression state information over multiple images of the interval and classifying a facial expression state for the interval based on the accumulated facial expression state information and logic that spans over several images of the interval, wherein each interval of the plurality of intervals includes one or more previously acquired images from the stream of images;determining a facial expression state for a first interval of the plurality of intervals;determining a facial expression state for a second interval of the plurality of intervals;detecting a facial expression state transition between the first interval and the second interval based upon the determined facial expression states for the first interval and the second interval, wherein the determined facial expression state for the first interval is different from the determined facial expression state for the second interval;in response to the detected transition between facial expression states, performing at least one image post-processing operation to at least one previously acquired image from the stream of images acquired by the portable image acquisition device; andcausing display of the post-processed image on a display coupled to the portable image acquisition device.
15. The method of claim 14, wherein the image on which the post-processing operation is performed is an image from the collection of images.
16. The method of claim 14, wherein: images in the collection of images are greyscale images; andthe post-processing operation is performed on a color image.
17. The method of claim 16, wherein the color image has a higher resolution than the resolution of images in the collection of images.
18. The method of claim 14, wherein performing one or more operations to post-process the stream of images in response to the detected transition in the facial expression state comprises performing a compositing operation.
19. The method of claim 14, wherein performing one or more operations to post-process the stream of images in response to the detected transition in the facial expression state comprises compositing a best smile image by combining at least one face region from multiple images in the stream of images.
20. The method of claim 14, further comprising performing in-device training of new classifiers used to separate one face form another.
21. The method of claim 14, wherein detecting a facial expression state transition comprises detecting at least one transition in a facial expression state selected from the group consisting of: a transition from a non-smile to a smile;a transition from a smile to a non-smile;a blink of one eye; anda blink of both eyes.
22. A portable device, comprising: a camera;a display;a set of one or more processors capable of performing steps comprising: acquiring at least one image using the camera;identifying a group of pixels that correspond to a face within the at least one acquired image using a face detector;acquiring a stream of images using the camera, where the stream of images comprises a collection of images having a lower resolution than the resolution of an image sensor of the camera; andperforming in real time as the stream of images is acquired steps comprising: tracking pose of the face within the stream of images;for each interval of a plurality of intervals of the stream of images, determining a facial expression state based upon the tracked pose of the face by accumulating facial expression state information over multiple images of the interval and classifying a facial expression state for the interval based on the accumulated facial expression state information and logic that spans over several images of the interval, wherein each interval of the plurality of intervals includes one or more previously acquired images from the streams of images;determining a facial expression state for a first interval of the plurality of intervals;determining a facial expression state for a second interval of the plurality of intervals; detecting a facial expression state transition between the first interval and the second interval based upon the determined facial expression states for the first interval and the second interval, wherein the determined facial expression state for the first interval is different from the determined facial expression state for the second interval;in response to the detected transition between facial expression states, performing at least one image post-processing operation to at least one previously acquired image from the stream of images acquired by the camera; anddisplaying the post-processed image via the display.
23. The portable device of claim 22, wherein the image on which the post-processing operation is performed is an image from the collection of images.
24. The portable device of claim 22, wherein: the images in the collection of images are greyscale images; andthe post-processing operation is performed on a color image.
25. The portable device of claim 24, wherein the color image has a higher resolution than the resolution of images in the collection of images.
26. The portable device of claim 22, wherein performing one or more operations to post-process the stream of images in response to the detected transition in the facial expression state comprises performing a compositing operation.
27. The portable device of claim 22, wherein performing one or more operations to post-process the stream of images in response to the detected transition in the facial expression state comprises compositing a best smile image by combining at least one face region from multiple images in the stream of images.
28. The portable device of claim 22, further comprising performing in-device training of new classifiers used to separate one face from another.
29. The portable device of claim 22, wherein detecting a facial expression state transition comprises detecting at least one transition in a facial expression state selected from the group consisting of: a transition from a non-smile to a smile;a transition from a smile to a non-smile;a blink of one eye; anda blink of both eyes.

PRIORITY CLAIM AND RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 17/020,535, filed Sep. 14, 2020, which is a continuation of U.S. patent application Ser. No. 15/948,848, filed Apr. 9, 2018 and issued on Sep. 15, 2020 as U.S. Pat. No. 10,778,885, which is a continuation of U.S. patent application Ser. No. 15/284,280, filed Oct. 3, 2016 and issued on Apr. 10, 2018 as U.S. Pat. No. 9,942,470, which is a continuation of U.S. patent application Ser. No. 14/300,150, filed Jun. 9, 2014 and issued on Oct. 4, 2016 as U.S. Pat. No. 9,462,180, which is a continuation of application Ser. No. 12/354,707, filed on Jan. 15, 2009 and issued on Jun. 10, 2014 as U.S. Pat. No. 8,750,578, which claims the benefit under 35 U.S.C. § 119(e) of provisional applications 61/024,508, filed Jan. 29, 2008, entitled, “Methods and Apparatuses For Detecting Facial Expressions in Digital Images and Applications Thereof”, and 61/023,855, filed Jan. 27, 2008, entitled, “Blink Detection Method”. This application is also related U.S. patent application Ser. No. 11/752,925 filed on May 24, 2007, entitled “Image Processing Method and Apparatus”. The entire contents of the above applications are hereby incorporated by reference for all purposes as if fully set forth herein. The applicant(s) hereby rescind any disclaimer of claim scope in the parent application(s) or the prosecution history thereof and advise the USPTO that the claims in this application may be broader than any claim in the parent application(s).

US Referenced Citations (210)

Number	Name	Date	Kind
4047187	Mashimo et al.	Sep 1977	A
4299464	Cushman	Nov 1981	A
4317991	Stauffer	Mar 1982	A
4367027	Stauffer	Jan 1983	A
RE31370	Mashimo et al.	Sep 1983	E
4638364	Hiramatsu	Jan 1987	A
5018017	Sasaki et al.	May 1991	A
RE33682	Hiramatsu	Sep 1991	E
5063603	Burt	Nov 1991	A
5164831	Kuchta et al.	Nov 1992	A
5164992	Turk et al.	Nov 1992	A
5227837	Terashita	Jul 1993	A
5280530	Trew et al.	Jan 1994	A
5291234	Shindo et al.	Mar 1994	A
5311240	Wheeler	May 1994	A
5384912	Ogrinc et al.	Jan 1995	A
5430809	Tomitaka	Jul 1995	A
5432863	Benati et al.	Jul 1995	A
5488429	Kojima et al.	Jan 1996	A
5496106	Anderson	Mar 1996	A
5572596	Wildes et al.	Nov 1996	A
5576759	Kawamura et al.	Nov 1996	A
5633678	Parulski et al.	May 1997	A
5638136	Kojima et al.	Jun 1997	A
5680481	Prasad et al.	Oct 1997	A
5684509	Hatanaka et al.	Nov 1997	A
5692065	Prakash et al.	Nov 1997	A
5706362	Yabe	Jan 1998	A
5710833	Moghaddam et al.	Jan 1998	A
5724456	Boyack et al.	Mar 1998	A
5774591	Black et al.	Jun 1998	A
5774747	Ishihara et al.	Jun 1998	A
5774754	Ootsuka	Jun 1998	A
5781650	Lobo et al.	Jul 1998	A
5802208	Podilchuk et al.	Sep 1998	A
5802220	Black et al.	Sep 1998	A
5802361	Wang et al.	Sep 1998	A
5805720	Suenaga et al.	Sep 1998	A
5812193	Tomitaka et al.	Sep 1998	A
5818975	Goodwin et al.	Oct 1998	A
5835616	Lobo et al.	Nov 1998	A
5842194	Arbuckle	Nov 1998	A
5870138	Smith et al.	Feb 1999	A
5963656	Bolle et al.	Oct 1999	A
5978519	Bollman et al.	Nov 1999	A
5991456	Rahman et al.	Nov 1999	A
6053268	Yamada	Apr 2000	A
6072903	Maki et al.	Jun 2000	A
6097470	Buhr et al.	Aug 2000	A
6101271	Yamashita et al.	Aug 2000	A
6115509	Yeskel	Sep 2000	A
6128397	Baluja et al.	Oct 2000	A
6148092	Qian	Nov 2000	A
6151073	Steinberg et al.	Nov 2000	A
6188777	Darrell et al.	Feb 2001	B1
6192149	Eschbach et al.	Feb 2001	B1
6249315	Holm	Jun 2001	B1
6263113	Abdel-mottaleb et al.	Jul 2001	B1
6268939	Klassen et al.	Jul 2001	B1
6282317	Luo et al.	Aug 2001	B1
6301370	Steffens et al.	Oct 2001	B1
6301440	Bolle et al.	Oct 2001	B1
6332033	Qian	Dec 2001	B1
6360021	Mccarthy et al.	Mar 2002	B1
6393148	Bhaskar	May 2002	B1
6400830	Christian et al.	Jun 2002	B1
6404900	Qian et al.	Jun 2002	B1
6407777	Deluca	Jun 2002	B1
6421468	Ratnakar et al.	Jul 2002	B1
6438264	Gallagher et al.	Aug 2002	B1
6456732	Kimbell et al.	Sep 2002	B1
6456737	Woodfill et al.	Sep 2002	B1
6459436	Kumada et al.	Oct 2002	B1
6473199	Gilman et al.	Oct 2002	B1
6501857	Gotsman et al.	Dec 2002	B1
6501942	Weissman et al.	Dec 2002	B1
6504942	Hong et al.	Jan 2003	B1
6504951	Luo et al.	Jan 2003	B1
6516154	Parulski et al.	Feb 2003	B1
6526161	Yan	Feb 2003	B1
6556708	Christian et al.	Apr 2003	B1
6606397	Yamamoto	Aug 2003	B1
6606398	Cooper	Aug 2003	B2
6633655	Hong et al.	Oct 2003	B1
6661907	Ho et al.	Dec 2003	B2
6697503	Matsuo et al.	Feb 2004	B2
6697504	Tsai	Feb 2004	B2
6754389	Dimitrova et al.	Jun 2004	B1
6760465	Mcveigh et al.	Jul 2004	B2
6765612	Anderson et al.	Jul 2004	B1
6801250	Miyashita	Oct 2004	B1
6850274	Silverbrook et al.	Feb 2005	B1
6876755	Taylor et al.	Apr 2005	B1
6879705	Tao et al.	Apr 2005	B1
6940545	Ray et al.	Sep 2005	B1
6965684	Chen et al.	Nov 2005	B2
6993157	Oue et al.	Jan 2006	B1
6996340	Yamaguchi et al.	Feb 2006	B2
7003135	Hsieh et al.	Feb 2006	B2
7020337	Viola et al.	Mar 2006	B2
7027619	Pavlidis et al.	Apr 2006	B2
7035440	Kaku	Apr 2006	B2
7035456	Lestideau	Apr 2006	B2
7035467	Nicponski	Apr 2006	B2
7038709	Verghese	May 2006	B1
7038715	Flinchbaugh	May 2006	B1
7050607	Li et al.	May 2006	B2
7064776	Sumi et al.	Jun 2006	B2
7082212	Liu et al.	Jul 2006	B2
7099510	Jones et al.	Aug 2006	B2
7110575	Chen et al.	Sep 2006	B2
7113641	Eckes et al.	Sep 2006	B1
7119838	Zanzucchi et al.	Oct 2006	B2
7120279	Chen et al.	Oct 2006	B2
7151843	Rui et al.	Dec 2006	B2
7158680	Pace	Jan 2007	B2
7162076	Liu	Jan 2007	B2
7162101	Itokawa et al.	Jan 2007	B2
7171023	Kim et al.	Jan 2007	B2
7171025	Rui et al.	Jan 2007	B2
7174033	Yukhin et al.	Feb 2007	B2
7190829	Zhang et al.	Mar 2007	B2
7200249	Okubo et al.	Apr 2007	B2
7218759	Ho et al.	May 2007	B1
7227976	Jung et al.	Jun 2007	B1
7233684	Fedorovskaya et al.	Jun 2007	B2
7248300	Ono	Jul 2007	B1
7254257	Kim et al.	Aug 2007	B2
7274822	Zhang et al.	Sep 2007	B2
7274832	Nicponski	Sep 2007	B2
7315631	Corcoran et al.	Jan 2008	B1
7317815	Steinberg et al.	Jan 2008	B2
7408581	Gohda	Aug 2008	B2
7440593	Steinberg et al.	Oct 2008	B1
7551755	Steinberg et al.	Jun 2009	B1
7715597	Costache et al.	May 2010	B2
7738015	Steinberg et al.	Jun 2010	B2
7764311	Bill	Jul 2010	B2
7787664	Luo et al.	Aug 2010	B2
7804983	Steinberg et al.	Sep 2010	B2
7916971	Bigioi et al.	Mar 2011	B2
8005268	Steinberg et al.	Aug 2011	B2
8238618	Ogawa	Aug 2012	B2
8750578	Neghina et al.	Jun 2014	B2
9462180	Neghina et al.	Oct 2016	B2
9942470	Neghina et al.	Apr 2018	B2
10778885	Neghina et al.	Sep 2020	B2
20010028731	Covell et al.	Oct 2001	A1
20010031142	Whiteside	Oct 2001	A1
20010040987	Bjorn et al.	Nov 2001	A1
20020090116	Miichi et al.	Jul 2002	A1
20020105482	Lemelson et al.	Aug 2002	A1
20020105662	Patton et al.	Aug 2002	A1
20020114535	Luo	Aug 2002	A1
20020172419	Lin et al.	Nov 2002	A1
20030025812	Slatter	Feb 2003	A1
20030052991	Stavely et al.	Mar 2003	A1
20030068100	Covell et al.	Apr 2003	A1
20030071908	Ojima et al.	Apr 2003	A1
20030160879	Robins et al.	Aug 2003	A1
20030169906	Gokturk et al.	Sep 2003	A1
20030190090	Beeman et al.	Oct 2003	A1
20040001616	Gutta et al.	Jan 2004	A1
20040088272	Jojic et al.	May 2004	A1
20040170397	Ono	Sep 2004	A1
20040175020	Bradski et al.	Sep 2004	A1
20040197013	Kamei	Oct 2004	A1
20040213482	Silverbrook	Oct 2004	A1
20040218916	Yamaguchi et al.	Nov 2004	A1
20040223629	Chang	Nov 2004	A1
20040223649	Zacks et al.	Nov 2004	A1
20040258304	Shiota et al.	Dec 2004	A1
20050013479	Xiao et al.	Jan 2005	A1
20050018925	Bhagavatula et al.	Jan 2005	A1
20050069208	Morisada	Mar 2005	A1
20050102246	Movellan et al.	May 2005	A1
20050169536	Accomazzi et al.	Aug 2005	A1
20050286802	Clark et al.	Dec 2005	A1
20060110014	Philomin et al.	May 2006	A1
20060177100	Zhu et al.	Aug 2006	A1
20060177131	Porikli	Aug 2006	A1
20060204106	Yamaguchi	Sep 2006	A1
20070025722	Matsugu	Feb 2007	A1
20070091203	Peker et al.	Apr 2007	A1
20070098303	Gallagher et al.	May 2007	A1
20070133901	Aiso	Jun 2007	A1
20070154095	Cao et al.	Jul 2007	A1
20070154096	Cao et al.	Jul 2007	A1
20070201725	Steinberg et al.	Aug 2007	A1
20070201750	Ito et al.	Aug 2007	A1
20070237421	Luo et al.	Oct 2007	A1
20080025576	Li et al.	Jan 2008	A1
20080144966	Steinberg et al.	Jun 2008	A1
20080192129	Walker et al.	Aug 2008	A1
20080310759	Liu et al.	Dec 2008	A1
20090109400	Yoshinaga et al.	Apr 2009	A1
20090135269	Nozaki et al.	May 2009	A1
20090190803	Neghina et al.	Jul 2009	A1
20100125799	Roberts et al.	May 2010	A1
20110069277	Blixt et al.	Mar 2011	A1
20110102553	Corcoran et al.	May 2011	A1
20110216943	Ogawa	Sep 2011	A1
20110234847	Bigioi et al.	Sep 2011	A1
20110235912	Bigioi et al.	Sep 2011	A1
20120218398	Mehra	Aug 2012	A1
20120219180	Mehra	Aug 2012	A1
20140347514	Neghina et al.	Nov 2014	A1
20170026567	Neghina et al.	Jan 2017	A1
20180295279	Neghina et al.	Oct 2018	A1
20210067686	Neghina et al.	Mar 2021	A1

Foreign Referenced Citations (24)

Number	Date	Country
1487473	Apr 2004	CN
1723467	Jan 2006	CN
102007499	Apr 2011	CN
102007499	Mar 2015	CN
1748378	Jan 2007	EP
2370438	Jun 2002	GB
1993260360	Oct 1993	JP
H08272948	Oct 1996	JP
H08272973	Oct 1996	JP
2000347277	Dec 2000	JP
2001067459	Mar 2001	JP
2002199202	Jul 2002	JP
2004294498	Oct 2004	JP
2005003852	Jan 2005	JP
2007088644	Apr 2007	JP
2007249132	Sep 2007	JP
2007306418	Nov 2007	JP
2007329602	Dec 2007	JP
2011511977	Apr 2011	JP
5639478	Oct 2014	JP
20100116178	Oct 2010	KR
101615254	Apr 2016	KR
2007060980	May 2007	WO
2009095168	Aug 2009	WO

Non-Patent Literature Citations (50)

Entry
International Preliminary Report on Patentability for International Application No. PCT/EP2009/000315, Report dated Aug. 3, 2010, dated Aug. 12, 2010, 7 Pgs.
Non-final Office Action dated Feb. 15, 2022, for U.S. Appl. No. 17/643,162, filed Dec. 7, 2021, 13 pgs.
Kotsia et al., “Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines”, IEEE Transactions on Image Processing, Jan. 2007, vol. 16, No. 1, pp. 172-187.
Saatci et al., “Cascaded Classification of Gender and Facial Expression using Active Appearance Models”, IEEE, FGR'06, 2006, 6 pgs.
Claims in Korean Application No. 10-2010-7016964, Dated Jun. 2015, 4 Pgs.
Communication Pursuant to Article 94(3) EPC, for European Patent Application No. 06789329.7, dated Jul. 31, 2009, 5 Pgs.
Communication Pursuant to Article 94(3) EPC, for European Patent Application No. 06789329.7, dated May 23, 2011, 5 Pgs.
Current Claims in China application No. 200980102866.0, dated May 2014, 7 Pgs.
Current Claims in Japan Application No. 2013-17764, dated Jun. 2014, 3 Pgs.
Extended European Search Report for European Application No. 06789329.7, Search completed Jan. 13, 2009 , dated Jan. 22, 2009, 7 Pgs.
Extended European Search Report for European Application No. 06800683.2, Search Completed May 18, 2011 , dated Jun. 29, 2011, 7 Pgs.
Final Office Action dated Mar. 24, 2010, for U.S. Appl. No. 11/460,225, filed Jul. 26, 2006.
Final Rejection, dated Jun. 3, 2013, for U.S. Appl. No. 13/035,907, filed Aug. 26, 2011.
International Search Report and Written Opinion for International Application No. PCT/EP2009/000315, Search completed Apr. 7, 2009, dated Apr. 29, 2009, 12 Pgs.
International Search Report and Written Opinion for International Application No. PCT/US2006/30173, Search completed Sep. 24, 2007, dated Nov. 1, 2007, 6 Pgs.
International Search Report and Written Opinion for International Application No. PCT/US2006/30315, Search completed Feb. 19, 2007, dated May 2, 2007, 8 Pgs.
Japan Patent Office, “Notification of Reasons for Refusal” in Application No. 2013-17764, dated Jun. 17, 2014, 4 Pgs.
Korean Claims in Application No. 10-2010-7016964, Dated Nov. 2014, 3 Pgs.
Korean Intellectual Property Office, “Notice of Non-Final Rejection” in Application No. 10-2010-7016964, dated Nov. 24, 2014, 11 Pgs.
Korean Intellectual Property Office, “Search Report” in Application No. 10-2010-7016964, dated Jun. 30, 2015, 7 Pgs.
Non-final Office Action dated Jul. 22, 2013, for U.S. Appl. No. 13/219,569, filed Aug. 26, 2011.
Non-final Office Action dated Dec. 4, 2012, for U.S. Appl. No. 13/035,907, filed Feb. 25, 2011.
Non-final Office Action dated Mar. 25, 2010, for U.S. Appl. No. 11/460,218, filed Jul. 26, 2006.
Non-final Office Action dated Sep. 21, 2009, for U.S. Appl. No. 11/460,218, filed Jul. 26, 2006.
Non-final Office Action dated Sep. 22, 2009, for U.S. Appl. No. 11/460,225, filed Jul. 26, 2006.
Notice of Allowance dated Aug. 20, 2010, for U.S. Appl. No. 11/460,225, filed Jul. 26, 2006.
Notice of Allowance dated Jul. 13, 2010, for U.S. Appl. No. 11/460,225, filed Jul. 26, 2006.
Notice of Allowance dated Jun. 8, 2012, for U.S. Appl. No. 13/191,239, filed Jul. 26, 2011.
Notice of Allowance dated Jun. 29, 2010, for U.S. Appl. No. 11/460,218, filed Jul. 26, 2006.
Office Action for Japanese Patent Application No. 2010-544617 (JP2011-511977 A), dated Dec. 18, 2012.
PCT Notification Concerning Transmittal of International Preliminary Report on Patentability, PCT Appin. No. PCT/US2006/30173, dated Sep. 4, 2008, 7 pgs.
The State Intellectual Property Office of the People's Republic of China, “Notification of the 4th Office Action” in Application No. 200980102866.0, dated May 21, 2014, 10 Pgs.
Bradski et al., “Learning-based Computer Vision with Intel's Open Source Computer Vision Library”, Intel Technology, vol. 9, Issue 2, pp. 119-130. Published online May 2005.
Cootes et al., “Active Appearance Models”, Proceedings of European Conference on Computer Vision, 1998, (H. Burkhardt and B. Neumann Eds.), Springer, vol. 2, pp. 484-498. Published online 1998.
Corcoran et al., “Automatic Indexing of Consumer Image Collections Using Person Recognition Techniques”, Digest of Technical Papers, International Conference on Consumer Electronics, pp. 127-128. Published online Jan. 8, 2005.
Costache et al., “In-Camera Person-Indexing of Digital Images”, Digest of Technical Papers International Conference on Consumer Electronics, 2006, pp. 339-340.
Demirkir et al., “Face Detection Using Boosted Tree Classifier Stages”, Proceedings of the IEEE 12th Signal Processing and Communications Applications Conference, 2004, pp. 575-578.
Dornaika et al., “Fast and Reliable Active Appearance Model Search for 3-D Face Tracking”, Proceedings of Mirage 2003, INRIA Rocquencourt, France, pp. 113-122. Published online Mar. 10-11, 2003.
Drimbarean et al., “Image Processing Techniques to Detect and Filter Objectionable Images Based on Skin Tone and Shape Recognition”, International Conference on Consumer Electronics, 2004, pp. 278-279.
Huang et al., “Eye Tracking with Statistical Learning and Sequential Monte Carlo Sampling”, Proceedings of the Fourth International Conference on Information, Communications & Signal Processing and Fourth IEEE Pacific-Rim Conference on Multimedia (ICICS-PCM2003), 2003, vol. 3, pp. 1873-1878.
Kotsia et al., “Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines”, IEEE Transactions On Image Processing, vol. 16, No. 1, pp. 172-187. Published online Jan. 2007.
Rowley et al., “Neural Network-based Face Detection”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, Issue 1, pp. 23-38. Published Online Jan. 1998.
Saatci et al., “Cascaded Classification of Gender and Facial Expression Using Active Appearance Models”, 7th International Conference on Automatic Face and Gesture Recognition (FGR06). IEEE, Published online 2006.
Viola et al., “Rapid Object Detection Using a Boosted Cascade of Simple Features”, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001, vol. 1, pp. I-511-I-518.
Viola et al., “Robust Real-time Face Detection”, International Journal of Computer Vision, Kluwer Academic Publishers, 2004, vol. 57, Issue 2, pp. 137-154, published online Jan. 10, 2004.
Xin et al., “Real-time Human Face Detection in Color Image”, International Conference on Machine Learning and Cybernetics, 2003, vol. 5, pp. 2915-2920.
Yang et al., “Detecting Faces in Images: A Survey”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Jan. 2002 vol. 24, Issue 1, pp. 34-58.
Yao et al., “Tracking a Detected Face With Dynamic Programming”, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW'04), 8 pgs. Published online 2004.
Zhao et al., “Face Recognition: A Literature Survey”, ACM Computing Surveys, 2003, vol. 35, No. 4, pp. 399-458. Published online Dec. 2003.
Zhu et al., “Fast Human Detection Using a Cascade of Histograms of Oriented Gradients”, Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, 2006, pp. 1491-1498.

Related Publications (1)

	Number	Date	Country
	20220103749 A1	Mar 2022	US

Provisional Applications (2)

	Number	Date	Country
	61024508	Jan 2008	US
	61023855	Jan 2008	US

Continuations (5)

	Number	Date	Country
Parent	17020535	Sep 2020	US
Child	17643162		US
Parent	15948848	Apr 2018	US
Child	17020535		US
Parent	15284280	Oct 2016	US
Child	15948848		US
Parent	14300150	Jun 2014	US
Child	15284280		US
Parent	12354707	Jan 2009	US
Child	14300150		US

Detecting facial expressions in digital images

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

CPC

International Classifications

Abstract