1. Field of the Invention
This invention relates to a method of and system for image processing for detecting the widths of faces in photographs of face, and more particularly, to a computer program for the image processing.
2. Description of the Related Art
For example, application of a passport or a license or making own personal history often requires a photograph of his or her face output in a predetermined standard (will be referred to as “a certification photograph”, hereinbelow.) The output standard of certification photographs generally defines the length of the face (or of a part of the face) together with the length of the finished photograph in the vertical direction whereas in the lateral direction, the output standard of certification photograph generally defines only the length (width) of the face and does not define the width of the face.
In order to obtain such a certification photograph, there have been proposed various methods. For example, there has been disclosed in Japanese Unexamined Patent Publication No. 11(1999)-341272 a method where when, with an image of a face used for a certification photograph displayed by a display device such as a monitor, positions of the vertex and the tip of the chin (will be referred to as “the position of the chin”, hereinbelow.) of the displayed face are designated, a computer obtains the position and the size of the face on the basis of the designated two positions, and at the same time, the computer enlarges or contracts the image of the face according to the output standard of certification photographs and trims the enlarged or contracted image so that the face is positioned in a predetermined position in the certification photograph. With this method, the user can request a certification photograph of a DPE shop or the like and at the same time, the user can request a DPE shop or the like to make a certification photograph from a photograph which he or she favors.
Further, as disclosed in Japanese Unexamined Patent Publication No. 2004-005384 and U.S. Patent Application Publication No. 20050013599, there have been proposed methods where instead of operator's manual designation, parts such as the face and the mouth are detected from the image of the face, the positions of the vertex and the chin are estimated on the basis of the detected positions of the parts and the trimming is carried out on the basis of the estimated positions of the vertex and the chin to form a certification photograph.
However, recently, on the basis of increasing requirements for the security, there is a tendency that the standard of certification photographs defines the width of the face together with the length of the same, and accordingly, it becomes necessary to grasp the width of the face in photographs of face and trims the photographs of face.
In fields other than the certification photograph, the width of the face in photographs of face is sometimes necessary. For example, when graduation albums are to be prepared, it is desired that the face in the photographs of face in each finished album are of substantially the same size. In order to unify the sizes of the face, it is necessary to obtain not only the length of the face but also the width of the face and to make the faces to be substantially the same in area.
In order to make a photograph in which the width of the face in the finished photograph meets the standard, it is thus necessary to grasp the width of the face in the original photographic images of the face. However, there has been conventionally no way to detect the width of the face in the photographic images of the face.
In view of the foregoing observations and description, the primary object of the present invention is to provide a method of and system for image processing for detecting the widths of faces in photographs of face for trimming which meets the strict standard of the certification photographs or for image processing to unify the size of the faces in a plurality of photographic images and a computer program for the image processing.
In accordance with the present invention, there is provided an image processing method for detecting the lateral widths of faces in photographs of face comprising the steps of
detecting a skin-colored area in a face,
obtaining a lateral width of the detected skin-colored area in each of the positions along a direction from the vertex to the chin of the face, and
determining the lateral width in a predetermined position in a range from a first position to a second position as the lateral width of the face, the first position being a position in which the lateral width uncontinuously increases, and the second position being a position which is nearer to the chin than the first position and remoter from the chin by one position than the position in which the lateral width uncontinuously decreases.
In the image processing method of the present invention, it is preferred that the largest lateral width in the range from the first position to the second position be determined as the lateral width of the face.
In the image processing method of the present invention, the larger of the lateral width in the first position and that in the second position may be determined as the lateral width of the face.
In the image processing method of the present invention, it is preferred that the skin-colored area in the face be detected as an area formed by pixels detected by setting an area which is estimated to be of a skin-color in the face as a reference area, and detecting pixels which are of a color approximating the color of the reference area from the face.
It is preferred that the reference area be an area between the eyes and the tip of the nose in the face.
In accordance with the present invention, there is further provided an image processing system for detecting the lateral widths of faces in photographs of face comprising
a skin-colored area detecting means for detecting a skin-colored area in a face;
a width obtaining means for obtaining a lateral width of the detected skin-colored area in each of the positions along a direction from the vertex to the chin of the face, and
a face width determining means for determining the lateral width in a predetermined position in a range from a first position to a second position as the lateral width of the face, the first position being a position in which the lateral width uncontinuously increases, and the second position being a position which is nearer to the chin than the first position and remoter from the chin by one position than the position in which the lateral width uncontinuously decreases.
In the image processing system of the present invention, it is preferred that the face width determining means determines the largest lateral width in the range from the first position to the second position as the lateral width of the face.
In the image processing system of the present invention, the face width determining means may determine the larger of the lateral width in the first position and that in the second position as the lateral width of the face.
In the image processing system of the present invention, it is preferred that the skin-colored area detecting means comprises a reference area setting means which sets an area which is estimated to be of a skin-color in the face as a reference area, and a skin-colored pixel detecting means which detects pixels which are of a color approximating the color of the reference area from the face and detects as the skin-colored area an area formed by the detected pixels.
It is preferred that the reference area setting means sets an area between the eyes and the tip of the nose in the face as the reference area.
Further, a computer program for causing a computer to execute the image processing method of the present invention may be recorded in computer readable media. A skilled artisan would know that the computer readable media are not limited to any specific type of storage devices and include any kind of device, including but not limited to CDs, floppy disks, RAMs, ROMs, hard disks, magnetic tapes and internet downloads, in which computer instructions can be stored and/or transmitted. Transmission of the computer code through a network or through wireless transmission means is also within the scope of this invention. Additionally, computer code/instructions include, but are not limited to, source, object and executable code and can be in any language including higher level languages, assembly language and machine language.
In accordance with the image processing method and system of the present invention, a skin-colored area is detected in an image of the face on the basis of the fact that the lateral width of the human face abruptly increases at the upper root of the ears and abruptly decreases at the lower root of the ears, and the position where the lateral width of the skin-colored area uncontinuously increases is obtained as a first position (i.e., the upper root of the ears) while the position which is nearer to the chin than the first position and remoter from the chin than the other positions in which the lateral width uncontinuously decreases is obtained as a second position (i.e., the lower root of the ears). And on the basis of the fact that the lateral width of the human face hardly changes between the upper root of the ears and the lower root of the ears, the lateral width in a predetermined position in the range is obtained as the lateral width of the face. By this, the lateral width of faces can be surely obtained.
Though, the lateral width in any position in the range may be taken as the lateral width of the face, the lateral width of the face can be more accurately obtained if the largest lateral width in the range is determined as the lateral width of the face.
Further, since when statistically viewed, the human face is often maximized in its lateral width in the position of the upper root of the ears or in the position of the lower root of the ears, the lateral width of faces can be rapidly obtained when the larger of the lateral width in the position of the upper root of the ears and that in the position of the lower root of the ears is determined as,the lateral width of the face.
In this invention, it is necessary to detect a skin-colored area in a face in order to detect the lateral width of the face in each of the positions. However the color of the human skin largely changes depending on the race, the degree of the sunburn, or the like. Accordingly, when the skin-colored area in the face is detected as an area formed by the pixels detected by setting an area which is estimated to be of a skin-color in the face as a reference area, and detecting pixels which are of a color approximating the color of the reference area from the face, the skin color can be certainly detected without being affected by the difference among the individuals, which leads to an accurate detection of the lateral width of the face.
As shown in
The image input portion 10 is for inputting the photographic images S0 to be processed by the image processing system of this embodiment and may comprise, for instance, a receiving portion which receives the photographic images S0 sent by way of a network, a read-out portion which reads out the photographic images S0 from a recording medium such as a CD-ROM, a scanner which photoelectrically reads out an image printed on a printing medium such as a paper and a printing paper, and the like.
The first characteristic value calculating portion 22 of the face detecting portion 20 calculates the characteristic value C0 for use in distinguishment of a face from the photographic images S0. For example, the first characteristic value calculating portion 22 calculates gradient vectors as the characteristic value C0. Calculation of the gradient vectors will be described, hereinbelow. The first characteristic value calculating portion 22 first detects horizontal edges by carrying out on the images S0 filtering by the use of a horizontal edge detecting filter shown in
The gradient vectors K thus calculated, in the case of a face of a person as shown in
The direction and the size of the gradient vector are taken as the characteristic value C0, The direction of the gradient vector K is of a value of 0 to 359° with a predetermined direction (e.g., x direction in
Then the size of the gradient vector K is normalized. The normalization is effected by obtaining a histogram of the sizes of the gradient vectors K of all the pixels in the images S0, smoothening the histogram so that the distribution of the sizes of the gradient vectors K are uniformed over the range of values which the pixels in the images S0 can take (0 to 255 in the case of 8 bit signals), and correcting the sizes of the gradient vectors K on the basis of the smoothening. For example, when the sizes of the gradient vectors K is small and the histogram thereof leans toward the smaller side as shown in
The reference data E1 stored in the database 40 is obtained by defining the distinguishing conditions to a combination of the characteristic values C0 in each of the pixels forming each pixel group for each of a plurality of pixel groups comprising a combination of a plurality of pixels selected from sample images to be described later.
The distinguishing conditions to a combination of the characteristic values C0 in each of the pixels forming each pixel group in the reference data E1 have been determined in advance by learning a plurality of sample images which have been known that they are the images of face and a plurality of sample images which have been known that they are not the images of face.
When the reference data E1 is generated in this embodiment, as the sample images which have been known that they are the images of face, sample images each of which has a size of 30×30 pixels, in which the center-to-center distances between the eyes are 10 pixels, 9 pixels, and 11 pixels and which are obtained by rotating the image of the face perpendicular to the straight line joining the centers of the eyes stepwise by 3° within ±15° in the plane (i.e., −15°, −12°, −9°, −6°, −3°, 0°, 3°, 6°, 9°, 12°, 15°) as shown in
As the sample images which have been known that they are not the images of face, arbitrary sample images each of which has a size of 30×30 pixels are employed.
In the case where only the sample images which have been known that they are the images of face, are 10 pixels in the center-to-center distances between the eyes and 0° in the rotational angle (that is, the images where the face is vertical) are learned, only the images of faces which are 10 pixels in the center-to-center distances between the eyes and are not rotated by any angle will be distinguished as a face image when the reference data E1 is referred to. The sizes of the face images which can be included in the photographic images S0 are not constant. Accordingly, the photographic images S0 are enlarged or contracted to distinguish a position of face which conforms in size to the sample images when determining whether a face image is included in the photographic images S0 as will be described later. However, in order to enlarge or contract an image so that the center-to-center distances between the eyes thereof is accurately 10 pixels, it is necessary to effect the distinguishment while the photographic images S0 is enlarged or contracted stepwise, for instance, by 1.1, which results in a vast amount of calculation.
The face images which can be included in the photographic images S0 can include not only the images where the face rotational angle is 0° as shown in
Accordingly, in this embodiment, the sample images in which the center-to-center distances between the eyes are 9 pixels, 10 pixels, and 11 pixels and which are obtained by rotating the image of the face perpendicular to the straight line joining the centers of the eyes stepwise by 3° within +15° in the plane as shown in
An example of learning the sample image group will be described with reference to the flow chart shown in
In
the value of the combination=0 (in the case where the size of the gradient vector=0)
the value of the combination=((the direction of the gradient vector+1)×the size of the gradient vector (in the case where the size of the gradient vector>0))
Since the number of combinations becomes 94 with this arrangement, the number of pieces of data on the characteristic value can be reduced.
Similarly, a histogram is made for the sample images which have been known that they are not the images of face. In the case of the sample images which have been known that they are not the images of face, pixels corresponding to the positions of the pixels P1 to P4 on the sample images which have been known that they are the images of face are used. The histogram representing the logarithmic values of the ratio of the frequencies shown by the two histograms is the histogram which is shown on the rightmost side of
Then, out of the distinguishers made in step S2, a distinguisher which is the most effective to distinguish whether the image is of a face is selected. This selection is effected taking into account the weights of the sample images. In this example, the weighted ratios of the correct answers of the distinguishers are compared, and the distinguisher exhibiting the highest weighted ratio of the correct answers is selected. (step S3) That is, since initially the sample images are equally weighted by 1, the distinguisher having the most sample images which are correctly distinguished as the image of face by the distinguisher is selected as the most effective distinguisher in the initial step S3. Whereas, in second step S3 after the weight of each sample image is updated in step S5 as will be described later, sample images whose weight is 1, sample images whose weight is larger than 1 and sample images whose weight is smaller than 1 mingle with each other and the sample image whose weight is larger than 1 is more counted than the sample whose weight is 1 in the evaluation of the ratio of the correct answers. By this, in steps S3 after the second step S3, a more importance is put on the sample images weighted more than the sample images weighted less.
Then whether the ratio of the correct answers of the combination of the distinguishers up to that time, that is, the ratio at which the result of distinguishment whether the sample images are images of face by the use of the distinguishers combined up to that time conforms to the answer whether the sample images are actually images of face, exceeds a predetermined threshold value is checked. (step S4) The sample images used here in the evaluation of the ratio of the correct answers may be the sample images with a current weight or the equally-weighted sample images. When the ratio exceeds the predetermined threshold value, the learning is ended since whether the images are of a face can be distinguished at a sufficiently high probability by the use of the distinguishers selected up to that time. When the ratio does not exceed the predetermined threshold value, the processing proceeds to step S6 in order to select one or more additional distinguisher to be combined with the distinguishers selected up to that time.
In step S6, in order for the distinguisher (s) selected in the preceding step S3 not to be selected again, the once-selected distinguisher(s) is omitted.
Then, the weight on the sample image which was not correctly distinguished whether it is an image of face in the preceding step S3 is increased and the weight on the sample image which was correctly distinguished whether it is an image of face in the preceding step S3 is reduced. (step S5) The reason why the weights are increased or reduced is that an importance is put on an image which was not correctly distinguished by the distinguishers which have been already selected so that a distinguisher which can correctly distinguish the image whether it is of a face, thereby enhancing the effect of the combination of the distinguishers.
Thereafter, the processing returns to step S3 where the next most effective distinguishers are selected on the basis of the weighted ratio of the correct answers as will be described above.
After distinguishers corresponding to the combination of characteristic values Co in each of the pixels forming a particular pixel group is selected as distinguishers which are suitable for distinguishing whether the image includes a face by repeating steps S3, to S6, the kind of the distinguishers and the distinguishing conditions used in distinguishment of whether the image includes a face are decided. (step S7) Then the leaning of the reference data E1 is ended.
When the learning procedure described above is employed, the distinguisher need not be limited to those in the form of a histogram but may be any so long as it provides data on the basis of which whether the image is of a face can be distinguished by the use of the combination of characteristic values Co in each of the pixels forming a particular pixel group, e.g., the distinguisher may be two-valued data, a threshold value or a function. Further, just the same, in the form of a histogram, a histogram representing the distribution of the difference between the two histograms shown at the middle of
Further, the learning procedure need not be limited to that described above but other machine learning procedures such as neural network may be employed.
The face detection performing portion 24 refers to the distinguishing conditions which the reference data E1 has learned for all the combinations of characteristic values Co in each of the pixels forming a plurality of pixel groups to obtain the distinguishing point of the combination of characteristic values Co in each of the pixels forming pixel groups, and detects a face on the basis of all the distinguishing points. At this time, the direction and the size of the gradient vector which are the characteristic value Co are four-valued and three-valued, respectively. In this embodiment, all the distinguishment points are summed and a face is detected on the basis of whether the sum is positive or negative, and of the magnitude of the sum. For example, when the sum of the distinguishment points is positive, it is determined that the image is of a face, whereas when the sum of the distinguishment points is negative, it is determined that the image is not of a face.
The photographic images S0 can differ from the sample images of 30 pixels×30 pixels and can be of various sizes. Further, when the image includes a face, the face sometimes rotated by an angle other than 0°. Accordingly, the face detection performing portion 24, while enlarging or contracting the photographic image S0 until the vertical side or the horizontal side thereof becomes 30 pixels and stepwise rotating it through 360° in the plane (
Further, since the center-to-center distances between the eyes are 9, 10, or 11 pixels in the sample images employed when the sample images are learned to generate the reference data E1, the ratio of enlargement to enlarge or contract the photographic image S0 may be 11/9. Since in the sample images used in learning upon generation of the reference data El, faces are rotated within ±15° in the plane, the photographic images S0 have only to be rotated through 360° 30° by 30°.
The first characteristic value calculating portion 22 calculates the characteristic value on each stage of deformation of the photographic images S0, e.g., enlargement/contraction or the rotation of the photographic images S0.
The face detecting portion 20 thus detects approximate positions and the sizes of the faces from the photographic images S0, and obtains the face images S1.
The eye detecting portion 30 detects the positions of the eyes from the face images S1 obtained by the face detecting portion 20 and
The position of the eye to be distinguished by the eye detection performing portion 34 is the center between the outside corner of the eye and the inner side of the eye indicated at x in
Since being the same as the first characteristic value calculating portion 22 in the face detecting portion 20 shown in
The second reference data E2 stored in the database 40 defines the distinguishing conditions, for each of a plurality of pixel groups comprising a combination of a plurality of pixels selected from the sample images to be described later, for distinguishing the combination of the characteristic value C0 of each of the pixels forming each of the pixel groups as the first reference data E1.
For learning of the second reference data E2, there are used sample images which are 9.7 pixels, 10 pixels and 10.3 pixels in the center-to-center distances between the eyes and are obtained by rotating the image of the face stepwise by 1° within ±3° in the plane as shown in
The eye detection performing portion 34 obtains, referring to the distinguishing conditions which the second reference data E2 has learned on all the combinations of the characteristic values C0 in the, the distinguishing point on the combination of each of the pixels forming each of the pixel groups and distinguishes the position of the eyes included in the face on the basis of all the distinguishing points. At this time, the direction and the size of the gradient vector K which are the characteristic values C0 are respectively four-valued and three-valued.
The eye detection performing portion 34, while stepwise enlarging or contracting the face image S1 obtained by the face detecting portion 20 and stepwise rotating it through 360° in the plane, sets a mask M of 30×30 pixels on the face image enlarged or contracted in each step, and detects the position of the eyes while moving the mask M one pixel by one pixel on the enlarged or contracted face image.
Further, since the center-to-center distances between the eyes are 9.07, 10, or 10.3 pixels in the sample images employed when the sample images are learned to generate the second reference data E2, the ratio of enlargement to enlarge or contract the photographic image S0 maybe 10.3/9.7. Since in the sample images used in learning upon generation of the reference data E1, faces are rotated within ±3° in the plane, the face images S1 have only to be rotated through 360° 6° by 6°.
The second characteristic value calculating portion 32 calculates the characteristic value C0 on each stage of deformation, e.g., enlargement/contraction or the rotation of the face images S1.
Then, in this embodiment, all the distinguishing points are summed on all the stages of deformation of the face images S, and in the image in the 30×30 pixel mask M on the stage of deformation where the sum is the largest, a coordinate system having its origin on the upper left corner is set. Then positions corresponding to coordinates (x1, y1) and (x2, y2) of the positions of eyes of the sample image are obtained and positions corresponding to the positions in the face image S1 before deformation are detected as the positions of the eyes.
The eye detecting portion 30 thus detects positions of the eyes from the face image S1 obtained by the face detecting portion 20.
The smoothening portion 50 carries out smoothening processing on the face image S1 in order to facilitate a later extraction of a skin-colored area and in this particular embodiment, obtains a smoothened image S2 by applying a Gaussian filter as the smoothening filter to the face image S1. The smoothening portion 50 carries out smoothening processing on the face image S1 by the channels R, G, B.
The reference area setting portion 60 sets as a reference area an area which is certainly of a skin-color in the face image S1, and in this particular embodiment, sets as the reference area an area from below the lower edge of the eyes (a position near the eyes) to above the tip of the nose (a position near the tip of the nose). Specifically, the reference area setting portion 60 first calculates the eye-to-eye distance D in the face image S1 from the positions of the eyes (points A1 and A2 shown in
The reference area setting portion 60 estimates the position D/10 downward remote from the center of the eyes (broken line L1 in
The reference area setting portion 60 sets the reference area within the area between the lines L1 and L4 thus obtained. Since the line L1 is below the lower edge of the eye and the line L4 is above the tip of the nose, the eyelashes, pupils and the moustache are removed from the area between the lines L1 and L4. Accordingly, any part within this area may be considered to be of a skin-color. However, in this embodiment, in order to avoid the influence of the moustache on the outer side the cheeks, the part which has a width equal to the eye-to-eye distance D in the face image S1 and is laterally in the middle of the area between the lines L1 and L4 (the hatched portion in
The reference area setting portion 60 outputs information representing the reference area thus set to the skin-colored area extracting portion 70.
The skin-colored area extracting portion 70 extracts a skin-colored area from a smoothened face image S2 and has a structure shown in
The reference area characteristic value calculating portion 72 calculates a mean angle α of hue of the images in the reference area in the smoothened face image S2 as the characteristic value in the reference area.
The skin-colored pixel extracting portion 74 extracts all the pixels which are of a color approximate to the color of the reference area in the smoothened face image S2. For example, the skin-colored pixel extracting portion 74 extracts the pixels which meet all the following conditions.
1. R≧G≧K×B (R, G, B respectively represent the values of R, G and B, and K represents a coefficient which is in the range of 0.9 to 1.0 and 0.95 here)
2. The difference between its angle of hue and the mean angle α of hue of the reference area is smaller than a predetermined Hue-range threshold value (e.g., 20).
The skin-colored area extracting portion 70 takes the area formed by the pixels extracted by the skin-colored pixel extracting portion 74 as the skin-colored area and outputs information representing the position of the skin-colored area to the face area mask generating portion 80.
The face area mask generating portion 80 generates a face area mask image S5 from the smoothened face image S2 in order to facilitate detection of the lateral width of a face and
The two-valued image generating portion 82 carries out two-value transformation on the smoothened face image S2, where the pixels in the skin-colored area is transformed into white pixels (that is, the value of the pixels is transformed into, for instance, 255, a maximum value in the dynamic range) and the pixels in the area other than the skin-colored area is transformed into black pixels (that is, the value of the pixels is transformed into 0) on the basis of the information representing the position of the skin-colored area extracted by the skin-colored area extracting portion 70, and obtains a two-valued image S3 such as shown in
The noise removing portion 84 carries out removal of noise on the two-valued image S3 in order to facilitate detection of the lateral width of a face and obtains a noise-removed image S4. The noise to be removed by the noise removing portion 84 may include noise which can make it difficult to detect the lateral width of a face or can provide inaccurate result of detection as well as those in the usual sense. In the image processing system of this embodiment, the noise removing portion 84 carries out the removal of noise in the following manner.
1. Removal of an Isolated Small Area
An “isolated small area” as used here means an area which is surrounded by skin-colored areas and is isolated from other non-skin-colored areas and is smaller than a predetermined threshold value, and may be, for instance, an eye (pupil) or a nose hole in the face. Further, the black dot-like noise in the forehead in the example shown in
The noise removing portion 84 removes such an isolated area from the two-valued image S3 by transforming the pixels thereof into white pixels.
2. Removal of an Elongated Area
An “elongated area” as used here means a black laterally extending elongated area. The noise removing portion 84 carries out scanning on the two-valued image S3 with the main scanning direction and the sub-scanning direction respectively extending in the vertical direction and the lateral direction of the face to detect such an elongated area and removes the detected elongated area from the two-valued image S3 by transforming the pixels thereof into white pixels.
By this, a frame of glasses or hair hanging over the eyebrow or the face can be removed.
The lateral uncontinuous area removing portion 86 carries out processing, where a skin-colored area which is uncontinuous in the lateral direction is removed, on the noise-removed image S4 obtained by the noise removing portion 84 and obtains the face area mask image S5. Specifically, the lateral uncontinuous area removing portion 86 carries out scanning on the noise-removed image S4 with the main scanning direction and the sub-scanning direction respectively extending in the vertical direction and the lateral direction of the face to detect an uncontinuous position where the skin-colored area (represented by white dots) is laterally uncontinuous and removes the skin-colored area on the right or left side of the detected position remoter from the center of the skin-colored area by transforming the pixels thereof into black pixels.
The face lateral width obtaining portion 90 obtains the lateral width W of the face by the use of the face area mask image S5, and
As can be understood from the description above, in the image processing system of this embodiment, on the basis of the fact that the lateral widths in the respective vertical positions in the range from the upper root of the ear to the lower root of the ear in a human face are larger than those of other parts, this range is detected and the largest lateral width in this range is obtained as the lateral width of the face. With this arrangement, the lateral width of the face can be surely and accurately obtained, and processing requiring the lateral width of the face such as trimming for a certification photograph the standard of which defines the lateral width of the face or for a graduation album where the size of the face must be unified is permitted.
Further, in the image processing system of this embodiment, since an area which is surely of a skin-color in the face in the photographic images of the face is set in the reference area and the area formed by pixels having a color approximate to the reference area is detected as the skin-colored area when a skin-colored area is detected in the face in the photographic images of the face, the skin-colored area can be certainly detected without being affected by the difference among the individuals, which leads to an accurate detection of the lateral width of the face.
Though, a preferred embodiment of the present invention has been described above, the image processing method and system and the computer program for the purpose need not be limited to the embodiment described above, but may be variously modified within the scope of the spirit of the present invention.
For example, the skin-colored area may be detected in a method other than by the skin-colored area extracting portion 70 in the embodiment described above. Specifically, pixels having a color included in a skin color range set on the basis of the values of pixels in the reference area in a two-dimensional plane having r and g respectively represented by r=R/(R+G+B) and g=G/(R+G+B) as the two coordinate axes, wherein R, G and B respectively represents the R value, G value and B value, may be detected as skin-colored pixels. The skin color range may be set, for instance, by obtaining average values of r and g in the reference area and setting as the skin color range a range where a predetermined range having average value of r as its center intersects with a predetermined range having average value of g as its center.
Further, the lateral width of the face may be determined as the larger of the lateral width in the position of the upper root of the ear and that in the position of the lower root of the ear.
Further, though, in the image processing system shown in
Further, the reference area may be set in a manner other than that used in the image processing system shown in
Number | Date | Country | Kind |
---|---|---|---|
358012/2004 | Dec 2004 | JP | national |