The present invention relates to an image-processing device and an image-processing program.
In a method known in the related art, the position taken by a human body, centered on a person's face and skin color, is determined and the attitude of the human body is then estimated by using a human body model (see patent literature 1).
Patent literature 1: Japanese patent No. 4295799
However, there is an issue to be addressed in the method in the related art described above, in that if skin color cannot be detected, the human body position detection capability will be greatly compromised.
(1) An image-processing device according to a first aspect of the present invention comprises: a face detection unit that detects a face of an animal in an image; a candidate area setting unit that sets an animal body candidate area for a body of the animal in the image based upon face detection results provided by the face detection unit; a reference image acquisition unit that obtains a reference image; a similarity calculation unit that divides the animal body candidate area having been set by the candidate area setting unit into a plurality of small areas and calculates a level of similarity between an image in each of the plurality of small areas and the reference image; and a body area estimating unit that estimates an animal body area corresponding to the body of the animal from the animal body candidate area based upon levels of similarity having been calculated for the plurality of small areas by the similarity calculation unit.
(2) According to a second aspect of the present invention, in the image-processing device according to the first aspect, it is preferable that the candidate area setting unit sets the animal body candidate area in the image in correspondence to a size and a tilt of the face of the animal having been detected by the face detection unit.
(3) According to a third aspect of the present invention, in the image-processing device according to the first or second aspect, it is preferable that the face detection unit sets a rectangular frame depending on a size and a tilt of the face of the animal at a position of the face of the animal in the image; and the candidate area setting unit sets the animal body candidate area by placing a specific number of rectangular frames, each identical to the rectangular frame having been set by the face detection unit, next to one another.
(4) According to a fourth aspect of the present invention, in the image-processing device according to the third aspect, it is preferable that the similarity calculation unit defines the plurality of small areas by dividing each of the plurality of rectangular frames that forms the animal body candidate area into a plurality of areas.
(5) According to a fifth aspect of the present invention, in the image-processing device according to the fourth aspect, it is preferable that the reference image acquisition unit further sets second small areas each contained within one of the rectangular frames and having a size matching a size of the plurality of small areas, and obtains images in a plurality of second small areas so as to use each image as the reference image; and the similarity calculation unit calculates levels of similarity between images in the individual small areas and the image in each of the plurality of second small areas.
(6) According to a sixth aspect of the present invention, in the image-processing device according to the fifth aspect, it is preferable that the reference image acquisition unit sets each of the second small areas at a center of one of the rectangular frame.
(7) According to a seventh aspect of the present invention, in the image-processing device according to any one of the first through sixth aspects, it is preferable that the similarity calculation unit applies a greater weight to a level of similarity calculated for a small area, among the plurality of small areas set within the animal body candidate area, which is closer to the face of the animal having been detected by the face detection unit.
(8) According to an eighth aspect of the present invention, in the image-processing device according to any one of the first through seventh aspects, it is preferable that the similarity calculation unit calculates levels of similarity by comparing one of, or a plurality of parameters among luminance, frequency, edge component, chrominance and hue between the images in the small areas and the reference image.
(9) According to a ninth aspect of the present invention, in the image-processing device according to any one of the first through eighth aspects, it is preferable that the reference image acquisition unit uses an image stored in advance as the reference image.
(10) According to a tenth aspect of the present invention, in the image-processing device according to any one of the first through ninth aspects, it is preferable that the face detection unit detects a face of a person in an image as the face of the animal; the candidate area setting unit sets a human body candidate area for a body of the person in the image as the animal body candidate area based upon the face detection results provided by the face detection unit; the similarity calculation unit divides the human body candidate area having been set by the candidate area setting unit into a plurality of small areas and calculates levels of similarity between images in the plurality of small areas and the reference image; and the body area estimating unit estimates a body area corresponding to the body of the person, which is included in the human body candidate area, as the animal body area based upon the levels of similarity having been calculated for the plurality of small areas by the similarity calculation unit.
(11) According to an eleventh aspect of the present invention, in the image-processing device according to the tenth aspect, it is preferable that an upper body area corresponding to an upper half of the body of the person is estimated and then a lower body area corresponding to a lower half of the body of the person is estimated based upon estimation results obtained by estimating the upper body area.
(12) An image-processing device according to a twelfth aspect of the present invention comprises: a face detection unit that detects a face of an animal in an image; a candidate area setting unit that sets a candidate area for a body of the animal in the image based upon face detection results provided by the face detection means; a similarity calculation unit that sets a plurality of reference areas within the candidate area for the body having been set by the candidate area setting means and calculates levels of similarity between images within small areas defined within the candidate area and a reference image contained in each of the reference areas; and a body area estimating unit that estimates an animal body area corresponding to a body of the animal, which is included in the candidate area for the body, based upon the levels of similarity calculated for the small areas by the similarity calculation means.
(13) An image-processing program, according to a thirteenth aspect of the present invention, enables a computer to execute; face detection processing for detecting a face of an animal in an image; candidate area setting processing for setting an animal body candidate area for a body of the animal in the image based upon face detection results obtained through the face detection processing; reference image acquisition processing for obtaining a reference image; similarity calculation processing for dividing the animal body candidate area, having been set through the candidate area setting processing, into a plurality of small areas and calculating levels of similarity between images in the plurality of small areas and the reference image; and body area estimation processing for estimating an animal body area corresponding to a body of the animal, which is included in the animal body candidate area, based upon the levels of similarity having been calculated through the similarity calculation processing for the plurality of small areas.
According to the present invention, the area taken up by an animal body can be estimated with great accuracy.
An image-processing device 100 achieved in the first embodiment comprises a storage device 10 and a CPU 20. The CPU (control unit, control device) 20 includes a face detection unit 21, a human body candidate area generation unit 22, a template creation unit 23, a template-matching unit 24, a similarity calculation unit 25, a human body area estimating unit 26, and the like, all achieved in software. The CPU 20 detects an estimated human body area 50 by executing various types of processing on an image stored in the storage device 10.
Images input via an input device (not shown) are stored in the storage device 10. These images include images input via the Internet as well as images directly input from an image-capturing device such as a camera.
In step S1 in
It is to be noted that the face detection unit 21 detects the inclination of each face based upon the face recognition algorithm and sets a rectangular block at an angle in correspondence to the inclination of the face. In the examples presented in
Next, in step S2 in
It is to be noted that the human body candidate area generation unit 22 generates a human body candidate area by setting a specific number of rectangular blocks, identical to the face rectangular block, next to one another along the longitudinal direction and the lateral direction in the example described above. As explained earlier, the probability of the body area taking up a position corresponding to the face size and orientation is high. In other words, the probability of the body area being set with accuracy is high through the human body candidate area generation method described above. However, the present invention is not limited to this example and the size and shape of the rectangular blocks set in the human body candidate area and the quantity of rectangular blocks set in the human body candidate area may be different from those set in the method described above.
Bs (i, j) in expression (1) indicates the address (row, column) of a rectangular block Bs present in the human body candidate area B whereas pix (a, b) in expression (1) indicates the address (row, column) of a pixel within each rectangular block Bs.
Next, the human body candidate area generation unit 22 in the CPU 20 divides each of the rectangular blocks Bs forming the human body candidate area B into four parts, as shown in
In step S3, in
The template can be expressed with matrices, as in (2) below. . . . (2)
T in expression (2) is a matrix of all the templates generated for the human body candidate area B and Tp (i, j) in expression (2) is a template matrix corresponding to each rectangular block Bs.
In step S4 in
For instance, the template-matching unit 24 first executes the template-matching processing for all the sub blocks BsDiv in all the rectangular blocks Bs, in reference to the template Tp (0, 0) set at the rectangular block Bs (0, 0) at the upper left corner, as shown in
In step S5 in
In expression (3), M represents the total number of sub blocks present along the row direction, N represents the total number of sub blocks present along the column direction and K represents the number of templates.
Among the plurality of rectangular blocks Bs forming the human body candidate area B, a rectangular block Bs closer to the face rectangular block has a higher probability of belonging to the human body candidate area. Accordingly, the similarity calculation unit 25 applies a greater weight to the template-matching processing results for the rectangular block Bs located closer to the face rectangular block, compared to the weight applied to a rectangular block Bs located further away from the face rectangular block. This enables the CPU 20 to identify the human body candidate area with better accuracy. More specifically, the similarity calculation unit 25 calculates similarity factors S (in, n) and a similarity factor average value Save as expressed in (4) below. . . . (4)
W (i, j) in expression (4) represents a weight matrix.
In step S6 in
The human body area estimating unit 26 may estimate an area to be classified as a human body area by using the similarity factor average value Save as a threshold value through a probability density function or through a learning threshold discrimination method adopted in conjunction with, for instance, an SVM (support vector machine).
In the first embodiment described above, template-matching processing is executed by comparing the value representing the luminance at each pixel in the template with the value representing the luminance at the corresponding pixel in the matching target sub block. In the second embodiment, template-matching processing is executed by comparing the frequency spectrum, the edge component, the chrominance (color difference), the hue and the like in the template with those in the matching target sub block or by comparing a combination of the frequency spectrum, the edge component, the chrominance, the hue and the like in the template with the corresponding combination in the matching target sub block, as well as by comparing the luminance values.
In the first embodiment described above, an area to be classified as a human body area is estimated. In the third embodiment, the gravitational center of a human body is estimated in addition to the area taken up by the human body.
In the first embodiment described earlier, a template is created by setting a template area at a central location among the sub blocks and the template thus generated is used in the template-matching processing. In the fourth embodiment, a template to be used to identify a human body area is stored in advance as training data so as to execute template-matching processing by using the training data.
In the previous embodiments described earlier, a template is created by using part of the image and thus, information used for purposes of template-based human body area estimation is limited to information contained in the image. This means that the accuracy and the detail of an estimation achieved based upon such limited information are also bound to be limited. In contrast, the image-processing device 103 in the fourth embodiment, which is able to incorporate diverse information as training data, will improve the human body area estimation accuracy and expand the estimation range. Namely, the image-processing device 103 achieved in the fourth embodiment, which is allowed to incorporate diverse information, will be able to estimate a human body area belonging to a person wearing clothing of any color or style with accuracy.
Furthermore, the range of application for the image-processing device 103 achieved in the fourth embodiment is not limited to human body area estimation. Namely, the image-processing device 103 is capable of estimating an area to be classified as an object area, e.g., an area taken up by an animal such as a dog or a cat, an automobile, a building or the like. The image-processing device 103 achieved in the fourth embodiment is thus able to estimate an area taken up by any object with high accuracy.
In the fifth embodiment, an upper body area is estimated based upon face detection results and then a lower body area is estimated based upon the estimated upper body area indicated in the estimation results.
In the fifth embodiment described above, a human body area is estimated by using the upper body area estimation results for purposes of lower body area estimation, so as to assure a high level of accuracy in the estimation of the overall human body area.
It is to be noted that if a human body area cannot be detected through the processing executed based upon the image-processing program achieved in any of the embodiments described above, the CPU may execute the processing again by modifying or expanding the human body candidate area.
While an explanation has been given in reference to the embodiments on an example in which the face area detection unit 21 detects a human face in an image and an area taken up by the body in the image is estimated based upon the face detection results, the application range for the image-processing device according to the present invention is not limited to human body area estimation. Rather, the image-processing device according to the present invention may be adopted for purposes of estimating an object area such as an area taken up by an animal, e.g., a dog or a cat, an area taken up by an automobile, an area taken up by a building structure, or the like. An animal with its body parts connected via joints, in particular, moves with complex patterns and, for this reason, detection of its body area or its attitude has been considered difficult in the related art. However, the image-processing device according to the present invention detects the face of an animal in an image and estimates the animal body area in the image with a high level of accuracy based upon the face detection results. Namely, the image-processing device according to the present invention can accurately estimate the human body area taken up by the body of a person, i.e., an animal belonging to the primate hominid group, with his ability to make particularly complex movements through articulation of the joints in his limbs, and is further capable of detecting the attitude of the body and the gravitational center of the body based upon the human body area estimation results as well.
While the present invention is realized in the form of an image-processing device in the embodiments and variations thereof described above, the image processing explained earlier may be executed on a typical personal computer by installing and executing an image-processing program enabling the image processing according to the present invention in the personal computer. It is to be noted that the image-processing program according to the present invention may be recorded in a recording medium such as a CD-ROM and provided via the recording medium, or it may be downloaded via the Internet. As an alternative, the image-processing device or the image-processing program according to the present invention may be mounted or installed in a digital camera or a video camera so as to execute the image processing described earlier on a captured image.
It is to be noted that the embodiments described above and the variations thereof may be adopted in any conceivable combination, including a combination of different embodiments and the combination of an embodiment and a variation.
The following advantages are achieved through the embodiments and variations thereof described above. Namely, the face of an animal in an image is first detected by the face detection unit 21 and then, based upon the face detection results, the human body candidate area generation unit 22 sets a body candidate area (rectangular blocks) likely to be taken up by the body of the animal (human) in the image. The template-matching units 24 and 27 obtain a reference image (template) respectively via the template creation unit 23 and the training data storage device 33. The human body candidate area generation unit 22 divides each rectangular block in the animal body candidate area into a plurality of sub areas (sub blocks). The template-matching units 24 and 27, working together with the similarity calculation unit 25, determine, through arithmetic operation, the level of similarity manifesting between the image in each of the plurality of sub areas and the reference image. Then, based upon the similarity factors thus calculated, each in correspondence to one of the plurality of sub areas, the human body area estimating unit 26 estimates an area contained in the animal body candidate area, which should correspond to the animal's body. Through these measures, the image-processing device is able to accurately detect the area taken up by the body of the animal.
In addition, in the embodiments and variations thereof described above, the human body candidate area generation unit 22 sets a candidate area for an animal's body in an image in correspondence to the size of the animal's face and the tilt of the animal's face, as shown in
In the embodiments and the variations thereof described above, the face detection unit 21 sets a rectangular block depending on the size of the face of an animal and the tilt of the face, at the position taken up by the animal's face in the image. Then, the human body candidate area generation unit 22 sets an animal body candidate area by setting a specific number of rectangular blocks each identical to the face rectangular block, next to one another, as shown in
In the embodiments and the variations thereof described above, the human body candidate area generation unit 22 defines sub areas (sub blocks) by dividing each of the plurality of rectangular blocks forming the animal body candidate area into a plurality of small areas. As a result, the image-processing device is able to determine levels of similarity, based upon which the body area is estimated, with high accuracy.
In the embodiments and the variations thereof described above, the template creation unit 23 sets a template area, assuming a size matching that of a sub block, at the center of each rectangular block and creates a template by using the image in the template area. As a result, the image-processing device is able to determine levels of similarity, based upon which the body area is estimated, with high accuracy.
In the embodiments and variations thereof described above, the similarity calculation unit 25 applies a greater weight to the similarity factor calculated for a sub block within the candidate area located closer to the animal's face. This allows the image-processing device to estimate the animal body area with high accuracy.
In the embodiments and variations thereof described above, the CPU calculates a similarity factor by comparing values indicated in the target sub block image and in the template, in correspondence to a single parameter among the luminance, the frequency, the edge component, the chrominance and the hue or corresponding to a plurality of such parameters. As a result, the image-processing device is able to determine levels of similarity, based upon which the body area is estimated, with high accuracy.
In the fourth embodiment and the variation thereof described above, the template-matching unit 27 uses an image stored in advance in the training data storage device 33 as a template, instead of images extracted from the sub blocks. This means that the image-processing device is able to estimate the body area by incorporating diverse information without being restricted to information contained in the image. As a result, the image-processing device is able to assure better accuracy for human body area estimation and, furthermore, is able to expand the range of estimation.
In the fifth embodiment and the variation thereof, the upper body-estimating unit 41 estimates an area corresponding to the upper half of a person's body. Then, the lower body-estimating unit 42 estimates an area corresponding to the lower half of the person's body based upon the upper body area estimation results. As a result, the image-processing device is able to estimate the area corresponding to the entire body with high accuracy.
In the embodiments and variations thereof, the template-matching unit 24 or 27 executes template-matching processing by using a template constituted with the image in a template area or training data. However, the present invention is not limited to these examples and the image-processing device may designate the image in each sub block set by the human body candidate area generation unit 22 as a template or may designate an image contained in an area in each rectangular block, which assumes a size matching the size of a sub block, as a template.
It is to be noted that the embodiments and variations thereof described above simply represent examples and the present invention is in no way limited to the particulars of these examples. Any other mode conceivable within the range of the technical teachings of the present invention should, therefore, be considered to be within the scope of the present invention.
The disclosure of the following priority application is herein incorporated by reference:
Japanese Patent Application No. 2011-047525 filed Mar. 4, 2011
Number | Date | Country | Kind |
---|---|---|---|
2011-047525 | Mar 2011 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2012/055351 | 3/2/2012 | WO | 00 | 8/23/2013 |