1. Field of the Invention
The present invention generally relates to an image processing device, image processing method, and program, and in particular to an image processing device, image processing method, and program that enable easy change of clothing in an image.
2. Description of the Related Art
In recent years, digital cameras and other imaging devices are in widespread use and anyone can easily capture still images and moving images.
On the other hand, decorating photographic images for enjoyment as typified with photograph stickers created by a photograph sticker production machine called “Print Club” is in fashion.
There are various restrictions, however, in decorating photographic images in photograph stickers. For example, it is necessary to use a photograph sticker production machine specialized for photograph stickers and to place the user's face or body at a predetermined position when an image is captured.
A personal color coordinating system has been proposed which enables clothing in an image input by the user to be changed to any seasonal color specified by the user (see, for example, Mizuki Oka, Junpei Yoshino, Kazuhiko Kato, “Personal Color Coordinating System,” Nov. 29, 2006, Information Processing Society of Japan, [retrieved on Jan. 18, 2010], Internet <URL: http://www.ipsj.or.jp/sig/os/index.php?plugin=attach&refer=ComSys2006%2Fposter&openfile=06-P10.pdf>).
In the personal color coordinating system described above, to extract a clothing image region from an input image, the user should specify a region corresponding to the trunk portion in the input image.
There is a demand therefore for automatic extraction of a clothing image region from an image and easy change of clothing in the image.
It is desirable that clothing in an image can be changed with ease.
An image processing device according to an embodiment of the present invention includes a clothing extractor extracting from an input image a face or head portion, which is a region estimated to be a face or head image, and extracting from the region immediately below the face or head portion a clothing region, which is a region estimated to be a clothing image, and a clothing converter changing clothing in the input image by performing predetermined image processing on an image in the clothing region in the input image.
An image processing method and program according to other embodiments of the present invention correspond to the image processing device according to the embodiment of the present invention described above.
According to the embodiments of the present invention, a face or head portion is extracted from an input image, the face or head portion being a region estimated to be a face or head image, a clothing region is then extracted from a region immediately below the face or head portion, the clothing region being a region estimated to be a clothing image, and predetermined image processing is performed on an image in the clothing region in the input image, so that the clothing in the input image is changed.
According to the embodiments of the present invention, clothing in an image can be easily changed.
The imaging device 10 in
The image capturer 11 captures an image of the subject and supplies the captured photographic image to the image processor 12.
The image processor 12 includes a person extractor 21, clothing extractor 22, and clothing converter 23.
The person extractor 21 performs a person extracting process to extract a person region, which is a region estimated to be a person image, from a photographic image (input image) supplied from the image capturer 11. Techniques available for extracting a person region include the recognition technique using the Boosting process using as the feature values the response from the steerable filter as described in Japanese Patent Application No. 2009-055062 filed by the applicant of the present invention and the recognition technique using the support vector machine (SVM) and the histograms of oriented gradients (HOG) feature values proposed by N. Dalal et al., INRIA, France (see “Histograms of Oriented Gradients for Human Detection”, CVPR, 2005). The person extractor 21 supplies the photographic image and person region information specifying the person region extracted by the person extracting process to the clothing extractor 22.
The clothing extractor 22 performs a clothing extracting process to extract a clothing region, which is a region estimated to be a clothing image, from the person region in the photographic image corresponding to the person region information supplied from the person extractor 21. The clothing extractor 22 supplies to the clothing converter 23 the photographic image and clothing region information specifying the clothing region extracted by the clothing extracting process.
Information indicating the pixel positions in the person or clothing region in the photographic image, for example, is used as the information for specifying the person or clothing region.
On the basis of the clothing region information supplied from the clothing extractor 3, the clothing converter 23 changes clothing in the photographic image by performing various types of image processing on the clothing region in the photographic image.
For example, the clothing converter 23 changes the color of an image in the clothing region in the photographic image. The clothing converter 23 also creates a mask for merging an image into only the clothing region in the person region and performs noise reduction and shaping of the mask using a morphing process such as expansion, reduction, or the like or a small region removing process by labeling, for example. The clothing converter 23 uses the resultant mask to merge a pre-stored image for decoration (referred to hereinafter as decorative image) into the image in the clothing region in the photographic image. After this image processing, the clothing converter 23 outputs the resultant photographic image to the outside such as a display device.
One or more decorative images may be pre-stored. If a plurality of decorative images are pre-stored, the decorative image to be merged may be selected by the user, for example. Plain or patterned textures or mark or character images, for example, can be used as the decorative images.
Person Extracting Process
When a photographic image 30 including generally the whole bodies of a father, a mother, and a child as shown in
Mask
When clothing region information specifying as the clothing region a region 32A of an outerwear image in the region 32 is supplied from the clothing extractor 22, the clothing converter 23 creates a mask 41 for merging an image into only the clothing region 32A in the region 32 as shown in
Merging of Decorative Image into Clothing Region
The clothing converter 23 uses the mask 41 shown in
The clothing of the mother and/or child can also be decorated similarly to the decoration of the father's clothing described with reference to
Process by Imaging Device
In step S11, the person extractor 21 performs a person extracting process. In step S12, the person extractor 21 determines whether a person region has been extracted by the person extracting process.
If it is determined in step S12 that the person region has been extracted, the clothing extractor 22 performs in step S13 a clothing extracting process to extract a clothing region from the person region corresponding to the person region information supplied from the person extractor 21. This clothing extracting process will be described below in more detail with reference to
In step S14, the clothing converter 23 changes clothing in the photographic image by performing various types of image processing on the clothing region in the photographic image on the basis of the clothing region information supplied from the clothing extractor 22. After the image processing, the clothing converter 23 outputs the resultant photographic image to the outside such as a display device. Then, the process ends.
On the other hand, if it is determined in step S12 that the person region has not been extracted, steps S13 and S14 are skipped and the photographic image is output as it is to the outside. Then, the process ends.
In step S21, the clothing extractor 22 performs a face portion extracting process to extract a face portion, which is a region estimated to be a face image, from the person region in the photographic image corresponding to the person region information supplied from the person extractor 21.
In step S22, the clothing extractor 22 determines whether the face portion has been extracted in step S21. If it is determined in step S22 that the face portion has been extracted, the clothing extractor 22 crops (cuts out) in step S23 a region immediately below the face portion from the photographic image.
In step S24, the clothing extractor 22 generates a color histogram of the photographic image in the cropped region. In step S25, the clothing extractor 22 identifies as the clothing region a region in the cropped region having the highest color frequency in the histogram generated in step S24. In step S26, the clothing extractor 22 generates clothing region information indicating the positions in the photographic image of the pixels included in the identified clothing region. Then, the process returns to step S13 in
On the other hand, if it is determined in step S22 that the face portion has not been extracted, the process ends.
In the clothing extracting process described above with reference to
Steps S31 to S33 in
In step S34, which follows step S33, the clothing extractor 22 learns the gaussian mixture model (GMM) of the color of the photographic image in the cropped region as GMM(a). In step S35, the clothing extractor 22 learns the GMM of the color of the photographic image in the face portion as GMM(b).
In step S36, the clothing extractor 22 identifies as the clothing region an area in the cropped region having a low GMM(b) likelihood and a high GMM(a) likelihood. In step S37, the clothing extractor 22 generates clothing region information indicating the positions in the photographic image of the pixels included in the identified clothing region. The process then returns to step S13 in
In the clothing extracting process described above with reference to
Alternatively, the clothing extractor 22 may estimate the pose of the person using, for example, the technique developed by D. Ramanan at Toyota Technological Institute at Chicago as described in “Learning to Parse Images of Articulated Bodies”, NIPS, 2006 and crop the upper body portion of that person.
Alternatively, the clothing extractor 22 may crop the predetermined region (for example, upper region) of the person region. If the person's upper body region is extracted as the person region in the person extracting process, the clothing extractor 22 may crop the person region itself.
The clothing extracting process is not limited to the processing described with reference to
Exemplary Segmentation Processes
When the region growing process is used to determine a clothing region, the clothing extractor 22 first labels as the seed point a pixel 71 located in a small region adjacent to the center of a cropped region 70 as shown in
Next, as shown in
The clothing extractor 22 then identifies as the clothing region a region made up of the labeled pixels (in the example in
If the graph cut process is used to determine a clothing region, the clothing extractor 22 first generates a graph from the image of a clipped region (in the example in
The clothing extractor 22 then performs segmentation by cutting the graph according to a min-cut/max-flow algorithm so that the total sum of the costs from the S node to the T node is minimized. With this, in the graph resulting from the cutting, the region made up of the pixels corresponding to the nodes connecting to the S node is determined to be the clothing region.
As described above, the imaging device 10 can automatically and easily extract the clothing region by extracting the face or head portion from the photographic image and extracting the clothing region from the region immediately below the face or head portion. The imaging device 10 can therefore easily change the clothing in the photographic image by performing predetermined image processing on the clothing region in the photographic image.
The image capturer and image processor may be implemented as different devices or as a single device. When the image capturer and the image processor are implemented as a single device, the clothing in the photographic image can be easily changed in an image capturing device (for example, personal camera or the like) without the use of a specialized device.
Computer According to an Embodiment of the Present Invention
A series of processing steps performed by the above image processor 12 can be implemented by hardware or software. When the processing steps are implemented by software, the program forming this software is installed into a general-purpose computer or the like.
The program can be pre-stored in a storage medium built in the computer, such as a storage unit 208 or read only memory (ROM) 202.
Alternatively, the program can be stored (recorded) in a removable medium 211. Such a removable medium 211 can be provided as the so-called package software. Examples of removable medium 211 include, for example, a flexible disk, compact disc read only memory (CD-ROM), magneto optical (MO) disk, digital versatile disc (DVD), magnetic disk, and semiconductor memory.
Instead of being installed into the computer via a drive 210 from the removable medium 211 as described above, the program can also be downloaded via a communication or broadcasting network to the computer and installed into a built-in storage unit 208. More specifically, the program can be transmitted to the computer either wirelessly, for example, from a download site via a satellite for digital satellite broadcasting, or in a wired manner through a network such as a local area network (LAN), Internet, or the like.
The computer has a built-in central processing unit (CPU) 201 to which an I/O interface 205 is connected via a bus 204.
When a command is input through the I/O interface 205 from an input unit 206 operated by a user, for example, the CPU 201 executes a program stored in the ROM 202 according to this command. Alternatively, the CPU 201 loads a program from the storage unit 208 into the random access memory (RAM) 203 and executes this program.
With this, the CPU 201 performs the processing steps according to the flowchart described above or the processing steps defined by the structure in the block diagram described above. The CPU 201 then outputs the processing results from an output unit 207, transmits the processing results from a communication unit 209, or stores the processing results into a storage unit 208, for example, as necessary, through the I/O interface 205.
The input unit 206 includes a keyboard, mouse, microphone, and/or the like. The output unit 207 includes a liquid crystal display (LCD), loudspeaker, and/or the like.
The order in which the processing steps are performed by the computer according to the program is not limited to the chronological order described in the flowcharts in this specification. In other words, the processing steps to be performed by the computer according to the program may include processing steps performed in parallel or separately (for example, parallel processing or processing by an object).
The program may be processed by a single computer (processor) or by a plurality of distributed computers. The program may be transmitted to and executed by one or more remote computers.
The present application contains subject matter related to that disclosed in Japanese Priority Patent Application JP 2010-107147 filed in the Japan Patent Office on May 7, 2010, the entire contents of which are hereby incorporated by reference.
The present invention is not limited to the embodiments described above and various modifications and changes may be made to the present invention without departing from the scope and spirit of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
2010-107147 | May 2010 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
4668065 | Tanaka et al. | May 1987 | A |
4899187 | Alligood | Feb 1990 | A |
5659490 | Imamura | Aug 1997 | A |
5719639 | Imamura | Feb 1998 | A |
20070071311 | Rovira-Mas et al. | Mar 2007 | A1 |
20070237364 | Song et al. | Oct 2007 | A1 |
20080136895 | Mareachen | Jun 2008 | A1 |
20080298643 | Lawther et al. | Dec 2008 | A1 |
20090144173 | Mo et al. | Jun 2009 | A1 |
20090263038 | Luo et al. | Oct 2009 | A1 |
20100111370 | Black et al. | May 2010 | A1 |
20100166339 | Gokturk et al. | Jul 2010 | A1 |
20100214442 | Uemura et al. | Aug 2010 | A1 |
20100266159 | Ueki et al. | Oct 2010 | A1 |
20110310289 | Veksland et al. | Dec 2011 | A1 |
Number | Date | Country |
---|---|---|
2010-108476 | May 2010 | JP |
Entry |
---|
“Personal Color Coordinating System,” http://www.ipsj.or.jp/sig/os/index.php?plugin=attach&refer=ComSys2006%2Fposter&openfile=06-P10.pdf>, Nov. 29, 2006, 2 pages. |
Navneet Dalal, et al., “Histograms of Oriented Gradients for Human Detection”, CVPR, 2005, 8 pages. |
Deva Ramanan, “Learning to parse images of articulated bodies”, NIPS, 2006, 8 pages. |
Number | Date | Country | |
---|---|---|---|
20110273592 A1 | Nov 2011 | US |