This application is based on and claims a priority to Chinese Patent Application Serial No. CN 201510847294.6, filed with the State Intellectual Property Office of P. R. China on Nov. 26, 2015, the entire content of which is incorporated herein by reference.
The present disclosure generally relates to the field of image processing technology, and more particularly, to methods and apparatus for processing images containing human faces.
An electronic photo album (herein referred to as electronic album program, or album program, or electronic album, or simply, album) is a common application in a mobile terminal, such as a smart phone, a tablet computer, and a laptop computer, etc. The electronic album may be used for managing, cataloging, and displaying images in the mobile terminal.
In related art, the album program in the terminal may cluster all human faces appeared in a collection of images to into a set of unique human faces, so as to organize the collection of images into photon sets each corresponding to one of the faces within the set of unique faces.
Embodiments of the present disclosure provide an image processing method and an image processing apparatus. This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In one embodiment, a method for image processing management is disclosed. The method includes recognizing at least one human face contained in an image; acquiring a set of contextual characteristic information for each of the at least one recognized human face; classifying each of the at least one recognized human face as a face of interest or irrelevance according to the set of contextual characteristic information compared to a predetermined set of corresponding contextual criteria; and associating each face classified as a face of interest with an electronic photo album among a set of at least one photo album each associated with a unique human face.
In another embodiment, an image processing and management apparatus is disclosed. The apparatus includes a processor; and a memory for storing instructions executable by the processor, wherein the processor is configured to cause the apparatus to: identify at least one human face contained in an image; acquire a set of contextual characteristic information for each of the at least one recognized human face; classify each of the at least one recognized human face as a face of interest or irrelevance according to the set of contextual characteristic information compared to a predetermined set of corresponding contextual criteria; and associate each face classified as a face of interest with an electronic photo album among a set of at least one photo album each associated with a unique human face.
In yet another embodiment, a non-transitory computer-readable storage medium having stored therein instructions is disclosed. The instructions, when executed by a processor of a terminal, causes the terminal to identify at least one human face contained in an image; acquire a set of contextual characteristic information for each of the at least one recognized human face; classify each of the at least one recognized human face as a face of interest or irrelevance according to the set of contextual characteristic information compared to a predetermined set of corresponding contextual criteria; and associate each face classified as a face of interest with an electronic photo album among a set of at least one photo album each associated with a unique human face.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and, together with the description, serve to explain the principles of the invention.
Reference will be made in detail to embodiments of the present disclosure. Unless specified or limited otherwise, the same or similar elements and the elements having same or similar functions are denoted by like reference numerals throughout the descriptions. The explanatory embodiments of the present disclosure and the illustrations thereof are not be construed to represent all the implementations consistent with the present disclosure. Instead, they are examples of the apparatus and method consistent with some aspects of the present disclosure, as described in the appended claims.
Terms used in the disclosure are only for purpose of describing particular embodiments, and are not intended to be limiting. The terms “a”, “said” and “the” used in singular form in the disclosure and appended claims are intended to include a plural form, unless the context explicitly indicates otherwise. It should be understood that the term “and/or” used in the description means and includes any or all combinations of one or more associated and listed terms.
It should be understood that, although the disclosure may use terms such as “first”, “second” and “third” to describe various information, the information should not be limited herein. These terms are only used to distinguish information of the same type from each other. For example, first information may also be referred to as second information, and the second information may also be referred to as the first information, without departing from the scope of the disclosure. Based on context, the word “if” used herein may be interpreted as “when”, or “while”, or “in response to a determination”. Further, the term “image” and “photo” are used interchangeably in this disclosure.
Embodiments of the present disclosure provide an image processing method, and the method may be applied in various electronic devices such as a mobile terminal. The mobile terminal may be equipped with one or more cameras and capable of taking photos and storing the photos locally in the mobile terminal device. An application may be installed in the mobile terminal for providing an interface for a user to organize and view the photos. The application may organize the photos based on face clustering. In particular, the photos may be organized in albums each associated with a particular person and a subset of photos in which that particular person appears. Photos with multiple individuals thus may be associated with multiple corresponding albums. Those of ordinary skill in the art understand that the association between a person-based album and the photos may be implemented as pointers and thus the mobile terminal only needs to maintain a single copy of each photo in the local storage of the mobile terminal. The clustering of the photos into the albums may be automatically performed by the application via face recognition. Specifically, the application may detect unique faces in the collection of photos and build albums corresponding to the unique faces. However, not all the human faces appearing in the collection of photos in the mobile terminal are of interest to the user. For example, a photo may be taken in a crowded place and there may be many other bystanders in the photo. A usual clustering application based on face recognition would only recognize faces of these bystanders and automatically establish corresponding photo albums. This may not be what the user desires.
The embodiments of the present disclosure provide methods and apparatus that classify the recognized faces in a photo collection into faces of interest or irrelevance (such as faces of bystanders) based on detecting some contextual characteristics information of the recognized faces in the photos, and only organize the photos into albums corresponding to faces of interest. While the disclosure below uses a mobile terminal device as an example, the principle disclosed may be applied in other scenarios. For example, the same face classification may be used in a cloud server maintaining electronic photo albums for users. This disclosure does not intend to limit the context in which the methods and apparatus disclosed herein apply.
For example, when a user takes a photo in a crowd scene, besides the target human face that the user wants to photograph (such as one of her friends), the photo may also include faces of a bystander that the user does not intend to photograph. Thus, the face of the bystander is unrelated and of irrelevance to the user. In the present disclosure, whether a human face recognized from a photo is of interest or irrelevance may be determined by the terminal device based on the contextual characteristic information obtained from imaging processing of the photo. As will be explained further below, the face of a bystander may have contextual characteristic information that the terminal device would reasonably conclude as indicating irrelevance. Thus, according to the method of
The contextual characteristic information of a face may include at least one of: a position of the face in the image, an orientation angle of the human face in the image, depth information of the human face in the image, a size of the face in the image relative to the size of the image, and a number of times of the face has appeared in all images. Any one or combination of these and other contextual characteristic information may be used to determine whether the face should be classified as being of interest or irrelevance, as will be described in detail hereinafter.
As shown in
As shown in
In this embodiment, when the image contains at least two human faces, the target photographed area may be determined according to the position of each human face in the image and the human face distribution. For example, the target photographed area is the center area of the image, and then a human face A in the center area may be determined as a face of interest. A distance from the A to the other one human face B in the image may be calculated. If the distance is less than the preset distance, then the human face B is also determined as a face of interest, giving rise to a set of face of interest: [A, B]. If the image further contains a human face C. A distance from the human face C to each of the set of faces of interest [A, B] is further calculated. If a distance from the human face C to any one in the set faces [A, B] is less than the preset distance, the human face C is determined as a face of interest. Whether other faces contained in the image are classified as faces of interest or irrelevance faces may be determined in a similar progressive way.
In another implementation, as shown in
Thus, in the implementation of
In another implementation, as shown in
In another implementation, as shown in
The classification of a face into either a face of interest or irrelevance may be based on any two or more items of the contextual characteristic information discussed above. For example, if the contextual characteristic information of a face includes the position of the face in the image and the orientation angle of the face in the image, the methods of determining whether the face is of interest corresponding to these two items of contextual characteristic information may be used additively. For example, the target photographed area may be determined according to the position of each human face in the image and the human face distribution. A human face within the target photographed area is determined as a face of interest. For a human face outside the target photographed area, the orientation angle may be used to determine whether that face is of interest. A face outside the target photographed area with an orientation angle smaller than a preset angle may be determined as a face of interest. On the contrary, a human face in the target photographed area but with an orientation angle greater than or equal to the preset angle may be determined as a face of irrelevance together with faces outside of the target photographed area.
As shown in
Various Apparatus are further disclosed below for implementing the methods described above.
In one implementation, the contextual characteristic information includes at least one of: a position of the human face in the image, an orientation angle of the human face in the image, depth information of the human face in the image, a size of the face relative to the size of the image, and a number of times of the face has appeared in the collection of images. Whether a face is of interest or irrelevance is determined according to one or more pieces of the aforementioned contextual information.
As shown in
In another implementation shown in
In another implementation shown in
In another implementation shown in
In another implementation as shown in
The above apparatus may further include a clustering module 141, as illustrated in
According to another aspect of the present disclosure, an image processing apparatus is provided, including a processor, and a memory for storing instructions executable by the processor, in which the processor is configured to cause the apparatus to perform the methods described above.
The device 1500 may include one or more of the following components: a processing component 1502, a memory 1504, a power component 1506, a multimedia component 1508, an audio component 1510, an input/output (I/O) interface 1512, a sensor component 1514, and a communication component 1516.
The processing component 1502 controls overall operations of the device 1500, such as the operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 1502 may include one or more processors 1520 to execute instructions to perform all or part of the steps in the above described methods. Moreover, the processing component 1502 may include one or more modules which facilitate the interaction between the processing component 1502 and other components. For instance, the processing component 1502 may include a multimedia module to facilitate the interaction between the multimedia component 1508 and the processing component 1502.
The memory 1504 is configured to store various types of data to support the operation of the device 1500. Examples of such data include instructions for any applications or methods operated on the device 1500, contact data, phonebook data, messages, pictures, video, etc. The memory 1504 may be implemented using any type of volatile or non-volatile memory devices, or a combination thereof, such as a static random access memory (SRAM), an electrically erasable programmable read-only memory (EEPROM), an erasable programmable read-only memory (EPROM), a programmable read-only memory (PROM), a read-only memory (ROM), a magnetic memory, a flash memory, a magnetic or optical disk.
The power component 1506 provides power to various components of the device 1500. The power component 1506 may include a power management system, one or more power sources, and any other components associated with the generation, management, and distribution of power in the device 1500.
The multimedia component 1508 includes a display screen providing an output interface between the device 1500 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes the touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensors may not only sense a boundary of a touch or swipe action, but also sense a period of time and a pressure associated with the touch or swipe action. In some embodiments, the multimedia component 1508 includes a front camera and/or a rear camera. The front camera and the rear camera may receive an external multimedia datum while the device 1500 is in an operation mode, such as a photographing mode or a video mode. Each of the front camera and the rear camera may be a fixed optical lens system or have focus and optical zoom capability.
The audio component 1510 is configured to output and/or input audio signals. For example, the audio component 1510 includes a microphone (“MIC”) configured to receive an external audio signal when the device 1500 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in the memory 1504 or transmitted via the communication component 1516. In some embodiments, the audio component 1510 further includes a speaker to output audio signals.
The I/O interface 1512 provides an interface between the processing component 1502 and peripheral interface modules, such as a keyboard, a click wheel, buttons, and the like. The buttons may include, but are not limited to, a home button, a volume button, a starting button, and a locking button.
The sensor component 1514 includes one or more sensors to provide status assessments of various aspects of the device 1500. For instance, the sensor component 1514 may detect an open/closed status of the device 1500, relative positioning of components, e.g., the display and the keypad, of the device 1500, a change in position of the device 1500 or a component of the device 1500, a presence or absence of user contact with the device 1500, an orientation or an acceleration/deceleration of the device 1500, and a change in temperature of the device 1500. The sensor component 1514 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor component 1514 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 1514 may also include an accelerometer sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor or thermometer.
The communication component 1516 is configured to facilitate communication, wired or wirelessly, between the device 1500 and other devices. The device 1500 can access a wireless network based on a communication standard, such as WiFi, 2G, 3G, LTE, or 4G cellular technologies, or a combination thereof. In one exemplary embodiment, the communication component 1516 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 1516 further includes a near field communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on a radio frequency identification (RFID) technology, an infrared data association (IrDA) technology, an ultra-wideband (UWB) technology, a Bluetooth (BT) technology, and other technologies.
In exemplary embodiments, the device 1500 may be implemented with one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components, for performing the above described methods.
In illustrative embodiments, there is also provided a non-transitory computer-readable storage medium including instructions, such as a memory 1504 including instructions, the instructions may be executable by the processor 1520 in the device 1500, for performing the above-described methods. For example, the non-transitory computer-readable storage medium may be a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disc, an optical data storage device, and the like.
A non-transitory computer-readable storage medium is further disclosed. The storage media have stored therein instructions that, when executed by a processor of the device 1500, causes the device 1500 to perform the above image processing method.
Each module or unit discussed above for
The illustrations of the embodiments described herein are intended to provide a general understanding of the structure of the various embodiments. The illustrations are not intended to serve as a complete description of all of the elements and features of apparatus and systems that utilize the structures or methods described herein. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following the general principles thereof and including such departures from the present disclosure as come within known or customary practice in the art. It is intended that the specification and examples are considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims in addition to the disclosure.
It will be appreciated that the present invention is not limited to the exact construction that has been described above and illustrated in the accompanying drawings, and that various modifications and changes can be made without departing from the scope thereof. It is intended that the scope of the invention only be limited by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
201510847294.6 | Nov 2015 | CN | national |