This application claims priority and benefits to Chinese Application No. 202010537221.8, filed on Jun. 12, 2020, the entire content of which is incorporated herein by reference.
Embodiments of the disclosure relate to image processing technologies, particularly to a field of artificial intelligence, and to a method for processing an image, a device for processing an image, an electronic device, and a storage medium.
With the development of image processing technologies, users are no longer satisfied with the originally captured images. Personalization needs are increasing, and thus users pursue filters of different styles.
Embodiments of the disclosure provide a method for processing an image. The method includes: acquiring an image to be processed, the image to be processed containing a face figure; extracting facial feature information matching the face figure; and obtaining a style transferred image matching the image to be processed by converting a style of the image to be processed to a preset drawing style based on the facial feature information.
Embodiments of the disclosure provide an electronic device. The electronic device includes at least one processor, and a memory communicatively coupled with the at least one processor. The memory is configured to store instructions executable by the at least one processor. When the instructions are executed by the at least one processor, the at least one processor is configured to execute a method for processing an image. The method includes acquiring an image to be processed, the image to be processed containing a face figure; extracting facial feature information matching the face figure; and obtaining a style transferred image matching the image to be processed by converting a style of the image to be processed to a preset drawing style based on the facial feature information.
Embodiments of the disclosure provide a non-transitory computer-readable storage medium, having computer instructions stored thereon. The computer instructions are configured to cause a computer to execute a method for processing an image. The method includes acquiring an image to be processed, the image to be processed containing a face figure; extracting facial feature information matching the face figure; and obtaining a style transferred image matching the image to be processed by converting a style of the image to be processed to a preset drawing style based on the facial feature information.
It should be understood, this part is not intended to identify key or important features of embodiments of the disclosure, nor to limit the scope of the disclosure. Other features of the disclosure will be easily understood by the following description.
The accompanying drawings are used to better understand the technical solution and do not constitute a limitation to the disclosure.
The following describes exemplary embodiments of the disclosure in combination with accompanying drawings, including various details of the embodiments of the disclosure to facilitate understanding, which should be regarded as merely exemplary. Therefore, those of ordinary skill in the art should realize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the disclosure. Likewise, for clarity and conciseness, descriptions of well-known functions and structures are omitted in the following description.
In the image processing technologies, filters of some drawing styles, such as filters of Oil Painting Style and filters of Ink Painting Style, have unique styles and vivid effects. Adding the filter of a certain drawing style to an image is generally achieved by stylized transferring of the image, preserving the content of the image while incorporating the drawing style. In the process of implementing the stylized transferring of the image, the inventor found that in the prior art, the facial features of the portrait may be deformed, the integration effect of the drawing style is poor, and the user experience is poor when applying a filter of a certain drawing style to a portrait image.
Therefore, embodiments of the disclosure provide a method for processing an image, a device for processing an image, an electronic device, and a storage medium. The method may solve a problem existing in the prior art that the face figure may be deformed, the effect of integrating the preset drawing style into the image is poor, and the user experience is poor when the style of the image including the face figure is converted to the preset drawing style. The method may improve the integrity, the consistency, and the aesthetics of the face image during the image conversion process, thereby improving the user experience.
As illustrated in
At block 110, an image to be processed is acquired. The image to be processed includes a face figure.
The image to be processed includes a human face figure and needs to be converted into a preset drawing style. For example, the image to be processed may be a user's self-portrait image.
In some embodiments of the disclosure, the object of the conversion into the preset drawing style is the image to be processed including the human face figure.
At block 120, facial feature information matching the face figure is extracted.
The facial feature information may include not only facial key points included in the image to be processed, such as facial features, face contour, and hairstyle contour, but also individual features of the face figure, such as nevus, glasses, and earrings. The facial feature information matches the face figure contained in the image to be processed.
In some examples, the facial feature information may include at least one of facial feature positions, facial feature sizes, face contour, and hairstyle contour.
The facial feature positions refer to positions of the facial features of the human face on the image to be processed. The facial feature sizes refer to areas occupied by the facial features on the human face. The face contour refers to the contour line of the human face of the portrait. The hairstyle contour refers to a region defined by the hairstyle of the portrait.
In some examples, before converting the style of the image to be processed, the method may include extracting the facial feature information of the image to be processed. By extracting the facial feature information in advance, the integrity of the face figure may be kept after the image conversion, thereby improving the beauty of the face figure, and improving the user experience.
At block 130, the style of the image to be processed is converted into the preset drawing style based on the facial feature information to obtain a style transferred image matching the image to be processed.
The preset drawing style may be selected by the user to convert the style of the image to be processed. For example, the preset drawing style may include Oil Painting Style, Chinese Ink Painting Style, Traditional Chinese Realistic Painting Style. The preset drawing style may also include different schools of the same drawing style. For example, the preset drawing style may include different schools of the Oil Painting Style, such as Realism Oil Painting Style, Abstractionism Oil Painting Style, Impressionism Oil Painting Style, etc. The preset drawing style is not limited in the disclosure.
The style transferred image refers to an image formed by incorporating the preset drawing style into the image to be processed on the premise of preserving the content of the image to be processed.
Converting the style of the image to be processed to the preset drawing style based on the facial feature information may include inputting the facial feature information and the image to be processed into a style transfer model which is trained, or reducing a difference between the image to be processed and the image of the preset drawing style by keeping the facial feature information. The method and implementation process of obtaining the style transferred image is not limited in the disclosure.
In some examples, obtaining the style transferred image matching the image to be processed by converting the style of the image to be processed to the preset drawing style based on the facial feature information may include obtaining a style transfer model matching the preset drawing style, and inputting the image to be processed and the facial feature information into the style transfer model to obtain the style transferred image. The style transfer model is obtained by training a cyclic generative adversarial network in advance using a real portrait image set, facial feature information matching each real portrait image contained in the real portrait image set, a styled portrait image set, and facial feature information matching each styled portrait image contained in the styled portrait image set. The styled portrait image set matches the preset drawing style.
The style transfer model is a model used to convert the image to be processed into the style transferred image having the preset drawing style. The style transfer model is obtained by training the cyclic generative adversarial network based on the real portrait image set, the facial feature information matching each real portrait image contained in the real portrait image set, the styled portrait image set, and the facial feature information matching each styled portrait image contained in the styled portrait image set. The real portrait image set is a collection of multiple real portrait images.
In some examples, the facial feature information of the image to be processed is extracted, and the facial feature information is input to the style transfer model as priori information, such that the input image to be processed can be processed by the style transfer model. While changing the style, the facial features are fixed to ensure the integrity and consistency of the facial features and enhance aesthetics of the style conversion.
With the technical solution according to embodiments of the disclosure, by extracting the facial feature information of the image to be processed, the style of the image to be processed is converted to the preset drawing style based on the facial feature information to obtain the style transferred image. This technical solution solves a problem existing in the prior art that the face figure may be deformed, the effect of integrating the preset drawing style into the image is poor, and the user experience is poor when the style of the image including the face figure is converted to the preset drawing style. The technical solution may improve the integrity, the consistency, and the aesthetics of the face image during the image conversion process, thereby improving the user experience.
Correspondingly, as illustrated in
At block 210, a real portrait image set is acquired and a styled portrait image set matching the preset drawing style is acquired.
The block 210 may further include the following.
At block 211, multiple real portrait images are acquired from the standard real portrait image database to form the real portrait image set.
The standard real portrait image database refers to a database containing standard real portrait images. For example, the standard real portrait image database can be a FFHQ (Flickr-Faces-HQ, high-definition facial data set) data set. The FFHQ data set may include more than 70,000 high-definition facial images. The real portrait image set may include the real portrait images selected from the standard real portrait image database. Neither style of the standard real portrait image database nor the number of selected real portrait images is limited in the disclosure.
At block 212, at least one image preprocessing is performed on each real portrait image contained in the real portrait image set to obtain preprocessed real portrait images, and the preprocessed real portrait images are added to the real portrait image set.
The image preprocessing is a processing operation of the real portrait images. The image preprocessing may include cropping, rotating, skin smoothing, brightness adjusting, and contrast adjusting.
Performing the image preprocessing on the real portrait images is to preprocess the standard real portrait images, which can simulate actual shooting effects, enhance the diversity of the real portrait images, and improve the robustness of the style transfer model.
At block 213, a standard styled image matching the preset drawing style is acquired from a standard styled image database.
The standard styled image database refers to a database that stores images of different drawing styles. For example, when the drawing style is the Oil Painting style, the standard styled image database can be a wikiart database or a painter-by-numbers database. The standard styled image is an image matching the drawing style.
At block 214, resultant images including the face figure are obtained by filtering the standard styled images.
The resultant images are selected from multiple standard style images. The resultant image is a standard style image including a human face. The standard styled image including the face figure may be obtained by filtering the standard styled images through a face detection model, or a preset image recognition algorithm. The method and the implementation process of obtaining the standard styled image including the face figure are not limited in embodiments of the disclosure.
At block 215, the resultant images are cropped to obtain cut results having a face region and the styled portrait image set is generated from the cut results.
Face region cropping may be performed on the resultant images to obtain the cut results, and the styled portrait image set may be generated based on the cut results. Performing the face region cropping is to facilitate the extraction of facial feature information.
In some examples, the number of real portrait images contained in the real portrait image set may be the same with or different from the number of styled portrait images contained in the styled portrait image set.
At block 220, real-styled portrait image pairs are generated based on the real portrait image set and the styled portrait image set.
A real-styled portrait image pair is an image pair including a real portrait image and a styled portrait image. In some examples, the real portrait image set and the styled portrait image set are training sets of the cyclic generative adversarial network. The real portrait image and the styled portrait image do not need to match each other.
At block 230, the facial feature information corresponding to the real portrait image of each real-styled portrait image pair is acquired, and the facial feature information corresponding to the styled portrait image of each real-styled portrait image pair is acquired.
At block 240, the cyclic generative adversarial network is trained using the real portrait image and the styled portrait image of each real-styled portrait image pair, the facial feature information of the real portrait image and the facial feature information of the styled portrait image.
The cyclic generative adversarial network includes: real-to-styled generator for transferring a real portrait to a styled portrait, a styled-to-real generator for transferring a styled portrait to a real portrait, a real portrait discriminator, and a styled portrait discriminator.
At block 250, it is determined whether a training ending condition is met. In cases where the training ending condition is met, the block 260 is executed. In cases where the training ending condition is not met, the block 240 is executed.
Detecting that the training ending condition is met can be detecting an instruction for stopping the training and issued by the user, or detecting that the number of cycle training processes of the cyclic generative adversarial network equals to a preset number. The training ending condition is not limited in the disclosure.
At block 260, the real-to-styled generator of the cyclic generative adversarial network is used as the style transfer model matching the preset drawing style.
When the training is over, the real-to-styled generator of the current cyclic generative adversarial network may be used as the style transfer model to convert the style of the image to be processed to the preset drawing style.
At block 270, it is determined whether the style of the image to be processed needs to be converted to the preset drawing style. In cases where the style needs to be converted, the block 280 is executed. In cases where the style does not need to be converted, the block 2110 is executed.
At block 280, the image to be processed is obtained. The image to be processed includes a face figure.
At block 290, facial feature information matching the face figure is executed.
At block 2100, a style transfer model matching the preset drawing style is obtained. The image to be processed and facial feature information are input into the style transfer model to obtain a style transfer image.
At block 2110, the method ends.
With the technical solution according to embodiments, the style transfer model is obtained through the training using the real portrait image set, the styled portrait image set, the facial feature information of each real portrait image set contained in the real portrait image set, and the facial feature information of each styled portrait image contained in the styled portrait image set. The facial feature information of the image to be processed is extracted. The image to be processed and the facial feature information are input to the style transfer model to obtain the style transferred image. This technical solution solves a problem existing in the prior art that the face figure may be deformed, the effect of integrating the preset drawing style into the image is poor, and the user experience is poor when the style of the image including the face figure is converted to the preset drawing style. The technical solution may improve the integrity, the consistency and the aesthetics of the face image during the image conversion process, thereby improving the user experience.
The first image acquiring module 310 is configured to acquire an image to be processed. The image to be processed includes a face figure.
The information extracting module 320 is configured to extract facial feature information matching the face figure.
The second image acquiring module 330 is configured to obtain a style transferred image matching the image to be processed by converting a style of the image to be processed into a preset drawing style based on the facial feature information.
With the technical solution according to embodiments of the disclosure, by extracting the facial feature information of the image to be processed, the style of the image to be processed is converted to the preset drawing style based on the facial feature information, to obtain the style transferred image. This technical solution solves a problem existing in the prior art that the face figure may be deformed, the effect of integrating the preset drawing style into the image is poor, and the user experience is poor when the style of the image including the face figure is converted to the preset drawing style. The technical solution may improve the integrity, the consistency, and the aesthetics of the face image during the image conversion process, thereby improving the user experience.
In some examples, the facial feature information includes at least one of facial feature positions, facial feature sizes, face contour, and hairstyle contour.
In some embodiments, the second image acquiring module includes a first image acquiring unit.
The first image acquiring unit is configured to obtain the style transferred image by acquiring a style transfer model matching the preset drawing style, and inputting the image to be processed and the facial feature information into the style transfer model.
The style transfer model is obtained by training a cyclic generative adversarial network in advance using a real portrait image set, facial feature information matching each real portrait image contained in the real portrait image set, a styled portrait image set, and facial feature information matching each styled portrait image in the styled portrait image set. The styled portrait image set matches the preset drawing style.
In some examples, the device further includes an image set acquiring module, an image pair generating module, an information acquiring module, a training module, and a model generating module.
The image set acquiring module is configured to acquire the real portrait image set and acquire the styled portrait image matching the preset drawing style.
The image pair generating module is configured to generate real-styled portrait image pairs based on the real portrait image set and the styled portrait image set.
The information acquiring module is configured to acquire the facial feature information corresponding to the real portrait image of each real-styled portrait image pair and the facial feature information corresponding to the styled portrait image of each real-styled portrait image pair.
The training module is configured to train the cyclic generative adversarial network using the real portrait image and the styled portrait image in each real-styled portrait image pair, as well as the facial feature information corresponding to the real portrait image and the facial feature information corresponding to the styled portrait image.
The cyclic generative adversarial network includes a real-to-styled generator for transferring a real portrait to a styled portrait, a styled-to-real generator for transferring a styled portrait to a real portrait, a real portrait discriminator and a styled portrait discriminator.
The model generating module is configured to determine the real-to-styled generator in the cyclic generative adversarial network as the style transfer model matching the preset drawing style, in response to detecting that a training ending condition is met.
In some embodiments, the image set acquiring module includes a first image set acquiring unit.
The first image set acquiring unit is configured to generate the real portrait image set from real portrait images acquired from a standard real portrait image database.
In some examples, the device further includes an image preprocessing module.
The image preprocessing module is configured to perform at least one image preprocessing on the real portrait images in the real portrait image set to obtain preprocessed real portrait images, and add the preprocessed real portrait images to the real portrait image set.
In some examples, the image set acquiring module further includes a second image acquiring unit, a third image acquiring unit and a second image set acquiring unit.
The second image acquiring unit is configured to acquire standard styled images matching the preset drawing style from a standard styled image database.
The third image acquiring unit is configured to acquire resultant images containing the face figure by filtering the standard styled images.
The second image set acquiring unit is configured to crop the resultant images to obtain cropped results having face regions, and generate the styled portrait image set from the cropped results.
The device for processing an image according to embodiments of the disclosure may be configured to execute the method for processing an image according to any of embodiments of the disclosure, and has the corresponding functional modules and beneficial effects of the method.
Embodiments of the disclosure further provide an electronic device and a computer-readable storage medium.
As illustrated in
The memory 402 is a non-transitory computer-readable storage medium according to embodiments of the disclosure. The memory is configured to store instructions executable by at least one processor, to cause the at least one processor to execute a method for processing an image according to embodiments of the disclosure. The non-transitory computer-readable storage medium according to embodiments of the disclosure is configured to store computer instructions. The computer instructions are configured to enable a computer to execute a method for processing an image according to embodiments of the disclosure.
As the non-transitory computer-readable storage medium, the memory 402 may be configured to store non-transitory software programs, non-transitory computer executable programs and modules, such as program instructions/modules (such as, a first image acquiring module 310, an information extracting module 320, and a second image acquiring module 330) corresponding to a method for processing an image according to embodiments of the disclosure. The processor 401 executes various functional applications and data processing of the server by operating non-transitory software programs, instructions and modules stored in the memory 402, that is, implements a method for processing an image according to embodiments of the disclosure.
The memory 402 may include a storage program region and a storage data region. The storage program region may store an application required by an operating system and at least one function. The storage data region may store data created by implementing the method for video frame interpolation through the electronic device. In addition, the memory 402 may include a high-speed random-access memory and may also include a non-transitory memory, such as at least one disk memory device, a flash memory device, or other non-transitory solid-state memory device. In some embodiments, the memory 402 may optionally include memories remotely located to the processor 401 which may be connected to the electronic device configured to implement a method for processing an image via a network. Examples of the above network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network and combinations thereof.
The electronic device configured to implement a method for processing an image may also include: an input device 403 and an output device 404. The processor 401, the memory 402, the input device 403, and the output device 404 may be connected through a bus or in other means. In
The input device 403 may be configured to receive inputted digitals or character information, and generate key signal input related to user setting and function control of the electronic device configured to implement a method for processing an image, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, an indicator stick, one or more mouse buttons, a trackball, a joystick and other input device. The output device 404 may include a display device, an auxiliary lighting device (e.g., LED), a haptic feedback device (e.g., a vibration motor), and the like. The display device may include, but be not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some embodiments, the display device may be a touch screen.
The various implementations of the system and technologies described herein may be implemented in a digital electronic circuit system, an integrated circuit system, an application specific ASIC (application specific integrated circuit), a computer hardware, a firmware, a software, and/or combinations thereof. These various implementations may include: being implemented in one or more computer programs. The one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor. The programmable processor may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device and at least one output device, and may transmit the data and the instructions to the storage system, the at least one input device and the at least one output device.
These computing programs (also called programs, software, software applications, or codes) include machine instructions of programmable processors, and may be implemented by utilizing high-level procedures and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms “machine readable medium” and “computer readable medium” refer to any computer program product, device, and/or apparatus (such as, a magnetic disk, an optical disk, a memory, a programmable logic device (PLD)) for providing machine instructions and/or data to a programmable processor, including machine readable medium that receives machine instructions as machine readable signals. The term “machine readable signal” refers to any signal for providing the machine instructions and/or data to the programmable processor.
To provide interaction with a user, the system and technologies described herein may be implemented on a computer. The computer has a display device (such as, a CRT (cathode ray tube) or a LCD (liquid crystal display) monitor) for displaying information to the user, a keyboard and a pointing device (such as, a mouse or a trackball), through which the user may provide the input to the computer. Other types of devices may also be configured to provide interaction with the user. For example, the feedback provided to the user may be any form of sensory feedback (such as, visual feedback, auditory feedback, or tactile feedback), and the input from the user may be received in any form (including acoustic input, voice input or tactile input).
The system and technologies described herein may be implemented in a computing system including a background component (such as, a data server), a computing system including a middleware component (such as, an application server), or a computing system including a front-end component (such as, a user computer having a graphical user interface or a web browser through which the user may interact with embodiments of the system and technologies described herein), or a computing system including any combination of such background component, the middleware components, or the front-end component. Components of the system may be connected to each other through digital data communication in any form or medium (such as, a communication network). Examples of the communication network include a local area network (LAN), a wide area networks (WAN), and the Internet.
The computer system may include a client and a server. The client and the server are generally remote from each other and usually interact via the communication network. A relationship between the client and the server is generated by computer programs operated on a corresponding computer and having a client-server relationship with each other.
It should be understood, steps may be reordered, added or deleted by utilizing flows in the various forms illustrated above. For example, the steps described in the disclosure may be executed in parallel, sequentially or in different orders, so long as desired results of the technical solution disclosed by the disclosure may be achieved without limitation herein.
The above detailed implementations do not limit the protection scope of the disclosure. It should be understood by the skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made based on design requirements and other factors. Any modification, equivalent substitution and improvement made within the spirit and the principle of the disclosure shall be included in the protection scope of disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202010537221.8 | Jun 2020 | CN | national |