This application claims priority to the Chinese patent application filed with the Chinese Patent Office on Oct. 25, 2021, with application No. 202111241440.2, the entirety of which is incorporated herein by reference.
The embodiments of the present disclosure relate to the field of image processing technology, such as to a method, apparatus, device, and storage medium for generating character style profile image.
With the development of technology, more and more application software come into users' lives, gradually enriching their leisure time, such as short video application (Application, APP), photo editing APP as Ulike, Xingtu, etc. Among them, transforming character profile images into various styles of images becomes more and more popular among users.
The embodiments of the present disclosure provide a method, apparatus, device, and storage medium for generating character style profile image, which can generate a character profile image with a set style, thereby improving diversity of the images.
The embodiments of the present disclosure provide a method for generating character style profile image, including:
The embodiments of the present disclosure also provide an apparatus for generating character style profile image, including:
The embodiments of the present disclosure also provide an electronic device, the electronic device including:
The embodiments of the present disclosure also provide a computer-readable medium having a computer program stored thereon, which, when executed by a processing apparatus, implements the method for generating a character style profile image as described in the embodiments of the present disclosure.
The following will describe the embodiments of the present disclosure with reference to the accompanying drawings. Although some embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms and is not limited to the embodiments set forth herein. It should be understood that the drawings and embodiments of the present disclosure are provided for illustrative purposes only and are not intended to limit the scope of protection of the present disclosure.
The term “including” and its variations as used herein are non-exclusive inclusion, i.e. “including but not limited to”. The term “based on” means “at least partially based on”. The term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; and the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the following description.
It should be noted that the concepts of “first” and “second” mentioned in this disclosure are only used to distinguish different apparatuses, modules, or units, but are not used to limit the order or interdependence of the functions performed by these apparatuses, modules, or units.
It should be noted that the modifications of “one” and “a plurality of” mentioned in this disclosure are illustrative but not limiting. Those skilled in the art should understand that unless otherwise indicated in the context, they should be understood as “one or more”.
The names of the messages or information interacted between a plurality of apparatuses in this public implementation are for illustrative purposes only, which are not intended to limit the scope of these messages or information.
Step 110, inputting an original character profile image into a first feature encoder to obtain first character profile feature code;
The original character profile image can be an image containing a character profile, which can be captured by a camera of the terminal device or obtained from a database. The first feature encoder may encode the input character profile image to obtain the first character profile feature code. The first character profile feature code may be represented by a multidimensional matrix.
In this embodiment, the first feature encoder is composed of a set neural network, and is obtained by training based on the character profile sample images.
A way for training the first feature encoder may include: obtaining a character profile sample image; inputting the character profile sample image into a first feature encoder to be trained to obtain a first sample character profile feature code; inputting a first sample character profile feature code into a character profile generative model to obtain a first reconstructed character profile image; and training the first feature encoder to be trained based on a loss function between the first reconstructed character profile image and the character profile sample image to obtain the first feature encoder.
The character profile generative model can be a model obtained after training the generative adversarial network.
The process of cross-iteratively training includes: inputting first random noise data into the the generative model to obtain a first character profile image; inputting the first character profile image and a first character profile sample image into the discriminative model to obtain a first discrimination result; adjusting parameters in the generative model based on the first discrimination result; inputting second random noise data into the adjusted generative model to obtain a second character profile image; inputting the second character profile image and a second character profile sample image into the discriminative model to obtain a second discrimination result, and determining a real discrimination result between the second character profile image and the second character profile sample image; and adjusting parameters in the discriminative model according to a loss function between the second discrimination result and the real discrimination result.
The first character profile sample image and the second character profile sample image are sample images in the obtained character profile sample images.
In this embodiment, the first feature encoder is trained based on the character profile generative model that has been trained. For example,
Step 120, determining an attribute increment between the original character profile image and a template image.
Attributes may include an age, hair color, gender and a deflection angle of the profile and whether the eyes of the profile are open, etc. The template image can be an image that matches the character style. For example, if the character style is “Halloween” style, the template image is an image that matches the “Halloween” style.
In this embodiment, a way for determining the attribute increment between the original character profile image and the template image may include: inputting the original character profile image into a attribute recognizer, outputting character attribute information, inputting the template image into the attribute recognizer, obtaining template attribute information, calculating a difference between the character attribute information and the template attribute information, and thus obtaining the attribute increment. The attribute recognizer can be constructed based on a set neural network.
Step 130, inputting the attribute increment and the first character profile feature code into a second feature encoder to obtain a second character profile feature code.
The second character profile feature code can be understood as character profile feature code added with attribute increment information. The second feature encoder may encode the input attribute increment and the first character profile feature code to obtain the second character profile feature code. The second character profile feature code may be represented by a multidimensional matrix.
In this embodiment, the second feature encoder may be obtained by training based on the character profile generative model and the first feature encoder that have been trained. With respect to training process of the character profile generative model and the first feature encoder, the above embodiment may be referred to, which will not be repeated here.
A way for training the second feature encoder includes: obtaining a character profile sample image; inputting the character profile sample image into the first feature encoder to obtain a second sample character profile feature code; inputting the second sample character profile feature code into a character profile generative model to obtain a second reconstructed character profile image; inputting the second sample character profile feature code and a real attribute increment into a second feature encoder to be trained to obtain a third sample character profile feature code; inputting the third sample character profile feature code into the character profile generative model to obtain an edited character profile image; determining a predictive attribute increment between the second reconstructed character profile image and the edited character profile image; and training the second feature encoder to be trained based on a loss function between the predictive attribute increment and the real attribute increment to obtain the second feature encoder.
The character profile sample images may be a large number of character profile images at different angles or under different light. The method of determining the predictive attribute increment between the second reconstructed character profile image and the edited character profile image may include: inputting the second reconstructed character profile image and the edited character profile image respectively into the attribute recognizer to obtain their attribute information, and then calculate the difference between the attribute information of the second reconstructed character profile image and the attribute information of the edited character profile image to obtain the predictive attribute increment.
Step 140, inputting the second character profile feature code into a style profile generative model to obtain an initial character style profile image.
The style profile generative model may transform the character profile into a character profile with a set style. In this embodiment, the set style can be a “Halloween” style. For example,
In this embodiment, the style profile generative model may be obtained by training based on the trained character profile generative model. The training process of the character profile generative model can refer to the above embodiments, which are not repeated here.
The process of cross-iteratively training includes: obtaining a set style character profile sample image; inputting first random noise data into the character profile generative model to obtain a first style character profile image; inputting the first style character profile image and the set style character profile sample image into the character profile discriminative model to obtain a first discrimination result; adjusting parameters in the character profile generative model based on the first discrimination result; inputting second random noise data into the adjusted character profile generative model to obtain a second style character profile image; inputting the second style character profile image and the set style character profile sample image into the character profile discriminative model to obtain a second discrimination result, and determining a real discrimination result between the second style character profile image and the set style character profile sample image; and adjusting parameters in the character discriminative model according to a loss function between the second discrimination result and the real discrimination result.
The set style character profile sample image can be a character profile image with a “Halloween” style, which can be obtained through virtual character rendering or retouching.
Step 150, merging the initial character style profile image into the template image to obtain a target character style profile image.
The template image can be an image that matches the set style. For example, if the set style is a “Halloween” style, the template image is an image that matches the “Halloween” style. For example,
In this embodiment, in order to ensure that the size and position of the character style profile image match those of the template image, it is necessary to adjust the initial character style profile image.
For example, the process of merging the initial character style profile image into the template image to obtain a target character style profile image includes: translating a position of a character style profile in the initial character style profile image; and merging the initial character style profile image after translation into the template image to obtain the target character style profile image.
For example, the character style profile may be translated to a center of the initial character style profile image.
Optionally, a way for translating the character style profile in the initial character style profile image to the center of the initial character style profile image may include aligning the central key point of the character style profile with the central point of the initial character style profile image.
A distance difference between a horizontal coordinate of a central key point of the character style profile and a horizontal coordinate of a central point of the initial character style profile image is calculated, and the distance difference between the horizontal coordinate of the central key point of the character style profile and the horizontal coordinate of the central point of the initial character style profile image is determined as a horizontal distance difference. A distance difference between a vertical coordinate of the central key point of the character style profile and a vertical coordinate of the central point of the initial character style profile image is calculated, and the distance difference between the vertical coordinate of the central key point of the character style profile and the vertical coordinate of the central point of the initial character style profile image is determined as a vertical distance difference. The character style profile is translated along the horizontal direction according to the horizontal distance difference, and the character style profile image is translated reversely along the vertical direction according to the vertical distance difference, until the central key point of the character style profile is aligned with the central point of the initial character style profile image.
Optionally, the way for translating the character style profile in the initial character style profile image to the center of the initial character style profile image may include: obtaining a vertical standard line and a horizontal standard line of the initial character style profile image; extracting a central key point and a mouth corner key point of the character style profile in the initial character style profile image; determining a distance difference between a vertical coordinate of the central key point and the vertical standard line, and determining the distance difference between the vertical coordinate of the central key point and the vertical standard line as a first distance difference; determining a distance difference between a horizontal coordinate of the mouth corner key point and the horizontal standard line, and determining the distance difference between the horizontal coordinate of the mouth corner key point and the horizontal standard line as the second distance difference; and translating the character style profile along a vertical direction according to the first distance difference, and translating the character style profile along a horizontal direction according to the second distance difference, to translate the character style profile to the center of the initial character style profile image.
The vertical standard line and the horizontal standard line can be set according to the size of the initial character style profile image and the user's needs. For example,
The process of the merging the initial character style profile image into the template image to obtain a target character style profile image may include: recognizing a template character profile in the template image to obtain a recognition rectangle box; cropping the initial character style profile image into an image of a set size according to the recognition rectangle box; pasting the image of the set size into the recognition rectangle box; obtaining a character profile mask image of the template image; and merging the image of the set size pasted into the recognition rectangle box into the template image based on the character profile mask image, to obtain the target character style profile image.
Even if the size of the cropped initial character style profile image is the same as that of the recognition rectangle box, the set size can be determined by the size of the recognition rectangle box. The character profile mask image can be understood as a binary image composed of areas surrounded by the template character profile in the template image, for example, the image surrounded by the white areas in
In this embodiment, the merging the image of the set size into the template image based on the character profile mask image may include: calculating according to the following formula: R=(mask*output)+(1−mask)*template. Where R is a pixel matrix of the target character style profile image, mask is a pixel matrix of the character profile mask image, output is the pixel matrix of the image of the set size, and template is the pixel matrix of the template image.
The technical solution of the embodiment of the present disclosure inputs an original character profile image into a first feature encoder to obtain first character profile feature code; determines an attribute increment between the original character profile image and a template image; inputs the attribute increment and the first character profile feature code into a second feature encoder to obtain a second character profile feature code; inputs the second character profile feature code into a style profile generative model to obtain an initial character style profile image; and merges the initial character style profile image into the template image to obtain a target character style profile image. The method for generating a character style profile image provided in the embodiments of the present disclosure can generate a character profile image with a set style, thereby improving diversity of the images.
Optionally, the target character style profile image obtaining module 250 is configured to translate a character style profile in the initial character style profile image to a center of the image; and merge the initial character style profile image after translation into the template image to obtain the target character style profile image.
Optionally, the target character style profile image obtaining module 250 is configured to translate the character style profile in the initial character style profile image to the center of the image by obtaining a vertical standard line and a horizontal standard line of the initial character style profile image; extract a central key point and a mouth corner key point of the character style profile in the initial character style profile image; determine a distance difference between a vertical coordinate of the central key point and the vertical standard line, and determine the distance difference between the vertical coordinate of the central key point and the vertical standard line as a first distance difference; determine a distance difference between a horizontal coordinate of the mouth corner key point and the horizontal standard line, and determine the distance difference between the horizontal coordinate of the mouth corner key point and the horizontal standard line as the second distance difference; and translate the character style profile along a vertical direction according to the first distance difference, and translate the character style profile along a horizontal direction according to the second distance difference, to translate the character style profile to the center of the image.
Optionally, the target character style profile image obtaining module 250 is configured to recognize a template character profile in the template image to obtain a recognition rectangle box; crop the initial character style profile image into an image of a set size according to the recognition rectangle box; paste the image of the set size into the recognition rectangle box; obtain a character profile mask image of the template image; and merge the image of the set size pasted into the recognition rectangle box into the template image based on the character profile mask image, to obtain the target character style profile image.
Optionally, the apparatus for generating a character style profile image further includes a first feature encoder training module configured to: obtain a character profile sample image; input the character profile sample image into a first feature encoder to be trained to obtain a first sample character profile feature code; input a first sample character profile feature code into a character profile generative model to obtain a first reconstructed character profile image; and train the first feature encoder to be trained based on a loss function between the first reconstructed character profile image and the character profile sample image to obtain the first feature encoder.
Optionally, the apparatus for generating a character style profile image further includes a second feature encoder training module configured to: obtain a character profile sample image; input the character profile sample image into the first feature encoder to obtain a second sample character profile feature code; input the second sample character profile feature code into a character profile generative model to obtain a second reconstructed character profile image; input the second sample character profile feature code and a real attribute increment into a second feature encoder to be trained to obtain a third sample character profile feature code; input the third sample character profile feature code into the character profile generative model to obtain an edited character profile image; determine a predictive attribute increment between the second reconstructed character profile image and the edited character profile image; and train the second feature encoder to be trained based on a loss function between the predictive attribute increment and the real attribute increment to obtain the second feature encoder.
Optionally, the apparatus for generating a character style profile image further includes a style profile generative model training module configured to: cross-iteratively train a character profile generative model and a character profile discriminative model until an accuracy of a discrimination result output by the character profile discriminative model meets a set condition, and determine the trained character profile generative model as the style profile generative model; a process of cross-iteratively training includes: obtain a set style character profile sample image; input first random noise data into the character profile generative model to obtain a first style character profile image; input the first style character profile image and the set style character profile sample image into the character profile discriminative model to obtain a first discrimination result; adjust parameters in the character profile generative model based on the first discrimination result; input second random noise data into the adjusted character profile generative model to obtain a second style character profile image; input the second style character profile image and the set style character profile sample image into the character profile discriminative model to obtain a second discrimination result, and determine a real discrimination result between the second style character profile image and the set style character profile sample image; and adjust parameters in the character discriminative model according to a loss function between the second discrimination result and the real discrimination result.
The above apparatus can execute the methods provided in all the previous embodiments of the present disclosure, and has the corresponding functional modules and effects for executing the above methods. Technical details not described in this embodiment can be found in the description of the methods provided in all the previous embodiments of the present disclosure.
Referring now to
As shown in
Typically, the following apparatuses can be connected to I/O interface 305: input apparatuses 306 including, for example, touch screens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output apparatuses 307 including liquid crystal displays (LCDs), speakers, vibrators, etc.; storage apparatuses 308 including magnetic tapes, hard disks, etc.; and a communication apparatus 309. The communication apparatus 309 may allow the electronic device 300 to communicate with other apparatuses wirelessly or wirelessly to exchange data.
According to embodiments of the present disclosure, the process described above with reference to the flowchart can be implemented as a computer software program. For example, an embodiment of the present disclosure includes a computer program product, and the computer program product includes a computer program carried on a computer-readable medium, where the computer program includes program code for performing a method for recommending words. In such an embodiment, the computer program can be downloaded and installed from a network through the communication apparatus 309, or installed through the storage apparatus 305, or installed through the ROM 302. When the computer program is executed by the processing apparatus 301, the above functions defined in the method of the embodiment of the present disclosure are performed.
It should be noted that the computer-readable medium described above can be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. The computer-readable storage media may include but are not limited to: an electrical connection with one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any combination thereof. In the present disclosure, a computer-readable storage medium may be any tangible medium containing or storing a program that can be used by an instruction execution system, apparatus, or device, or can be used in combination with an instruction execution system, apparatus, or device. In the present disclosure, a computer-readable signal medium can include a data signal propagated in baseband or as a carrier wave, the computer-readable signal medium carries computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any combination thereof. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit programs for use by or in conjunction with instruction execution systems, apparatus, or devices. The program code contained on the computer-readable medium may be transmitted using any suitable medium, the medium includes but not limited to: wires, optical cables, RF (radio frequency), etc., or any combination thereof.
In some embodiments, clients and servers can communicate using any currently known or future developed network protocol such as Hyper Text Transfer Protocol (HTTP), and can be interconnected with any form or medium of digital data communication (such as communication networks). Examples of communication networks include local area networks (LANs), wide area networks (WANs), internetworks (such as the Internet), and end-to-end networks (such as ad hoc end-to-end networks), as well as any currently known or future developed networks.
The computer-readable medium may be included in the electronic device or may exist separately without being assembled into the electronic device.
The computer-readable medium carries one or more programs, and when the one or more programs are executed by the electronic device, the electronic device performs the following steps: inputting an original character profile image into a first feature encoder to obtain first character profile feature code; determining an attribute increment between the original character profile image and a template image; inputting the attribute increment and the first character profile feature code into a second feature encoder to obtain a second character profile feature code; inputting the second character profile feature code into a style profile generative model to obtain an initial character style profile image; and merging the initial character style profile image into the template image to obtain a target character style profile image.
Computer program codes for performing the operations of the present disclosure may be written in one or more programming languages or a combination thereof, including but not limited to Object Oriented programming languages—such as Java, Smalltalk, C++, and also conventional procedural programming languages—such as “C” or similar programming languages. The program code may be executed entirely on the user's computer, partially executed on the user's computer, executed as a standalone software package, partially executed on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In the case of involving a remote computer, the remote computer may be any kind of network—including LAN or WAN—connected to the user's computer, or may be connected to an external computer (e.g., through an Internet service provider to connect via the Internet).
The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functions, and operations of possible implementations of the system, method, and computer program product of various embodiments of the present disclosure. Each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions marked in the blocks may occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed in parallel, or they may sometimes be executed in reverse order, depending on the function involved. It should also be noted that each block in the block diagrams and/or flowcharts, as well as combinations of blocks in the block diagrams and/or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or operations, or may be implemented using a combination of dedicated hardware and computer instructions.
The units in the embodiments of the present disclosure may be implemented by means of software or hardware, and the name of the unit does not constitute a limitation on the unit itself.
The functions described herein above can be performed at least in part by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Parts (ASSPs), System on Chip (SOCs), Complex Programmable Logic Devices (CPLDs), and so on.
In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store programs for use by or in conjunction with instruction execution systems, apparatuses, or devices. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses, or devices, or any suitable combination thereof. Specific examples of the machine-readable storage medium may include electrical connections based on one or more wires, portable computer disks, hard disks, RAM, ROM, EPROM or flash memory, optical fibers, CD-ROM, optical storage devices, magnetic storage devices, or any combination thereof.
According to one or more embodiments of the present disclosure, the present disclosure discloses a method for generating a character style profile image, including: inputting an original character profile image into a first feature encoder to obtain first character profile feature code; determining an attribute increment between the original character profile image and a template image; inputting the attribute increment and the first character profile feature code into a second feature encoder to obtain a second character profile feature code; inputting the second character profile feature code into a style profile generative model to obtain an initial character style profile image; and merging the initial character style profile image into the template image to obtain a target character style profile image.
The merging the initial character style profile image into the template image to obtain a target character style profile image, includes: translating a position of a character style profile in the initial character style profile image; and merging the initial character style profile image after translation into the template image to obtain the target character style profile image.
The translating a position of a character style profile in the initial character style profile image, includes: obtaining a vertical standard line and a horizontal standard line of the initial character style profile image; extracting a central key point and a mouth corner key point of the character style profile in the initial character style profile image; determining a distance difference between a vertical coordinate of the central key point and the vertical standard line, and determining the distance difference between the vertical coordinate of the central key point and the vertical standard line as a first distance difference; determining a distance difference between a horizontal coordinate of the mouth corner key point and the horizontal standard line, and determining the distance difference between the horizontal coordinate of the mouth corner key point and the horizontal standard line as the second distance difference; and translating the character style profile along a vertical direction according to the first distance difference, and translating the character style profile along a horizontal direction according to the second distance difference.
The merging the initial character style profile image into the template image to obtain a target character style profile image, includes: recognizing a template character profile in the template image to obtain a recognition rectangle box; cropping the initial character style profile image into an image of a set size according to the recognition rectangle box; pasting the image of the set size into the recognition rectangle box; obtaining a character profile mask image of the template image; and merging the image of the set size pasted into the recognition rectangle box into the template image based on the character profile mask image, to obtain the target character style profile image.
The way for training the first feature encoder includes: obtaining a character profile sample image; inputting the character profile sample image into a first feature encoder to be trained to obtain a first sample character profile feature code; inputting a first sample character profile feature code into a character profile generative model to obtain a first reconstructed character profile image; and training the first feature encoder to be trained based on a loss function between the first reconstructed character profile image and the character profile sample image to obtain the first feature encoder.
The way for training the second feature encoder includes: obtaining a character profile sample image; inputting the character profile sample image into the first feature encoder to obtain a second sample character profile feature code; inputting the second sample character profile feature code into a character profile generative model to obtain a second reconstructed character profile image; inputting the second sample character profile feature code and a real attribute increment into a second feature encoder to be trained to obtain a third sample character profile feature code; inputting the third sample character profile feature code into the character profile generative model to obtain an edited character profile image; determining a predictive attribute increment between the second reconstructed character profile image and the edited character profile image; and training the second feature encoder to be trained based on a loss function between the predictive attribute increment and the real attribute increment to obtain the second feature encoder.
The way for training the style profile generative model includes: cross-iteratively training a character profile generative model and a character profile discriminative model until an accuracy of a discrimination result output by the character profile discriminative model meets a set condition, and determining the trained character profile generative model as the style profile generative model; a process of cross-iteratively training includes: obtaining a set style character profile sample image; inputting first random noise data into the character profile generative model to obtain a first style character profile image; inputting the first style character profile image and the set style character profile sample image into the character profile discriminative model to obtain a first discrimination result; adjusting parameters in the character profile generative model based on the first discrimination result; inputting second random noise data into the adjusted character profile generative model to obtain a second style character profile image; inputting the second style character profile image and the set style character profile sample image into the character profile discriminative model to obtain a second discrimination result, and determining a real discrimination result between the second style character profile image and the set style character profile sample image; and adjusting parameters in the character discriminative model according to a loss function between the second discrimination result and the real discrimination result.
Number | Date | Country | Kind |
---|---|---|---|
202111241440.2 | Oct 2021 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2022/127195 | 10/25/2022 | WO |