This application claims the benefit of Taiwan application Serial No. 109129193, filed Aug. 26, 2020, the disclosure of which is incorporated by reference herein in its entirety.
The disclosure relates in general to an image correction method and system, and more particularly to an image correction method and system based on deep learning.
In the field of image recognition, particularly the recognition of character in an image, a local image containing the target character is firstly located from the image and then is corrected as a front view image for the subsequent recognition model to perform character recognition. An image correction procedure converts the images with different view angles and distances into front view images with the same angle and distance to speed up the learning of the recognition model and increase the recognition accuracy.
However, in the current technology, the image correction procedure still depends on the conventional image processing method to manually find the rotation parameters and repeatedly adjust the parameters to increase the accuracy of the image correction procedure. Although the image correction procedure can be performed using the technology of artificial intelligence (AI), the image correction procedure can only find clockwise or anticlockwise rotation angles and cannot be used in complicated image processing to scale, sift or tilt the image.
Therefore, it has become a prominent task for the industries to efficiently and correctly correct various images as front view images.
The disclosure is directed to an image correction method and a system based on deep learning. The perspective transformation parameters for the image correction procedure are found by a deep learning model and used to efficiently correct various images to front view images and further update the deep learning model using the loss value to increase the recognition accuracy.
According to one embodiment, an image correction method based on deep learning is provided. The image correction method includes the following steps. An image containing at least one character is received by a deep learning model, and a perspective transformation matrix is generated according to the image. A perspective transformation on the image is performed according to the perspective transformation matrix, and a corrected image containing a front view of the at least one character is obtained. An optimized corrected image containing the front view of the at least one character is generated according to the image. An optimized perspective transformation matrix corresponding to the image and the optimized corrected image is obtained. A loss value between the optimized perspective transformation matrix and the perspective transformation matrix is calculated. The deep learning model is updated using the loss value.
According to another embodiment, an image correction system based on deep learning is provided. The image correction system includes a deep learning model, a processing unit and a model adjustment unit. The deep learning model is configured to receive an image containing at least one character and generate a perspective transformation matrix according to the image. The processing unit is configured to receive the image and the perspective transformation matrix and perform a perspective transformation on the image according to the perspective transformation matrix to obtain a corrected image containing a front view of the at least one character. The model adjustment unit is configured to receive the image, generate an optimized corrected image containing the front view of the at least one character according to the image, obtain an optimized perspective transformation matrix corresponding to the image and the optimized corrected image, calculate a loss value between the optimized perspective transformation matrix and the perspective transformation matrix, and update the deep learning model using the loss value.
The above and other aspects of the disclosure will become better understood with regard to the following detailed description of the preferred but non-limiting embodiment(s). The following description is made with reference to the accompanying drawings.
In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed embodiments. It will be apparent, however, that one or more embodiments may be practiced without these specific details. In other instances, well-known structures and devices are schematically shown in order to simplify the drawing.
Referring to
Refer to
In step S110, an image IMG1 containing at least one character is received by the deep learning model 110, and a perspective transformation matrix T is generated according to the image IMG1. The image IMG1 can be any image containing at least one character, such as the image of a vehicle plate, a road sign, a serial number or a sign board. The at least one character is such as number, English character, hyphen, punctuation mark or a combination thereof. Refer to
In step S120, a perspective transformation is performed on the image IMG1 by the processing unit 120 according to the perspective transformation matrix T to obtain a corrected image IMG2 containing a front view of the at least one character. The processing unit 120 performs the perspective transformation on the image IMG1 according to the perspective transformation matrix T to convert the image IMG1 into the corrected image IMG2 containing the front view of the at least one character. Referring to
In step S130, the deep learning model 110 is updated by the model adjustment unit 130 using a loss value L. Referring to
In step S131, the image IMG1 is marked by the model adjustment unit 130, wherein the mark contains a mark range covering the character. Referring to
Referring to
Refer to
In step S133, an optimized perspective transformation matrix corresponding to the image IMG1 and the optimized corrected image is obtained by the model adjustment unit 130. Due to the perspective transformation relation between the image IMG1 and the optimized corrected image, the model adjustment unit 130 can calculate a perspective transformation matrix using the image IMG1 and the optimized corrected image and use the calculated perspective transformation matrix as the optimized perspective transformation matrix.
In step S134, a loss value L between the optimized perspective transformation matrix and the perspective transformation matrix T is calculated by the model adjustment unit 130. In step S135, the deep learning model 110 is updated by the model adjustment unit 130 using the loss value L. As indicated in
According to the image correction system 100 and method based on deep learning of the present disclosure, the perspective transformation parameters for the image correction procedure are found by a deep learning model and used to efficiently correct various images to front view images and further update the deep learning model using the loss value to increase the recognition accuracy.
Referring to
In step S1110, an image IMG5 containing at least one character is captured by the image capture unit 1140.
In step S1120, an image IMG5 is received by the deep learning model 1110, and a perspective transformation matrix T′ is generated according to the image IMG5. Step S1120 is similar to step S110 of
In step S1130, a shooting information SI is received by the deep learning model 1110, and several perspective transformation parameters of the perspective transformation matrix T′ are limited according to the shooting information SI. The shooting information SI is a shooting location, a shooting direction and a shooting angle. The shooting location, the shooting direction and the shooting angle can respectively be represented by 3 parameters, 2 parameters and 1 parameter. The perspective transformation matrix T′ contains several perspective transformation parameters T′11, T′12, T13, T′21, T′22, T′23, T′31, T′32 and 1 as indicated in formula 2. The perspective transformation parameters T′11, T′12, T′13, T′21, T′22, T′23, T′31, T′32 can be determined according to the 6 parameters of the shooting location, the shooting direction and the shooting angle.
Firstly, the deep learning model 1110 assigns a reasonable range to each of the 6 parameters of the shooting location, the shooting direction and the shooting angle and calculates the perspective transformation parameter T′mn using a grid search algorithm to obtain a largest value Lmn and a smallest value Smn of the perspective transformation parameter Tmn Then, the deep learning model 1110 calculates each perspective transformation parameter T′mn according to formula 3:
T′
mn
=S
mn+(Lmn−Smn)σ(Zmn) (formula 3)
Wherein Zmn is a value not subjected to any restrictions, and a is a logic function whose range is 0 to 1. Thus, the deep learning model 1110 can assure that each of the perspective transformation parameters T′11, T′12, T′13, T′21, T′22, T′23, T′31, T′32 falls within a reasonable range.
In step S1140, a perspective transformation is performed on the image IMG5 by the processing unit 1120 according to the perspective transformation matrix T′ to obtain a corrected image IMG6 containing a front view of the at least one character. Step S1140 is similar to step S120 of
In step S1150, the deep learning model 1110 is updated using a loss value L′. Step S1150 is similar to step S130 of
Thus, the image correction system 1100 and method based on deep learning of the present disclosure can limit the range of the perspective transformation parameter according to the shooting information SI to increase the accuracy of the deep learning model 1110 and make the training of the deep learning model 1110 easier.
It will be apparent to those skilled in the art that various modifications and variations can be made to the disclosed embodiments. It is intended that the specification and examples be considered as exemplary only, with a true scope of the disclosure being indicated by the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
109129193 | Aug 2020 | TW | national |