This application claims priority to Korean Patent Application No. 10-2021-0100346 filed on Jul. 30, 2021, the disclosure of which is hereby incorporated by reference in its entirety.
The present invention relates to a method and apparatus for generating learning data input to a neural network to train the neural network.
In order to recognize a text from an image using the neural network, the neural network must learn, and a large amount of learning data is required to train the neural network. However, in the case of recognizing texts (numbers and letters) of a vehicle's license plate using the neural network, images (or photos) of the license plate are required as learning data, but there are many restrictions on collecting a large amount of these images. For example, as the climate and time zone change, illuminance and sharpness around the vehicle become also different; thus, there are restrictions on collecting the images of the license plate, and there are also restrictions on the legal aspects because the vehicle's number is personal information. In addition, since learning data collected at various points in time and in various environments is required for accurate training of the neural network, high time and human consumptions are taken to collect the images of the license plate.
The technical task to be achieved by the present invention is to provide a method and apparatus for generating learning data for a neural network. The technical task to be achieved in the present embodiments is not limited to the aforementioned technical task, and other technical tasks can be derived from the following embodiments.
As technical means for achieving the aforementioned technical task, a method for generating learning data for a neural network according to one embodiment may comprise generating a license plate image by combining a background image, a frame image and a text image; generating a transformed image by performing at least one of a geometry transformation and a filter transformation on the license plate image; setting a text corresponding to the text image as target data for the transformed image; and generating the learning data including the transformed image and the target data.
An apparatus for generating learning data for a neural network according to another embodiment may comprise a memory in which at least one program is stored; and a processor for executing said at least one program to generate learning data to train the neural network, wherein the processor generates a license plate image by combining a background image, a frame image and a text image, generates a transformed image by performing at least one of a geometry transformation and a filter transformation on the license plate image, sets a text corresponding to the text image as target data for the transformed image, and generates the learning data including the transformed image and the target data.
A storage medium according to one another embodiment may comprise a computer readable storage medium recording a program for executing the method according to one embodiment in a computer.
According to the present invention, the apparatus for generating learning data can effectively generate leaning data by generating, instead of collecting, a license plate image which is utilized for training a neural network, and transforming the generated license plate images in various manners.
The effect of the present invention is not limited to the aforementioned effect, and the effects which are not mentioned herein can be obviously understood from the present specification and the drawings attached herewith in view of a person having ordinary skill in the art to which the present invention pertains.
Hereinafter, preferred embodiments of the present invention will be explained in detail with reference to the drawings attached herewith. The advantages and characteristics of the present invention, and the methods for achieving them would be clarified with reference to the embodiments to be explained below in detail, together with the drawings attached herewith. However, the present invention is not limited to the embodiments disclosed below and can be implemented in any other various manners, and these embodiments are provided just to complete the disclosure of the present invention and provide a person having ordinary skill in the art to which the present invention pertains with the scope of the invention completely, and the present invention is defined by the scope of the claims. The same reference numerals refer to the same elements throughout the entire specification.
Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification can be used with the meaning that a person having ordinary skill in the technical field to which the present invention pertains can commonly understand. In addition, the terms defined in the generally used dictionary will not be ideally or excessively interpreted unless particularly defined.
The terms used in the present specification are just to explain the embodiments, and are not intended to limit the present invention. In the present specification, singular terms include plural terms unless particularly mentioned. The term “comprise” and/or “which comprise” used in the present specification do(es) not exclude the presence or addition of one or more elements other than the elements mentioned.
Unless otherwise mentioned in the present specification, the term “in contact with” or “connected” may include that one element/feature is directly in contact with or connected with other element/feature or that one element/feature is indirectly in contact with or connected by interposing other element/feature. Thus, various diagrams illustrated in the drawings illustrate exemplary arrangements of elements and components, but there may be additional mediation elements, devices, features or components (when assumed that the functions of the elements illustrated are not adversely affected) in the actual embodiments.
In addition, unless otherwise mentioned in the present specification, the expressions “first,” “second” and “third” are used just to distinguish the objects that the terms refer to for the explanation of the invention. In addition, even if the objects are referred to using “first,” “second,” etc., the contents of the objects they refer to may be the same depending on cases.
Hereinafter, embodiments of the present invention will be explained with reference to the drawings.
The neural network 100 illustrated in
The neural network 100 may be a DNN including an input layer Layer1, two hidden layers Layer2 and Layer3, and an output layer Layer4. For example, when the neural network 100 represents Convolutional Neural Network (CNN), Layer 1 to Layer 4 may be part of the layers in the Convolutional Neural Network, and this may be a convolutional layer, a pooling layer, a fully connected layer, and the like.
Each of the layers included in the neural network 100 may comprise a plurality of artificial nodes, also known as “neurons”, “Processing elements (PEs)”, or “units.” For example, as shown in
The neural network 100 may perform an operation based on received input data (e.g., I1 and I2), and may generate output data (e.g., O1 and O2) based on the result of performing the operation.
The nodes included in each of the layers included in the neural network 100 may be connected to each other to exchange data. For example, one node may receive data from other channels and perform an operation, and may output the operation result to other nodes.
The input and output of each of the nodes may be referred to as input activation and output activation, respectively. That is, the activation may be an output of one channel and may also be a parameter corresponding to input of nodes included in a next layer.
Each of the nodes may determine its own activation based on activations that are received from nodes included in the previous layer, a weight, and a bias. The weight is a parameter used to calculate output activation in each node, and may be a value assigned to a connection relationship between nodes.
Each of the nodes may be processed by a computational unit or processing element that receives an input and outputs an output activation, and the input-output of each of the nodes may be mapped. For example, when σ is an activation function, wjki is a weight from a k-th node included in (i−1)-th layer to a j-th node included in an i-th layer, bji is a bias of the j-th node included in the i-th layer, and aji is activation of the j-th node included in the i-th layer, activation aji may be calculated using Equation 1 as follows.
As shown in
Meanwhile, when the neural network 100 is implemented in DNN architecture, the neural network 100 may include more layers capable of processing valid information. Thus, the neural network 100 may process data sets of higher complexity than a neural network having a single layer.
In the neural network, a convolution operation is performed between input data and a weight map, and as a result, feature maps are output. Weight maps are parameters for finding the features of input data, and they are also called kernel or filter. The generated output feature maps are convolution operated as input feature maps with a weight map, thereby outputting new feature maps. As a result of repeatedly performing this convolution operation, a result of operation of the input data through the neural network may be finally output.
For example, when data having a 24×24 size is input to the neural network of
After then, the 10×10 feature maps is reduced in size through repeated convolution operation and sub-sampling operation with the weight map, thus outputting global features in the end. The neural network may output feature maps from the input data by repeatedly performing the convolution operation and the sub-sampling (or pooling) in several layers, and finally obtain the operation result of the input data as the output feature maps are input to the fully connected layer.
Referring to
A convolution operation may be performed on the first feature map FM1 and a weight map WM, and as a result, the second feature map FM2 may be generated. The weight map filters the features of the first feature map FM1 by performing a convolution operation with the first feature map FM1 by using a weight parameter defined in each element. The weight map performs a convolution operation with windows (or tiles) of the first feature map FM1 while shifting the first feature map FM1 by a stride in a sliding window manner. During each shift, each of the weight parameters included in the weight map may be multiplied with and added to each of the pixel values of an overlapped window in the first feature map FM1. As the first feature map FM1 and the weight map are convoluted, one channel of the second feature map FM2 may be generated. Although only one weight map is illustrated in
The second feature map FM2 may correspond to an input feature map of a next layer. For example, the second feature map FM2 may be an input feature map of a pooling (or sub-sampling) layer.
Referring to
The apparatus 200 for generating learning data illustrated in
For recognizing texts (numbers and letters) of a vehicle's license plate using the neural network 100, images (or photos) showing the license plate are required as learning data (or training data), but there are many restrictions on collecting a large amount of these images.
The apparatus 200 for generating learning data can effectively generate leaning data (or training data) by generating various license plate images by combining individual images stored in the memory 200, instead of collecting images of the actual vehicle's license plate, and transforming the generated images.
As the learning data generated at the apparatus 200 for generating learning data is used for training of the neural network 100, the text of a license plate can be more accurately recognized from an external image (or photo) obtained outside of the apparatus 200 for generating learning data. The external image is an image showing the license plate, and obtained (or photographed) outside of the apparatus 200 for generating learning data, and can be data input to the trained neural network 100 to recognize a text, not an image utilized for the generation of learning data.
Here, the vehicle can be any transportation means, without limitation, as long as a unique license plate is attached onto it. In addition, the license plate can be any number plate without limitation, as long as it is attached to a vehicle, regardless of whether it is attached to the front or rear of the vehicle.
The processor 210 may perform overall functions for controlling the apparatus 200 for generating learning data that comprises it. The processor 210 can overall control the apparatus 200 for generating learning data, by executing programs stored in the memory 220.
The processor 210 may generate a license plate image by combining individual images stored in the memory (220). The processor 210 may generate a license plate image by loading a background image, a frame image and a text image, which are the elements of the license plate image, from the memory 220 one by one, and combining the loaded images. The detailed method of generating the license plate image by combining the images will be described with reference to
The processor 210 may generate learning data in which the license plate image is variously transformed, such that even when external images obtained at various points in time and in various environments are input to the neural network 100, an accurate result can be generated from the neural network 100. For example, the processor 210 may perform at least one of geometry transformation and filter transformation on the license plate image. The order in which the processor 210 performs the geometry transformation and the filter transformation on the license plate image is not limited, but may be changed according to the setting.
The processor 210 may generate a transformed image by performing geometry transformation on a generated license plate image. The geometry transformation may be a transformation method transforming all or some of the geometry structures of the license plate image. For example, the geometry transformation may include at least one of length transformation, tilt transformation, and motion blur transformation. The geometry transformation will be described below with reference to
The processor 210 may generate a transformed image by performing filter transformation on a generated license plate image. The filter transformation may be a transformation method transforming the attributes of all or some of the license plate image. For example, the filter transformation may include at least one of sharpness transformation, brightness transformation, chroma transformation, contrast transformation, color transformation, noise transformation, transparency transformation, and climate application transformation. The filter transformation will be described below with reference to
Meanwhile, the processor 210 may perform a transformation on a license plate image such that the frame image and the text image correspond to each other in the license plate image. For example, the processor 210 may perform geometry transformation on the frame image and the text image by the same length or the same angle, or may perform filter transformation of transforming image attributes by the same value.
In addition, the processor 210 may perform a transformation only on the frame image and the text image while maintaining the background image in the license plate image, or may perform a transformation on all of the background image, the frame image and the text image, such that the background image, the frame image and the text image correspond to each other.
Further, the processor 210 may perform filter transformation on a geometry transformed image, or geometry transformation on a filter transformed image.
Also, the processor 210 may set a text corresponding to the text image as target data (or label data) corresponding to the transformed image. The target data is data included in the learning data, and the neural network 100 may be trained such that the target data (text) is output through a neural network operation on input data (transformed image) for the training of the neural network 100.
The text corresponding to a text image may correspond to a vehicle's number indicated in the license plate. The processor 210 may set the corresponding text as target data for a transformed image, such that when the transformed image is input to the neural network 100, the text corresponding to the transformed image (or text image include in the transformed image) is output from the neural network 100. In other words, the processor 210 may set target data and train the neural network 100, such that when an external image is input to the neural network 100, the text corresponding to the external image can be output from the neural network 100.
The processor 210 may generate learning data including transformed image and target data (text). The processor 210 may generate learning data by combining a transformed image that is subject to an operation with target data so as to generate data suitable to be input to the neural network 100.
The processor 210 may input learning data to the neural network 100 so as to train the neural network 100. The processor 210 may train the neural network 100, such that the target data (text) can be derived from a transformed image.
The processor 210 may be a processor 210 comprised in various kinds of computing devices such as PC (personal computer), a server device, a mobile device, an embedded device, an IoT (Internet of Things) device, etc. For example, the processor 210 may be a processor 210 such as CPU (central processing unit), GPU (graphics processing unit), AP (application processor), NPU (neural processing unit), but is not limited thereto.
The processor 210 may be implemented as an array of several logic gates, and may be implemented in combination with a commonly-used microprocessor and a memory 220 on which a program executable in this microprocessor is stored. In addition, a person having ordinary knowledge in the technical field to which the present embodiment pertains can understand that this can also be implemented as another form of hardware.
The memory 220 is hardware storing various kinds of data processed by a processor 210, which may store, for example, background images, frame images, text images and various kinds of filters, etc. In addition, the memory 220 may store various programs or applications, etc. to be driven by the processor 210.
The memory 220 may include at least one of a volatile memory and a nonvolatile memory. The nonvolatile memory includes ROM (Read Only Memory), PROM (Programmable ROM), EPROM (Electrically Programmable ROM), EEPROM (Electrically Erasable and Programmable ROM), flash memory, PRAM (Phase-change RAM), MRAM (Magnetic RAM), RRAM (Resistive RAM), FeRAM (Ferroelectric RAM), etc. The volatile memory includes DRAM (Dynamic RAM), SRAM (Static RAM), SDRAM (Synchronous DRAM), PRAM, MRAM (Magnetic RAM), RRAM (Resistive RAM), etc. In the embodiment, the memory 200 may be implemented by at least one of HDD (Hard Disk Drive), SSD (Solid State Drive), CF (compact flash), SD (secure digital), Micro-SD (micro secure digital), Mini-SD (mini secure digital), xD (extreme digital) and Memory Stick.
The neural network 100 may be comprised in a separate independent device implemented outside of the apparatus 200 for generating learning data. Here, the neural network 100 may receive an input of data by external apparatus 200 for generating learning data, and output the result of performing an operation. However, the neural network 100 may also be comprised in an apparatus 200 for generating learning data, unlike the one illustrated in
From
Referring to
The background image 610 may correspond to an image corresponding to the front or rear of a vehicle on which the license plate is attached, or an image having other various patterns and colors. The shape and color of the front or rear are different for each vehicle, and even in the case of the same vehicle model, the shape around the license plate may be different because of tuning, etc. of the vehicle, and thus various background images are required. Additional examples of the background image 610 will be described with reference to
The frame image 620 corresponds to a region where the text is excluded from the license plate, and may be formed with various standards, colors and shapes as illustrated in
The text image 630 is an image including numbers and letters (i.e, a vehicle number) displayed on the license plate. The text image 630 may be formed in various font colors, font sizes, and arrangement of characters, as shown in the license plates according to
Meanwhile, the processor may generate a text image 630. The processor may generate a text image 630 for the text set based on information on a font, character arrangement, standard, etc., stored in the memory. As such, the processor may generate text images of various standards on the same text, or it may generate text images of the same standard on different texts. The processor may store the generated text images in the memory.
The memory may store a background image group, a frame image group, and a text image group. For generation of various license plate images, each image group may include various background images, various frame images, and various text images. To combine the frame image 620 and the text image 630, their respective standards (or versions) should correspond to each other, and thus, information on the standard of each image can also be stored in the memory. Thus, the frame image group and the text image group may include not only the images, but also information on their standards (or versions).
The processor may load an image from each of the background image group, the frame image group and the text image group. That is, the processor may load one background image 610 from the background image group stored in the memory, one frame image 620 from the frame image group, and one text image 630 from the text image group.
Meanwhile, since the standards of the frame image 620 and the text image 630 correspond to each other, the processor may load a frame image 620 and a text image 630 that correspond to each other in standard from the frame image group and the text image group based on the standard information of each image.
The processor may generate a license plate image 640 by combining loaded images. As illustrated in
The processor may generate various license plate images through various combinations. The license plate image 640 illustrated in
Referring to
In Step S710, the processor may compare the number of the license plate images corresponding to the particular text with a pre-set value.
The pre-set value is the number of license plate images to be generated for one text. For example, when it is set to generate 1000 license plate images for text “52GA 3018”, the pre-set value corresponds to 1000, and the processor may compare the number of the license plate images corresponding to “52 GA 3018” with 1000.
In Step S720, the processor may determine whether the number of the license plate images is less than the pre-set value.
If the number of generated license plate images is less than the pre-set value, this means that the number of license plate images does not reach the target, so the processor may perform Step 730 in order to repeat the generation of license plate images. In Step S730, the processor may generate the license plate images by loading individual images and combining the loaded images. After then, the processor may perform Step S710 to compare again the number of the license plate images of a particular text with the pre-set value.
In Step S720, if it is determined that the number of generated license plate images is greater than or equal to than the pre-set value, this means that the number of the license plate images has reached the target, so the processor may finish the generation of license plate images. By such method, the processor can generate the pre-set number of the license plate images.
Referring to
The geometry transformation may include at least one of length transformation, tilt transformation, and motion blur transformation. When performing the geometry transformation, the processor may perform one of length transformation, tilt transformation, and motion blur transformation, or perform a plurality of transformations on the license plate image 640. However, the aforementioned types of geometry transformation are just examples, and as long as it transforms the geometry structure of an image, the transformation can be included in the geometry transformation, without limitation.
In the embodiment according to
For example, the processor may generate a first length-transformed image 841a by expanding the length of the license plate image 640 in the right direction of
In the embodiment according to
For example, the processor may generate a first tilt-transformed image 842a by adjusting the angle of the license plate image 640 in the clockwise direction of
In the embodiment according to
The processor may perform motion blur transformation on a frame image by displaying the moving trace of the frame image while moving the frame image on the background image in one direction. In addition, the processor may perform motion blur transformation on a text image as well by displaying the moving trace of the text image while moving the text image on the frame image where the motion blur transformation was performed in the same direction as the direction of moving the frame image. That is, the processor may perform the motion blur transformation on the frame image and the text image such that the frame image and the text image correspond to each other on the background image. Or, the processor may perform the motion blur transformation on all of the license plate images including the background image.
In addition, the processor may gradually change brightness, sharpness or transparency between the moving start point and the moving end point for the moving trace of the image. For example, the processor may set the sharpness at the moving start point of the image to 50%, and set the sharpness of the moving end point of the image to 90%.
For example, the processor may perform motion blur transformation on the license plate image 640 in the right direction of
Referring to
Since external images for the vehicle's license plate are obtained by photographing it at various angles, transformed images corresponding to the external images obtained at various angles are required. Thus, the processor may perform various types of geometry transformation on the license plate image 640 to generate various transformed images.
For example, the processor may generate a first geometry-transformed image 941 corresponding to the obtained external image photographed a certain angle at a left side from the front of the license plate image 640, or may generate a third geometry-transformed image 943 corresponding to the obtained external image photographed a certain angle at a right side, by performing tilt transformation and length transformation on the license plate image 640. In addition, the processor may generate a second geometry-transformed image 942 corresponding to the obtained external image photographed at the front of the license plate image 640 by performing length transformation on the license plate image 640, without performing tilt transformation.
Since the processor generates transformed images at various angles and these transformed images are utilized as learning data as illustrated in
Referring to
The filter transformation may include at least one of sharpness transformation, brightness transformation, chroma transformation, contrast transformation, color transformation, noise transformation, transparency transformation and climate application transformation. However, the aforementioned types of the filter transformations are just examples, and any transformation that transforms the attributes of an image may be included in the filter transformation, without limitation. For example, the filter transformation may include blur transformation, granular transformation, film-effect transformation, sepia transformation or rain effect transformation, and the like.
When performing filter transformation, the processor may perform one of the filter transformations or perform a plurality of transformations on the license plate image 640.
The external images for the vehicle's license plate may be obtained in various environments such as various climates, various time zones, and various imaging systems, etc. That is, the same license plate may have different sharpness, color and noise of the license plate when displayed on external images. Also in this case, the processor may perform filter transformation to generate various transformed images such that accurate results can be output from the neural network 100. In the embodiment according to
Referring to
It can be seen that tilt transformation was performed on the transformed images 1141 to 1146, except for the seventh transformed image 1147. In addition, it may be seen that motion blur transformation was performed on the transformed images 1141 to 1143 and 1145 to 1147, except for the fourth transformed image 1144. Also, it may be seen that sharpness transformation or the blur transformation is performed on the first to seventh transformed images 1141 to 1147. The first to seventh transformed images 1141 to 1147 may be ones on which various types of transformation other than the aforementioned types of transformations were performed.
As such, the processor may generate various transformed images by performing various types of transformation on various license plate images. As the neural network is trained by using these transformed images, accurate operation is possible even when various external images are input to the neural network.
The apparatus for generating learning data may input learning data 1211 to the neural network 100, in addition to generating the learning data 1211 to train the neural network.
The processor may generate learning data 1211 including information on transformed images and texts by connecting a transformed image and the corresponding text (target data). For example, if the text image used in the license plate image displays the text “52GA 3018”, the text corresponding to the text image is “52GA 3018”, and the processor may set the text “52GA 3018” as the target data for the transformed image. Thus, in this case, the learning data 1211 may include the image transformed from the license plate image, and the text “52GA 3018” which is the target data.
The apparatus for generating learning data may input learning data 1211 to the neural network 100 to train the neural network. The apparatus for generating learning data may train the neural network 100 to output target data when the transformed image is input to the neural network 100. For example, the apparatus for generating learning data may train the neural network 100 such that a result to be output through operation of the neural network 100 on the transformed image becomes target data.
Specifically, the neural network 100 may be trained in such a way that various parameters of the neural network 100 are optimized, where the operation result of a transformed image and the target data is compared and the difference is reduced.
When an external image 1221 is input to the trained neural network 110 after the training of the neural network 100 is completed, a text 1222 corresponding to the external image 1221 may be output from the trained neural network 110. For example, when the external image 1221 in which “39GA 2764” is displayed on the license plate is input to the trained neural network 110, the text 1222 “39GA 2764” may be output.
Referring to
The apparatus for generating learning data may include a CPU 1310, an RAM 1320, a memory 1340, a sensor module 1350, and a communication module 1360, and the neural network device 1330 may include a neural network. The electronic system 1300 may include the apparatus for generating learning data and the neural network device. The CPU 1310 of
The apparatus for generating learning data may further include an input/output module, a security module, a power control device, and the like. Some of the hardware configurations of the electronic system 1300 may be mounted on at least one semiconductor chip. The neural network device 1330 may be a device including a hardware accelerator dedicated to the neural network.
The CPU 1310 controls the overall operation of the electronic system 1300. The CPU 1310 may include a single-core processor or a multi-core processor. The CPU 1310 may process or execute programs and/or data stored in the memory 1340. In one embodiment, the CPU 1310 may control the function of the neural network device 1330 by executing programs stored in the memory 1340.
The RAM 1320 may temporarily store programs, data, or instructions. For example, the programs and/or data stored in the memory 1340 may be temporarily stored in the RAM 1320 according to the control of the CPU 1310 or boot code. The RAM 1320 may be implemented as a memory such as DRAM or SRAM.
The neural network device 1330 may perform an operation of the neural network based on received input data and generate an information signal based on the result of the performing. The neural network device 1330 is hardware performing processing using the neural network, and may correspond to a hardware accelerator dedicated to a neural network.
The information signal may include one of various types of recognition signals such as a text recognition signal, an object recognition signal, and an image recognition signal. For example, the neural network device 1330 may receive an external image as input data and generate a text recognition signal corresponding to the external image. However, the present disclosure is not limited thereto, and the neural network device 1330 may receive various types of input data according to the type or function of an electronic device on which the electronic system 1300 is mounted, and may generate a recognition signal according to the input data.
The memory 1340 is where data is stored, and may store an operating system (OS), various programs, and various types of data. In an embodiment, the memory 1340 may store intermediate results, for example, an output feature map, generated during the operation of the neural network device 1330 in the form of an output feature list or output feature matrix. In an embodiment, a compressed output feature map may be stored in the memory 1340. Also, the memory 1340 may store quantized neural network data, such as parameters, weight maps, or weight lists, which are used in the neural network device 1330. The memory 1340 may include at least one of a volatile memory and a nonvolatile memory.
The sensor module 1350 may collect information around the electronic device on which the electronic system 1300 is mounted. The sensor module 1350 may sense or receive a signal (e.g., an image signal, a text signal, an audio signal, etc.) from the outside of the electronic device and convert the sensed or received signal into data. To this end, the sensor module 1350 may include at least one of various types of sensing devices such as a microphone, an imaging device, an image sensor, a light detection and ranging (LIDAR) sensor, an ultrasonic sensor, and an infrared sensor. The sensor module 1350 may provide the converted data as input data to the neural network device 1330. The sensor module 1350 may provide various types of data to the neural network device 1330.
The communication module 1360 may include various wired or wireless interfaces for communicating with an external device. For example, the communication module 1360 may include a communication interface connectable to a wired local area network (LAN), a wireless local area network (WLAN) such as wireless fidelity (Wi-Fi), a wireless personal area network (WPAN) such as Bluetooth, a wireless universal serial bus (USB), a zigbee, a near field communication (NFC), a radio-frequency identification (RFID), a power line communication (PLC), or a mobile cellular network such as 3rd generation (3G), 4th generation (4G), or long term evolution (LTE).
Referring to
In Step S1410, the apparatus for generating learning data may generate a license plate image by combining a background image, a frame image, and a text image.
The apparatus for generating learning data may load one image from each of a background image group including a plurality of background images, a frame image group including a plurality of frame images, and a text image group including a plurality of text images.
The apparatus for generating learning data may load a frame images and text image that correspond to each other in standard from each of a frame image group including frame images having various standards and a text image group including text images having various standards. The apparatus for generating learning data may generate a license plate image by combining loaded images.
The apparatus for generating learning data may generate a plurality of license plate images by repeating the steps of generating a license plate image by loading image and combining the loaded images.
The apparatus for generating learning data may compare the number of generated license plate images that correspond to each text with a pre-set value. The apparatus for generating learning data may repeat the generation of license plate images when the number of license plate images corresponding respectively to texts is less than the pre-set value, and stop generating license plate images when the number is greater than or equal to the pre-set value.
In step S1420, the apparatus for generating learning data may generate a transformed image by performing at least one of geometry transformation and filter transformation on a license plate image.
The apparatus for generating learning data may perform geometry transformation on a license plate image such that the frame image and the text image correspond to each other in the license plate image.
The apparatus for generating learning data may perform geometry transformation only on the frame image and the text image while maintaining the background image in the license plate image. Or, the apparatus for generating learning data may perform geometry transformation on all of the background image, the frame image, and the text image.
The geometry transformation may include at least one of length transformation, tilt transformation, and motion blur transformation.
The apparatus for generating learning data may perform length transformation on a license plate image by adjusting the length in at least one direction.
The apparatus for generating learning data may perform motion blur transformation on a frame image by displaying the trace of the movement of the frame image while moving the frame image in the background image in one direction, and may perform motion blur transformation on the text image such that it corresponds to the frame image in the transformed frame image.
The filter transformation may include at least one of sharpness transformation, brightness transformation, chroma transformation, contrast transformation, color transformation, noise transformation, transparency transformation, and climate application transformation.
In Step S1430, the apparatus for generating learning data may set a text corresponding to the text image as target data for the transformed image.
In Step S1440, the apparatus for generating learning data may generate learning data including the transformed image and the target data.
After Step S1440, the apparatus for generating learning data may train the neural network such that the target data is output through an operation on the transformed image in the neural network.
Such operations performed by the apparatus for generating learning data may be performed in a computer-readable storage medium in which a program is recorded.
As non-limiting examples, computer-readable media may be used to store RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or desired program code means in the form of instructions or data structures, and may include any other storage medium that can be accessed by a general purpose or special-purpose computer or a general purpose or special-purpose processor. Discs and discs, when used herein, include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), floppy discs, and Blu-ray discs. Discs normally reproduce data magnetically, but discs optically reproduce data with a laser. Combinations of those aforementioned should also be included within the scope of computer-readable storage media.
The embodiments of the present invention have been explained with reference to the drawings attached herewith, and a person having ordinary skill in the art to which the present invention pertains may understand that the present invention can be carried out in any other specific forms, without changing the technical idea or essential features. Therefore, please note that the embodiments described above are just exemplary and are not limited thereto.
Number | Date | Country | Kind |
---|---|---|---|
10-2021-0100346 | Jul 2021 | KR | national |
This work was supported by Seoul R&BD Program in Korea [CY201024, AI-based Smart City Safety Net Vehicle Search Service].