The present disclosure relates to a labeling technology and, specifically to an image object labeling method and system.
The development and application of machine learning have gradually become an apparent trend. A large amount of data can be used to train machine learning model and the trained models can be used to obtain certain prediction information.
In fact, for a machine learning model with strong generalization, it is necessary to collect a large amount of data with relatively even category differences to complete a training process. A digital file containing a specific image category is taken as an example; the data collecting process needs to select data, assign labels, and confirm correctness.
Generally, to adapt to changes in the application context of machine learning, it is necessary to adjust the given types of data labels specifically manually. However, because judgment factors, such as experience, cognition, and rules, are different from person to person, in addition to labeling errors of data labels, the quality of the data set can be different, and the construction efficiency of the machine learning model will be affected. Although some labeling technologies have been proposed in the past, they still need to be improved.
In light of this, it is necessary to provide a technical solution different from the past to solve a prior art problem.
One objective of the present disclosure is to provide an image object labeling method that can provide labels on image objects and is favorably suitable for improving label construction efficiency of a machine learning model.
Another objective of the present disclosure is to provide an image object labeling system that can provide labels on image objects and is favorably suitable for improving label construction efficiency of a machine learning model.
Another objective of the present disclosure is to provide a tangible, non-transitory, computer readable medium that can provide labels on image objects and is favorably suitable for improving label construction efficiency of a machine learning model.
To achieve the above objective, one aspect of the present disclosure provides an image object labeling method, executed by a processor coupled to a memory, including: providing an image file; detecting at least one first object from the image file, and generating at least one graphic block and attribute thereof according to a first detection result; performing a binarization process on the image file, to present a first graphic feature on a region containing the graphic block in the image file and present a second graphic feature on the rest region, in the image file; combining the first detection result and a result of the binarization process, and filtering the image file through a plurality of masks whose are gradually reduced to show a plurality of separated graphic components on a portion of the first graphic feature until a number of the separated graphic components is the same as a number of the at least one graphic block; and assigning a label to each of the separated graphic components according to the attribute of the at least one graphic block, alternatively, outputting a message, receiving a command, and adjusting the at least one graphic block and attribute thereof as the label according to the command.
In one embodiment of the present disclosure, after generating the at least one graphic block and attribute thereof, the method further includes: detecting at least one second object from the image file, and generating at least one text block and attribute thereof according to a second detection result.
In one embodiment of the present disclosure, the image object labeling method further includes: processing the at least one graphic block and the at least one text block by a filling algorithm.
In one embodiment of the present disclosure, before performing the binarization process, the method further includes: performing a grayscaling process on the at least one graphic block and the at least one text block.
In one embodiment of the present disclosure, the image object labeling method further includes: processing the region presenting the second graphic feature by a dilation algorithm or an erosion algorithm.
In one embodiment of the present disclosure, the label includes a pattern, a text, or a combination thereof.
In one embodiment of the present disclosure, the at least one graphic block includes a partial feature diagram of an electronic component, and the attribute of the graphic block includes a directional attribute.
Another aspect of the present disclosure provides an image object labeling system including a processor coupled to a memory storing instructions configured to be executed by the processor to perform the above method.
Another aspect of the present disclosure provides a tangible, non-transitory, computer readable medium including instructions stored in a tangible non-transitory computer-readable medium to perform the above method.
The image object labeling method, system, and tangible non-transitory computer readable medium of the present disclosure are provided for providing an image file; detecting at least one first object from the image file, and generating at least one graphic block and attribute thereof according to a first detection result; performing a binarization process on the image file, to present a first graphic feature on a region containing the graphic block in the image file and present a second graphic feature on the rest region, in the image file; combining the first detection result and a result of the binarization process, and filtering the image file through a plurality of masks whose are gradually reduced to show a plurality of separated graphic components on a portion of the first graphic feature until a number of the separated graphic components is the same as a number of the at least one graphic block; and assigning a label to each of the separated graphic components according to the attribute of the at least one graphic block, alternatively, outputting a message, receiving a command, and adjusting the at least one graphic block and attribute thereof as the label according to the command. Thus, after the label creation process mentioned above, the image objects are labeled. It can greatly shorten creation time of labeling objects, and is beneficial to improve the label construction efficiency of the data set for the development and application of the machine learning model.
The following description of the various embodiments is provided to illustrate the specific embodiments of the present disclosure. Furthermore, directional terms mentioned in the present disclosure, such as upper, lower, top, bottom, front, rear, left, right, inner, outer, side, surrounding, central, horizontal, lateral, vertical, longitudinal, axial, radial, uppermost, and lowermost, which only refer to the direction of drawings. Therefore, the directional terms configured as above are for illustration and understanding of the present disclosure and are not intended to limit the present disclosure.
Please refer to
For example, the image object labeling system can be configured to include a processor and a memory coupled to the processor, wherein the memory stores at least one instruction executed by the processor to perform an image object labeling method provided by another aspect of the present disclosure, which is illustrated as the following but is not limited here.
Another aspect of the present disclosure that provides an image object labeling method that is executed by a processor coupled to a memory and includes providing an image file, e.g., being converted from a document file; detecting at least one first object from the image file, and generating at least one graphic block and attribute thereof according to a first detection result; performing a binarization process on the image file, to present a first graphic feature on a region containing the graphic block in the image file and present a second graphic feature on the rest region, in the image file; combining the first detection result and a result of the binarization process, and filtering the image file through a plurality of masks whose are gradually reduced to show a plurality of separated graphic components on a portion of the first graphic feature until a number of the separated graphic components is the same as a number of the at least one graphic block; and assigning a label to each of the separated graphic components according to the attribute of the at least one graphic block, alternatively, outputting a message, receiving a command, and adjusting the at least one graphic block and attribute thereof as the label according to the command. The following examples illustrate the sample states that can be implemented to understand the relevant content, but not limited here.
For example, as shown in
As shown in
As shown in
The external data can also be the document file, and the processor can convert the document file to provide the image file. The image file is at least provided in the manner mentioned above and is not limited here. For example, the document file can be a digital document file containing various graphic examples, such as different electronic component views, and text examples, such as explanatory text. For example, “*. pdf,” a portable document format, e.g., the content of the file can include technical documents (datasheets) related to electronic components, but not limited here.
The digital document file can also be in other file formats, e.g., other file-formats, including graphics and text, such as “*.doc” or “*.odt.” Additionally, the external data can also include other data, such as tables. The image file can be in compressed or uncompressed file format, such as “*.jpg,” but it is not limited here. The image file can also be in other file formats, such as “*.png” or “*.bmp.”
The following image files can refer to file data or screen content in which data of the file is displayed on a display device and can also be called pictures. Subsequently, step S2 can be performed.
As shown in
For example, values, words, pointers, signs, or a symbolic combination thereof can be used to represent the first object. For example, in the process of object detection, object classification, and block recognition, data can be trained according to the application context of the data to be labeled, and matching a machine learning model and parameters can be selected.
For example, the object detection method can be a method based on a deep learning model, such as real-time object detection (You only look once Version 3: Unified, Real-Time Object Detection, or YoloV3, in short), efficient and accurate scene text detection (EAST), convolutional neural network (CNN), and region-based CNN (R-CNN), in which setting 5 model features (such as architectures and parameters) can be understandable by those skilled in the art, and is not described here.
For example, the graphic block can be a partial feature diagram of an electronic component, e.g., a view of the electronic component. The attribute of the graphic block is configured as a view direction, such as top, bottom, long-side, short-side, pattern and/or section, but is not limited here.
For example, the above method, after generating a plurality of graphic blocks and their attribute according to the detection result, further includes detecting a plurality of second object (such as text examples) from the image file, and generating a plurality of text blocks and their attribute according to a second detection result (i.e., a result of detecting the second object). For example, values, words, pointers, signs, or a symbolic combination thereof can be used to represent the second object. For example, the second object detection method can be a method based on a deep learning model, such as YoloV3, EAST, CNN, and R-CNN. Subsequently, step S3 can be performed.
Optionally, as shown in
As shown in
For example, the binarization process can be that the pictures are directly processed in the binaried process, but are not limited here, e.g., the pictures can be processed in a process of converting into grayscale, such as by an adaptive grayscale algorithm, and then be processed in the binaried process. In addition, the pictures can be expanded and eroded by expansion and erosion algorithms, which can be understandable by those skilled in the art, and are not described here.
In this example, a scheme of grayscale and binarization can be adopted. After a picture is pre-processed by the binaried process, the erosion algorithm (e.g., using one 13×13 mask) can be adopted to erode a part of the picture showing in white color, such that a part of the picture showing non-background contents (such as pixels of a partial feature of an electronic component) outwardly expands. Thus, object extraction effect can be improved.
Subsequently, several gradually reduced masks (e.g., using 13×13, 11×11, 9×9, 7×7, 5×5, 3×3 masks, in sequence) for filtering the image file, so that the contents of the image file are gradually and clearly shown.
For example, the pictures with the above detection result (such as only the first detection result or combining the first and second detection results) and the pre-processed in the binarization process can be combined to further narrow the range of object determination. Also, the range of objects can be defined in a filling manner, such as region filling algorithm or flood filling algorithm, which can be understandable by those skilled in the art, and is not described here.
Subsequently, the masks are gradually reduced for filtering. For example, the ranges of defined objects are connected to each other, so an erosion degree of the objects can be re-adjusted, e.g., first adjusting one mask into 11×11 for initial separation and calculation of a plurality of graphic components. In order to facilitate identification, the separated graphic components can be applied to the attributes of the graphic blocks (such as painted in different colors) generated according to the object detection results to avoid confusion. Then, the mask can be gradually reduced to make a part of the image file presenting the first graphic feature show the separated graphic components until the number of the separated graphic components and the number of the graphic blocks are the same. This step can be stopped to use these separated graphic components as targets that are to be labeled.
As shown in
For example, step S6, may be configured to give a label to each of the separated graphic components according to the attribute of the at least one graphic block. For example, a size of the mask is adjusted to absolutely separate the graphic components. When the number of the separated graphic components is the same as the number of the at least one graphic block, the attribute of the at least one graphic block (such as a category result detected from one graphic block object) is correspondingly applied to the graphic member as the label. The label can be presented as a form of a digital file, but is not limited here. For example, different labels can be presented in different colors fitted at a position corresponding to the respective graphic member to avoid visual confusion.
In another aspect, for example, the adjusting process may be configured to output message and receiving a command, and adjusting the at least one graphic block and attribute thereof as the label according to the command. Examples are as follows.
For example, as shown in
As shown in
As shown in
Optionally, as shown in
Optionally, as shown in
Optionally, as shown in
Optionally, as shown in
Optionally, as shown in
Optionally, in one embodiment, after generating at least one graphic block and attribute thereof according to a first detection result, the method further includes: detecting at least one second object from the image file, and generating at least one text block and attribute thereof according to a second detection result. In this way, the text block and attribute thereof can be used to exclude the graphic block and attribute thereof to avoid misrecognition of the graphic block and attribute thereof, which can improve the labeling performance of image objects.
Optionally, in another application scenario, if the graphic block derived from the first object and the text block derived from the second object need to be combined, a relationship between the graphic block and the text block can be created. After the graphic components are cut, the graphic block, the attribute of the graphic block, the text block, and the attribute of the text block can be fetched in the graphic components to improve performance of labeling image objects.
Optionally, in one embodiment, the method further includes: processing the at least one graphic block and the at least one text block by a filling algorithm. In this way, results of the binaried pre-processing and the object detection can be effectively combined to improve performance of labeling image objects.
Optionally, in one embodiment, before performing the binarization process, the method further includes: performing a grayscaling process on the at least one graphic block and the at least one text block. In this way, the grayscale operation can be used to effectively increase identification of the content of the image file to avoid residual noise of the image file and to improve object matching degree of the binarization operation and performance of labeling image objects.
Optionally, in one embodiment, the method further includes: processing the resion presenting the second graphic feature by a dilation algorithm or an erosion algorithm. In this way, a certain processing effect can be provided under a condition of good use of computing resources to improve performance of labeling image objects.
Optionally, in one embodiment, the label includes a pattern, a text, or a combination thereof. In this way, breadth of the labels can be can enriched to meet the needs of data processing or display applications, so as to adjust or confirm contents of the labels to improve performance of labeling image objects, integrity of labeled image objects, and reusability of labeled image objects.
Optionally, in one embodiment, the at least one graphic block includes a partial feature diagram of an electronic component, and the attribute of the graphic block includes a directional attribute. In this way, automatic labeling operations can be performed for data that requires a large amount of processing (such as digital document files of electronic components) to improve the efficiency of labeling image objects.
To enable a person to understand the features of embodiments of the present disclosure, the following is an example of inputting a technical file (datasheet) of an electronic component as a document file and illustrates the classification process of the objects in the above embodiments. Still, it is not intended to be used as a limit.
For example, as shown in
Subsequently, as shown in
Subsequently, as shown in
Subsequently, as shown in
Subsequently, as shown in
Subsequently, as shown in
Subsequently, as shown in
Subsequently, as shown in
Furthermore, because distributions of objects in various documents are different, so matching degrees between the labels and the objects corresponding to various documents are different. The above content is taken as an example. Suppose there is an image file with similar content to the image file shown in
In another aspect, the present disclosure further provides an image object labeling system, which includes a processor coupled to a memory storing instructions configured to be executed by the processor to perform the above image object labeling method.
For example, the image object labeling system can be configured as an electronic device with data processing functions. The electronic device can be cloud platform machines, servers, desktop computers, notebook computers, tablets, or smartphones, but is not limited here, to perform the above image object labeling method, which is described as above and is not described here.
In another aspect, the present disclosure further provides a tangible, non-transitory, computer readable medium, storing instructions that cause a computer to execute operations including: providing an image file; detecting at least one first object from the image file, and generating at least one graphic block and attribute thereof according to a first detection result; performing a binarization process on the image file, to present a first graphic feature is presented on a region containing the graphic block in the image file and present a second graphic feature on the rest region, in the image file; combining the first detection result and a result of the binarization process, and filtering the image file through a plurality of masks whose are gradually reduced to show a plurality of separated graphic components on a portion of the first graphic feature until a number of the separated graphic components is the same as a number of the at least one graphic block; and assigning a label to each of the separated graphic components according to the attribute of the at least one graphic block, alternatively, outputting a message, receiving a command, and adjusting the at least one graphic block and attribute thereof as the label according to the command.
After the instructions are loaded and executed by the computer, the computer can execute the above-mentioned image object labeling method. For example, the tangible, non-transitory, computer readable medium may contain several program instructions, which can be implemented by using existing programming languages to implement the above-mentioned image object classification methods, such as Python combining with Numpy, Matplotlib, and TENSORFLOW packages, but not limited here.
In another aspect, the present disclosure further provides a computer-readable medium, such as an optical disc, a flash drive, or a hard disk, but not limited here. It should be understandable that the computer-readable medium can further be configured as other forms of computer data storage medium, e.g., cloud storage (such as ONEDRIVE, GOOGLE Drive, AZURE Blob, or a combination thereof), or data server, or virtual machine. The computer can read the program instructions stored in the computer-readable medium. After the computer loads and executes the program instructions, the computer can complete the above-mentioned image object labeling method.
In summary, the image object labeling method, system, and tangible non-transitory computer readable medium of the present disclosure are provided for providing an image file; detecting at least one first object from the image file, and generating at least one graphic block and attribute thereof according to a first detection result; performing a binarization process on the image file, to present a first graphic feature is presented on a region containing the graphic block in the image file and present a second graphic feature on the rest region, in the image file; combining the first detection result and a result of the binarization process, and filtering the image file through a plurality of masks whose are gradually reduced to show a plurality of separated graphic components on a portion of the first graphic feature until a number of the separated graphic components is the same as a number of the at least one graphic block; and assigning a label to each of the separated graphic components according to the attribute of the at least one graphic block, alternatively, outputting a message, receiving a command, and adjusting the at least one graphic block and attribute thereof as the label according to the command. Thus, after the label creation process mentioned above, the image objects are labeled. It can greatly shorten creation time of labeling objects, and is beneficial to improve the label construction efficiency of the data set for the development and application of the machine learning model.
Although the present disclosure has been disclosed in preferred embodiments, which are not intended to limit the disclosure, those skilled in the art can make various changes and modifications without departing from the spirit and scope of the disclosure. Therefore, the scope of protection of the present disclosure is defined as definitions of the scope of the claims.