The present disclosure relates to an image-based classification system. More particularly, the disclosure relates to an image classification system that uses an artificial neural network to perform image segmentation on an image of an object in order to inspect any irregularity in an inner layer of the object.
Automated optical inspection equipment has wide application and is frequently used in the front-end and back-end manufacturing processes of display panels and semiconductor products to inspect product defects. For example, an automated optical inspection system associated with the manufacture of display panels may carry out glass inspection, array inspection (in the front-end array manufacturing process), color filter inspection, and back-end liquid crystal module inspection, etc.
While classifying the image materials through machine vision, a conventional automated optical inspection system typically uses an edge detection algorithm to determine edge locations, and in some cases the image being analyzed must be manually marked to create proper masks (e.g., when a watershed algorithm is used). Such image classification methods have decent reliability but are disadvantaged by their operational limitations and unsatisfactory inspection efficiency.
The primary objective of the present disclosure is to provide an image-based classification system, comprising an image capturing device, and a processing device connected to the image capturing device. The image capturing device is used for capturing an image of an object, wherein the object has a surface layer and an inner layer. The processing device is configured to use a deep learning model, perform image segmentation based on the image of the object, define a surface-layer region and an inner-layer region of the image, and generate classification information. The surface-layer region and the inner-layer region correspond respectively to the surface layer and the inner layer of the object.
The present disclosure is so designed that, without having to resort to manually designed features, an artificial neural network can automatically extract the region corresponding to an irregularity in an inner layer of a display panel from an image of the display panel so as to improve the efficiency and the reliability of the inspection.
According to the present disclosure, image segmentation for, and defect inspection from, an image in a region corresponding to an irregularity in an inner layer of the object can be completed in one inspecting procedure, thus providing better inspection efficiency than that of the conventional algorithms.
The details and technical solution of the present disclosure are hereunder described with reference to accompanying drawings. For illustrative sake, the accompanying drawings are not drawn to scale. The accompanying drawings and the scale thereof are not restrictive of the present disclosure.
The present disclosure is intended for use in an automated optical inspection (AOI) system and involves automatically generating a mask in an image of a display panel through an artificial neural network and extracting a region of interest (i.e., a region on which defect inspection will be performed) from the image according to the mask, in order to achieve higher inspection reliability and higher inspection efficiency than in the prior art.
Please refer to
In this embodiment, the image-based classification system 100 essentially includes an image capturing device 10 (e.g., a camera), and a processing device 20 connected to the image capturing device 10. To enable fully automated inspection and fully automated control, a carrier 30 is generally also included to carry an object P to an inspection area, where images of the object P are taken. In addition, for different types of the object P or defects, the image-based classification system 100 may be mounted with a variety of auxiliary light sources 40 for illuminating the object P. The auxiliary light source 40 may be, but is not limited to, a lamp for emitting collimated light, a lamp for emitting diffused light, or a dome lamp. Depending on the type of the object P, two or more auxiliary light sources 40 may be required at the same time for some special objects P.
The camera for use in an automated optical inspection should be chosen according to the practical requirements. A high-precision camera is called for when stringent requirements are imposed on the precision and the reliability of a workpiece to be inspected. A low-end camera, on the other hand, may be used to reduce equipment cost. In short, the choice of the camera is at the user's discretion. Cameras for automated optical inspection can be generally categorized as area scan cameras or line scan cameras, either of which may be used to meet the practical requirements. A line scan camera is often used for dynamic inspection, in which the object P is photographed while moving, and which ensures the continuity of inspection process.
The image capturing device 10 is connected to the back-end processing device 20. Images taken by the image capturing device 10 are analyzed by the processor 21 of the processing device 20 in order to find defects on the surface of, or inside, the object P. Preferably, the image capturing device 10 is provided with a microprocessor (generally a built-in feature of the image capturing device 10) for controlling the image capturing device 10 or preprocessing the images taken by the image capturing device 10. After the processor 21 of the processing device 20 obtains images from the image capturing device 10 (or its microprocessor), the processor 21 preprocesses the images (e.g., through image enhancement, noise removal, contrast enhancement, edge enhancement, feature extraction, image compression, and/or image conversion), and outputs the preprocessed images to be analyzed by a visual software tool and related algorithms so as to produce a determination result, which is either output or stored in a database. In this embodiment, the processor 21 is configured to load a deep learning model M1 (see
Please refer to
The present disclosure uses a mask region-based convolutional neural network (or Mask RCNN for short) as its major structure and modifies the network in order to achieve image segmentation and defect identification (inspection) at the same time. Image segmentation and defect inspection are performed by the processor 21 after the processor 21 loads the data in the storage unit 22. The collaboration between the processor 21 and the storage unit 22, however, is not an essential feature of the disclosure and therefore will not be detailed herein.
The processor 21 is configured to execute the deep learning model M1 after loading from the storage unit 22, to define a surface-layer region P1 and an inner-layer region P2 of an image of the object P and thereby generate the corresponding classification information, to identify any defect P21 in the inner-layer region P2 according to the classification information, and to produce an inspection result.
A preferred embodiment of the present disclosure is described below with reference to
Referring to
Referring to
The feature extraction network N11 includes a plurality of first convolutional layers N111, N112, N113, N114, and N115, which are sequentially arranged in a bottom-to-top order, with the bottom convolutional layer (e.g., the first convolutional layer N111) configured to extract the low-level features in the image being analyzed, and the upper convolutional layers (e.g., the first convolutional layers N112 to N115) configured to extract the high-level features in the image being analyzed. The number of the convolutional layers can be set as required by the samples and is not an essential feature of the present disclosure. The original image IP is normalized before being input into the bottom first convolutional layer N111 and has its features extracted in the first convolutional layers N111 to N115 to produce a plurality of feature maps. In one preferred embodiment, the feature extraction network N11 is a deep residual network (ResNet), whose relatively good convergence properties solve the degradation problem of a deep network.
As far as target inspection is concerned, a relatively low-level feature map contains a relatively small amount of information but is relatively large in size and hence relatively accurate in terms of target position, which helps identify image details. A relatively high-level feature map, on the other hand, contains a relatively large amount of information but is relatively inaccurate in terms of target position, with relatively large strides that hinder the inspection of relatively small targets in the image being analyzed. To enhance the accuracy of inspection, the backbone network N1 further includes the feature pyramid network N12 to ensure the accuracy of target positions as well as the desired amount of information. More specifically, the feature pyramid network N12 upsamples pixels in the feature maps output from the upper first convolutional layers N112, N113, N114, and N115 to obtain a plurality of same-size feature maps N121, N122, N123, and N124, which correspond in number to the outputs of the feature extraction network N11 and also correspond in size respectively to the feature maps output from the first convolutional layers N112, N113, N114, and N115. The feature pyramid network N12 then merges each of the feature maps output from the first convolutional layers N112, N113, N114, and N115 with the corresponding same-size feature maps N121, N122, N123, or N124 to obtain a plurality of initially merged feature maps, on which convolution is subsequently performed by the plural second convolutional layers of the feature pyramid network N12 respectively to produce a plurality of merged feature maps Q1, Q2, Q3, and Q4 to be output by the feature pyramid network N12. Therefore, the bottom-layer output can be used to inspect small targets in the image being analyzed; the middle-layer outputs, to inspect medium-sized targets; and the top-layer output, to inspect large targets. The selected output features are dynamically determined according to target size.
Referring to
More specifically, referring to
Generally, a classifier is designed to deal with fixed input dimensions only and cannot deal with variable input dimensions properly. Now that the bounding box fine-tuning step of the region proposal network N2 allows the at least one region of interest RO to have different sizes, image compression must be carried out by a pooling operation in order to normalize the input image. During the pooling operation, the ROI aligning module N3 uses bilinear interpolation to reduce errors resulting from quantification, or more particularly from the rounding of floating-point numbers. More specifically, referring to
Once the normalized image NM is input into the fully convolutional network N4, referring to
Through the foregoing computations, the deep learning model M1 produces a total of three outputs, namely the merged feature maps Q1-Q4, the at least one region of interest RO and the instance segmentation masks SD, wherein the instance segmentation masks SD have been directly mapped onto the merged feature maps Q1-Q4 to avoid repeated feature extraction.
Thus, the processing device 20 performs inspection according to the aforesaid classification information, identifies any defect in the inner-layer region P2, and outputs the inspection result.
As stated above, the deep learning model M1 further includes the background removal module N5 and the fully connected layer N6. The background removal module N5 associates the merged feature maps Q1-Q4 with the at least one region of interest RO, segments the merged feature maps Q1-Q4 accordingly, and removes the background of the segmented merged feature maps Q1-Q4 according to the instance segmentation masks SD to obtain background-removed feature images of the object P. As the input of the fully connected layer N6 must be a normalized image, the background-removed areas of the background-removed feature images can be filled with a single image parameter in order for the images input into the fully connected layer N6 to meet the aforesaid requirement. (It is worth mentioning that training images used in the training process may be images with both the inner-layer region and the surface-layer region or images with only the inner-layer region.) Once the background-removed feature images of the object P are input into the trained fully connected layer N6, whose output end may be the softmax layer N22, the trained fully connected layer N6 weights and classifies the images and then outputs a classification result N7, such as “non-defective” or the type(s) of the defect(s).
According to the above, the present disclosure is so designed that an artificial neural network can automatically extract from an image of a display panel the region corresponding to an irregularity in an inner layer of the display panel. In addition, according to the present disclosure, image segmentation for, and defect inspection from, an image with the region corresponding to the irregularity in the inner layer of the object can be completed in one inspection procedure, thus providing much higher inspection efficiency than achievable by the conventional algorithms.
The above is the detailed description of the present disclosure. However, the above is merely the preferred embodiment of the present disclosure and cannot be the limitation to the implement scope of the present disclosure, which means the variation and modification according to the present disclosure may still fall into the scope of the disclosure.
Number | Date | Country | Kind |
---|---|---|---|
108127221 | Jul 2019 | TW | national |