None.
The present invention relates generally to a hierarchical classification system and/or a defect detection system for an image.
Referring to
One type of alignment technique includes feature point based alignment. Feature point based alignment extracts discriminative interesting points and features from the model image and the input images. Then those features are matched between the model image and the input images with K-nearest neighbor search or some feature point classification technique. Then a homography transformation is estimated from those matched feature points, which may further be refined.
Feature point based alignment works well when target objects contain a sufficient number of interesting feature points. Feature point based alignment typically fails to produce a valid homography when the target object in the input or model image contains few or no interesting points (e.g. corners), or the target object is very simple (e.g. target object consists of only edges, like paper clip) or symmetric, and/or the target object contains repetitive patterns (e.g. machine screw). In these situations, too many ambiguous matches prevents generating a valid homography. To reduce the likelihood of such failure, global information of the object such as edges, contours, or shape may be utilized instead of merely relying on local features.
Another type of alignment technique is to search for the target object by sliding a window of a reference template in a point-by-point manner, and computing the degree of similarity between them, where the similarity metric is commonly given by correlation or normalized cross correlation. Pixel-based template matching is very time-consuming and computationally expensive. For an input image of size N×N and the model image of size W×W, the computational complexity is O(W2×N2), given that the object orientation in both the input and model image is coincident. When searching for an object with arbitrary orientation, one technique is to do template matching with the model image rotated in every possible orientation, which makes the matching scheme far more computationally expensive.
With regard to image classification, many techniques involve using nearest neighbor classifier, Naïve Bayes classifier, Neural Networks, decision trees, multi-variate regression model, and support vector machines. Often each of these techniques involve using a classification technique where category models are learned from initial labeled training data and then each testing example is assigned to a class out of a finite and small set of classes.
Defect detection based upon a supervised classification is one detection category. However, often it is difficult to gather a reasonable size of training samples with labeled defect masks, which requires cumbersome manual annotation. Labeling by human operators leads to severe waste of resources to produce such samples, especially given that new datasets and defects periodically arise. Given the high intra-class and inter-class variance of potential defects, designing suitable features tends to be problematic.
Another category of defect detection views defect detection as saliency detection. Saliency detection typically estimates coarse and subjective saliency support on natural images, and often leads to severe over detections while making a number of assumptions in the process.
Another category of defect detection views defect detection as anomaly detection. For example, analyzing the input image in the Fourier domain may only locate small defects on uniformly textured or periodic patterned images, such as a fabric surface. The anomaly detection process is not suitable for large sized defects.
Another category of visual defect detection is based on the use of a defect free “reference” or “model” image. The model image may contain an “ideal” view of the product or parts thereof. The input image may contain a view of the product under inspection and is compared with the model image to detect defects. In principle, deviations or differences from the model image present in the input image may indicate one or more defects.
What is desired therefore is a computationally efficient classification technique and/or a computationally efficient defect detection technique.
The foregoing and other objectives, features, and advantages of the invention may be more readily understood upon consideration of the following detailed description of the invention, taken in conjunction with the accompanying drawings.
Referring to
Referring to
Referring to
Referring again to
Referring to
In general, an image classification problem can be defined as: given a set of training examples composed of pairs {xi, yi}, find a function f(x) that maps each xi to its associated class yi, i=1, 2, . . . , n, where n is the total number of training examples. After training, the predictive accuracy of the learned classification function is evaluated by using it to classify a set of unlabeled examples, unseen during training. This evaluation measures the generalization ability (i.e., predictive accuracy) of the learned classification function. Classification has many applications, including for example, text classification (e.g., document and web content), image classification, object classification, image annotation (e.g., classify image regions into different areas), face classification, face recognition, biological classification, biometric identification, handwriting recognition, medical image classification, drug discovery, speed recognition, and Internet search engines.
Many classification problems tend to have relatively complex hierarchical structures, such as for example, genes, protein functions, Internet documents, and images. Using a classification technique that is likewise hierarchical in nature tends to work well on such problems, especially when there are a large number of classes with a large number of features. Non-hierarchical techniques, such as those that treat each category or class separately, are not especially suitable for a large number of classes with a large number of features. By utilizing known hierarchical structures, a hierarchical classification technique permits the effective classification of a large number of classes with a large number of features by splitting the classes into a hierarchical structure. At the root level in the category hierarchy, a sample may be first classified into one or more sub-categories using a classification technique. The classification may be repeated on the sample in each of the sub-categories until the sample reaches leaf categories or is not suitable to be further classified into additional sub-categories. Each of such classification selections may be processed in an independent manner, and in an efficient manner using parallel processors.
Referring to
Referring to
The use of hierarchical decomposition of the classification permits efficiencies in both the learning and the prediction phases. Each of the individual classifications is smaller than the original problem, and often it is feasible to use a much smaller set of features for each of the classifications. The hierarchical classification technique may take into account the structure of the class hierarchy, permitting the exploitation of different sets of features with increased discrimination at different category levels, whereas a flat classification technique would ignore the information in the structure of the class hierarchy. The hierarchical classification framework is also flexible enough to adapt to changes in the category structure. The hierarchical classification framework also facilitates the adaptation to changes in the category structure. In addition, the use of such a hierarchical structure often leads to more accurate specialized classifiers. Moreover, at each level or portion thereof, any classification technique, such as SVM, Random Treess, Neural Networks, and/or Bayesian classifier may be used as the classification technique.
One technique for implementing a hierarchical classification technique is to transform the original hierarchical classification into a set of flat classifications, with one flat classification for each level of the class hierarchy, and then use a flat classification technique to solve each of these levels independently of the others. For example, in
In this framework, the two independent runs effectively correspond to two distinct flat classifications. In other words, the multiple runs of the classification technique are independent both in the training phase and the test phase. While beneficial, this approach does not guarantee that the classes predicted by the independent runs at the different class levels will be compatible with each other. For instance, it is possible to have a situation where the classifier at level 1 assigns a test example to class 1, while the classifier at level 2 assigns the example to class 2.1, which is inconsistent with the first level prediction.
A modified hierarchical classification technique may use multiple runs of independent training but further include top-down classification during testing. In such a modified hierarchical classification, in the training phase the class hierarchy is processed one level at a time (or otherwise independently), producing one or more classifiers for each internal class node. In the test phase, each sample may be classified in a top-down fashion. For example, the test sample may be assigned to one or more classes by the first-level classifier(s). Then the second level classifier(s) may assign this sample to one or more sub-classes of the class(es) predicted at the first level. This process may be continued until the sample's class(es) are predicted at the deepest available level.
To create a hierarchical set of classifiers using the top-down technique, the system may either train a single multi-class classifier for each internal class node, or train multiple binary classifiers for each internal class node. In the former case, the system may use a multi-class classification technique such as multi-class SVM, Random Trees, and/or Decision Trees. Thus, at each class level, the system may build a classifier that predicts the class(es) of a sample at that level. In the latter case, the system may train multiple binary classifiers at each internal class node. Therefore, for each test sample and for each class level, the system may present the sample to each of the binary classifiers at that level. As a result, the test example may be assigned to one or more classes at each level, and this information may be taken into account at the next level.
The top down approach has the advantage that each classification model for a single class node is induced to solve a more modular, focused classification. The modular nature of the top-down approach may also be exploited during the test phase, where the classification of a sample at a given class level guides its classification at the next level. However, the top down approach has the disadvantage that, if a test example is misclassified at a certain level, it tends to be misclassified at all the deeper levels of the hierarchy.
During the training of the top-down hierarchical classification, a classifier may be learned, or multiple classifiers may be learned, for each internal (non-leaf) node of the tree hierarchy. At each internal node, the technique may use a set of features discriminating among all the classes associated with the child nodes of this internal node. For instance, at the root node, the technique may use features discriminating among the first-level classes 1, 2, . . . , k0, where k0 is the number of first level classes (child nodes of the root node). At the node corresponding to class 1, the technique may use features discriminating among the second level classes 1.1, 1.2, . . . k1, where k1 is the number of child classes of the class 1, and so forth. The features used at each internal node may be automatically discovered by a feature selection technique (e.g., using mutual information between a feature F and a category C), or they can be defined by an operator where the operator selects the most discriminating features for differentiating the sub-classes.
This hierarchical approach produces a hierarchical set of features or rules, where each internal node of the hierarchy is associated with its corresponding set of features or rules. When classifying a new sample in the test set, the sample is first classified by the feature/rule set associated with the root node. Next, it is classified by the feature/rule set associated with the first-level node whose class was predicted by the feature/rule set at the root (“zero-th”) level, and so on, until the sample reaches a leaf node and is associated to the corresponding leaf-level class. For instance, suppose the sample was assigned to class 1 by the feature/rule set associated with the root node. Next, the sample may be classified by the feature/rule set associated with the class node 1, in order to have its second-level class predicted, and so on, until the sample is assigned to a leaf node class. In this manner, only a portion of a set of classes at a particular level may be used, if desired. This top down technique for classification of the test samples exploits the hierarchical nature of the discovered feature/rule set.
The class hierarchy may be used to select a specific set of positive and negative samples for each run of the technique containing only the samples directly relevant for that particular case. For instance, referring to
By way of example, the defect classification technique may be applied to liquid crystal displays (LCDs). During the production process of a LCD panel, various types of defects may occur in the LCD panel due to the product processes. Many of the defects may fall into one of four categories, namely, SANSO, UWANO, COAT, and GI. SANSO and UWANO are foreign substances that are deposited onto the LCD panel during the various production stages. SANSO and UWANO both have characteristics of a dark black inner core. The principal difference between the SANSO and UWANO is that the SANSO defect has a pinkish and/or greenish color fringe around the defect border, whereas UWANO does not have such a color fringe around the border. The COAT defect is a bright uniform coat region with a thin dark border and the color of the inner COAT region is similar to the color of the circuit on the LCD panel. The GI defect consists of a colorful rainbow pattern that is typically substantial in size.
Referring to
Referring to
Referring to
Referring to
In general, one or more candidate matches may be detected, while typically only one of the candidates is the “correct” match. The challenge is that multiple candidate matches may have quite similar appearance, with the differences between the image areas in the model image corresponding to multiple candidate matches can be small. For example, different parts of the underlying circuit patterns in the model can be quite similar, with only small differences. Hence, these different parts in the image are difficult to discriminate, leaving ambiguity as to which candidate should be selected as the correct match. The alignment 404 may utilize a landmark label image 406, together with the input image 400 and model image 402.
The landmark label image 406 together with the input and model images 400, 402 may be provided to an extraction and modification of a warped model image region process 408. The warped model image region process 408 extracts a region from the model image that corresponds to the input image and warps it into a corresponding relationship for comparison. The warped model image region process 408 provides the input image 400, a warped model image 410, and a warped landmark label image 412. The warped model image 410 is the extraction of that portion of the image of interest. The warped landmark label image 412 is the landmark image map with labels, akin to a segmentation map having “semantic meaning”. The landmark image may identify portions of the landmarks that should have a greater or lesser impact on the alignment. For example, landmark regions of greater importance may be marked relative to landmark regions of lesser importance.
The input image 400, the warped model image 410, and the warped landmark label image 412 may be provided to a defect detection process 414, that identifies the defects 416, such as in the form of a defect mask image. The defect detection includes many challenges one or more may be addressed in a suitable system.
One of the challenges for defect detection that should be addressed is that the input images may have different illumination changes and color patterns compared to the model template image. Input images contain defects may have complicated backgrounds, such as circuit patterns, varying background color, varying levels of blur (focus), and noise. Accordingly, a robust defect detection under such varying imaging conditions is challenging.
Another of the challenges for defect detection that should be addressed is that the input image may include many different types of defects, such as for example, SANSO, UWANO, GI, and COAT. Different classes of defects vary dramatically in their feature representations and therefore it is problematic to utilize a set of generalized features which can handle all types of defects. For example, GI images have a rainbow pattern in color with pink and/or greenish color fringes. For example, COAT images have a similar color as the landmark inside and are not salient compared to SANSO and UWANO defects.
Another of the challenges for defect detection that should be addressed is that often some misalignment between the input image and the model image remains, even after alignment, due to the ambiguity in the landmark structures and image conditions. The alignment stage may identify the correct model image area, however, small differences in the shape and location of the landmark structures in these images will still be present. By way of example, these small differences may lead to false alarms in the detection stage.
In many cases it is desirable to provide a defect detection technique that does not require training. Training based defect detection techniques, such as the classification technique previously described, may be time consuming to implement, and may not be suitable to detect defects with varying characteristics. Moreover, a training based framework often requires a significant amount of training data, and the training process itself tends to be computationally intense.
Referring to
The weighted matching 450 may be implemented by performing a binary AND operation between a model edge orientation feature image and an input edge orientation feature image, and counting the number of matching pixels. The weighted matching may be implemented by performing the binary AND operation and count operation twice; once with the entire model edge orientation feature image and another time with a masked model edge orientation feature image, where non-discriminative edge features are set to zero.
Another aspect of the alignment process 404, 408 is extending the scoring of the potential candidate matches and ranking those candidates in an order from the greatest likelihood to the least likelihood. For example, a baseline matching score may be computed that is based on the number of edge pixels in the input image and corresponding model image region that have the same local orientation at the same location. The matching scoring function may incorporate a relative penalty for edges that are present in the candidate model image region but not present in the input image.
Such mismatched edge components between the model image and the input image may be isolated by image processing. Subsequently, the number of isolated pixels detected in this manner may be used to penalize the matching score for each candidate. Incorrect candidate matches are likely to be substantially penalized, whereas the correct candidates are not substantially penalized (or penalized by a negligible amount). This image processing may involve applying morphological dilation operators to a binarized input edge image. This binarized input edge image is then negated by a binary NOT operator and matched to a binarized model edge image using a binary AND operator. This process isolates edge pixels that are present in the candidate model image region but not present in the input image or near the input image edge pixels.
Referring to
The blur estimation process 460 may be based upon, an edge pixel selection 461, and estimating the edge width at selected edge points 462. For example, the selection criterion may include selecting edges along horizontal line and vertical line structures of landmarks. For example, the technique may model an edge peak profile as a Gaussian function to estimate the edge width. Based upon the edge width estimation, the local edge width may be computed at each selected edge pixel. The local edge width estimates may be combined into a global blur level estimation 464, e.g., by averaging. The system may decide to skip and/or stop the image processing based upon the global blur level estimation 464.
The warped model image 410 and/or the warped landmark label image 412 may be related based upon geometric relationships between the input image 400 and the matching model image 402 region, such as by translation, rotation, and other parameters, which may be provided to the defect detection 414. The defect detection 414 preferably includes techniques to reduce the effects of illumination and lighting changes in the input image while also enhancing the difference of the potential defect compared to the model image. Referring to
The input image 400 may be processed in a suitable manner to identify the dominant background color 500, such as using a three dimensional histogram. The background of the warped model image 410 based upon the warped landmark label image 412 (e.g., may be a binary image identifying the structure of the underlying features) may be replaced with the dominant background color 500 of the input image 400 using a replacement process 510. The landmarks of the warped model image 410 based upon the warped landmark label image 412 (e.g., may be a binary image identifying the structure of the underlying features) may be replaced with a dominant landmark color 505 of the input image 400 using a replacement process 515. The result of the background replacement process 510 and the landmark replacement process 515 is a modified model image 520. In this manner, the colors of the backgrounds and/or landmarks of the input image 400 and the warped model image 410 are similar, so that minor defects in the input image 400 are more readily identifiable. An absolute difference is computed between the input image 400 and the modified model image 520 by a difference process 530. The difference process 530 provides a resulting image 540 that is preferably a direct absolute difference with modified mode image 520. The use of this technique, especially compared to computing the absolute differences between the input image and the warped model image, is (1) a reduction in the effects of differences in the background due to the illumination and lightening changes from the input image; (2) a reduction in bright lines in the difference image that would have otherwise resulted since the model images are imperfectly blended using several sub-images; and (3) a reduction in large differences near landmark features due to imperfect alignment.
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
Referring also to
if R(i,j)>0.875*(I_max−I_min) then R(i,j)=0;
if 0.75*I_max−I_min)<R(i,j)<0.875*(I_max−I_min) then R(i,j)=100;
if 0.5*(I_max−I_min)<R(i,j)<0.75*(I_max−I_min) then R(i,j)=50;
if R(i,j)<0.5*(I_max−I_min) then R(i,j)=0.
Then a morphological filtering 743 (e.g., an erosion process and then a dilation process) may be conducted in order to remove noisy artifacts. After the morphological filtering the gradient magnitude 744 may be calculated on the quantized images so as to suppress the uniform background region. The gradient magnitude may be the square root of the sum of the square of the horizontal gradient magnitude and the vertical gradient magnitude where the horizontal gradient magnitude is obtained by convolving the input image by the 3 by 3 filtering mask [−1, −1, −1, 0, 0, 0, 1, 1, 1] and the vertical gradient magnitude may be obtained by convolution with the filtering mask [−1, 0, 1, −1, 0, 1, −1, 0, 1]. Then the process may filter out false positives as a result of geometrical relations of the color fringes and the defects from the modified resulting image 745. The false positive filter may be a distance filter, namely, the color fringe should be sufficiently close to the defect. The false positive filter may be a geometric relation that the defect region should be surrounded by the color fringe. In the luminance space, the system may threshold the pixels which have the luminance smaller than a substantial value (e.g., 230) to be zero 746 to obtain the color fringe detector in R channel 747 since the color fringe usually has a high luminance.
Referring also to
if a(i,j)>0.75*a_max then a(i,j)=0;
if 0.25*a_max<a(i,j)<0.75*a_max then a(i,j)=100;
if a(i,j)<0.25*b_max then a(i,j)=0;
if b(i,j)>0.75*b_max then b(i,j)=0;
if 0.25*b_max<b(i,j)<0.75*b_max then b(i,j)=100;
if b(i,j)<0.25*b_max then b(i,j)=0.
Then the color fringe detector may threshold the results by rejecting the non-zero pixels which are sufficient far away from the defect in the modified resulting image to obtain the color fringe in the Lab space 756.
Referring also to
Referring also to
Referring also to
The color fringe detector for GI images 710 may be any suitable function, such as an OR operation of any of the features.
The shapes of the COAT, SANSO, and/or UWANO defect images may be refined 621 (see
The terms and expressions which have been employed in the foregoing specification are used therein as terms of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding equivalents of the features shown and described or portions thereof, it being recognized that the scope of the invention is defined and limited only by the claims which follow.