The invention relates to a method for associating a digital image with a class of a classification system.
Automating error recognition based on optical analysis methods has become increasingly important with the increasing automation of industrial processes. Optical error recognition methods were performed in the past by quality assurance personnel, who inspected the object to be tested or an image representation of the object to be tested and identified possible errors. For example, x-ray images of weld seams are checked based on error types, such as for example tears, inadequate continuous welds, adhesion errors, slag, slag lines, pores, tubular pores, root notches, root errors, heavy-metal inclusions and edge offset. It is also known to inspect radioscopic images of cast parts to identify errors in the cast part, for example inclusion of impurities, inclusion of gases, bubbles, such as axial pores or spongy pores, fissures or chaplets. Because of these errors are of similar type, but may be different in their appearance and shape, more recent approaches in industrial error evaluation now associate errors with different classes, wherein the respective class contains errors of the same type. The industry standard EN 1435 describes, for example, the classification system for weld seam errors. According to this standard, the errors occurring in weld seams and identified by x-ray images are divided into the 30 different classes, for example classes for the error tear, such as longitudinal care or transverse tear, inadequate continuous welds, adhesion errors, foreign inclusions, such as slag, slag lines, gas inclusions, such as pores or tubular pores, or heavy-metal inclusions, undercuts, root notches, root errors, and edge offset. With increasing automation of these processes, there is now a push to achieve optical recognition of errors and association of these errors with predetermined classes through image analysis based on images that are recorded and stored using digital image recording techniques. Conventional automated error recognition methods based on digital images use a so-called “heuristic approach.” With this approach, reference images are saved in an image processing unit and an attempt is made to through image comparison to associate the content of a digital image with one of these reference patterns.
In other technical fields, image content is associated with classes of a classification system, for example, for character recognition. In this case, for example, each letter forms its own class, so that for the capital letter alphabet there exist, for example, 26 classes, namely for the characters (A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z). The OCR technologies (Optical Character Recognition) analyze the digital image of a printed page generated by a scanner and associate the individual letter symbols with the predetermined classes. As a result, the OCR technology “recognizes” the text and can transfer the classified characters to a text processing program as an editable sequence of letters. The granted European patents 0 854 435 B1 and 0 649 113 B1 are directed, for example, to the technical field of character recognition (Optical Character Recognition).
The technique of image processing can be more and more divided into areas with different sub-processes, whose technologies develop independent of each other. These areas are frequently organized into image preprocessing, image analysis, analysis of image sequences, image archiving and the so-called Imaging.
Image preprocessing is defined as the computer-aided improvement of the quality (processing: noise elimination, smoothing) of the corresponding digital image to facilitate visual recognition of the information content of this image by the viewer.
Image analysis is defined as the computer-aided evaluation of the information content of the corresponding digital image by automated and reproducible structuring, identification and comprehension of this image.
Analysis of image sequences is defined as the computer-aided evaluation of the information content of the respective sequence of digital images by automated and reproducible structuring, identification and comprehension of all individual images of this sequence and by automated and reproducible comprehension of the context of the sequence of individual images of this image sequence.
Image archiving is defined as the computer-aided compression and storage of the digital images together with indexed search descriptors from a controlled vocabulary.
Imaging is defined as the computer-aided generation of synthetic graphics and digital images for visualizing and describing the information content of complex processes on an image and symbol plane for the human observer.
The technique of associating the content of digital images with a class of the classification system is one method of image analysis, which can be divided into three subareas: segmentation, object recognition and image comprehension.
Segmentation is defined as of the automated and reproducible structuring of the respective digital images by separating the objects that are relevant for the analysis of the image from each other and from the image background. Object recognition is defined as the automated and reproducible classification of the separated objects. Image comprehension can be interpreted as the automated and reproducible interpretation of the respective digital image by context evaluation of the classified, separated objects. The technique of associating digital images with a class of a classification system is a method of object recognition.
Object recognition can be viewed as a subarea of pattern recognition, namely as the subarea of the pattern recognition which recognizes as patterns only two-dimensional objects in images.
Images are typically displayed as an image composed of pixels, whereby to display the image, the content of each pixel and its position in the image must be known. Depending on the content attribute, the image is can be divided into color images, grayscale images and binary images, wherein binary images have as content attribute, for example, only the values 0 and 1 for black and white, respectively.
One method frequently used in this technology for associating a digital image with a class of a classification system, which was used successfully for decades for distinguishing military aircraft (friend-foe identification), is known from M. K. Hu: “Visual Pattern Recognition by Moment Invariants”, IRE Trans. Info. Theory, vol. IT-8, 1962, pp. 179-187 and R. C. Gonzalez, R. E. Woods: “Digital Image Processing”, Addison-Wesley Publishing Company, 1992, pp. 514-518. Based on the so-called normalized centralized axial moments obtained through image analysis techniques from the image display, a finite sequence {φ1} of 7 dimensionless shape attributes can be generated for an arbitrary, separated, in limited, two-dimensional object in a binary image by scaling. If the 7 sequential elements ΦI (0≦I≦I0=7) are viewed as the coordinates of an attribute vector Φ=(Φ1, Φ2, Φ3, Φ4, Φ5, Φ6, Φ7) which is an element of a 7-dimensional Euclidian attribute space M7, then this method induces an object recognition in this 7-dimensional attribute space M7. The method has the advantage, compared with object recognition by heuristic attributes, that classification occurs exclusively with attribute vectors Φ=(Φ1, Φ2, Φ3, Φ4, Φ5, Φ6, Φ7) whose coordinates are dimensionless shape attributes, so that in particular size differences between the objects to be recognized and the objects used for generating the comparison table become unimportant. In addition, a unique sequential order with respect to the relevance of the attributes for the object recognition and the digital image processing is defined within the set of the dimensionless shape attributes φ1 through the coordinate reference to the attribute vector Φ so that it is immediately clear that the first attribute Φ1 is the most important.
However, this method still has disadvantages because the number of the available dimensionless shape attributes is limited to 7 and a misclassification can therefore occur with complex objects, if two different classes have identical values for the 7 dimensionless shape attributes.
In view of this background information, it is an object of the invention to propose a method for associating the content of a digital image with a class of a classification system which makes it possible to reliably recognize also symbols having a more complex shape.
This object is solved with the method according to claim 1. Advantageous embodiments are recited in the dependent claims.
The invention is based on the concept of determining for the image to be analyzed a predetermined number of numerical shape attributes ψm wherein m is a running index having values from 1 to F, wherein ψm is a transformed expression of the dimensionless, scaled, normalized, centralized, polar moment
Unlike the conventional method which is limited to 7 shape attributes, the numerical shape attributes ψm proposed for image analysis in the present invention are independent of each other in such a way that a large number of shape attributes can be defined, without creating an interdependence of the shape attributes. In this way, an unambiguous association of the image contents to be recognized with a predetermined class can be achieved.
In particular, the method of the invention is independent of the relative position of the content to be recognized with respect to the acquisition device. Even objects rotated by, for example, 60° or 180° can be uniquely associated.
The method is based on computing is sequence of F functionally independent, dimensionless attributes of the separated, limited content in the presented image.
The image is conventionally represented by N pixels, wherein a pixel in a predetermined coordinate system is located at the position (xi, yi) and the image extends from the coordinates (0, 0) to (ximax, yjmax) and imax is the maximum number of pixels in the direction of the x-coordinate and ymax is the maximum number of pixels in the direction of the y-coordinate, and wherein a content attribute data [j, i] is associated with each pixel.
The content attribute for a binary image, where the corresponding image pixel content assumes, for example, the value 1 or 0 for black or white, is for example a single value saved in a table, and data [j,i] is representative for the value in this table at the position associated with the pixel. In color images, where the content attribute of each pixel is composed, for example, of three values for the 3 color representation “red, green, blue” (RGB representation), the content attribute data [j,i] is, for example, representative of a vector which has these three values for the respective pixel. Data [j,i] can also be representative of other vectors, if other color representations are used, e.g., grayscale representations. Data [j,i] can also be representative of the magnitude of such vector, when a multi-color representation is converted from a multi-color representation, for example an RGB representation, into a grayscale or even a binary representation before employing the classification method of the invention.
In a color representation, for example an RGB representation, data [j,i] can also represent the individual value of the red representation, or the green representation, or the blue representation in the pixel. The classification method is then performed, for example, exclusively based on one representation, for example the red representation, whereby the method is here performed identical to the preceding method for binary representations. In this case, binary values 1 and 0 can also be used for data [j,i] at the pixel, wherein for example 1 indicates red and 0 empty. The classification method can also be performed in parallel for the different color representations, i.e., in parallel for a binary red representation, a binary green representation and a binary blue representation. This increases the accuracy of the classification.
The moment
Δa=width of the pixel in the x-coordinate direction,
Δb=width of the pixel in the y-coordinate direction,
data [j, i]=content attribute of the pixel at the position (yj, xi)
m=a sequential number from 1 to F.
In a particularly preferred embodiment, the predetermined coordinate system is a Cartesian coordinate system, because the majority of digital images defines the pixels with reference to a Cartesian coordinate system. However, other coordinate systems, for example polar coordinates systems, can also be employed.
While presently digital images can be rendered typically with between 1 and 3 million image dots (pixels), it can be expected that the number N will increase with advances of image acquisition and image processing techniques, so that the afore-described sum functions will approach integral functions.
More particularly, an image content is defined by the arrangement of pixels having the same content attribute.
The F determined shape attributes of the image content define an attribute vector in a limited, F-dimensional attribute space (unit hypercube). The content classification occurs by problem-specific clustering of this n-dimensional.
The classification system can be, for example, a predetermined industry standard, for example EN 1435. For identification of persons, for example, each person can form an individual class. In this case, the F shape attributes ψm representative of the fingerprint or the iris image of the person to be identified are then saved in the comparison table. For identification of persons, the image of the iris acquired by the image acquisition unit, for example a camera, is analyzed with the method of the invention, whereby the F shape attributes ψm of the recorded iris are computed and compared with the shape attribute values saved in the table. If there is an (approximate) agreement with all values of the shape attributes ψm of a class, then the system has recognized the person characterized by this class. Preferably, a least-squares method, for example a method according to Gauss, can be used to establish the approximate agreement.
If a digital image is recognized that is represented in a representation different from a binary representation, then the aforementioned method steps can be performed for several groups with F numerical shape attributes ψm, for example, for one group for values of a red representation, for one group for values of a green representation, and for one group for values of a blue representation. Alternatively, the aforementioned method steps can also be performed on content attributes data [j,i] which contain the individual values of the individual color representations as a vector. Computational division operations are then preferably performed on the magnitudes of the vectors.
In a preferred embodiment, the shape attribute ψm is determined by the transformation
However, other transformations as can also be used to transform ψm to
The shape attribute to be compared with the values stored in the table is preferably the shape attribute ψm obtained with the aforementioned transformation. Before the comparison with the table values or in the transformation from
For defining the number F of the numerical shape attributes ψm, the number F can be increased, starting with F=1, from several, in particular more than 29 samples per class of the classification system, until the values for the respective shape attribute ψm determined for the samples of a class are different in at least one numerical value for at least one shape attribute ψm from the numerical value of this shape attribute ψm of the other class. In a particularly preferred embodiment, the number F of the shape attributes is increased until the values of the shape attributes with the highest ordinal numbers m in all classes decrease with increasing ordinal number. The values of the corresponding shape attribute ψm determined for the at least 29 samples per class can be arithmetically averaged in order to determine a value to be inserted for this class for this shape attribute.
The table reproduced below, which is intended only to illustrate the freely selected numerical values, shows that for determining the weld seam error in relation to the error classes “tear”, “pore”, “tubular pore”, a number F=1 of the numerical shape attributes ψm is not sufficiently precise, because ψ1 assumes almost identical values for the tear class as for the tubular pore classes. The association only becomes unique by including the second numerical shape attribute ψ2. As can be seen, in spite of the similar numerical values for ψ2 in the class “pore” and “tubular pore”, this system consisting of only two shape attributes ψ1, ψ2 is suitable to precisely classify the 3 errors.
The number F can also be determined by a method based on a rotational ellipse. Such “Cluster Methods” are described, for example, in H. Niemann, Klassifikation von Mustern (Pattern Classification), Springer Verlag, Berlin, 1983, page 200ff.
The method of the invention for associating the content of a digital image with a class of a classification system is employed preferably in the optical inspection of components, in particular in optical surface inspection. The method can also be used in quality assurance, texture, shape and contour analysis, photogrammetry, symbol and text recognition, personnel recognition, robotic vision or evaluation of radiographic or radioscopic images, ultrasound images and nuclear spin tomography.
It is thereby unimportant if the images having objects to be recognized are “optical” images in the spectral range of visible light or radiographic or radioscopic images, or even synthetic images from the technical field Imaging. The method can therefore be used in the field of optical surface inspection as well as in quality assurance, texture, shape and contour analysis, photogrammetry, symbol and text recognition, personnel recognition, robotic vision or evaluation of radiographic or radioscopic images, ultrasound images and nuclear spin tomography.
When a concrete problem of object recognition is approached in the context of this broad range of possible applications, then the degree of complexity of the problem is defined from the beginning:
It is known into how many different object classes K the objects to be recognized are to be sorted. Unlike with classification based on heuristic attributes, in the new algorithmic method the number of degrees of freedom of the shape can be experimentally determined with respect to each object class based on a representative random sampling of test objects. The classification is performed exclusively with attribute vectors ψ=(ψ1, ψ2, ψ3, ψ4, ψ5, . . . , ψF). The attribute vector of an arbitrary separated, limited, two-dimensional object in the image is located inside a limited, normalized F-dimensional subarea (“unit hypercube”) of an F-dimensional attribute space. The pattern classification is performed by a problem-specific clustering of the interior of this F-dimensional unit hypercube.
The invention will now be described with reference to a drawing depicting a single exemplary embodiment. It is shown in:
The following Table shows the values for ψ1, wherein ψ1 is computed from the relation
With the afore-described relationship and the respective data fields for the respective representations, where content attributes are saved at the positions (yj, xi), the values reproduced in the following table are obtained:
Table of the numerical values for the shape attribute ψ1
As can be seen, the value ψ1 for the letter A assumes values of about 0.57, for the letter B values of about 0.6, and for the letter C values of about 0.44. A previously defined symbol can therefore be uniquely recognized with the method of the invention independent of the actual position and size of the letter.
Number | Date | Country | Kind |
---|---|---|---|
102004043149.3 | Sep 2004 | DE | national |
102005001224.8 | Jan 2005 | DE | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2005/009427 | 9/1/2005 | WO | 00 | 11/6/2008 |