1. Field
The present invention relates to a method and apparatus for pattern matching processing, and it is advantageous, for example, when used for an OCR (optical character reader).
2. Description of the Related Art
An OCR (optical character reader) recognizes a character through a pattern matching process. In a pattern matching process, a value indicating the degree of pattern matching is calculated from pattern data constituted by a plurality of elements (plural items of pixel data) extracted from an input image and template data constituted by a plurality of elements (pixel data) for pattern determination stored in advance in storage means. The calculated value is compared with a predetermined threshold to obtain a determination output indicating whether there is a desired pattern corresponding to the input image or not.
As the value indicating the degree of pattern matching, similarity S as expressed by Expression 1 shown below is frequently used.
where P(i, j) represents pattern data obtained by extracting a partial area of an input image, and Q(i, j) represents template data for pattern determination. “i” and “j” represent non-negative integers.
One document disclosing a technique for a pattern matching process is Japanese Patent No. 3572203. According to the document, a common template is created by combining characteristic parts of plural items of template data. Matching calculations are carried out to obtain similarity between the common template and pattern data constituted by a plurality of elements extracted from an input image. This method of processing allows the efficiency of a pattern matching process to be improved.
When a pattern matching process is performed, even if an input pattern (a character or the like) is slightly different from a pattern to be detected in the thickness of lines, the size of points, and the like, the input pattern must be determined to be the same pattern as the pattern which should be detected. For example, let us assume that there is a plurality of characters which are identical except that they are different in the thickness of lines. Then, those characters must be determined as identical characters. When a plurality of templates to be used for the characters having different lines are prepared as template data for comparison with input pattern data in such a case, a greater memory capacity will be required, and matching calculations will take a long time.
Under the circumstance, it is an object of an embodiment of the invention to provide a method and apparatus for pattern matching which have high flexibility and diversified recognition capabilities in recognizing input pattern data.
In the above-mentioned embodiment, pattern data constituted by a plurality of elements is extracted from input image data; template data constituted by a plurality of elements stored in advance in storage means is read; weight data constituted by a plurality of elements stored in advance in storage means in association with the template data is read; a calculation is performed on each element using the pattern data, the template data, and the weight data; a similarity value representing the degree of matching between the pattern data and the template data is calculated using the sum of calculation results obtained by the calculation on each element; and the similarity value is compared with a predetermined threshold to obtain a determination output indicating whether there is a match between the pattern data and the template data or not.
Additional objects and advantages of the embodiments will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out hereinafter.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention, and together with the general description given above and the detailed description of the embodiments given below, serve to explain the principles of the invention.
An embodiment of the invention will now be described with reference to the drawings.
11 represents a line sensor which utilizes, for example, a charge coupled device (CCD). A signal read by the line sensor 11 is converted at an analog-to-digital conversion circuit 12 into image data which is then fetched into a control unit 13. The control unit 13 temporarily fetches the image data into an image memory 132. The image data fetched into the image memory 132 is binarized at an area division unit 133.
Pattern data having a predetermined size (the same size as that of template data) is extracted from the binarized data at a pattern area extraction process unit 134. The pattern data is represented by P(i, j).
The above-described extraction process may be performed on the entire binary image such that an area having a predetermined size is incremented one pixel at a time. Alternatively, in order to allow subsequent processes to be performed efficiently, the size of an outline in which the binary pattern exits may be determined, and only an area having a predetermined size encompassing the outline size may be extracted.
Referring to the method of determining an outline size, as shown in
The pattern data having the predetermined size extracted as described above is compared with template data at a similarity calculation unit 135 to calculate similarity between them. A plurality of template data is prepared. As will be described later, weight data is prepared to allow the process of determining similarity to template data to be performed with some allowance or flexibility when similarity is calculated.
The result of similarity determination made by the similarity calculation unit 135 is input to a determination result processing unit 136. When the result of similarity determination exceeds a threshold, the character represented by the pattern data is finally determined. A sequence controller 131 controls the sequence in which each of the above-described blocks executes the data processing. 137 represents a memory in which the template data and weight data are stored.
A process of extracting a pattern area from the binary data is performed. Specifically, pattern data having a predetermined size (which is the same as that of a template data) is extracted.
At the next step or step SA3, an initial value is given to a template No. k, and it is determined at step SA4 whether k has reached a maximum value K. When k has reached the maximum value K at step SA4, the pattern recognition process terminates.
When k has not reached the maximum value K, template data Q[k] (i, j) associated with k is read from storage means or the memory 137 (step SA5). Weight data W[k] (i, j) associated with the template data is read from the memory 137 (step SA6). The template data and weight data may be completely different types of patterns or identical patterns at different inclinations, and such patterns may be selectively used depending on purposes.
Weighted similarity Sw[k] is calculated from the pattern data P, template data Q[k], and weight data W[k] (step SA7). Referring to the method of calculating similarity, for example, Expression 2 is used.
Next, it is determined whether the similarity exceeds a predetermined value T[k] (step SA8). If the similarity does not exceed the predetermined value T[k], a determination result J[k] of 0 is asserted (step SA10), and similarity between the next template and the pattern data P is calculated (step SA11). If the similarity exceeds the predetermined value T[k], a determination result J[k] of 1 is asserted (step SA9).
When Expression 2 given above is used, multiplications and divisions must be carried out, which requires a tremendously large circuit scale when implemented as hardware.
In order to suppress such an increase in circuit scale, a simple method as represented by the flow chart shown in
Based on a comparison between the difference D(i, j) and a predetermined threshold Td(i, j), the sum of the selected values (the difference D(i, j)) may be used as similarity Sw[k].
That is, when a difference D(i, j) is equal to or smaller than the predetermined threshold Td(i, j), there is similarity. When the difference exceeds the predetermined threshold Td(i, j), there is no similarity. When there is similarity, weight data A(i, j) is added to obtain similarity Sw[k]. All pixels in the predetermined size are compared with the pixel of the template data by varying j and i to obtain similarity Sw[k].
The description is continued by referring to
As described above, a plurality of similar patterns having different outline sizes can be detected at one matching process by combining the template data and the weight data to apply a smaller weight to unstable parts near edges of the patterns to be detected (parts which are uncertain in that they may become either of “1” and “0” as a result of binarization and which leave the patterns unchanged in global views thereof regardless of the result of binarization).
Specifically, when template data 61 and weight data 62 associated with the same are prepared as shown in
As will be apparent from the above, in the example shown in
On the contrary, weight data 71 which applies a greater weight to the neighborhood of the edge of a pattern to be detected (and a part of the pattern to be detected) may be prepared as shown in
As will be apparent from the above, in the example shown in
The invention is not limited to the above-described embodiment.
The invention is not limited to the exact modes of the above-described embodiments and may be embodied by modifying the constituent elements without departing from the gist of the same when implemented. Various inventions may be conceived by appropriately combining a plurality of the constituent elements disclosed in the above-described embodiments. For example, some constituent elements may be deleted from among the entire constituent elements described in the embodiments. Further, constituent elements belonging to the different elements may be combined as occasion demands.
Additional advantages and modifications will readily occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details and representative embodiments shown and described herein. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and their equivalents.