The present invention relates, in general, to image analysis and, in particular, to classifying an image as handwritten, machine-printed, or unknown.
Different methodologies are used for performing optical character recognition (OCR) on handwritten text and machine-printed text. To maximize the accuracy of an OCR, it is advisable to separate handwritten text from machine-printed text before having the same processed by an OCR that accepts the text type to be processed.
U. Pal and B. B. Chaudhuri, in an article entitled “Automatic Separation of Machine-Printed and Hand-Written Text Lines,” in Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999, pages 645–648, disclose a method of separating machine-printed and handwritten text in both Bangla (Bangla script) and Devnagari (Hindi script) based on the distinctive structural and statistical features of machine-printed and handwritten text lines. The present invention is not based on structural and statistical features of the entire lines of machine-printed and handwritten text.
Sean Violante et al., in an article entitled “A COMPUTATIONALLY EFFICIENT TECHNIQUE FOR DISCRIMINATING BETWEEN HAND-WRITTEN AND PRINTED TEXT,” in IEE Colloquium on Document Image Processing and Multimedia Environments, 1995, pages 17/1–17/7, dislose a method of distinguishing handwritten versus machine-printed addresses on mail by determining region count, edge straightness, horizontal profile, and the dimensions of the address box and then using a neural network to classify the letter as having either a handwritten or machine-printed address. The present invention does not use all of the features Violante et al. use to determine whether or not text is handwritten or machine-printed.
K. Kuhnke et al., in an article entitled “A System for Machine-Written and Hand-Written Character Distinction,” in Proceedings of the Third International Conference on Document Analysis and Recognition, 1995, pages 811–814, disclose a method of distinguishing handwritten text from machine-printed text by preprocessing the image by using a bounding box and extracting contours, extracting features from the image (i.e., straightness of vertical lines, straightness of horizontal lines, and symmetry relative to the center of gravity of the character in question). The features extracted by Kuhnke et al. are not used in the present invention.
Kuo-Chin Fan et al., in an article entitled “CLASSIFICATION OF MACHINE-PRINTED AND HANDWRITTEN TEXTS USING CHARACTER BLOCK LAYOUT VARIANCE,” in Pattern Recognition, 1998, Vol. 31, No. 9, pages 1275–1284, disclose a method of distinguishing handwritten text from machine-printed text by dividing text blocks into horizontal or vertical directions, obtaining the base blocks from a text block image using a reduced X-Y cut algorithm, determining character block layout variance, and classifying the text according to the variance. The variance determined by Fan et al. is not used in the present invention.
U.S. Pat. No. 4,910,787, entitled “DISCRIMINATOR BETWEEN HANDWRITTEN AND MACHINE-PRINTED CHARACTERS,” discloses a device for and method of distinguishing between handwritten and machine-printed text by determining the total number of horizontal, vertical, and slanted strokes in the text, determining the ratio of slanted strokes to the determined total, and declaring the text handwritten if the ratio is above 0.2 and machine-printed if the ratio is below 0.2. The present invention does not distinguish between handwritten and machine-printed text based on a ratio of slanted strokes in the text to a total of horizontal, vertical, and slanted strokes in the text. U.S. Pat. No. 4,910,787 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. No. 5,442,715, entitled “METHOD AND APPARATUS FOR CURSIVE SCRIPT RECOGNITION,” discloses a device for and method of recognizing cursive script by segmenting words into individual characters, scanning the individual characters using a window, and determining whether or not a character within the window is in a cursive script using a neural network. The present invention does not use a scanning window or a neural network to distinguish between handwritten and machine-printed text. U.S. Pat. No. 5,442,715 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. No. 6,259,812, entitled “KEY CHARACTER EXTRACTION AND LEXICON REDUCTION CURSIVE TEXT RECOGNITION,” discloses a device for and method of recognizing cursive text by calculating character and geometric confidence levels to identify “key characters.” The present invention does not calculate character and geometric confidence levels to identify “key characters.” U.S. Pat. No. 6,259,812 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. No. 6,259,814, entitled “IMAGE RECOGNITION THROUGH LOCALIZED INTERPRETATION,” discloses a device for and method of recognizing machine-printed and handwritten characters images by creating a look-up table with examples of machine-printed and handwritten characters and comparing an unknown character to the look-up table to determine its type. The present invention does not use a look-up table filled with examples of machine-printed and handwritten characters. U.S. Pat. No. 6,259,814 is hereby incorporated by reference into the specification of the present invention.
It is an object of the present invention to categorize an image as handwritten, machine-printed, or unknown.
The present invention is a method of categorizing an image as either handwritten, machine-printed, or unknown.
The first step of the method is receiving an image.
The second step of the method is identifying connected components within the image.
The third step of the method is enclosing each connected component within a rectangular, or bounding, box.
The fourth step of the method is computing a height and a width of each bounding box.
The fifth step of the method is computing a sum and maximum horizontal run for each connected component, where the sum is the sum of all pixels in the corresponding connected component, and where the maximum horizontal run is the longest consecutive number of horizontal pixels in the corresponding connected component.
The sixth step of the method is identifying connected components that are suspected of being characters.
If the number of suspected characters is less than or equal to a first user-definable number then the seventh step of the method is categorizing the image as unknown and stopping. Otherwise, proceed to the next step.
If the number of suspected characters is greater than the first user-definable number then the eighth step of the method is comparing the suspected characters to determine if matches exist.
The ninth step of the method is computing a score based on the suspected characters and the number of matches and categorizing the image into one of a group of categories consisting of handwritten, machine-printed, and unknown.
The present invention is a method of categorizing an image as handwritten, machine-printed, or unknown.
The first step 1 of the method is receiving an image.
The second step 2 of the method is identifying connected components within the image. A connected component is a grouping of black pixels, where each pixel touches at least one other pixel within the connected component. For example, the lower case “i” contains two connected components, the letter without the dot and the dot.
The third step 3 of the method is enclosing each connected component within a rectangular, or bounding, box.
The fourth step 4 of the method is computing a height and a width of each bounding box.
The fifth step 5 of the method is computing a sum and maximum horizontal run for each connected component, where the sum is the sum of all pixels in the corresponding connected component, and where the maximum horizontal run is the longest consecutive number of horizontal pixels in the corresponding connected component.
The sixth step 6 of the method is identifying connected components that are suspected of being characters. The details of the sixth step 6 are listed in
If the number of suspected characters is less than or equal to a first user-definable number then the seventh step 7 of the method of
If the number of suspected characters is greater than the first user-definable number then the eighth step 8 of the method is comparing the suspected characters to determine if matches exist. Each suspected character is compared against every other suspected character. A match exists between a pair of suspected characters if they have the same height and width, if each suspected character in the pair has a height that is less than 4 times its width, and if each suspected character in the pair has a width that is less than 4 times its height. If there are a significant number of matches then the image likely contains machine-printed characters.
The ninth step 9 of the method is computing a score based on the suspected characters and the number of matches and categorizing the image into one of a group of categories consisting of handwritten, machine-printed, and unknown. The details of the ninth step 9 are listed in
The second step 22 of the method of
The third step 23 of the method of
The fourth step 24 of the method of
If there are less than 30 suspected characters and the first user-definable range cannot be expanded then the fifth step 25 of the method of
If there are less than 30 suspected characters and the first user-definable range can be expanded then the sixth step 26 of the method of
If there are more than 30 suspected characters identified then the seventh, and last, step 27 in the method of
The first step 31 of the method of
The second step 32 of the method of
The third step 33 of the method of
The fourth step 34 of the method of
The fifth step 35 of the method of
If the match score is greater than 0.5 then the sixth step 36 of the method of
If the maximum run ratio is greater than 0.8 then the seventh step 37 of the method of
If the maximum run ratio is less than 0.5 then the eighth step 38 of the method of
If the maximum run ratio is greater than 0.7 and the match score is greater than 0.05 then the ninth step 39 of the method of
If the match score is less than 0.1 then the tenth step 40 of the method of
If the match score is greater than 0.4 then the eleventh step 41 of the method of
If the image has been rotated then the twelfth step 42 of the method of
Number | Name | Date | Kind |
---|---|---|---|
4910787 | Umeda et al. | Mar 1990 | A |
5181255 | Bloomberg | Jan 1993 | A |
5216725 | McCubbrey | Jun 1993 | A |
5410614 | Chou et al. | Apr 1995 | A |
5442715 | Gaborski et al. | Aug 1995 | A |
5544259 | McCubbrey | Aug 1996 | A |
5561720 | Lellmann et al. | Oct 1996 | A |
5570435 | Bloomberg et al. | Oct 1996 | A |
5862256 | Zetts et al. | Jan 1999 | A |
6259812 | Mao et al. | Jul 2001 | B1 |
6259814 | Krtolica et al. | Jul 2001 | B1 |
6636631 | Miyazaki et al. | Oct 2003 | B1 |
6909805 | Ma et al. | Jun 2005 | B1 |
6920246 | Kim et al. | Jul 2005 | B1 |
6940617 | Ma et al. | Sep 2005 | B1 |