1. Technical Field
The present invention relates generally to character recognition systems, and relates more specifically to a system and method for providing positional synchronization of characters in a multivoting character recognition system.
2. Related Art
Character recognition systems have become an integral part of many of today's business operations. Character recognition systems are utilized in applications wherein printed character data must be converted to electronic information. There exist a number of character recognition technologies, with some of the most popular being optical character recognition (OCR) and magnetic ink character recognition (MICR).
Unfortunately, character recognition systems are often prone to errors due to numerous factors, including inconsistencies in the print quality, inconsistencies in the paper quality, etc. One way to improve the efficacy of character recognition systems is to obtain multiple sets of read results for each character being read. Then a “voting” engine can be utilized to analyze corresponding sets of read results to provide higher recognition accuracy or increased read rate.
Each set of read results can be obtained from different transducer technologies (e.g., OCR and MICR) and/or from using variations within the same technology, e.g., using different gray-scale levels in an OCR read. For instance, in an OCR environment, a gray-scale image may be “thresholded” many times to generate a series of black and white images. The series of black and white images can form multiple instances of character data suitable for analysis by a voting engine. Results from a MICR read can likewise be added to further increase the read rate.
One of the problems that occur when read information is obtained from multiple transducer inputs is that the characters from the different inputs may not be positionally synchronized. There exist certain circumstances in which the read data for one or more inputs includes a missed character or an erroneously added character. This problem takes potentially serious shape when results from different sets of transduced information are combined by the voting engine. For example, consider the case where a first set of transduced information correctly includes a string of characters and a second set erroneously does not include the third character read from a string of characters. In this case, the second set of transduced information will provide character information that is not synchronized with the first set. Namely, it will report that the fourth character is the third character, the fifth character is the fourth character, and so on. The result is a situation where characters obtained from different sets of corresponding transduced information are not positionally synchronized with each other.
Accordingly, without a mechanism for addressing this problem, the use of multiple read data and voting engines will be subject to unintentional misreads.
The present invention addresses the above-mentioned problems, as well as others, by providing a character recognition system and method that provides positional synchronization of characters in a plurality of corresponding sets of transduced character information. In a first aspect, the invention provides a character recognition system, comprising: at least one transducer system for scanning printed character data and generating a plurality of sets of transduced character information; a position collection system for collecting positional data for characters in each set of transduced character information; a character position synchronization system for positionally synchronizing corresponding characters from different sets of transduced character information; and a voting engine for receiving the positionally synchronized sets of transduced character information.
In a second aspect, the inventions provides a character recognition system, comprising: a position collection system for collecting positional data for a plurality of corresponding sets of transduced character information; and a character position synchronization system for positionally synchronizing characters from the corresponding sets of transduced character information.
In a third aspect, the invention provides a method for providing character recognition in which multiple sets of corresponding transduced character information are analyzed by a voting engine, comprising: scanning printed character data to generate multiple sets of corresponding transduced character information; collecting positional data for characters in the sets of corresponding transduced character information; and positionally synchronizing characters from the sets of corresponding transduced character information.
In a fourth aspect, the invention provides a program product stored on a recordable medium for facilitating character recognition in a multi-voting character recognition engine, comprising: means for collecting positional data for a plurality of corresponding sets of transduced character information; and means for positionally synchronizing character information from the corresponding sets of transduced character information.
These and other features of this invention will be more readily understood from the following detailed description of the various aspects of the invention taken in conjunction with the accompanying drawings in which:
Referring now to the drawings,
In the example provided in
Character recognition system 10 further includes one or more position collection systems 11. Each position collection system 11 collects positional data associated with the character data collected by the transducer systems 12. An exemplary methodology for collecting positional data is described below with respect to
Positional data is stored along with each set of transduced character information 14. Two examples of transduced character information 14 that include positional data are described below with reference to
As noted above, and described in further detail below with regard to
In the case where different transducers are utilized, multiple instances of positional information for a printed character string are collected/derived from the different transducers. In such cases, as in the described embodiment of using a MICR and an optical system, it may be necessary to initially synchronize the positional information for each instance of the captured character string. Stated alternately, a reference point for the two (or more) position transducers needs to be aligned, i.e., a positional bias must be established.
The positional bias may be calculated, e.g., by first establishing the character read synchronization for the first few characters and then calculating the average positional bias of the multiple instances of the string.
In an exemplary embodiment, a stepper motor, which transports document in relatively small steps, can be utilized to provide positional information. Specifically, stepper information may be mapped into a linear displacement of a document in order to provide positional data. Knowing when the MICR characters are read in relationship to the motor steps, a MICR “positional string” can be established. Similarly, the optical positional string can be established by knowing the character position in pixel counts and the pixels per unit length of the optical imaging system. Knowing the two positional strings in units of length, the bias of one with respect to the other(s) is established by then synchronizing the “read” of the first few, e.g., 3 or 4, characters.
Referring now to
In the above example, positional data comprises a position measurement 36 for each character by position collection system 11. As shown in
As seen in
Note that in the exemplary document of
Instead, character recognition system 10 includes character position synchronization system 18 to positionally synchronize corresponding sets of transduced character information 14. Specifically, positional data is analyzed to ensure that voting engine 16 is voting on character information obtained from the same position on document 30. Thus, each character in a set is positionally synchronized with characters from other corresponding sets.
In one exemplary embodiment, synchronization is achieved as follows. First one set of the character information (e.g., Set A) is established as the reference set. Next, each position measurement value from the reference set is compared to position measurement values from the secondary set(s) (e.g., Set B). If a position measurement value from one of the secondary sets matches the position measurement value from the reference set, within some tolerance, then it is concluded that the characters associated with the matching position measurement values correspond. For instance, Char A, from the reference Set A has a position measurement value of 1.52. This value is then compared to each value in the secondary Set B to look for matches within a tolerance (e.g., +/−0.05). In this case, Char A from secondary Set B from has a value of 1.50, which falls within the tolerance range. It is thus concluded that Char A from reference Set A corresponds to Char A from the secondary Set B (i.e., the two characters from sets A and B can be synchronized). Obviously, the selection of the tolerance range can vary from application to application.
This process is then repeated for each character in the reference Set A to identify corresponding characters in the secondary set(s). Thus, it can be seen that the second character “C” having a value of 1.80 in Set A corresponds to the second character “C” in Set B, which has a value of 1.84. Note however that in Set A, the third character “c” has a positional data value of 2.10, and no positional data value in Set B matches 2.10 (i.e., the closest value is 1.84, which is a 0.26 difference and falls outside the tolerance of +/−0.05). Accordingly, it is concluded that there is no corresponding character in Set B for the third character “c.” It can be readily seen that the remaining characters in the reference Set A, “O, U & N” having values of 2.41, 2.68 and 3.01 respectively, correspond to characters “O, U & N” in secondary Set B, which have values of 2.42, 2.70 and 3.04, respectively. That is, their respective values fall within the tolerance.
In this example, after positional synchronization of buffers 40 and 42, two positionally synchronized buffers could be created as follows:
It is understood that the systems, functions, mechanisms, methods, engines and modules described herein can be implemented in hardware, software, or a combination of hardware and software. They may be implemented by any type of computer system or other apparatus adapted for carrying out the methods described herein. A typical combination of hardware and software could be a general-purpose computer system with a computer program that, when loaded and executed, controls the computer system such that it carries out the methods described herein. Alternatively, a specific use computer, containing specialized hardware for carrying out one or more of the functional tasks of the invention could be utilized. The present invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods and functions described herein, and which—when loaded in a computer system—is able to carry out these methods and functions. Computer program, software program, program, program product, or software, in the present context mean any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.
The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously, many modifications and variations are possible. Such modifications and variations that may be apparent to a person skilled in the art are intended to be included within the scope of this invention as defined by the accompanying claims.