The present invention relates to a business document processor and to, for example, a technique for removing a seal impression within a business document.
With respect to the enormous amounts of paper business documents archived within organizations, there has been an interest in recent years in achieving improvements in searchability, safe storage of paper documents, and sharing of knowledge through character recognition via scanning and OCR, and managing document data with document management systems.
While OCR in its current state has high character string recognition accuracy for documents free of noise, when, for example, a seal image such as that of a company seal overlaps with a character string, there is a problem in that that portion would be erroneously recognized. If erroneously recognized, not only would the character information of that portion be unobtainable, but nonsensical character information would become and remain as noise, and impede subsequent searches. Seal images found in business documents are characteristic in that they are often affixed in such a manner that they overlap with information regarding customers such as customer name, name of representative of customer, and the like. Such pieces of information are often vital in identifying those documents. Thus, if such information cannot be recognized, these documents will not be returned during searches, and one would have to check all registered document data. For this reason, when applying OCR, it is necessary that character strings that overlap with seal impressions also be recognized with a high degree of accuracy.
In order to improve the recognition accuracy of such OCR, there is proposed a method for separating a seal impression that overlaps with a character string. For example, in Patent Literature 1 and Patent Literature 2, there are proposed techniques for recognizing and removing a seal impression, discerning it from text using the difference between the color of the seal impression and the color of the text in the document. Thus, even if the text and the seal impression overlap with each other, it is possible to remove only the seal impression while keeping the overlapping text.
In addition, in Patent Literature 3, there is proposed a technique for recognizing and removing seal impressions taking advantage of the fact that the contours of seal impressions often take on the form of regular polygons. Thus, in cases where the text and the seal impression overlap with each other, it is possible to prevent erroneous recognition of OCR by removing the seal impression and the character strings that overlap with the seal impression.
However, since business documents already archived electronically are sometimes stored in grayscale, the techniques of Patent Literature 1 and 2, which are techniques for recognizing seal impressions in color, are inapplicable.
In addition,
The present invention is made in view of such circumstances, and provides a technique for removing only a seal impression while keeping character string information when applying OCR to a business document stored in grayscale even in cases where character strings and seal impressions overlap with each other.
In order to solve the problems above, a business document processor according to the present invention comprises: a seal impression detection processing portion that detects a seal impression region in a business document inputted in grayscale and removes the seal impression region from the business document; a seal impression related information extraction processing portion that extracts as seal impression related information (for example, information relating to a customer) character information that exists near the removed seal impression region in the business document, from which the seal impression region has been removed, where a portion of the characters is unclear due to the seal impression region; an attribute classification processing portion that identifies attributes of the seal impression related information that has been extracted; and a character extrapolation processing portion that refers to a character string candidate database storing character string candidates (for example, a customer database storing customer information) and extrapolates, based on the seal impression related information that has been classified by the attributes, a character string that overlaps with the seal impression region and that is thus unclear.
In addition, the character extrapolation processing portion substitutes into the portion that is unclear due to the seal impression region the character string obtained by extrapolation, and registers the business document data, into which the character string has been substituted, in a document database in a pair with the business document inputted in grayscale.
The business document processor may further comprise a display processing portion that displays on a display portion the business document data into which the character string has been substituted. In this case, if there are a plurality of character string candidates that may be substituted, the display processing portion displays on the display portion a plurality of business document data into which the plurality of candidates have been substituted, and the character extrapolation processing portion registers in the document database, of the plurality of business document data, the business document data that is selected by a user.
In addition, the character extrapolation processing portion may calculate the degree of match between information stored in the character string candidate database and the seal impression related information that has been classified by attribute, and treat the information stored in the character string candidate database as a character string candidate for substitution when the degree of match exceeds a predetermined value. On the other hand, if the degree of match is at or below a predetermined value, processing may be terminated without substituting any characters into the seal impression region.
Further features of the present invention will become apparent from the best mode for carrying out the invention provided below and the accompanying drawings.
According to the present invention, it becomes possible to recognize documents inputted in grayscale even if character strings found in the documents overlap with seal impressions such as those of company seals and the like. Thus, searchability for business documents improves, and the effectiveness of document management systems is further enhanced.
Best modes for carrying out a business document processor of the present invention are described in detail below with reference to the accompanying drawings.
The input/output devices 30 comprise: an output portion comprising a display device 32 for displaying data, a printer (not shown), and the like; and an input portion comprising a keyboard 31 for performing such operations as menu selection with respect to displayed data, a pointing device 33 such as a mouse, a scanner 34 for scanning documents, and the like.
The program memory 40 comprises: a seal impression detection processing portion 41 that detects a seal impression, such as that of a company seal and the like, that is present in a document; an OCR processing portion 42 that recognizes characters within a document; a seal impression related information region extraction processing portion 43 that cuts out a character string block present in the periphery of a seal impression; an attribute classification processing portion 44 that classifies an attribute of a character string within the character string block; and a character substitution processing portion 45. It is noted that each processing portion is stored in the program memory 40 as program code, and each processing portion is realized through execution of the respective program code by the central processing unit 10.
The data memory 20 comprises: grayscale image data 21 obtained by scanning a paper document in grayscale; OCR result data 22 that is generated by applying OCR with respect to the grayscale image data 21; and seal impression related data 23 in which is stored information on a character string block near a seal impression region within the OCR result data 22.
Next, processing performed at a business document processor having the configuration discussed above is described.
In
Details of the process in
First, the seal impression detection processing portion 41 reads the grayscale image data 21 obtained by scanning the business document in grayscale, and searches for the region of the seal impression within the grayscale image data 21. In so doing, the seal impression is searched for using such conventional techniques as those of Patent Literature 3 and the like. In addition, after the seal impression search, the seal impression detection processing portion 41 removes a polygonal region including the contour of that seal impression. Here, with the technique of Patent Literature 3, since the seal impression and character strings cannot be recognized separately, when the seal impression region is removed, the character strings are removed together as well. The character strings removed at this point are later substituted by being extrapolated by the character substitution processing portion 45 from the surrounding character strings as will be described later.
Next, details of the process in
First, the seal impression related information region extraction processing portion 43 sets the seal impression region (the region at which the seal impression was detected through the seal impression detection process) as an initial value of a seal impression related information region, and enlarges the seal impression related information region so as to include the character strings present nearby. Specifically, the seal impression related information region extraction processing portion 43 searches for character strings surrounding the seal impression related information region. For example, since it is possible to identify, through an OCR process, the font size(s) of the character strings that are present in the periphery of the seal impression, strings of characters concatenated at widths (distances) narrower than such font sizes may each be deemed as one character string. Then, the seal impression related information region extraction processing portion 43 enlarges the seal impression related information region with a rectangular region including such character strings as part of the seal impression related information region, and stores it in the data memory as the seal impression related data 23.
Details of the process in
First, the attribute classification processing portion 44 reads the seal impression related data 23, divides the character strings within the seal impression related data 23 line by line, and assigns the attribute of the character string on each line. Specifically, the attribute classification processing portion 44 performs a morphological analysis of the character string on each line using the attribute database 53, and determines an attribute that fits each character string.
In the present embodiment, a description is provided through an example where the attribute database 53 is written in the format “(character pattern):(attribute)”. For example, if “Txxx-xxxx:‘postal code’” is written in the attribute database 53 (where x is an arbitrary number from 0 to 9) and the character string of interest is “T100-0000”, it will be determined that this character string is a match with the format for postal code, and the attribute of postal code will be assigned to this character string. In addition, if “telephone:‘telephone number’” is written in the attribute database 53 and the character string of interest includes the character string “telephone” (or “Tel”) as in “Telephone (03)1234-5678”, the attribute of telephone number is assigned thereto. Further, there are cases where it is specified in the format “‘prefecture name’+‘ward/city/town/village name’:‘address’”. This represents the fact that when a character string with a prefecture name attribute is concatenated with a character string with a ward/city/town/village name attribute, an address attribute is assumed. Thus, attributes are assigned to the respective character strings. The various attribute definitions are mutually independent, and the definitions never collide. In addition, it is assumed that a plurality of patterns representing the same attribute are registered, and that variations in notation can thus be absorbed.
Details of the process in
First, the seal impression related data 23 is read (step S901). Next, variables Mmax and n are initialized (step S902). In addition, variable length array max_id is emptied (step S903).
Then, through the process from step S904 to step S911, the customer that appears to be the best match with respect to the customer information included in the seal impression related data is selected. First, unprocessed customer data is read from the customer database 52 (step S904). Next, the layout of each character string within the seal impression related data 23 is configured (step S905). Specifically, as shown in
In addition, the customer data selected in step S904 is matched against the data in the seal impression related data 23 to calculate match degree Mn (step S906). Mn is so calculated as to be greater when there are a large number of matching characters and smaller when there are a large number of non-matching characters or when the number of characters is incongruent. Existing techniques such as alignment score, for example, may be used in the calculation of match degree. In the example of
Subsequently, it is determined whether or not Mn is equal to or greater than maximum value Mmax (step S907), and if it is greater, Mmax is updated with Mn (step S908). In addition, the value of n at that point, i.e., the ID indicating the customer, is added to max_id (step S909). Here, if the comparison in step S903 is equal, n is added to max_id, whereas if Mn is greater than Mmax in the comparison in step S903, the content held by max_id is discarded, and max_id is made to hold n alone.
Thereafter, n is incremented (step S910). Then, it is determined whether or not matching has been performed with respect to all customer data (step S911), and the process from step S904 to step S910 is repeated if there is any unprocessed customer data. If there is no unprocessed customer data, proceeding to step S912, it is determined whether or not Mmax is greater than threshold value T (step S912). T is a predefined constant and is a threshold value for determining whether or not the matching result is sufficiently plausible.
If Mmax is greater than T, the character string that is missing due to the removal of the seal impression is substituted with the customer data scoring Mmax, that is, the customer data corresponding to max_id (step S913). If Mmax is equal to or less than T, it signifies the fact that the match degree is insufficient. Thus, it is determined that there is no corresponding customer data, and all of the character strings within the seal impression related data 23 are removed (step S914). In this case, the central processing unit 10 may, for example, display on the GUI in
Finally, a confirmation screen such as that shown in
In addition, on the confirmation screen, of the customers that have been selected as candidates for substitution, the customer specified by the user is displayed in highlight (in the example in
Further, when some other customer displayed in the table on the upper portion of the screen is specified by the user, the specified customer is displayed in highlight, and the customer information displayed with the document image on the lower portion of the screen is simultaneously switched. From such display, the user is able to determine which candidate is suitable for substitution. If the user determines that a candidate suitable for substitution is displayed, he may express agreement by pressing the “yes” button in the dialog. If user agreement is obtained, the processing result is reflected in the customer database. If user agreement is not obtained, processing is cancelled.
In an embodiment of the present invention, with respect to a business document scanned in grayscale such as that shown in
Next, as shown in
Through the execution of such processing, it becomes possible to automatically and accurately obtain customer information of a document, even in a case where a seal impression is present within that document in such a manner as to overlap with character strings that contain customer information, by using information surrounding those character strings.
In the present embodiment, a case was described where the character strings that overlapped with a seal impression were character strings that contained customer information. However, the present invention is by no means limited such that character strings that overlap with a seal impression have to be character strings that contain customer information, and processing may be executed with respect to all kinds of character strings. In other words, as long as missing character strings can be extrapolated through a process of matching against a database, the present invention is applicable to all kinds of documents.
In addition, the present invention may also be realized through program code of software that realizes the functions of the embodiment. In this case, a storage medium on which the program code is recorded is supplied to a system or apparatus, and the computer (or CPU or MPU) of that system or apparatus reads the program code stored on that storage medium. Thus, the program code itself that is read from the storage medium would realize the functions of the embodiment described above, and the program code itself, as well as the storage medium storing it, would constitute the present invention. As storage media for supplying such program code, for example, flexible disks, CD-ROMs, DVD-ROMs, hard disks, optical disks, magneto-optical disks, CD-Rs, magnetic tapes, non-volatile memory cards, ROMs and the like may be used.
In addition, based on instructions of program code, an OS (operation system) and the like running on a computer may perform part or all of the actual processing, and the functions of the embodiment described above may be realized through such processing. Further, after the program code read out from the storage medium is written in the memory on the computer, the CPU and the like of the computer may perform part or all of the actual processing based on instructions of that program code, and the functions of the embodiment described above may be realized through such processing.
In addition, program code of software that realizes the functions of the embodiment may be stored on storage means, such as a hard disk, memory or the like of a system or apparatus, or on a storage medium, such as a CD-RW, CD-R or the like, through distribution via a network. At the time of use, the computer (or CPU or MPU) of that system or apparatus may read out and execute the program code stored on the storage means or the storage medium.
Number | Date | Country | Kind |
---|---|---|---|
2008-335216 | Dec 2008 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2009/006889 | 12/15/2009 | WO | 00 | 2/2/2011 |