This application is based on and claims priority under 35 USC 119 from Japanese Patent Application No. 2020-055528 filed Mar. 26, 2020.
The present disclosure relates to an information processing device and a non-transitory computer readable medium.
Technology that extracts desired information from a document image illustrating a document is known. For example, Japanese Unexamined Patent Application Publication No. 2004-178044 describes performing morphological analysis on character fields of a document, comparing the morphological analysis result with document attribute part-of-speech patterns that express the structure of document attributes as morphological levels, and from among the matching character fields, extracting a character field whose appearance position in the document is in a predetermined range as a document attribute.
In some cases, it is desirable to extract information such as the names of the parties to a contract from a document such as a contract. However, in the case where a document contains names other than the parties to the contract, such as company names for example, the names other than the parties to the contract may be extracted with a method that analyzes text to distinguish target information.
Aspects of non-limiting embodiments of the present disclosure relate to extracting target information more accurately compared to the case of extracting target information from a target character string irrespectively of the position of a specific type of impression.
Aspects of certain non-limiting embodiments of the present disclosure address the above advantages and/or other advantages not described above. However, aspects of the non-limiting embodiments are not required to address the advantages described above, and aspects of the non-limiting embodiments of the present disclosure may not address advantages described above.
According to an aspect of the present disclosure, there is provided an information processing device provided with a processor configured to acquire a document image illustrating a document, and extract target information with respect to a target character string from a region set with reference to a position of a specific type of impression included in the document image.
An exemplary embodiment of the present disclosure will be described in detail based on the following figures, wherein:
The image processing device 10 is provided with a processor 11, memory 12, an image reading unit 13, a storage unit 14, an operation unit 15, and a display unit 16. These hardware elements are connected through a bus 17. By executing a program, the processor 11 controls each unit of the image processing device 10 and performs a process of extracting a value corresponding to a key from the document image. For the processor 11, a central processing unit (CPU) is used for example. The memory 12 stores a program for causing the processor 11 to execute the process described above. For the memory 12, read-only memory (ROM) and random access memory (RAM) are used for example. The image reading unit 13 reads a document to generate a document image. For the image reading unit 13, an image scanner is used for example. The storage unit 14 stores the document image generated by the image reading unit 13. For the storage unit 14, a hard disk drive or a solid-state drive (SSD) is used for example. The operation unit 15 is used by the user to operate the image processing device 10. For the operation unit 15, a touch panel and one or more buttons are used for example. The display unit 16 displays various screens that the user uses to perform operations. For the display unit 16, a liquid crystal display is used for example.
In a signature field at the end of the contract, a character string 31 stating “Company Name” is arranged beside a character string 32 stating “Company A” indicating the company name of the first party to the contract, while a character string 33 stating “Company Name” is arranged beside a character string 34 stating “Company B” indicating the company name of the second party to the contract. Generally, in the case where a party to a contract is a company, a company seal is impressed at a position overlapping the last character of the company name. A company seal refers to a seal used by a company. Generally, a company seal has a square shape, and the impression of a company seal is larger than the impressions of other types of seals. Also, generally, the signature field at the end of a contract where a company seal is impressed has more whitespace than other portions of the contract. In the example illustrated in
Also, in some cases, seals other than a company seal, such as a tally seal and a revision seal, may be impressed on a contract. A tally seal refers to a seal in the case where the contract is two or more pages, with the seal being impressed to straddle the two pages and thereby indicate that the multiple pages are related to each other. Generally, a tally seal is impressed at an interval from a predetermined position of the document. For example, on a spread layout containing two pages arranged side by side, the positions where the tally seals are impressed are in the central part at the boundary between the side-by-side pages. On the other hand, on single pages not in a spread layout, the positions where the tally seals are impressed are along the edge of the pages. Note that tally seals may be impressed at equally spaced intervals. In the example illustrated in
A revision seal refers to a seal that is impressed to clearly indicate that someone has made a revision when revising a portion of a document. Generally, in the case of correcting incorrect characters, strikethrough is applied to the corrected portion, and a revision seal is impressed at a position corresponding to the corrected portion. The strikethrough may be a single line or double lines. In the example illustrated in
In the following description, the case where the processor 11 is described as the agent of the processes means that the processes are performed by the processor 11 performing calculations or controlling the operations of other hardware elements through cooperation between the program stored in the memory 12 and the processor 11 executing the program.
In step S11, the processor 11 causes the image reading unit 13 to read a document according to an operation by the user. With this arrangement, a document image illustrating the document is acquired. In the example illustrated in
In step S12, the processor 11 detects impressions from the document image. The detection of impressions is performed using existing impression detection technology, for example. For example, a vermilion portion approximately the size of an impression in the document image may be detected as an impression. In the example illustrated in
In step S13, the processor 11 performs a company seal determination process. As illustrated in
In step S32, the processor 11 determines whether or not an impression is provided at an interval from an edge of the document image. In the example illustrated in
In step S33, the processor 11 determines whether or not an impression is provided at an interval apart from the middle of the document image. In the example illustrated in
Returning to
Returning to
In step S24, the processor 11 ranks the candidates of the impression of a company seal according to the layout. First, the amount of whitespace surrounding an impression is counted. For example, the number of pixels having the same color as the background is used as the amount of whitespace. Here, “surrounding” refers to a range inside a predetermined distance centered on the impression for example. The greater the amount of surrounding whitespace, the higher the rank assigned. In the case where impressions have the same amount of surrounding whitespace, the impression detected earlier is assigned a higher rank. Here, it is assumed that an impression farther in the −y axis direction in the document image is detected earlier. In the example illustrated in
In step S25, the processor 11 determines the impressions of company seals according to the ranking. For example, in the case where the number of values to extract is 2, the two highest-ranking impressions are determined to be impressions of company seals. In the example illustrated in
Returning to
In step S15, the processor 11 extracts a value with respect to a key from the surrounding range of each impression determined to be the impression of a company seal in step S25 in the document image. The surrounding range refers to a predetermined region set with reference to the position of the impression. In the example illustrated in
In this case, first, the surrounding range 28 is searched for the key “company name”. Here, in the example illustrated in
Next, the surrounding range 29 is searched for the key “company name”. In the example illustrated in
The values extracted in step S15 are paired with the corresponding keys and stored in the storage unit 14, for example. In the example illustrated in
According to the exemplary embodiment described above, information indicating the parties to a contract is extracted from the surrounding ranges of the impressions of company seals, and therefore the information indicating the parties to the contract may be extracted more accurately compared to the case of extracting information indicating the parties to a contract irrespectively of the positions of the impressions of company seals. Also, because a value is extracted from inside the surrounding range of the impression of a company seal, the processing load for extracting the value is decreased compared to the case of treating the entire document image as the range from which to extract a value. Furthermore, because the impression of a tally seal and the impression of a revision seal included in the document image are excluded from the candidates of the impression of a company seal, in the case where the document image contains an impression of a tally seal and the impression of a revision seal, it is possible to avoid a situation in which these impressions are used as the impressions of company seals, and incorrect information is extracted as a value. In other words, in the case where the document image contains an impression of a tally seal and the impression of a revision seal, it is possible to avoid a situation in which the accuracy of extracting information about the parties to a contract is lowered.
Furthermore, because the impressions of tally seals are determined by the position and spacing of the impressions in the document image, these impressions may be distinguished more accurately compared to the case of distinguishing the impressions of tally seals from other types of impressions irrespectively of the position and spacing of the impressions in the document image. Furthermore, in the case where the document image illustrates a single page, impressions provided along an edge of the document image are determined to be impressions of tally seals, and therefore these impressions may be distinguished more accurately compared to the case of distinguishing the impressions of tally seals from other types of impressions irrespectively of the position at the edge of the document image in this case. Furthermore, in the case where the document image illustrates spread pages, impressions provided at the boundary between the pages are determined to be impressions of tally seals, and therefore these impressions may be distinguished more accurately compared to the case of distinguishing the impressions of tally seals from other types of impressions irrespectively of the position at the boundary between the pages in this case.
Furthermore, because impressions at position corresponding to characters with strikethrough applied in the document image are determined to be the impressions of revision seals, these impressions may be distinguished more accurately compared to the case of distinguishing the impressions of revision seals from other types of impressions irrespectively of the position corresponding to characters with strikethrough applied in the document image. Furthermore, because the impressions of company seals are determined in accordance with a ranking assigned according to the sizes of the impressions, these impressions may be distinguished more accurately compared to the case distinguishing the impressions of company seals from other types of impressions irrespectively of the sizes of the impressions. Furthermore, because the impressions of company seals are determined in accordance with a ranking assigned according to the amounts of whitespace surrounding the impressions, these impressions may be distinguished more accurately compared to the case distinguishing the impressions of company seals from other types of impressions irrespectively of the amounts of whitespace surrounding the impressions. Furthermore, in the case where a surrounding range of an impression of a company seal contains information indicating multiple parties to a contract, only the information indicating a party to the contract closest to the position of the impression of the company seal is extracted, and therefore the information indicating the parties to the contract may be extracted more accurately compared to the case of extracting information indicating the parties to the contract irrespectively of the position of the information with respect to the impression of a company seal.
The exemplary embodiment described above is one example of the present disclosure. The present disclosure is not limited to the exemplary embodiment described above. In addition, the exemplary embodiment described may also be modified like the following examples. At this time, two or more of the following exemplary modifications may also be combined and used.
In the exemplary embodiment described above, after impressions are detected, the processor 11 may also remove the impressions from the document image and perform character recognition on the document image with the impressions removed. The removal of impressions may be performed using existing technology. For example, vermilion portions that are the color of an impression may be removed from the document image. In the example illustrated in
In the exemplary embodiment described above, the processor 11 may also change the surrounding range of an impression of a company seal. For example, the surrounding range of an impression of a company seal may be changed depending on the type of document. This is because in some cases, the positional relationship between an impression of a company seal, a key, and a value is different depending on the type of document. For example, in the case where the type of document is a first type, the length of the surrounding range in the vertical direction or the horizontal direction may be changed such that the surrounding range is a horizontally long shape. On the other hand, in the case where the type of document is a second type, the length of the surrounding range in the vertical direction or the horizontal direction may be changed such that the surrounding range is a vertically long shape. In another example, the surrounding range of an impression of a company seal may be changed depending on the type of value. This is because in some cases, the positional relationship between an impression of a company seal, a key, and a value is different depending on the type of value. For example, the size or shape of the surrounding range may be changed between the case where the value is a company name and the case where the value is a personal name. In another example, the surrounding range of an impression of a company seal may be changed depending on the position of the impression of the company seal. In the example illustrated in
In the exemplary embodiment described above, the method of determining whether or not a document image illustrates spread pages is not limited to a method using the aspect ratio. For example, in the case where the user performs an operation of designating whether or not the pages are spread pages, the determination of whether or not the document image illustrates spread pages may be made according to the operation. In another example, because a line or a dashed line is provided in the middle of a spread page layout in some cases, the determination of whether or not the document image illustrates spread pages may be made according to the presence or absence of such a line. In another example, depending on the document, the first page and the last page may be a single page that serves as the front and back covers while all other pages are spread pages in some cases, and therefore the determination of whether or not the document image illustrates spread pages may be made according to whether the page is the first page or the last page.
In the exemplary embodiment described above, the tally seal determination process and the revision seal determination process may also be performed after assigning a ranking. In this case, impressions of tally seals and impressions of revision seals may also be ranked. Also, impressions having a size that is less than a threshold may be excluded from the candidates of an impression of a company seal, whereas impressions having a size that is the threshold or greater may be assigned a rank according to the layout.
In the exemplary embodiment described above, in the case where the rank according to size and the rank according to layout are different, the rank according to size may be changed by the rank according to layout. For example, in the case where the rank according to size is 2 and the rank according to layout is 3, the rank may be lowered to 3. Also, a score may be computed on the basis of the rank according to size and the rank according to layout, and a combined rank may be computed according to the score. For example, in the case where the rank according to size is 1 and the rank according to layout is 1, the score is 1+1=2. In the case where the rank according to size is 2 and the rank according to layout is 2, the score is 2+2=4. In the case where the rank according to size is 3 and the rank according to layout is 3, the score is 3+3=6. In this case, the combined rank is high in order of lowest score.
In the exemplary embodiment described above, a rank does not necessarily have to be assigned. For example, in the case where only an impression of a company seal, an impression of a tally seal, and an impression of a revision seal are detected from the document image, ranks do not have to be assigned. In this case, the impression that is neither an impression of a tally seal nor the impression of a revision seal from among the impressions detected from the document image is the impression of a company seal.
In the exemplary embodiment described above, the tally seal determination process and the revision seal determination process do not necessarily have to be performed. Generally, an impression of a company seal is larger than other types of impressions, such as an impression of a tally seal of an impression of a revision seal, and therefore the impression of a company seal is assigned a higher rank according to size than other types of impressions. Consequently, the impression of a company seal may be determined by the rank according to size. Also, generally, the impression of a company seal has more surrounding whitespace than other types of impressions, and therefore even if another type of impression is as large as or larger than the impression of a company seal, the impression of a company seal is assigned a higher rank according to layout than other types of impressions. Consequently, the impression of a company seal may be determined by the rank according to layout in addition to the rank according to size.
In the exemplary embodiment described above, an impression of a company seal may be used as a key. In this case, a value may be extracted from the nearby range of the key. The nearby range refers to a predetermined region set with reference to the position of the impression of a company seal. In the example illustrated in
In the exemplary embodiment described above, character recognition does not necessarily have to be performed on the entire document image. For example, character recognition may be performed only in the surrounding range of the impression of company seal.
In the exemplary embodiment described above, the specific type of impression is not limited to the impression of a company seal. For example, in the case where the parties to a contract are individuals, the personal seals of the individuals may be impressed beside the names of the contracting parties in some cases. In this case, the specific type of impression is the impression of a personal seal of an individual. The impression of a personal seal of an individual is assigned a lower rank according to size, but is assigned a higher rank according to layout, and therefore may be determined to be the specific type of impression.
In the exemplary embodiment described above, the document illustrated by the document image is not limited to a contract. The document may any type of document insofar as the document contains a value stated near an impression, such as a receipt or an invoice, for example. Additionally, the value is not limited to information indicating a party to a contract. For example, the value may be any type of information that is stated near an impression, such as the issuer of a receipt or an invoice.
In the exemplary embodiment described above, some of the functions of the image processing device 10 may also be provided in an external device. For example, a server device connected to the image processing device 10 through a communication channel may acquire a document image from the image processing device 10 and perform the process of extracting a value from the document image. In this example, the server device is the information processing device according to an exemplary embodiment of the present disclosure.
In the exemplary embodiment above, the term “processor” refers to hardware in a broad sense. Examples of the processor include general processors (e.g., CPU: Central Processing Unit), and dedicated processors (e.g., GPU: Graphics Processing Unit, ASIC: Application Integrated Circuit, FPGA: Field Programmable Gate Array, and programmable logic device).
In the embodiment above, the term “processor” is broad enough to encompass one processor or plural processors in collaboration which are located physically apart from each other but may work cooperatively. The order of operations of the processor is not limited to one described in the embodiment above, and may be changed.
An exemplary embodiment of the present disclosure may also be provided as a program executed in the image processing device 10. The image processing device 10 is an example of a computer according to an exemplary embodiment of the present disclosure. The program may be downloaded through a communication channel such as the Internet, or may be provided by being recorded onto a computer readable recording medium such as a magnetic recording medium (such as magnetic tape or a magnetic disk), an optical recording medium (such as an optical disc), a magneto-optical recording medium, or semiconductor memory.
The foregoing description of the exemplary embodiment of the present disclosure has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in the art. The embodiment was chosen and described in order to best explain the principles of the disclosure and its practical applications, thereby enabling others skilled in the art to understand the disclosure for various embodiments and with the various modifications as are suited to the particular use contemplated. It is intended that the scope of the disclosure be defined by the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2020-055528 | Mar 2020 | JP | national |