The present application is based on and claims the benefit of priority under 35 U.S.C §119 of Japanese Patent Application No. 2014-048663 filed Mar. 12, 2014, and 2015-037577 filed Feb. 27, 2015, the entire contents of which are hereby incorporated herein by reference.
1. Field of the Invention
The present invention generally relates to a document processing system, a document processing apparatus, and a document processing method.
2. Description of the Related Art
There is a known document management apparatus which recognizes and searches for hand-written characters (letters) which are written on a document having a predetermined format such as, for example, a ledger paper, an interview paper, etc. Further, there is known a document management apparatus which separates document image data, where typed characters (letters) and hand-written characters are mixed, into image data of a typed area and image area of a hand-written area, performs a character recognition process on each of the areas, and generates index tables to be searched (see, for example, Japanese Laid-open Patent Publication No. 2007-011683).
According to an aspect of the present invention, a document processing system includes a document storage (accumulation) unit storing (accumulating) document images which include a predetermined one or more character strings and one or more fill-in ranges which correspond to the one or more character strings; an association information storage unit storing the character strings of the document images in association with the fill-in ranges corresponding to the character strings; a searching for unit searching for a character string, which includes a requested searched-for character string, from among the stored character strings; and a display control unit displaying a list of images of the fill-in ranges corresponding to the searched for character string of the stored (accumulated) document images.
Other objects, features, and advantages of the present invention will become more apparent from the following description when read in conjunction with the accompanying drawings, in which:
Basically, it is possible to accurately recognize typed text (no handwriting) because the character figures are stable, but it is difficult to accurately recognize hand-written text because of individual difference and unstable figures. Further, in order to increase the recognition accuracy of hand-written text, there is a known method in which a person (writer) is requested to write one character in one block (characters are written on a block basis). However, if it is requested to write a long sentence into a ledger paper dedicated to handwriting where each character is to be written into block by block, the person does not like to write that way because usability (convenience) is remarkably lowered. Therefore, such a writing method is not generally used.
On the other hand, in a ledger paper and an interview paper, it is possible to accurately recognize choices which are selected by a person by identifying the columns which are filled with a writing material such as a pencil by using an Optical Mark Recognition (OMR) technique for questions with choices. However, even in such a case, if there is a part where handwriting is necessary to, for example, write a text answer which is not included in the choices, it is still difficult to recognize such text, so that, for example, it is necessary for an operator to read and type the texts (by using a keyboard) to digitize the text.
As described, it is difficult to digitize and use a document which includes handwritten texts written on a predetermined format.
The present invention is made in light of the above problem, and may provide a document processing system that digitizes a document which includes handwritten texts written on a predetermined format so that the document can be used (handled) easily.
In the following, embodiments of the present invention are described with reference to the accompanying drawings.
The server apparatus 101 refers to an information processing apparatus having a configuration of a general computer, and is an example of a document management apparatus according to an embodiment. Various functions as the document management apparatus are realized by a program, etc., which runs on the server apparatus 101. The image forming apparatus 102 refers to an apparatus having an image reading function of, for example, a multifunctional peripheral having the functions of a printer, a scanner, a copier, a facsimile machine, etc., in a single chassis. The terminal device 103 refers to an information processing apparatus having a configuration of a general computer such as a Personal Computer (PC), a tablet terminal, a smartphone, etc.
In the configuration of
Note that the configuration of
The server apparatus 101 and the terminal device 103 have a configuration of, for example, a general computer.
The CPU 201 refers to an arithmetic device (processor) which realizes the functions of the server apparatus 101 or the terminal device 103 by reading (loading) a program and data, which are stored in the ROM 203 and the storage section 204, on the RAM 202 and executing the program (processes). The RAM 202 refers to a volatile memory which is used as, for example, a work area of the CPU 201. The ROM 203 refers to a non-volatile memory which can hold the program and data stored therein even when power thereto is turned off, and may be, for example, a flash ROM.
The storage section 204 refers to a storage device such as a Hard Disk Drive (HDD), a Solid State Device (SSD), etc., and stores an application program, various data, etc.
The external I/F 205 refers to an interface with an external device. The external device may be, for example, a recording medium 210. The server apparatus 101 and the terminal device 103 can read and write data from and to the recording medium 210. The recording medium 210 may be, for example, an optical disk, a magnetic disk, a memory card, a Universal Serial Bus (USB) memory, etc.
Further, by storing a predetermined program in the recording medium 210 and installing the program into the server apparatus 101 or the terminal device 103 via the external I/F 205, it become possible to execute the predetermined program.
The input section 206 includes, for example, a pointing device such as a mouse, a keyboard, etc., and is used to input various operation signals to the server apparatus 101 or the terminal device 103. The display section 207 includes a display, etc., and displays a result of processing performed by the server apparatus 101 or the terminal device 103.
The communication I/F 208 refers to an interface so that the server apparatus 101 or the terminal device 103 is connected to the network 104. By having the communication I/F 208, it becomes possible for the server apparatus 101 or the terminal device 103 to perform data communications with another apparatus (device) via the network 104. The bus 209 is connected to the elements described above, and transmits address signals, data signals, various control signals, etc.
Note that the configuration of
The controller board 300 includes a configuration of a general computer. Namely, the controller board 300 includes a CPU 301, a system memory 302, a North Bridge (NB) 303, a South Bridge (SB) 304, an Application Specific Integrated Circuit (ASIC) 306, a local memory 307, an HDD 308, a Network Interface Card (NIC) 313, a USB interface 314, an IEEE1394 interface 315, a Centronics interface 316, etc.
The operation panel 309 is connected to the ASIC 306 of the controller board 300. Further, the SB 304, the NIC 313, the USB interface 314, the IEEE1394 interface 315, and the Centronics interface 316 are connected to the NB 303 via a PCI bus. Further, the FCU 310, the printer 311, and the scanner 312 are connected to the ASIC 306 of the controller board 300 via a PCI bus.
Further, the ASIC 306 of the controller board 300 is connected to the local memory 307, the HDD 308, etc. Further, the CPU 301 is connected to the ASIC 306 via the NB 303 of a CPU chipset. Further, for high-speed communications, the ASIC 306 is connected to the NB 303 via not a PCI bus but an accelerated Graphics Port (AGP) 305.
The CPU 301 is a processor that performs overall control of the image forming apparatus 102. The CPU 301 executes the operating system, an application, and a program providing various services stored, for example, the HDD 308, etc., so as to realize the functions of the image forming apparatus 102.
The NB 303 is a bridge to connect the CPU 301, the system memory 302, the SB 304, and the ASIC 306 to each other. The system memory 302 is a memory to be used, for example, as a drawing memory of the image forming apparatus 102. The SB 304 is a bridge to connect the NB 303 to the PCI bus and peripheral devices. Further, the local memory 307 is a memory to be used, for example, as a copy image buffer and a code buffer. Hereinafter the system memory 302 and the local memory 307 may be referred to (simplified) as a “memory” or a “storage area”.
ASIC 306 is an integrated circuit dedicated to an image processing and having a hardware element for the image processing. The HDD 308 is a storage device to store (accumulate), for example, an image, a program, a font data, a form, etc.
Further, the operation panel 309 is hardware to receive an input operation by a user (operation section) and is hardware to display for the user (display section). The FCU 310 performs transmission and reception of FAX data in accordance with a standard such as, for example, Group 3 Facsimile (G3 FAX). The printer 311 performs printing, for example, under the control of the program running in the CPU 301.
The NIC 313 is a communication interface to connect the image forming apparatus 102 to the network 104 so as to perform data transmission and reception. The USB interface 314 is a serial bus interface to connect the image forming apparatus 102 to, for example, a recording medium such as a USB memory and various USB-based devices. The IEEE1394 interface 315 is an interface to connect the image forming apparatus 102 to, for example, a device in compliance with IEEE1394, which is a high-speed serial bus standard. The Centronics interface 316 is an interface to connect the image forming apparatus 102 to, for example, a device in compliance with Centronics specification, which is a parallel port specification.
Note that the configuration of
In
The communication means 401 is a means for connecting the server apparatus 101 to the network 104 so as to perform data transmission and reception with the image forming apparatus 102, the terminal device 103, etc., and corresponds to, for example, the communication I/F 208 of
The document storage (accumulation) means 402 stores (accumulates) an image of document (document image) which includes a predetermined one or more character strings, and fill-in ranges (entry ranges) corresponding to the character strings. The document storage means 402 stores (accumulates) the document image, which is acquired by the image forming apparatus 102, etc., and which is to be processed, in a storage means such as, for example, the storage section 204 of
Here, the document (document image) to be processed in the document processing system 100 is described.
Referring back to
Referring back to
The search-for means 405 searches for a character string which includes a searched-for character string, requested (input) to the terminal device 103, etc., from among the character strings stored as (in) the association information 409. In the association information 409, the character strings (typed texts) are stored in association with the corresponding fill-in ranges (entry columns) where handwritten characters, etc., are to be written. Due to this, it becomes possible to identify the fill-in range which corresponds to the question, etc., including the searched-for character string as a result of searching.
The display control means 406 displays a list of images of the fill-in ranges, for example, on the terminal device 103, etc., the fill-in ranges corresponding to the character strings which are searched-for by the search-for means 405, and being in the document image stored (accumulated) as the document data 408. For example, in the document image of
The extraction means 407 extracts an image of an input range (fill-in range) corresponding to the handwritten characters selected from among the handwritten characters displayed in a list on the terminal device 103, etc. The display control means 406 displays the image extracted by the extraction means 407 in a list, for example, on the terminal device 103, etc.
Note that the document storage means 402, the association information storage means 403, the identification means 404, the search-for means 405, the display control means 406, the extraction means 407, etc., are realized by a program running, for example, in the server apparatus 101.
In
The reading means 410 reads a document to be processed, and converts the read document into electronic data such as a document image, etc. The reading means 410 includes, for example, the scanner 312 of
The character recognition means 411 performs an Optical Character Recognition (OCR) process that converts character images included in the document image, etc., read by the reading means 410 into text data, and acquires the character strings (typed texts) included in the document image and the coordinates information of the character strings. The character recognition means 411 is realized by, for example, a program running on the CPU 301 of
The input display means 412 displays various information and receives user's input operations, and includes, for example, the operation panel 309 of
The communication means 413 connects the image forming apparatus 102 to the network 104 so that the image forming apparatus 102 can perform data transmission and reception with the server apparatus 101, the terminal device 103, etc. The communication means 413 corresponds to, for example, the NIC 313 of
In
The input means 414 inputs a user's input operation, and corresponds to, for example, the input section 206 of
The display means 415 displays various information of processing screens, etc., of the terminal device 103, and corresponds to, for example, the display section 207 of
The communication means 416 connects the terminal device 103 to the network 104, so that the terminal device 103 can perform data transmission and reception with the server apparatus 101, the image forming apparatus 102, etc., and corresponds to, for example, the communication I/F 208 of
Note that the above functional configurations are one example only, and the present invention is not limited to the above functional configurations. For example, those means of the server apparatus 101 may be included in the image forming apparatus 102, the terminal device 103, etc. Further, the character recognition means 411, the input means 414, the display means 415, etc., may be included in the server apparatus 101. Further, for example, the image forming apparatus 102 may be connected to the server apparatus 101, the terminal device 103, etc., via a USB interface, etc., without using the communication means 413.
Here, the process is described of identifying the one or more character strings (typed texts, etc.) included in a document image and the fill-in ranges, where handwritten characters are written, corresponding to the character strings.
The document data read by the reading means 410, the positions of the typed texts identified by the character recognition means 411, and the text data are transmitted to the server apparatus 101 by the communication means 413 of the image forming apparatus 102.
For example, in the interview paper 701 of
Further, for example, it is assumed that a shape of the handwritten fill-in range corresponding to the typed texts is rectangular and the handwritten fill-in range does not overlap with the typed texts and other handwritten fill-in ranges.
For example, under such conditions, it becomes possible to identify the range A 802 which is hatched with slash lines as the handwritten fill-in range corresponding to the typed texts “Name”. For example, the range A 802 can be defined by the coordinates (Xa,Ya), which is designated by the position of the typed texts 801, and the coordinates (Xb,Yb) which is designated by the positions of the typed texts 803 and 805.
Further, in a cases of a range C 806 and a range D 808 of
The interview paper ID 901 is identification information to identify the interview paper. The information of the typed texts 902, the handwritten fill-in range 903, etc., may differ depending on the interview sheet. Therefore, the type of the interview paper is managed based on the interview paper ID 901.
Further, the template 900 stores (includes) the typed texts 902 in association with the handwritten fill-in range 903, which area identified by the identification means 404. The template 900 is stored in a storage means such as the storage section 204 of
In step S1001, it is determined whether there exists an interview paper, etc., where no data (characters, etc.) are written, so that a new template 900 is registered (generated) based on the interview paper, etc. When it is determined that there exists the interview paper, etc. (YES in step S1001), the process goes to step S1002. On the other hand, when there is no ledger paper, etc., to be registered as a new template 900 (NO in step S1001), the process ends. Here, note that the interview paper is an example of a document to be processed. For example, the document may be a ledger paper or the like.
In step S1002, the reading means 410 reads the interview paper where no data (characters, etc.) are written, and converts the read data into image data.
In step S1003, the character recognition means 411 performs an ORC process on the image data acquired by the reading means 410 so as to acquire the character codes (text data) and positions (coordinates, etc.) of the typed text.
In step S1004, based on the character codes and the positions of the typed text acquired by the character recognition means 411, the identification means 404 identifies the fill-in ranges where hand-written characters, etc., are to be written.
In step S1005, the identification means 404 generates the template 900 as illustrated in, for example,
In step S1006, the template 900 generated by the identification means 404 is stored in a storage means such as the storage section 204 of
The above process is repeated until no interview paper to be registered is left.
In the above description, a case is described where a template is generated by using an interview paper where no (before) characters (characters, etc.) are written. However, note that it is also possible to generate a template by using an interview paper where characters (characters, etc.) are already written. In this case, in step S1003, the typed text (typed characters) and the handwritten characters are distinguished from each other, so that (only) the distinguished typed text can be processed.
In this case, it is possible to distinguish the handwritten characters from the typed text by, for example, using a characteristic that a confidence rating score of recognizing the characters becomes lower when handwritten characters are recognized by the OCR process. Further, it is possible to determine that the handwritten characters are written when the confidence rating score of recognizing the characters is lower than a predetermined threshold value. In this regard, a recognition result where the confidence rating score of recognizing the characters is low indicates that there is high likelihood that characters are wrongly recognized. Therefore, it is not appropriate to use the text (characters) having a low confidence rating score as the text to be searched for, so that it is desired (convenient) not to use such a result (text) to, for example, avoid wasteful searching.
By performing the process described above, it becomes possible to register the template 900 in the document processing system 100 by using, for example, an interview paper where no data (characters, etc.) are written or an interview paper where data (characters, etc.) are written.
In step S1101, it is determined whether there exists, for example, an interview paper, where data (characters, etc.) are written, to be newly stored. When it is determined that such an interview paper exists (YES in step S1101), a document ID is updated and the process goes to step S1102. On the other hand, when it is determined that no such interview paper exists (NO in step S1101), the process ends. Here, the document ID herein refers to identification information to identify the document image, so that different values are assigned (allocated) to different document images. Further, note that the interview paper is an example only. For example, the document image may be, for example, another ledger paper.
In step S1102, the reading means 410 reads the interview paper where data (characters, etc.) are written, and converts the read data into image data.
In step S1103, the character recognition means 411 performs the ORC process on the image data acquired by the reading means 410 so as to acquire the character codes (text data) and positions (coordinates, etc.) of the typed text. In this case, the character recognition means 411 may distinguish typed text from handwritten characters, so that the ORC process is performed only on the typed text. Otherwise, the character recognition means 411 may perform the ORC process without distinguishing the typed text from the handwritten characters.
In step S1104, for example, the identification means 404 makes a comparison between the recognition result of the typed text by the character recognition means 411 (typed text, etc.) and the information of the template 900 (typed text, etc.), so as to determine which template 900 is being used.
As an example method to determine the template 900, it is possible to count the number of appearances of the character codes and words in each of the documents to be compared. In this method, when the acquired (counted) number of appearances are regarded as the respective dimensions of a vector, the characteristics of the documents can be expressed by using the respective vectors, so that a likelihood degree between two documents to be compared can be acquired based on the Euclidean distance between the vectors. Therefore, it is possible to determine (regard) the template 900 that has the shortest Euclidean distance to the vector acquired based on a document to be identified among a plurality of templates 900 as the template 900 of (corresponding to) the document to be identified. Further, in addition to the number of appearances, with the existing positions of the characters, it become possible to identify the template 900 more accurately. Note that the method of determining the template 900 described above is an example only. Any other appropriate method of determining the template 900 may be alternatively used.
In step S1105, based on the template 900 determined in step S1104, the identification means 404 identifies the handwritten fill-in ranges of the image acquired by the reading means 410.
In step S1106, based on the identified handwritten fill-in ranges in step S1105, the identification means 404 generates the association information 409 of the image acquired by the reading means 410 in a format similar to that of the template 900 of
In step S1107, the association information storage means 403 stores the association information 409, which is generated by the identification means 404, in a storage means such as the storage section 204, etc. Further, the document storage means 402 stores the image data, which are acquired by the reading means 410, in association with the association information 409 in a storage means such as the storage section 204, etc., as the document data 408. Here, the document data 408 are associated with the association information 409 based on, for example, the document ID, etc.
By performing the process describe above, a user can store (accumulate) the document data 408 such as an interview paper, where data (characters, etc.) are written, and the association information 409 in the document processing system 100.
A user can use terminal device 103, etc., to browse necessary information from the document images stored in the document processing system 100.
In step S1201, a user inputs a searched-for word (searched-for character string).
In step S1202, the search-for means 405 searches for the character string which includes or corresponds to the input searched-for character string from among the character strings stored in the association information 409.
In step S1203, the display control means 406 extracts an image of the handwritten fill-in ranges corresponding to the character string which is searched for by the search-for means 405. In step S1204, the display control means 406 displays the extracted image on the terminal device 103 or the like. In this case, the display control means 406 may reduce the size of the image to obtain an excellent small list display (e.g., thumbnail display).
In step S1205, a user is prompted to choose an image from the list display, so that an image is selected by the user.
Here, for example, in a case where there are plural document images having different interview paper IDs in the document processing system 100, in step S1205, the document image is searched for having the same interview paper ID 901 as that of the selected image.
In step S1207, an image is extracted that is included in the document image extracted in step S1206 and that is the image of the fill-in range which is the same as that of the image selected in step S1205.
In step S1208, the display control means 406 displays the images extracted in step S1207 in a list display. In this case, the size of the images may be reduced to obtain an excellent small list display (e.g., thumbnail display).
In step S1209, for example, a user of the terminal device 103 is prompted to choose an image, so that an image is selected by the user.
In step S1210, an (overall) document image including the image selected by the user in step S1209 is displayed on the terminal device 103, etc.
Here, as one example of a specific searching for process, a process is described in a case where an interview paper where a specific person has written data (characters, etc.) is searched for.
For example, in order to see the name of the person who wrote an interview paper, a user inputs a character string “name” as the searched-for word by using the input means 414 of the terminal device 103. In response to the input, the display means 415 of the terminal device 103 display a list including not only an image of the handwritten characters written in the “name” column of the interview paper but also, for example, an image of the handwritten characters written in a question column including the character string “name”. The process corresponds to steps S1201 through S1204 of
Next, the user selects an image, which corresponds to the “name” column of the interview paper, by using the input means 414 or the like from among the images displayed in the list by the display means 415, so that the display means 415 displays only a list of the handwritten character images of the “name” column of the interview paper. The process corresponds to steps S1205 through S1208 of
Then, the user selects an image where the name of the specific person is handwritten from among the images in the list displayed by the display means 415. By doing this, the overall image of the interview paper including the selected image is displayed on the display means 415. The process corresponds to steps S1209 and S1210 of
According to an embodiment, by, for example, doing as described above, it becomes possible to digitize the document, etc., including handwritten characters written in a predetermined format, so that the document, etc. can be used easily.
A user 1, who will store (accumulate) a document image in the document processing system 100, performs a predetermined operation by, for example, using the image forming apparatus 102 (step S1301). In response, the image forming apparatus 102 sends a start request to the server apparatus 101 (step S1302).
Upon receiving the start request, the server apparatus 101 starts, for example, execution of an application (step S1303). By the application, the server apparatus 101 sends a scan request to the image forming apparatus 102 to scan a document (step S1304).
Upon receiving the scan request, the image forming apparatus 102 reads (scans) the document (step S1305), and performs an OCR process on the read image data (step S1306). Further, the image forming apparatus 102 transmits the acquired document image and a result of the OCR process (i.e., text data, coordinate information, etc.) to the server apparatus 101 (step S1307).
Upon receiving the document image and the result of the OCR process from the image forming apparatus 102, the server apparatus 101 identifies the fill-in range by using the identification means 404 (step S1308), generates the association information 409 including the character string in association with the fill-in range (step S1309), and stores the association information 409. Further, the document storage (accumulation) means 402 performs a document storage (accumulation) process that stores document data 408 received from the image forming apparatus 102 (step S1310). Here, note that the OCR process (step S1306) of
A user 2, who will perform searching in the document processing system 100, inputs a searched-for word (searched-for character string) by using, for example, the terminal device 103 (step S1401). Here, it is assumed that, for example, a program corresponding to the document processing system 100 runs in the terminal device 103. When the searched-for word is input, the terminal device 103 transmits the searched-for word to the server apparatus 101 (step S1402).
Upon receiving the searched-for word, the server 101, the search-for means 405 of the server apparatus 101 performs a text searching for process to search for the text (step S1403), and transmits an image list, which is based on a result of the searching for process, to the terminal device 103 (step S1404).
The terminal device 103 causes the display means 415 to display a list of the received images (step S1405), and prompts the user 2 to select an image. When the user 2 select an image (step S1406), the terminal device 103 transmits information of the selected image to the server apparatus 101 (step S1407).
Upon receiving the information of the selected image, the server apparatus 101 extracts images of the fill-in range which is the same as the fill-in range in the selected image in the document image having the same interview paper ID, etc., of the selected image (step S1407), and transmits the list of the extracted images to the terminal device 103 (step S1409).
The terminal device 103 causes the display means 415 to display a list of the received images (step S1410), and prompts the user 2 to select an image. When the user 2 selects an image (step S1411), the terminal device 103 transmits information of the selected image to the server apparatus 101 (step S1412).
Upon receiving the information of the selected image from the terminal device 103, the server apparatus 101 reads the document image including the selected image (step S1413), and transmits the selected document image to the terminal device 103 (step S1414).
The terminal device 103 displays the document image, which is received from the server apparatus 101, on the display means 415 (step S1415).
By doing this, the user 2 can browse a desired document image with a simple operation.
Next, with reference to
The screen 1503 of
When one screen is selected in the screen 1601 of
When the “Print” button is selected, the terminal device 103 transmits a print request to print the displayed document image to the server apparatus 101. In accordance with the received print request, the server apparatus 101 transmits a print instruction to print the document image to the image forming apparatus 102. The image forming apparatus 102 perform printing based on the received print instruction.
When the “Whole Screen Display” button 1706 is displayed, the terminal device 103 displays the document image by using the entire screen. When the “Cancel” button 1707 is selected, the terminal device cancels the current process, and, for example, displays the input screens to input a searched for word as illustrated in
As described above, in the document processing system 100 according to an embodiment, it becomes possible to digitize the document, etc., including handwritten characters written in a predetermined format, so that the document, etc. can be used easily.
Although the invention has been described with respect to specific embodiments for a complete and clear disclosure, the appended claims are not to be thus limited but are to be construed as embodying all modifications and alternative constructions that may occur to one skilled in the art that fairly fall within the basic teaching herein set forth.
For example, in the above embodiment, as described with reference to
In this case, for example, in
Further, the document processing system 100 may be, for example, a document processing apparatus that is realized by a program running in the image forming apparatus 102, the terminal device 103, etc.
Number | Date | Country | Kind |
---|---|---|---|
2014-048663 | Mar 2014 | JP | national |
2015-037577 | Feb 2015 | JP | national |