1. Field of the Invention
The present invention relates to an image processing device for processing document images.
2. Description of the Related Art
Recently, as network such as represented by Internet being widespread, documents are commonly distributed in electrical form. On the other hand, it is still common that they are distributed in paper form. Thus, various techniques have been provided for obtaining reusable electronic data from paper documents even if the existing is only a paper document.
For example, a technique is known in which a document image obtained by scanning a paper document is sent from a terminal to a server, subjected to character-recognition on the server, converted into a reusable format, and then returned to the terminal (see the Japanese Patent Laid-Open No. H11-167532 (1999)).
Additionally, another technique is also known that allows for dividing a document image into areas corresponding to their type and outputting them individually (see the Japanese Patent Laid-Open No. H11-167532 (1999)).
Although the format of data that users want to reuse depends on the situation, it is desirable that the format is easy to extract data for the users. In addition, as character recognition techniques have certain limitations, characters may potentially be falsely recognized. If the recognition of a reusable content is less-accurate, the content will be awkward to use for the users. In the technique disclosed in the Japanese Patent Laid-Open No. H11-167532 (1999), only information of the data with character format is reusable, and the information is converted without considering any recognition accuracy. However, as for paper documents, not only the content itself of the document but also the layout or positional relationship of the content often have important meanings for reusing.
Additionally, the technique disclosed in the Japanese Patent Laid-Open No. H09-091950 (1997) divides a document into contents and outputs them individually, thus the relationship between them might be lost.
Furthermore, it is also another challenge to enable users easily to reuse character images contained in a image as vector data and to reuse the character images as character codes after character recognition.
An image processing device of the present invention includes: an analyzing unit for analyzing an image and extracting character areas; a character recognition unit for recognizing characters in the character areas extracted by the analyzing unit and obtaining character code data of the recognition result; a vectorization unit for vectorizing the character areas extracted by the analyzing unit and obtaining vector drawing data; a storage location determination unit for determining, in an electronic document, storage locations of the character code data obtained by the character recognition unit and the vector drawing data obtained by the vectrization unit; and an electronic document generation unit for generating the electronic document having the character code data and the vector drawing data in the storage locations determined by the storage determination unit.
According to the techniques of the present invention, users can easily utilize desired data since both character code data of the recognition result of character areas and vector drawing data are provided. In addition, the recognition result can be provided in a re-editable form. Furthermore, as each of the vector drawing data is arranged in the same layout as the input image, users can view the positional relationship between vector drawing data of each of the character areas in an understandable manner, thus can easily browse or edit the data.
Further features of the present invention will become apparent from the following description of exemplary embodiments (with reference to the attached drawings).
In the following, a preferred embodiment of the present invention will be described with reference to the drawings. It will be understood, however, that any configurations described in the embodiment is exemplary only and that the scope of the present invention should not be limited to the configurations.
[System Configurations]
First, the whole configuration of a system including an image processing device according to the first embodiment of the present invention will be described with reference to
In
Personal computer (PC) 120 is connected to an image processing device 100 via a network such as LAN 110 and receives data sent from the image processing device 100. Display and edit program 121 may be executed on PC 120 that is able to display the received electronic document on the screen and edit the document, and another reuse program 122 may utilize portions of the document on the PC.
[Overview of Image Processing Device 100]
Next, the overview of image processing device 100 according to the first embodiment will be described with reference to
In the embodiment, although each process illustrated in
In
An electronic document generation unit (electronic document generation program) 210 generates an electronic document from the above input image.
An electronic document 220 is generated by electronic document generation unit 210.
The blocks 211 to 215 schematically illustrate each process executed by electronic document generation unit 210, which are described in more detail below.
An analyzing unit 211 analyzes input document images to extract their character areas.
A character recognition unit 212 character-recognizes the character areas extracted by analyzing unit 211 and converts the recognized characters into character code data.
A vectorization unit 213 extracts outline information of each character contained in the character areas extracted by analyzing unit 211 and converts the extracted outline information into vector drawing data.
A storage location determination unit 214 determines, in electronic document, storage locations of the character code data of electronic document 220 obtained from the character recognition unit 212 and the vector drawing data of electronic document 220 obtained from the vectorization unit 213.
A format conversion unit 215 converts the formats of character code data and vector drawing data into those of electronic document 220 according to the storage locations determined by storage determination unit 214 to generate electronic document 220.
The electronic document 220 generated by the electronic document generation unit 210 is composed of each of the data 221 to 224 described below, and may be displayed and edited by display and edit program 121 in PC 120.
A layout editing data 221 is used in display and edit program 121 and stores vector drawing data 223 generated by the vectorization unit 213. In addition, text editing data 222 is used in display and edit program 121 and stores character code data 224 after the recognition of character recognition unit 212.
[Processing Example by an Electronic Document Generation Unit]
In the following, a case in which electronic the document generation unit 210 processes input image 300 of
The analyzing unit 211 extracts a character area in input image 200 using well known image analyzing techniques. Specifically, the analyzing unit 211 extracts pixel clusters respectively composing each character in an image, and then extracts as a character area from the clusters an area in which each cluster has a size of a character and the clusters are aligned either at least vertically or horizontally. As a way of extracting a pixel cluster of character, any of well known methods may be used including: extracting a pixel cluster of analogous color pixels from input multi-valued image; extracting a pixel cluster of black pixels obtained by binalizing the multi-valued images; or extracting pixels enclosed by edges obtained by generating differential edge information from the multi-valued images.
In the example of image 300 illustrated in
Now, an exemplary character recognition process will be briefly described. However, it will be understood that the description is an example only, and any other recognition processes may be used for the purpose.
In a character recognition process, whether characters are written vertically or horizontally in a character area, that is, their line direction is determined. To achieve this, a method may be used that binarizes an image, generates the vertical and horizontal projection of the image, and then determines the one having lower variance to be the line direction.
Then the image is divided into separate character images. This may be achieved by determining line spaces to be divided using the horizontal projection of the binary image and dividing the image into line images, and then determining character spaces to be divided using the vertical projection of the image and dividing each of the line images into character images.
Then predetermined feature amount is calculated for each one of the character images. Then a dictionary pre-storing the feature amount of all character images will be searched for the most similar character to the character image. The character code of the character is the recognized result of each character image.
In the example illustrated in
The vectorization unit 213 utilizes well known vectorization techniques to generate vector drawing data of characters from image data in each character area. Methods disclosed in the Japanese patent document 3026592 or the Japanese Patent Laid-Open No. 2005-346137 are examples of the vectorization techniques.
For example, the method disclosed in the Japanese patent No 3026592, with raster-scanning an image, detects the horizontal and vertical inter-pixel vectors based on the condition of a focused pixel and the neighboring pixels of the image. Next, by extracting outlines of the image data based on the connection conditions between each of the inter-pixel vectors, information called outline vector is generated that describes the outline of the connected pixel data with a set of inter-pixel vectors. In addition, the method disclosed in the Japanese Patent Laid-Open No. 2005-346137 generates vector description data that remains to be high quality even when significantly magnified, by approximating outline vectors using straight lines, or two- or three-dimensional Bezier curves.
The storage location determination unit 214 determines the storage locations of character code data and vector drawing data based on a determination rule specifying data storage locations such as illustrated in
The format conversion unit 215 converts the formats of character code data and vector drawing data into those according to the storage location determined by the storage location determination unit 214 to generate electronic document 220.
[Specific Example of an Electronic Document]
Now, a specific example of an electronic document according to the first embodiment of the present invention will be described with reference to
Electronic document 500 illustrated in
A partition 501 in
On the other hand, a partition 502, enclosed by elements <Text> and </Text>, contains text editing data of an electronic document. The partition 502 also contains character code data 504 obtained as a result of character recognition.
Here, the name and data structure of elements complies with the specification of display and edit program 121. Depending on the type of display and edit program 121, the electronic document may be output with names or data structures other than those illustrated in
When display and edit program 121 is a “PowerPoint” application program of MICROSOFT CORPORATION, vector drawing data 503 is displayed in a slide area (slide pane) and character recognition result is displayed in a note area (note pane). In this case, electronic document 500 is formatted with “pptx”.
[An Example Screen by a Display and Edit Program]
Next, an example screen of electronic document 500 of
Display window 601 shows a whole display window obtained by display and edit program 121. Display window 601 includes layout editing window 602, text editing window 603, and summary information display window 604. If display and edit program 121 is a Power Point program, layout editing window 602 corresponds to a slide pane, text editing window 603 to a note pane, and summary information display window 604 to an outline pane.
Layout editing window 602 displays the content of layout editing data 221 in electronic document 220. As for electronic document 500 of
On layout editing window 602, a user can scale objects drawn with vector drawing data or change the color information of the object. Further, the user can store an edited electronic document or print an electronic document exactly as in displayed in layout editing window 602.
Text editing window 603 displays the content of text editing data in electronic document 220 as text data. As for electronic document 500 of
A user may use the content displayed on text editing window 603 as a complementary information such as annotations or comments of a document displayed on layout editing window 602. Furthermore, the content displayed on the window 303 may be edited as text data.
Summary information window 604 displays, of the content of layout editing data 221 in electronic document 220, character information of data used for summary information. However, as data used for summary information does not exist in electronic document 500 of
As described above, the image processing device according to the embodiment generates electronic data in which information obtained by vectorizing character areas of an image and those obtained by recognizing characters in the image are stored in respective locations. As character code data and vector drawing data of data generated from the input image are both presented on a display screen of display and edit program 121, a user can immediately utilize both of the data. That is, in the embodiment, both character code data and vector drawing data of the recognition result of character areas may be listed and provided in a re-editable manner. Furthermore, as each vector drawing data is arranged in the same layout as the input image, the location relationship between vector drawing data of each character area is displayed in an easily understood manner for a user, and the user can easily browse or edit the data.
Next, an image processing device according to a second embodiment of the present invention will be described with reference to
Similarly to
Input image 700 is inputted from scanner 101.
An electronic document generation unit 710 generates an electronic document from the input image 700.
An electronic document 720 is generated by the electronic document generation unit 710. Blocks 711 to 717 in
An analyzing unit 711 analyzes input document images and extract their character areas.
An character recognition unit 712 character-recognizes the character areas extracted by the analyzing unit 711 and converts the recognized characters into character code data.
An character layout extracting unit 713 extracts character layout information of the character area extracted by the analyzing unit 711. While the information can include, for example, coordinate of a character area, and font, size or color of a character, the coordinate of a character area and the size of a character will be extracted at least herein.
An vectorization unit 714 extracts outline information of each character contained in the character areas extracted by the analyzing unit 711 and converts the extracted information into vector drawing data.
An background generation unit 715 generates data to draw an area other than character areas in an input image, i.e., a background.
A storage location determination unit 716 determines storage locations of data in the electronic document 720. In this case, the data to be a target includes: character code data output by the character recognition unit 712; character drawing data with layout obtained by combining the character code data and character layout information; and vector drawing data output by the vectorization unit 714.
An format conversion unit 717 converts character code data, drawing data with layout, vector drawing data, and background drawing data into the respective formats of the electronic document 720.
The electronic document 720 generated by the electronic document generation unit 710 is composed of data 721 to 727 described below and may be displayed and edited by the display and edit program 121 in PC 120.
The layout editing data 721 is used in the display and edit program 121. The data 721 includes two types of character drawing data with layout 723 and 724, background drawing data 725 and vector drawing data 726.
Meanwhile, text editing data 722 is used in the display and edit program 121, which is configured by character code data 727.
[Processing Example by an Electronic Document Generation Unit]
In the following, a case in which electronic document generation unit 710 processes input image 801 of
First, the analyzing unit 711 extracts character areas in input image 700. The process is similar to those of the analyzing unit 211 in
As for input image 800, areas 801, 802 and 803 enclosed by dotted lines are to be extracted as character areas.
The recognition unit 712 recognizes characters in each character area to generate character code data. The process is similar to those of character recognition unit 212 in
Provided, from the example illustrated in
Then, the character layout extracting unit 713 obtains character layout information of each character area. The information extracted here are to be coordinate information of the character area and size information of characters. Any well known extracting techniques may be used for extracting the character layout information. For example, although the coordinate information of character areas may be obtained upon the analyzing unit 711 extracting the areas, and the size of characters may be obtained as an averaged size value of the characters generated upon recognizing each of the characters by character recognition unit 712, any other methods may be used for the purpose.
Then, the vectorization unit 714 generates vector drawing data of characters from image data in each character area. The process is similar to those of the analyzing unit 213 in
Then, the background generation unit 715 generates background drawing data by changing the color of pixels of character parts in a character area of the input image into a color analogous to that of a background surrounding the area. In other words, upon overplanting the character pixels of the input image with the color of the surrounding area, a background image not including the character area of the input image can be generated. Any well known techniques may be used for the process of background generation unit 715. For example, the Japanese Patent Laid-Open No. 2007-272601 discloses a technique that vectorizes characters in an input image and complements pixels in the characters with a background color.
Then, the storage location determination unit 716 determines the storage locations of character code data, character drawing data with layout, background drawing data and vector drawing data based on a determination rule specifying data storage locations, for example, as illustrated in
According to the determination rule of
In determining according to the determination rule of
The format converting unit 717 converts each data into a format in accordance with a storage location determined by the storage location determination unit 716 to build electronic document 720.
[Specific Example of an Electronic Document]
Now, a specific example of an electronic document according to the second embodiment of the present invention will be described with reference to
Electronic document 1000 illustrated in
Partition 1001 enclosed by elements <Layout> and </Layout> contains layout editing data of an electronic document. Partition 1001 also contains character drawing data 1 with layout 1003, character drawing data 2 with layout 1004, background drawing data 1005, and vector drawing data 1006.
Partition 1002 enclosed by elements <Text> and </Text> contains text editing data of an electronic document and character code data 1007.
Here, similarly to the first embodiment, the name and data structure of elements of
[An Example Screen by a Display and Edit Program]
Next, an example screen of the display and edit program 121 will be described with reference to
A display window 1101 shows a whole display window obtained by the display and edit program 121. The display window 1101 includes a layout editing window 1102, a text editing window 1103, and a summary information display window 1109.
A layout editing window 1102 displays the content of layout editing data 721 in the electronic document 720.
In the electronic document 1000 of
Note here that the background drawing data 1005 will be placed on the character drawing data with layout 1003 and 1004. In other words, the character drawing data with layout 1003 and 1004 will be displayed with the background drawing data 1005 overlaid thereon. Therefore, a user can see the vector drawing data 1006 drawn on a background based on the background drawing data 1005. Consequently, image information whose appearance is identical to that of input image 800 of
Similarly to the first embodiment previously described, in the layout editing window 1102, a user can scale objects drawn with vector drawing data or change the color information of the objects. In addition, storing edited electronic document, or printing an electronic document in exact appearance of the layout editing window 1102 are possible.
The text editing window 1103 displays the content of text editing data in electronic document 720 as text data.
As for exemplary electronic document 1000 of
The summary information window 1104 displays, out of the content of the layout editing data 721 in the electronic document 720, character code information contained in character drawing data 1 with layout (723).
Note that a display and edit program according to the embodiment defines strings of layout editing data enclosed by elements <HighRecognitionText> and </HighRecognitionText> as character 1 with layout to be displayed in the summary information window. Further, strings of layout editing data enclosed by elements <LowRecognitionText> and </LowRecognitionText> is defined as character drawing data 2 with layout not to be displayed in the summary information window. Consequently, as for the electronic document 1000 of
As described above, an image processing device according to the embodiment generates electronic data in which character drawing data with layout is stored in the storage location determined according to its character recognition rate. For the above generated electronic data, the display and edit program 121 also provides summary information about high character recognition rate data, so a user can more efficiently reuse high reliability information. In other words, in the embodiment, the user can reuse high character recognition rate data without hesitation and in easy-browsing manner. Furthermore, character drawing data with layout (i.e., character codes resulting from character recognition) is drawn on the layout editing window, so that when text search being performed, found characters are highlighted, and the user can easily find the position of a desirable text on the screen.
Here, in the case of the display and edit program 121 being a Power Point program, the layout editing window 602 corresponds to a slide pane, text editing window 603 corresponds to a note pane, and summary information display window 604 corresponds to a outline pane. In this case, an electronic document is generated in which strings having higher similarity of recognized characters are written in the layout editing data part so that only such strings are displayed in the outline pane.
Now, the third embodiment of the present invention will be described.
In the above second embodiment, the storage location determination unit 716 has determined storage locations according to the determination rule of
In the third embodiment, character drawing data with layout having higher character recognition rate of character codes will be stored as character drawing data 1 with layout (723) in the layout editing data 721 in the electronic document 720. On the other hand, character drawing data with layout having lower character recognition rate of character codes will be, after eliminating character code information, stored as character drawing data 1 with layout (723) in the layout editing data 721 in the electronic document 720.
[Specific Example of an Electronic Document]
Now, a specific example of an electronic document according to the third embodiment of the present invention will be described with reference to
Electronic document 1300 illustrated in
Partition 1301 is used for storing layout editing data of electronic document 1300, which stores, as character drawing data 1 with layout (723), data 1303 and 1305 indicated in
[An Example Screen by a Display and Edit Program]
Next, an example screen processed by display and edit program 121 will be described with reference to
A layout editing window 1402 and a text editing window 1403 each display information similar to
A summary information display window 1404 displays empty line 1405 between recognized character areas 801 and 803. The line 1405 corresponds to data in the character area 802 that was falsely-recognized as “THIS IS A SAMBLE OF AN ELECTRONIC DOOUMENT”. A user can insert a string “THIS IS A SAMPLE OF AN ELECTORONIC DOCUMENT” into the empty line and round out summary information. The user can do that manually or by copying the recognition result of an original string in the text editing window 1403 for empty line 1405.
In addition, as character codes in the layout editing window is linked to the corresponding character codes in the summary information window, when characters are inserted into empty line 1405, the characters are also inserted into corresponding partition 1406 of the area for character drawing with layout. Here, characters in layout editing window are not viewable at this time since they are hidden behind a background image. However, by putting the drawing area of those characters in front of the background and the drawing area of the vector drawing data, the input string becomes available that includes character layout information, i.e., its position and size, extracted from the original image.
As described above, an image processing device according to the embodiment generates electronic data that stores character drawing data with layout whose output format is changed according to its character recognition rate. For the electronic data generated above in this manner, the display and edit program 121 daringly does not provide character information of lower character recognition rate on a layout editing window and instead does provide editable empty lines, so that users can easily edit or rebuild the electronic data. That is, modifiable lower character recognition rate data can be provided in the embodiment.
In the case of the display and edit program 121 being a Power Point program, layout editing window 602 corresponds to a slide pane, text editing window 603 corresponds to a note pane, and summary information display window 604 corresponds to an outline pane. In this case, an electronic document is generated with its data written in a layout editing data partition, such that strings having higher similarity of character recognition result are displayed in an outline pane and empty lines instead of a character area having lower similarity are displayed in the outline pane.
Various embodiments of the present invention have been described above.
It should be noted that objects of the present invention may be achieved by a system or computer device (or CPU or MPU) reading a program code from a storage medium that stores the code and executing the code. The program code embodies the steps of the flowchart illustrated in the embodiments described above. In this case, the program code itself read from the storage medium causes the computer to realize functions of the above embodiments. Therefore, the code and a computer readable storage medium storing the code also fall within the scope of the present invention.
Storage medium such as, for example, Floppy® disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, magnetic tape, non-volatile memory card, and ROM, etc. may be used for providing the program codes.
Additionally, functions of the embodiments previously described may be realized by a computer reading and executing a program. Furthermore, the execution of a program also encompasses an operating system running on a computer performing a part or all of actual processes based on instructions of the program.
Still further, functions of the embodiments described above may also be realized by a function enhancement board inserted into a computer or a function enhancement unit coupled to the computer. In this case, first, a program read from a storage medium will be written in a memory in the board or unit. Thereafter, a CPU in the board or unit will perform a part or all of actual processes based on instructions of the program. Functions of the embodiments previously described may also be performed by such board or unit.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2008-171178, filed Jun. 30, 2008, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2008-171178 | Jun 2008 | JP | national |
This application is a division of application Ser. No. 12/486,245, which was filed on Jun. 17, 2009, and which is now U.S. Pat. No. 8,571,359.
Number | Name | Date | Kind |
---|---|---|---|
5509092 | Hirayama et al. | Apr 1996 | A |
6043823 | Kodaira et al. | Mar 2000 | A |
6404921 | Ishida | Jun 2002 | B1 |
7170647 | Kanatsu | Jan 2007 | B2 |
7277584 | Nakanishi et al. | Oct 2007 | B2 |
7298900 | Kanatsu | Nov 2007 | B2 |
7317833 | Kaneda | Jan 2008 | B2 |
7382939 | Kanatsu | Jun 2008 | B2 |
8566711 | Srivastava | Oct 2013 | B1 |
20030182402 | Goodman et al. | Sep 2003 | A1 |
20040239955 | Uchida et al. | Dec 2004 | A1 |
20040257612 | Okabe et al. | Dec 2004 | A1 |
20050039119 | Parks et al. | Feb 2005 | A1 |
20050138160 | Klein et al. | Jun 2005 | A1 |
20050238244 | Uzawa | Oct 2005 | A1 |
20050273711 | Herzman et al. | Dec 2005 | A1 |
20060001932 | Sekiguchi | Jan 2006 | A1 |
20060171587 | Kanatsu | Aug 2006 | A1 |
20070127049 | Bystrom et al. | Jun 2007 | A1 |
20070133031 | Takaragi et al. | Jun 2007 | A1 |
20070230787 | Belitskaya et al. | Oct 2007 | A1 |
20070230810 | Kanatsu | Oct 2007 | A1 |
20070237401 | Coath et al. | Oct 2007 | A1 |
20080040677 | Atarashi et al. | Feb 2008 | A1 |
20080080769 | Kanatsu | Apr 2008 | A1 |
20090086275 | Liang et al. | Apr 2009 | A1 |
Number | Date | Country |
---|---|---|
3026592 | May 1992 | JP |
09-091450 | Apr 1997 | JP |
11-167532 | Jun 1999 | JP |
2005-346137 | Dec 2005 | JP |
2007-272601 | Oct 2007 | JP |
2008-092419 | Apr 2008 | JP |
Entry |
---|
Imade et al. (“Segmentation and classification for mixed text/image documents using neural network,” Proc. 2nd ICDAR, 1993, 930-934). |
Japanese Office Action issued May 22, 2012, in counterpart Japanese application No. 2008-171178. |
Roberts et al., “Image and Text Coupling for Creating Electronic Books from Manuscripts,” Proc. 4th Int'l Conf. on Document Analysis and Recognition( 1997), vol. 2 pp. 823-826. |
Number | Date | Country | |
---|---|---|---|
20140023272 A1 | Jan 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12486245 | Jun 2009 | US |
Child | 14034723 | US |