The present application claims priority to and incorporates by reference the entire contents of Japanese priority documents 2007-065668 filed in Japan on Mar. 14, 2007 and 2007-325144 filed in Japan on Dec. 17, 2007.
1. Field of the Invention
The present invention relates to a technology for processing a document image.
2. Description of the Related Art
A technology for extracting line blocks by performing a line extracting process on a document image, and performing a predetermined processing on the line blocks has been known.
For example, “character direction identifying device, character processing device, program, and storage medium” for promptly identifying character direction with a small-size storage and less number of calculations is disclosed in Japanese Patent Application Laid-open No. 2006-031546.
Furthermore, “language identification apparatus, program, and recording medium” for promptly discriminating languages used in a document is disclosed in Japanese Patent Application Laid-open No. 2005-063419.
Moreover, “method, device, and program for extracting title of document picture” for extracting a title candidate speedily and accurately is disclosed in Japanese Patent Application Laid-open No. 2003-058556.
However, in the technologies disclosed in Japanese Patent Application Laid-open No. 2006-031546 and Japanese Patent Application Laid-open No. 2005-063419, because a predetermined processing is performed on the whole image, the processing time tends to be long.
A technology for extracting an area from a document picture and performing a character recognition on the extracted area is disclosed in Japanese Patent Application Laid-open No. 2006-031546. However, even if this technology is applied to typical character recognition, the character recognition needs to be performed on all extracted areas. Therefore, the processing time virtually the same as performing the character recognition on the whole document picture will be required.
It is an object of the present invention to at least partially solve the problems in the conventional technology.
According to an aspect of the present invention, there is provided an image processing apparatus that includes an image processing unit that performs a predetermined image processing on each of a plurality of areas of an input image; a judging unit that judges whether a result of image processing performed by the image processing unit on an area satisfies a certain processing end condition; a stopping unit that causes the image processing unit to stop performing the image processing when judgment of the judging unit is affirmative; and an output unit that outputs the image processing result.
According to another aspect of the present invention, there is provided an image processing method including performing a predetermined image processing on each of a plurality of areas of an input image; judging whether a result of image processing performed at the performing on an area satisfies a certain processing end condition; stopping the performing when judgment at the judging is affirmative; and outputting a result of the image processing performed at the performing.
According to still another aspect of the present invention, there is provided a computer program product that realizes the above method on a computer.
The above and other objects, features, advantages and technical and industrial significance of this invention will be better understood by reading the following detailed description of presently preferred embodiments of the invention, when considered in connection with the accompanying drawings.
Exemplary embodiments of the present invention are explained in detail below with reference to the accompanying drawings.
In a case of performing a predetermined processing such as an optical character recognition (OCR) processing and a character-orientation judging processing on a document image using line data of the document image, an image processing apparatus 100 according to an embodiment can perform the predetermined processing at high speed without requiring excessive computational resource by effectively performing the predetermined processing on the document image.
Characteristic functions of the image processing apparatus 100 from among the functions executed by the CPU 1 by executing various programs installed in the ROM 2 are explained below.
The image receiving unit 10 receives a document image and stores the document image into the storage unit 3 as needed. The document image can be an image acquired by the scanner, an image created by using the PC, or an image received through the communication line. The document-image dividing unit 20 receives the document image from the image receiving unit 10. Alternatively, the document-image dividing unit 20 receives the document image from the storage unit 3.
The document-image dividing unit 20 divides the document image into a plurality of processing areas, and the processing unit 30 performs a predetermined processing, such as OCR processing and/or character-orientation judging processing, on the processing areas.
Specifically, the processing-area setting unit 21 of the document-image dividing unit 20 divides the document image into a plurality of processing areas and sets each area as a processing area. The document image can be divided by specifying the number of processing areas, or by specifying a width of processing areas.
Assuming that the document image is rectangular with one apex defined by coordinates (Xs, Ys) and a diagonally opposite apex defined by coordinates (Xe, Ye), the height H of the document image will be Ye-Ys and the width W will be Xe-Xs. By specifying a number Nbh of processing areas in the height direction of the document image, the document image can be divided into Nbh number of processing areas each having a length H/Nbh in the height direction. Alternatively, by specifying a number Nbw of processing areas in the width direction of the document image, the document image can be divided into Nbw number of processing areas each having a length W/Nbw in the width direction. Values Nbv and Nbh can be set as required at the time of dividing the document image.
Instead of specifying the number of divisions, it is conceivable to specify a fixed length in the height direction or the width direction. By specifying a length H/Nbh in the height direction, the document image can be divided into Nbh number of processing areas. Alternatively, by specifying a length W/Nbw in the width direction, the document image can be divided into Nbw number of processing areas.
Subsequently, the block extracting unit 22 extracts blocks each of which circumscribes a black-pixel connected component, i.e., performs block extraction processing, and the line extracting unit 23 performs line extraction processing.
The line extraction processing performed by the line extracting unit 23 is explained below.
A basic operation (an image processing method) of the image processing apparatus 100 is explained below.
Then, in the processing unit 30, the predetermined processing is performed for every line extracted by the line extracting unit 23, and a result of the predetermined processing is stored after the control unit 33 judges whether a non-processed line remains in the processing area. If the control unit 33 judges that there is a non-processed line, the processing unit 30 performs the predetermined processing on the non-processed line, and the control unit 33 judges whether there is a non-processed line again. If the control unit 33 judges that there is no non-processed line, the above operations are performed on the next processing area. In the present embodiment, the predetermined processing is performed for every line extracted by the line extracting unit 23; however, for example, the predetermined processing can be performed for every predetermined area. The predetermined area can be any size as long as the OCR processing and the character (line) orientation judging processing can be performed on the predetermined area.
The processing area is smaller than the whole document image, so that the number of blocks and lines in the processing area is also smaller than those in the document image. Therefore, areas for storing block data and line data that are secured in advance can be small, which is advantageous.
Moreover, by performing the predetermined processing on the lines every processing area, and storing a result of the predetermined processing for the whole document image, the areas for storing block data and line data can be repeatedly utilized.
In the basic operation of the image processing apparatus 100, the document image can be subjected to the predetermined processing based on a result of the line extraction processing without preparing a large amount of computational resource considering the maximum processing amount by diving the document image into the processing areas, and an intermediate result can be analyzed by obtaining the result of the predetermined processing in units of the processing area, enabling to decide to finish the operations early. Therefore, a user can obtain a desired result quickly with a small amount of computational resource, resulting in improving usability.
The control unit 33 (a judging unit) judges whether a result of the predetermined processing on the line extracted by the line extracting unit 23 satisfies a processing end condition of the processing by the OCR unit 31 or the character-orientation judging unit 32 every time the predetermined processing is performed on the line. In the present embodiment, the control unit 33 judges whether a result of the predetermined processing satisfies the processing end condition every time the predetermined processing is performed on the line; however, for example, the control unit 33 can judge whether a result of a predetermined processing on the predetermined area satisfies the processing end condition every time the predetermined area is processed.
The processing end condition includes a first case in which a desired result can be obtained and a second case in which a document image is not suitable as a target for processing during the predetermined processing by the OCR unit 31 or the character-orientation judging unit 32. For example, the first case includes a case in which a character string (character data) obtained by the OCR processing coincides with a preset keyword, a case in which the number of lines whose character orientation (arrangement data of blocks in lines) can be judged exceeds a threshold number of lines with which the character orientation can be judged with a predetermined reliability, and other cases. Therefore, the predetermined processing by the OCR unit 31 or the character-orientation judging unit 32 in operation can be ended before processing the whole document image, so that the time required for the predetermined processing can be shortened.
The second case includes a case in which the number of blocks in a line in the processing area (a predetermined area) exceeds a threshold number of blocks in the line existable in the processing area, and other cases. Therefore, if the number of blocks in the line exceeds the threshold number of blocks, the control unit 33 judges that the document image is not a text image, so that the predetermined processing by the OCR unit 31 or the character-orientation judging unit 32 in operation can be ended before processing the whole document image, so that unnecessary processing can be avoided.
When a result of the predetermined processing satisfies the processing end condition, the control unit 33 (a stop unit) stops the predetermined processing to non-processed lines in the processing area. Then, the control unit 33 (an output unit) outputs the result of the predetermined processing to the line as a result of the predetermined processing to the processing area. Furthermore, the control unit 33 instructs the OCR unit 31 or the character-orientation judging unit 32 to transfer to the processing to the next processing area.
Examples of processing procedures (image processing methods) are explained based on the basic operation of the image processing apparatus 100 according to the embodiment. In the present embodiment, when the processing end condition is satisfied, the predetermined processing on a line is ended. The operation of ending the predetermined processing on a line includes a case explained in a first example in which the predetermined processing in operation on a processing area is ended and thereafter, the processing is transferred to the next processing area, and a case explained in a second example in which the predetermined processing in operation on a processing area is ended and thereafter, the entire operation is finished without transferring to the processing to the next processing area.
A processing procedure in the first example is explained referring to
The document image stored in the storage unit 3 by the image input unit 10 is read out from the storage unit 3 and is input to the document-image dividing unit 20, or the document image is directly input to the document-image dividing unit 20 by the image input unit 10 (Step S1).
The document-image dividing unit 20 sets processing areas by the processing-area setting unit 21. Specifically, the processing-area setting unit 21 divides the document image into a plurality of areas, and temporarily stores the divided areas as the processing areas (Step S2). At this time, a number is assigned to each processing area, and a first processing area is processed (Steps S3 and S4).
Before performing the block extraction processing (Step S6) and the line extraction processing (Step S7) on the first processing area, the document-image dividing unit 20 judges whether the whole document image (all the processing areas) is processed (Step S5). If there is a plurality of processing areas, the document-image dividing unit 20 judges that not the whole document image is processed, i.e., there is a non-processed processing area remained (“No” at Step S5), and a system control proceeds to Step S6.
The document-image dividing unit 20 performs the block extraction processing on the first processing area by the block extracting unit 22 (Step S6). Specifically, the document-image dividing unit 20 extracts blocks each circumscribing a pixel connected component, and records coordinates of the blocks. Thereafter, the document-image dividing unit 20 performs the line extraction processing on the first processing area by the line extracting unit 23 (Step S7). Specifically, the document-image dividing unit 20 couples adjacent blocks to form lines, and records coordinates of the lines.
The processing unit 30 performs the predetermined processing such as the OCR processing by the OCR unit 31 and the character-orientation judging processing by the character-orientation judging unit 32 on the lines extracted by the line extraction processing (Step S8).
The processing unit 30 judges whether the predetermined processing is performed on all the lines in the first processing area (Step S9).
When the processing unit 30 judges that the predetermined processing is performed on all the lines in the first processing area (“Yes” at Step S9), a result of the predetermined processing is stored in the storage unit 3 (Step S11). Then, a second processing area is taken as a target (Step S12), and the processing from Step S4 is performed on the second processing area in the same manner as the above.
When the processing unit 30 judges that the predetermined processing is not performed on all the lines in the first processing area (“No” at Step S9), the processing unit 30 judges whether the processing end condition to end the predetermined processing in operation is satisfied (Step S10).
As described above, the processing end condition includes the case in which a desired result can be obtained while the predetermined processing is performed, examples of which are a case in which a result of the OCR processing coincides with the preset keyword, and a case in which the number of lines whose character orientation is judged exceeds the predetermined threshold.
The character-orientation judging processing by the character-orientation judging unit 32 is explained in detail.
After a line is extracted, the character-orientation judging unit 32 calculates the height of each block in the line, and estimates the maximum line height in case that the line is skewed or blocks in the line are all small. A height h of each block in the line is multiplied by a predetermined value A (e.g. 1.2), which is compared with an actual line height H. If the value calculated by multiplying the maximum block height hs by the predetermined value A is larger than the actual line height H, the maximum block height hs is regarded as the actual line height H. Next, a base line of the line is determined by calculating a regression line of end points Ye of the blocks in the line. At this time, only the end points Ye that are lower than the half of the height of the line are used. The calculated regression line is regarded as the base line of the line. Then, the blocks in the line are aligned according to start points Ys of the blocks. The arrangement data of the aligned blocks is quantized to convert the blocks into a symbol sequence. The appearance probability is calculated from the symbol sequence in all possible character orientations.
When the processing unit 30 judges that the processing end condition is not satisfied (“No” at Step S10), a control system returns to Step S8, and the processing unit 30 continues the predetermined processing to non-processed lines in the first processing area.
When the processing unit 30 judges that the processing end condition is satisfied (“Yes” at Step S10), the processing unit 30 ends the predetermined processing in operation, and the control system proceeds to Step S11 at which a result of the predetermined processing performed thus far is stored in the storage unit 3. Then, the second processing area is taken as a target (Step S12), and the processing from Step S4 is performed on the second processing area in the same manner as the above. In other words, when it is judged that the processing end condition is satisfied, the predetermined processing in operation for the processing area is ended, and a target for processing is transferred to the next processing area.
When it is judged that the whole document image (all the processing areas) is processed, i.e., there is no non-processed processing area remained (“Yes” at Step S5), the entire operation is finished.
A processing procedure in the second example is explained referring to
The processing at Steps S21 to S32 shown in
If the blocks extracted as shown in
In a third example, as shown in
A processing procedure in the third example is explained referring to
In the third example, the predetermined processing is performed considering both of the horizontal writing and the vertical writing, and when it is judged that the processing end condition is satisfied, the predetermined processing in operation on a processing area is ended to transfer to the processing to the next processing area.
The document image stored in the storage unit 3 by the image input unit 10 is read out from the storage unit 3 and is input to the document-image dividing unit 20, or the document image is directly input to the document-image dividing unit 20 by the image input unit 10 (Step S41).
The document-image dividing unit 20 sets a line direction in which the predetermined processing is performed (a processing direction) to the horizontal direction by a line direction setting unit (not shown) (Step S42).
The document-image dividing unit 20 sets processing areas according to the line direction (the horizontal direction) by the processing-area setting unit 21. Specifically, the document-image dividing unit 20 divides the document image into a plurality of areas in the line direction, and temporarily stores the divided areas as the processing areas (Step S43). At this time, a number is assigned to each processing area, and a first processing area is processed (Steps S44 and S45).
Before performing the block extraction processing (Step S47) and the line extraction processing (Step S48) on the first processing area, the document-image dividing unit 20 judges whether the whole document image (all the processing areas) is processed (Step S46). If there is a plurality of processing areas, the document-image dividing unit 20 judges that not the whole document image is processed, i.e., there is a non-processed processing area remained (“No” at Step S46), and a system control proceeds to Step S47.
The document-image dividing unit 20 performs the block extraction processing on the first processing area by the block extracting unit 22 (Step S47). Specifically, the document-image dividing unit 20 extracts blocks each circumscribing a pixel connected component, and records coordinates of the blocks. Thereafter, the document-image dividing unit 20 performs the line extraction processing on the first processing area by the line extracting unit 23 (Step S48). Specifically, the document-image dividing unit 20 couples adjacent blocks to form lines, and records coordinates of the lines.
The processing unit 30 performs the predetermined processing such as the OCR processing by the OCR unit 31 and the character-orientation judging processing by the character-orientation judging unit 32 on the lines extracted by the line extraction processing (Step S49).
The processing unit 30 judges whether the predetermined processing is performed on all the lines in the first processing area (Step S50).
When the processing unit 30 judges that the predetermined processing is performed on all the lines in the first processing area (“Yes” at Step S50), a result of the predetermined processing is stored in the storage unit 3 (Step S52). Then, a second processing area is taken as a target (Step S53), and the processing is performed on the second processing area from Step S45 in the same manner as the above.
When the processing unit 30 judges that the predetermined processing is not performed on all the lines in the first processing area (“No” at Step S50), the processing unit 30 judges whether the processing end condition to end the predetermined processing in operation is satisfied (Step S51).
As described above, the processing end condition includes the case in which a desired result can be obtained while the predetermined processing is performed, examples of which are a case in which a result of the OCR processing coincides with the preset keyword, and a case in which the number of lines whose character orientation is judged exceeds the predetermined threshold.
When the processing unit 30 judges that the processing end condition is not satisfied (“No” at Step S51), the control system returns to Step S49, and the processing unit 30 continues the predetermined processing to non-processed lines in the first processing area.
When the processing unit 30 judges that the processing end condition is satisfied (“Yes” at Step S51), the processing unit 30 ends the predetermined processing in operation, and the control system proceeds to Step S52 at which a result of the predetermined processing performed thus far is stored in the storage unit 3. Then, the second processing area is taken as a target (Step S53), and the processing is performed on the second processing area from Step S54 in the same manner as the above. In other words, when it is judged that the processing end condition is satisfied, the predetermined processing in operation on the processing area is ended, and a target for processing is transferred to the next processing area.
When it is judged that the whole document image (all the processing areas) is processed, i.e., there is no non-processed processing area remained (“Yes” at Step S46), the document-image dividing unit 20 sets the line direction in which the predetermined processing is performed on the vertical direction by the line direction setting unit (Step S54).
The document-image dividing unit 20 sets processing areas according to the line direction (the vertical direction) by the processing-area setting unit 21 (Step S55). Thereafter, the processing at Steps S56 to S65 is performed in the same manner as those at Steps S44 to S53.
When it is judged that the whole document image (all the processing areas) is processed, i.e., there is no non-processed processing area remained (“Yes” at Step S58), the entire operation is finished.
A processing procedure in a fourth example is explained referring to
In the fourth example, the predetermined processing is performed considering both of the horizontal writing and the vertical writing, and when it is judged that the processing end condition is satisfied, the predetermined processing in operation on a processing area is ended but the processing is not transferred to the next processing area.
The processing at Steps S71 to S83 shown in
It is sufficient that the upper limit of the number of blocks in the processing area is set based on the case in which normally used minimum-size characters fill the processing area. However, the number of blocks may exceed the upper limit when the document image is not a text image, but a pointillist drawing or a background, for example.
The binary halftone image shown in
For solving the problem, the control unit 33 judges whether the processing end condition is satisfied between the block extraction processing and the line extraction processing in each of the flowcharts in
When the processing end condition is satisfied, the control unit 33 judges that the document image in the processing area is not a text document, and the processing is transferred to the next processing area without performing the line extraction processing on the processing area. In other words, if the control unit 33 judges that a result of the line extraction processing cannot be obtained, the processing in operation is ended to avoid unnecessary computations, and the processing is transferred to the next processing area. Therefore, it is possible to effectively perform the line extraction processing in view of processing speed and computational resource. When the processing end condition is not satisfied, the line extraction processing is performed on the processing area.
It is not often the case that characters present uniformly on the whole document image, so that it is not appropriate to determine to end the processing based on a processing result of the block extraction processing of only one line (or one processing area). Therefore, the control unit 33 is made to determine the processing result if it coincides with a processing result of any other line (or any other processing area), so that the possibility of incorrectly judging the document image based on a local processing result in the document image can be reduced.
The condition of the document image can be recognized more clearly by increasing the number of times that a processing result of a line (or a processing area) needs to coincide with that of any other line (or any other processing area) (hereinafter “the number of times of coincidence”), resulting in increasing reliability of a processing result. A user can specify the number of times of coincidence as a processing-result determining condition using the keyboard 6 based on reliability that the user requires, so that a processing result can be determined early while desired reliability is secured, which is advantageous. The control unit 33 determines a processing result of a line (or a processing area) when the number of times of coincidence reaches a specified number of times.
In the case of specifying the number of times of coincidence, a large difference occurs in processing time between the following two cases A and B.
A: a processing result of a first processing line (or a first processing area) coincides with a processing result of the next processing line (or the next processing area)
B: a processing result of the first processing line (or the first processing area) coincides with a processing result of the last processing line (or the last processing area)
Therefore, for surely shortening the processing time, it is also effective that the control unit 33 judges the necessity of performing the predetermined processing on other lines (or other processing areas) according to the condition that processing results of the processing continuously performed on lines (or processing areas) coincide with each other. With this condition, it is sufficient to check the number of times of continuous coincidence, so that the processing result can be determined without waiting processing results of the processing performed after the number of times of continuous coincidence is satisfied.
Even when a user specifies the number of times of continuous coincidence as the processing-result determining condition, a processing result can be determined early while desired reliability is secured.
A pointillist drawing as shown in
The size of the block to be excluded can be set in advance based on the range of the size of a target character desired by a user, and can be proportionally changed according to the resolution. For example, if the user targets only large size characters, the size of the block to be excluded can be set large.
After the block extraction process, blocks that are smaller than the block to be excluded are eliminated as a target for processing.
Furthermore, any one of the above-described image processing methods can be easily embodied by recording a computer program for processing procedures in a general program language in any kind of storage medium such as a flexible disk, a CD-ROM, a DVD-ROM, a magnet optical disc (MO), and the like, and allowing a PC of the image processing apparatus to read the computer program. The computer program can be directly read by PCs of image processing apparatuses 200 and 300 through a network such as the Internet and Intranet as shown in
According to the embodiment and the examples, the image processing is performed every predetermined processing area in an input image as a target for processing. Every time the image processing is performed on a processing area in the input image, it is judged whether the processing end condition to end the image processing on the processing area is satisfied. When a result of the image processing satisfies the processing end condition, the image processing on non-processed processing areas is stopped. Therefore, the processing speed is further improved.
The embodiment and the examples described above are useful in a document processor such as an image forming apparatus and a scanner, and are especially suitable for adapting to an image processing apparatus (a document processor) without large storage capacity.
Although the invention has been described with respect to specific embodiments for a complete and clear disclosure, the appended claims are not to be thus limited but are to be construed as embodying all modifications and alternative constructions that may occur to one skilled in the art that fairly fall within the basic teaching herein set forth.
Number | Date | Country | Kind |
---|---|---|---|
2007-065668 | Mar 2007 | JP | national |
2007-325144 | Dec 2007 | JP | national |