The present invention relates to an image correction apparatus and an image correction method for correcting a slant or meandering of a character row and the like produced in an image obtained by shooting an original such as a document by a hand scanner and the like.
Conventionally, various technologies for performing character recognition by shooting an original such as a document by a scanner and the like and performing OCR (Optical Character Recognition) using the shot image have been proposed.
Specifically, in an apparatus adopting a system in which an user moves a relatively compact scanner such as a hand scanner over the original to shoot an image, depending on the handling of the user, it is difficult to scan the scanning direction at the time of shooting in a constant direction relative to the alignment direction of characters etc. of the original. On this account, sometimes a slant or meandering occurs in the resultant shot image compared to the original. As a result, when the degree of the slant or meandering is large, there is a problem that the characters can not be cut out correctly and the character recognition rate is reduced.
As a method for correcting such slant or meandering of an image, for example, a method for generating a projection profile by making a character image into image data consisting of an aggregate of pixels arranged in a two-dimensional manner, binarizing brightness values with respect to each pixel to form two dimensional binarized image data, setting a large number of scan lines mutually parallel to each pixel to perform scanning, accumulating data representing the character image of the binarized image data with respect to each scan line, and obtaining a distribution in a direction perpendicular to the scan lines with respect to the accumulated value, and for obtaining an amount of rotation correction based on the dispersion value of the projection profile has been proposed (for example, see Publication of Japanese Patent No. 3108979).
However, in the image correction method as described above, since shift is performed in units of character elements, the meandering with respect to each character can be corrected but the distortion of the character element itself can not be corrected, and thus, there is a problem that suitable character recognition can not be performed in the subsequent processing such as OCR.
The invention is objected to provide an image correction method capable of correcting a slant or meandering of a character element row as well as correcting the distortion of the character element itself in light of the problems.
An image correction apparatus of the invention is characterized by including: an image input part to which an image including plural character element rows are input; a row detection part for detecting a predetermined character element row from the plural character element rows; a correction amount calculating part for performing calculation of a position correction amount in a column direction with respect to each pixel column on the predetermined character element row; and a position correction part for correcting a position of each pixel column of the image so as to move it in the column direction based on the position correction amount calculated with respect to each pixel column in a predetermined direction.
According to such constitution, correction is performed on all of the pixel columns constituting the image to move them in the column direction, and thereby, not only the meandering and slant of character element rows can be corrected, but also the distortion of each character element can be corrected.
Further, the row detection part may have a histogram generating part for generating an integrated histogram along a row direction of the image, and detect the longest character element row based on the integrated histogram.
According to such constitution, by the simple processing of calculating the integrated histogram of the image, the predetermined character element row the position correction amount of which should be calculated can be selected while suppressing the load on the computation part.
Furthermore, the row detection part may have a pixel position extracting part for extracting a pixel position where a value of the integrated histogram generated in the histogram generating part becomes the maximum, and detect the longest character element row based on the pixel position.
According to such constitution, the predetermined character element row can be detected by detecting character element row including the pixel position where the value of the integrated histogram becomes the maximum.
Moreover, the row detection part may have a range identifying part for identifying a pixel position range where the value of the integrated histogram falls within a predetermined range as the longest character element row from the pixel position extracted in the pixel position extracting part.
According to such constitution, the predetermined character element row can be identified simply and clearly by determining the predetermined range in advance.
Further, the correction amount calculating part may have an end position detection part for detecting an end position in the column direction with respect to each character element of the predetermined character element row, and calculate the position correction amount based on a displacement amount of the end position with respect to each character element row.
According to such constitution, since the processing for detecting the end position in the column direction with respect to each character element of the predetermined character element row is performed, the load on the computing part can be reduced compared to the case where computation processing is performed with respect to all of the character element rows in the image.
Furthermore, the correction amount calculating part may calculate the displacement amount based on an envelope curve connecting the end positions detected by the end position detection part with respect to each character element.
According to such constitution, the displacement amount can be calculated by the simple processing of calculating the envelope curve on the predetermined character element row with respect to each character element.
Further, the image correction apparatus may include: an image input part to which an image including plural character element rows are input; a histogram generating part for generating an integrated histogram along a row direction of the image; a pixel position extracting part for extracting a pixel position where a value of the integrated histogram generated in the histogram generating part becomes the maximum; a range identifying part for identifying a pixel position range where the value of the integrated histogram falls within a predetermined range as a range of the longest character element row from the pixel position extracted in the pixel position extracting part; an end position detection part for detecting an end position in the column direction in the image with respect to each character element of the longest character element row; a position correction amount calculating part for calculating a displacement amount of the end positions with respect to each character element row based on an envelope curve connecting the end positions detected by the end position detection part with respect to each character element row; and a position correction part for correcting the image with respect to each pixel column so as to move it in the column direction based on the position correction amount.
According to such constitution, not only the meandering and slant of character element rows can be corrected, but also the distortion of each character element can be corrected.
Next, the image correction apparatus of the invention is characterized by including: an image input part to which a first image including plural character element rows are input; an expanded row generating part for generating a second image including plural expanded rows by expanding the first image in a row direction; a starting position detection part for detecting a starting position of the expanded row in the column direction with respect to each pixel column of the second image; a correction amount calculating part for calculating a position correction amount in a column direction with respect to each pixel column of the second image; and a position correction part for correcting a position of each pixel column of the first image so as to move it in the column direction based on the position correction amount.
According to such constitution, since the starting position of the expanded row in the column direction is detected to determine the range of the pixel positions constituting the character element row, there is a lower possibility that the character element rows superpose and the character element rows can be separated with higher accuracy compared to the case where the range of existence of the entire character is detected. Therefore, even when the character element rows are shot slanted to some extent, the slant as well as the meandering can be corrected simultaneously.
Further, the second image may be a binarized image having brightness values expressed by a value of 0 or 1.
According to such constitution, used amount of memory can be reduced, processing can be performed rapidly, and the load on the computing part can be reduced, and thereby, mounting on the portable information equipment and the like can be made easier.
Furthermore, the starting position detection part may perform detection of the starting position of the expanded row in the column direction by, while moving a pixel of interest in the column direction, detecting a brightness value of the pixel of interest with respect to each pixel column, and, when equal to or more than a predetermined number of pixels having brightness values of 0 continue, setting a position where the pixel having the brightness value of 0 is detected for the first time as the starting position.
According to such constitution, since the possibility that the noise information due to dirt or the like is regarded as a character element can be made lower, more suitable image correction can be performed.
Moreover, the starting position detection part may perform detection of the starting position of the expanded row in the column direction with respect to each of plural expanded rows, and the correction amount calculating part may calculate the position correction amount based on an average value of a starting position distribution of each of the plural expanded rows in the column direction.
According to such constitution, the influence by the character such as “j” or “p” protruding more downward than other characters is reduced, and thereby, more suitable image correction can be performed.
Next, the image correction apparatus of the invention is characterized by including: an image input part to which a binarized first image including plural character element rows are input; an expanded row generating part for generating a second image including plural expanded rows by expanding the first image in a row direction; a starting position detection part for, while moving a pixel of interest in a column direction, detecting a brightness value of the pixel of interest with respect to each pixel column of the second image, and, when equal to or more than a predetermined number of pixels having brightness values of 0 continue, detecting a position where the pixel having the brightness value of 0 is detected for the first time as a starting position of the expanded row; a correction amount calculating part for calculating a position correction amount with respect to each pixel column of the second image based on an average value of a starting position distribution of the plural expanded rows in the column direction; and a position correction part for correcting the first image with respect to each pixel column so as to move it in the column direction based on the position correction amount.
According to such constitution, since the starting position of the expanded row is detected to determine the range of the lower end position of the character element row, there is a lower possibility that the character element rows superpose and the character element rows can be separated with higher accuracy compared to the case where the range of existence of the entire character is detected. Therefore, even when the character element rows are shot slanted to some extent, the slant as well as the meandering can be corrected simultaneously.
Next, information equipment and a cellular phone device including the image correction apparatus of the invention may be provided.
According to such constitution, since characters and the like, the slant, meandering, and distortion of character elements of which are image corrected for easy character recognition etc. can be input, especially in the information equipment and cellular phone device equipped with the character recognition such as an OCR function, the accuracy of character reading can be made higher.
Next, the image correction method of the invention is characterized by including: a first step for detecting a predetermined character element row from an image including plural character element rows; a second step for calculating a position correction amount with respect to each pixel column of the predetermined character element row; and a third step for correcting the image with respect to each pixel column so as to move it in the column direction based on the position correction amount.
According to such constitution, correction is performed on all of the pixel columns constituting the image to move them in the column direction, and thereby, not only the meandering and slant of character element rows can be corrected, but also the distortion of each character element can be corrected.
Further, the image correction method of the invention may include: a first step for generating a second image including plural expanded rows by expanding a first image including plural character element rows in a row direction; a second step for detecting a starting position of the expanded row in a column direction with respect to each pixel column of the second image; and a third step for correcting a position of the first image so as to allow starting positions of the expanded row in the column direction to align with each other based on information of the starting positions of the expanded row in the column direction.
According to such constitution, since the starting position of the expanded row in the column direction is detected to determine the range of the lower end positions constituting the character element row, there is a lower possibility that the character element rows superpose and the character element rows can be separated with higher accuracy compared to the case where the range of existence of the entire character is detected. Therefore, even when the character element rows are shot slanted to some extent, the slant as well as the meandering can be corrected simultaneously.
Hereinafter, embodiments of the invention will be described in detail using the drawings.
(First Embodiment)
First, as the first embodiment of the invention, an image correction apparatus and an image correction method of the invention will be described.
As shown in
As image input part 1, a device selected from devices such as an optical device used for the publicly known hand scanner can be used.
Storage means 3 is connected to CPU 2 and, as a storage medium thereof, the publicly known flash memory and the like can be used.
Display part 5 can arbitrarily be selected from the publicly known display devices, for example, an LCD (Liquid Crystal Display), an EL (Electro-Luminescent), a CRT (Cathode Ray Tube), etc.
Next, the processing steps of the image correction method in the first embodiment of the invention will be described according to
First, an image shot in image input part 1 (hereinafter, referred to as “original image”) is loaded (developed) in storage means 3 via the CPU 2 (S1).
An example of original image 10 is shown in
Further, in the embodiment, as image shooting means in image input part 1, a CCD of 256×16 pixels is used. Furthermore, as storage means 3 for image development, a frame memory of the horizontal direction relative to the sheet surface of
Then, CPU 2 executes processing of correcting the slant of the entire image relative to original image 10 stored in storage means 3 (S2). The invention is not for restricting anything with respect to the processing of correcting the slant, but publicly known methods can be used therefor. For example, in JP-A-1-156887, a method for rotating original image 10 by rotating original image 10 to plural angles, calculating a histogram along the row direction, and determining an angle at which the width of the histogram becomes the minimum as an angle to which original image 10 should be rotated is disclosed. Such method can be used, or any other publicly known method may be used.
In
By the comparison between slant corrected image 11 and original image 10, it is seen that the slant of the entire image has been corrected, but the above described meandering remains. As below, a method for correcting meandering of an image in the embodiment will be described.
Turning to
Note that, in the specification, the horizontal pixel alignment in the image data constituting original image 10 is referred to as “horizontal line”, and the vertical pixel alignment is referred to as “vertical line”.
As seen from
Subsequently, CPU 2 calculates the vertical pixel position where the number of black pixels becomes the maximum from the horizontal integrated histogram that has been calculated in the above described step S3, and regards the crest part including the maximum value as the longest character row (S5). In the embodiment, since crest part B includes the vertical pixel position where the number of black pixels becomes the maximum, character row B is regarded as the longest character row.
Further, CPU 2 determines the vertical width of the corresponding character row with respect to crest part B, which has been regarded as the longest (S5). Specifically, a vertical pixel position range where the number of black pixels becomes a predetermined ratio, R % relative to the maximum value (the range shown by W in
By the processing steps heretofore, longest character row range W along the vertical direction for determining an amount to be meandering corrected can be determined.
Next, in the vertical pixel position range determined as longest character row range W, CPU 2 scans slant corrected image 11 shown in
As seen from
Turning to
Then, CPU 2 calculates the amount to be vertically displaced with respect to each vertical line based on the vertical position displacement amount shown in
Thus, according to the image correction method or the image correction apparatus of the invention, since the correction is performed by detecting the black pixel lower end position with respect to each area regarded as one character, forming an envelope curve that connects the adjacent black pixel lower end positions with respect to character element, and performing correction by displacing all of the vertical lines based on the vertical position displacement amount, the distortion of character element itself can be improved.
By the way, in the above description, the example in which the respective processing steps from step S2 to step S8 are realized with software is shown. However, the invention is not limited to that, and at least one step from step S1 to step S8 may be realized with hardware having a function of each step.
In
As described above, according to the image correction apparatus or the image correction method of the embodiment, since the correction is performed with respect to the entire image by detecting the longest character row by horizontal histogram calculation and detecting the displacement amount to be corrected with respect to the character row, the processing can be performed more rapidly compared to the case where the displacement correction is performed with respect to all of the character lines that constitute the image.
Next, a specific example in which the recognition accuracy of the original is improved by the image correction apparatus or the image correction method of the embodiment.
The calculation of the correct reading ratio is performed by performing OCR processing in CPU 2 based on the image stored in storage means 3 and calculating the rate of the resulting correctly recognized characters. As a sample, recognition is performed using 20 business cards at random. The number of characters shot and subjected to OCR is 390 characters of telephone numbers and 1026 characters of mail addresses and URLs.
First, in the case of the telephone numbers in the business cards, relative to the correct reading rate when no correction is performed, about 20% of the correct reading rate can be improved by the image correction apparatus or the image correction method of the embodiment.
Further, with respect to the mail addresses and URLs in the business cards, relative to the correct reading rate when no correction is performed, about 25% of the correct reading rate can be also improved by the image correction apparatus or the image correction method of the embodiment, and higher correct reading rate can be obtained.
Furthermore, with respect to the entire of the telephone numbers, mail addresses and URLs in the business cards, relative to the correct reading rate when no correction is performed, about 23% of the correct reading rate can be also improved when the image correction of the invention is performed. It is conceivable that this is because the meandering of the character rows can be corrected as well as the distortion of the image of the character itself can be corrected according to the image correction apparatus or the image correction method of the invention.
(Second Embodiment)
Next, as the second embodiment of the invention, another example of the image correction apparatus or the image correction method of the invention will be described.
As shown in
Note that, the image information of pixels in the invention refers to various kinds of information such as brightness information, color information, and density information with respect to pixels that constitute the image, and, in the embodiment, the brightness information of pixels is used.
As image input part 101, a device arbitrarily selected from devices such as an optical device used for the publicly known hand scanner can be used.
First storage means 103 and second storage means 104 are frame memory, respectively, and, as a storage medium thereof, the publicly known medium such as a flash memory can be used.
Display part 105 can arbitrarily be selected from the publicly known display devices, for example, an LCD (Liquid Crystal Display), an EL (Electro-Luminescent), a CRT (Cathode Ray Tube), etc.
Next, the processing steps when image correction apparatus 130 in the embodiment of the invention performs image correction will be described according to
First, image shot in image input part 101 (hereinafter, referred to as “original image”) 110 is developed as brightness information as image information of pixels arranged in a two-dimensional manner in first storage means 103 via CPU 102 (S10).
An example of original image 110 is shown in
Note that, in the embodiment, as described above, original image 110 is an aggregate of pixels arranged in a two-dimensional manner and a monochrome image of pixels each having a multilevel (256 levels of gray) brightness value.
Further, in the embodiment, as shooting means of image input part 101, a CCD of 256×16 pixels is used, and, as first storage means 103 for image development, a frame memory of the horizontal direction relative to the sheet surface (lateral)×the vertical direction relative to the sheet surface (longitudinal)=1000×400 pixels in
Then, CPU 102 executes binarization processing of storing either value of 0 (black) or 1 (white) with respect to each pixel as brightness information using the publicly known method on original image 110 stored in first storage means 103 (S11). By the binarization processing, reduction in used amount of memory and speeding up of the processing are made possible, and the load on CPU 102 can be suppressed.
In
Turning to
The horizontal expansion processing will be further described.
In
Subsequently, CPU 12 judges whether the brightness value of the pixel of interest is 0 (black) or not (S22), if the brightness value of the pixel of interest is 0 (black), the brightness value of the pixels in a predetermined range forward and rearward of the corresponding pixel of interest along the processing direction in second storage means 104 is set to 0 (black) (S23). On the other hand, if the brightness value of the pixel of interest is not 0 (is 1), this processing is not performed.
Step S23 will be described using
Turning to
By such processing, horizontally expanded image 13 as shown in
As shown in
Here, turning to
The detection method of the expanded character row will be described using
In
Then, CPU 102 judges whether equal to or more than the predetermined number of the continuous black pixels are detected or not (S32), and, if equal to or more than the predetermined number of the continuous black pixels are detected, the position of the pixel where the black pixel stars for the first time is stored as a starting position of the expanded character row (S34). On the other hand, if less than the predetermined number of the continuous black pixels are detected, the continuation is regarded not as an expanded character row but noise information, and the pixel of interest is shifted (S36) and the processing proceeds to the starting position detection processing of the next expanded character row. Note that, it is practically desirable that the predetermined number includes approximately 20 pixels.
Then, whether the pixel of interest reaches the upper end of the vertical pixel column or not is judged (S35), if it reaches, the processing is ended.
The processing as described above is performed with respect to all of the vertical pixel columns (entire screen) that constitute horizontally expanded image 113. By such processing, a short continuation of black pixels is regarded as noise information, and thereby, only the information of the expanded character rows constituted by character rows can be drawn and processed and the constitution hardly affected by the noise information can be realized.
As described above, a result (histogram) obtained by plotting the vertical positions and the integrated values of the accumulated number of starting positions in the respective vertical positions from the obtained starting points of all of the expanded character rows as a result of performing starting position detection processing of the expanded character rows on all of the vertical pixel columns is shown in
Turning to
Note that, in the step S14, in the case where the area of the crest of the histogram is less than the predetermined value, considered as noise information, the information of the integrated value is neglected. By such system, suitable starting position detection of the expanded character row with less influence of noise information can be performed.
Thus, according to the image correction method and the image correction apparatus of the invention, since the ranges of the lower positions of the characters are not superposed and easily separated by the grouping of the starting positions of the expanded character rows, the separation of the character rows can be performed with high accuracy.
Turning to
In
In
As seen from
By the way, in the above description, the example in which the respective processing steps from step S11 to step S17 are realized with software is shown. However, the invention is not limited to that, and at least one step from step S10 to step S17 may be realized with hardware having a function of each step.
According to the image correction apparatus of the invention, since the average value of the displacement amounts calculated with respect to each character row is used as the displacement amount to be corrected, even when there is a character element protruding downward, for example, such as “j” or “p” in a certain character row, the system in which the row is hardly and adversely affected by the character element can be realized.
Note that the image correction apparatus or the image correction method of the invention is not limited to the constitution described in the embodiment. For example, a system for performing correction on all of the vertical pixel columns constituting the image by referring to only the displacement amount to be corrected, which is calculated with respect to the character row including the largest integrated value based on the histogram of the starting points of the expanded character rows shown in
Image correction is performed by using image correction apparatus 130 or the image correction method in the embodiment of the invention using random 20 business cards as samples. The number of characters subjected to OCR is 390 characters of telephone numbers and 1026 characters of mail addresses and URLs. After the image correction, OCR processing is performed by CPU 102 based on the image stored in first storage means 103, and, as a result of calculating the rate of the correctly recognized characters, in the entire of the telephone numbers, mail addresses and URLs in the business cards, relative to the correct reading rate when no correction is performed, about 25% of the correct reading rate can be improved.
By the way, in the embodiment of the invention, the function of display part 105 is not described specifically, however, by arranging the display part so as to display the acquired original image, the binarized image, etc., or messages such as an error message or an message representing necessary input contents to the user, a user-friendly apparatus constitution can be realized.
Note that, in the first embodiment or the second embodiment of the invention, the example in which the slant and meandering of the respective characters, numerals, etc. on the image read by the image correction apparatus are corrected is described, however, the image correctable by the image correction apparatus of the invention is not limited to that. Needless to add, for example, the image correction apparatus or the image correction method of the invention can correct the slant, meandering, distortion, or the like of the read image with respect to an original in which information such as barcodes and graphics in place of or in addition to characters, numerals and the like (such information is generically named as “character element” in the specification) aligned in a direction on the image.
(Third Embodiment)
Next, as the third embodiment of the invention, an information equipment including the image correction apparatus or the image correction method of the invention will be described.
Since the used amount of memory can be reduced, processing can be performed rapidly, and the load on the computing part (CPU) can be reduced by using the image correction apparatus or the image correction method of the invention, mounting on portable information equipment and the like becomes easier. The example of mounting image correction apparatus 40, 130 on the information equipment such as a cellular phone device is shown in
Cellular phone device 150 has constitution in which image correction apparatus 130 is built-in in the publicly known cellular phone device including antenna part 151, speaker part 152, display part 105 such as an LCD, key part 154, and microphone part 155.
By mounting the image correction apparatus of the invention, a surface for reading information represented by the density of the characters, graphics, etc. of image input part 101 of image correction apparatus 130 is provided on the lower surface of cellular phone device 150, and thereby, very user-friendly image correction apparatus 130 built-in cellular phone device 150 can be provided.
Note that, in the embodiment, an example in which image correction apparatus 130 described in the second embodiment is mounted on cellular phone device 150 has been described, however, as a matter of course, mounting image correction apparatus 40 described in the first embodiment can exert the same effect.
As described above, since document information such as URLs and two-dimensional barcodes can be read and subjected to processing such as OCR by the cellular phone device equipped with the image correction apparatus of the invention, multifunctional information equipment such as a cellular phone device that has never been existed can be provided.
Needless to add, the information equipment here is not limited to the above described cellular phone device, but includes publicly known various kinds of information equipment such as a digital camera, compact personal computer, and PDA (personal digital assistant).
As described above, an image correction apparatus and an image correction method according to the invention has an advantage that the slant or meandering of character rows can be corrected and the distortion of the character element itself can also be corrected, and are useful as an image correction apparatus and an image correction method and the like for correcting the slant or meandering of character rows and the like produced in an image obtained by shooting an original such as a document by a hand scanner and the like.
Number | Date | Country | Kind |
---|---|---|---|
2002-286766 | Sep 2002 | JP | national |
2002-308254 | Oct 2002 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP03/12518 | 9/30/2003 | WO |