Preferred embodiments of the present invention will be described in detail referring to the accompanying drawings.
The operation panel 1 is an interface for receiving instructions from a user and includes an operating unit provided with various switches, buttons, etc. and a display unit for displaying data, images, etc. to be informed to a user.
The image input apparatus 3 is a unit for optically reading an image of a document and includes a light source for emitting light towards the document to be read and an image sensor such as a charge coupled device (CCD). The image input apparatus 3 focuses a reflected image from the document placed at a predetermined reading position on the image sensor and outputs an analog electric signal of RGB colors (R: red, G: green, and B: blue). The analog electric signal outputted from the image input apparatus 3 is transferred to the image processing apparatus 4.
The image processing apparatus 4 converts the analog electric signal received from the image input apparatus 3 into a digital electric signal which is then subjected to appropriate image processing and the resultant image data is dispatched to the image output apparatus 7. The internal arrangement and operation of the image processing apparatus 4 will be described later in more detail.
The image output apparatus 7 is a unit for creating an image on a sheet such as paper and an OHP film based on the image signal received from the image processing apparatus 4. For forming an image to be desired by the user on the sheet through an electrophotographic method, the image output apparatus 7 includes a charger for charging a photoconductive drum to a predetermined potential, a laser writer for emitting a laser beam in response to the image data received from the outside to produce an electrostatic latent image on the photoconductive drum, a developer for applying toner to the electrostatic latent image produced on the photoconductive drum to visualize the image, and a transfer device (not shown) for transferring the toner image produced on the photoconductive drum onto the sheet. The method of forming images is not limited to the electrophotographic method using a laser writer but may be selected from any other applicable image forming methods including ink-jet printing, thermal printing, and sublimate printing.
The internal arrangement of the image processing apparatus 4 will be described next. An AD conversion section 40 is provided for converting an analog signal of RGB colors received from the image input apparatus 3 into a digital signal. A shading correction section 41 is provided for subjecting the digital signal of RGB colors received from the AD conversion section 40 to a process of eliminating various distortions developed in the illuminating system, the image focusing system, and the image sensing system of the image input apparatus 3. After subjected to the process of shading compensation, the RGB signal is dispatched to an input tone correction section 42.
The input tone correction section 42 is provided for conducting a process such as removing page background density and adjusting image quality such as contrast. A segmentation process section 43 is provided for separating and allocating pixels of the input image based on the RGB signal into a character region, a halftone region, and a photograph region. Based on the result of the separation, the segmentation process section 43 produces and dispatches a segmentation class signal indicative of the region assigned with pixels to a black generation and under color removal section 46, a spatial filter process section 47, and a tone reproduction process section 49 connected downstream while transferring the received RGB signal directly to a document matching process section 44 connected downstream.
The document matching process section 44 examines whether or not the image (input image) of the input signal is similar to a stored image (referred to as a stored format hereinafter) and if so, examines whether or not the input image is an image formed by writing in the stored format. When the input image is the image formed by writing in the stored format, the region corresponding to the writing-in is extracted and stored in association with the stored format.
A color correction section 45 is provided for removing color impurity through checking the spectral characteristic of C, M, and Y components including useless absorption components in order to reproduce the colors with high fidelity. The color-corrected RGB signal is then transferred to the black generation and under color removal section 46. The black generation and under color removal section 46 is provided for producing a K (black) signal from the color-corrected C, M, and Y components of the primary color signal and subtracting the K signal from the original C, M, and Y signals to generate a new set of C, M, and Y signals. More particularly, this process generates four, i.e., C, M, Y, and K signals from the three original color signal.
The generation of black signal may be conducted by a skeleton black method. Assuming that the input/output characteristic of a skeleton curve is y=f(x), the input data are C, M, and Y, the output data are C′, M′, Y′, and K′, and the UCR (under color removal) rate is a (where 0<α<1), the process of black generation and under color removal is expressed by the next expressions.
K′=F{min(C,M,Y)}
C′=C−αK′
M′=M−αK′
Y′=Y−αK′
The spatial filter process section 47 is provided for subjecting the C, M, Y, and K signals of image data received from the black generation and under color removal section 46 to an spatial filtering process with the use of a digital filter based on the segmentation class signal thus to correct the spatial frequency characteristic for eliminating any blur or granular deterioration in the output image.
For example, when the image is separated into a character region by the segmentation process section 43, for improving the reproducibility of achromatic text or chromatic text, the data thereof is subjected to the spatial filtering process of the spatial filter process section 47 for sharpening or increasing the degree of enhancement for high frequencies. This is followed by the tone reproduction process section 49 performing binarization or multivaluing process with a high-resolution screen suitable for reproduction of the high frequencies. When the image is separated into a halftone region by the segmentation process section 43, the data thereof is subjected to a low-pass filtering process of the spatial filter process section 47 for removing input halftone components. This is followed by an output tone correction process of the output tone correction section 48 for converting signals including density signals to a halftone area rate signal which represents a characteristic of a color image display apparatus and then a gradation reproducing process of the tone reproduction process section 49 for finally separating the image into pixels to reproduce respective gray level thereof. When the image is separated into a photograph region by the segmentation process section 43, the data thereof is subjected to the binarization or multivaluing process with a screen specified for reproduction of the gradation.
The image data on which the above described processes have been performed is temporarily stored in a memory unit (not shown), read out from the same at a desired timing and transferred to the image output apparatus 7.
The controller 440 may be a CPU for controlling the action of each units of hardware described above. The feature point calculating section 441 is provided for extracting a connected region from a character string or ruled lines in the input image and calculating the centroid in the connected region as a feature point. The features calculating section 442 is provided for calculating features which remain unchanged regardless of rotation, enlargement, and reduction based on the feature point calculated by the feature point calculating section 441. The vote processing section 443 is provided for voting a stored format based on the features calculated by the features calculating section 442. The similarity determining section 444 is provided for determining from the voting result whether or not the input image is similar to the stored format.
The written region extracting section 445 is provided for, when it is determined that the input image is similar to the stored format, extracting a character string and a picture written in the stored format from the input image. The stored process controlling section 446 is provided for, when it is determined that the input image is similar to the stored format, assigning an ID with the stored format and supplying the encoding/decoding process section 447 with each extracted region of the image data. When the input data is not similar to the stored format, a message is created and displayed on the operation panel 1 for prompting storing the input image as a stored format.
The encoding/decoding process section 447 encodes the image data extracted by the written region extracting section 445, using a method such as Modified Huffman (MH), Modified Read (MR), Modified Modified Read (MMR), or Joint Photographic Experts Group (JPEG). The encoding method of MH includes encoding the run length of white run and black run in each line into a Huffman format and adding to the end of one-line encoded data a line sync signal EOL. The encoding method of MR is a modification of MH for encoding the data of interest correlated with the data in the previous line to increase the data encoding ratio. More specifically, the data in the first line is encoded by the MH method and the data in the second line to the k-th line is encoded using correlation with the previous line. The data of the (k+1)th line is then encoded by the MH method. These steps are repeated. The encoding method of MMR is a modification of MR where k=∞ for encoding constantly using the correlation with the previous line. The method of JPEG includes diving the image into blocks of a desired size and converting each block into a spatial frequency region level by discrete cosine transform. The converted data is then quantized for reduction of data size and subjected to an entropy coding using Huffman code. The resultant encoded data is stored in a memory 449. When the encoded data is read out from the memory 449, it is subjected to decoding by the encoding/decoding process section 447. The combine process section 448 combines the decoded image data with the stored format.
The processing performed by the document matching process section 44 will be described in more detail.
The achromatizing section 4410 is provided for, when the input data is a color image, achromatizing and converting the input data into a brightness signal or a luminance signal. For example, the luminance signal can be obtained from the next expression.
Yj=0.30Rj+0.59Gj+0.11Bj
Where Yj is the luminance of each pixel and Rj, Gj, and Bj are three color component values of each pixel. Alternatively, the RGB signal may be converted into CIE1976L*a*b* signals (CIE: Commission International de l'Eclairage, L*: Lightness, a*, b*: chroma).
The resolution converting section 4411 is provided for, when the input data has optically been multiplied at the image input apparatus 3, multiplying again to have a desired resolution. The resolution converting section 4411 also conducts a resolution converting process to reduce the resolution to be lower than that when the image is read at the same magnification by the image input apparatus 3, for easing the amount of processing downstream. For example, the image data read at 600 dpi (dot per inch) is converted to the image data at 300 dpi.
The filtering section 4412 is provided for absorbing a difference in the spatial frequency characteristic among the different models of the image input apparatus. The image signal produced by a CCD may contain a fault such as blur which caused by optical components including lenses and mirrors, aperture at the photosensitive surface of the CCD, transfer efficiency, residual images, scanning error or integration effects during the physical scanning, and the like. The filtering section 4412 subjects the input data to a proper filtering process (the enhancing process) to correct the blur caused by degradation of MTF. Also, it can suppress unwanted high frequency components which interfere with the processing downstream. Namely, a filter is used for performing the enhancing and smoothing processes.
The binarizing section 4413 is provided for producing from the achromatic image data binary image data suitable for calculation of the centroid.
The centroid calculating section 4414 is provided for calculating and outputting, as a feature point, a centroid in connected components in the binarized data to the features calculating section 442. The calculation of the centroid may be carried out by a known method. More specifically, the method includes labeling each pixel in the binarized data of the binary image, specifying the connected region defined by the pixels assigned with the same label, and calculating the centroid in the specified connected region as a feature point.
The calculation of the features will be described. The features calculating section 442 calculates the features of an image from a plurality of feature points determined by the feature point calculating section 441. More particularly, one of the feature points is selected as a current feature point and four of the feature points which are close to the current feature point are then selected as surrounding feature points.
The similar processing is performed when any of the feature points P1, P2, P5, and P6 is selected as the current feature point. As changing the current feature point one by one, the features calculating section 442 calculates the invariant Hij (i=1, 2, . . . , 6 and j=1, 2, 3) when the current feature points are at P1, P2, . . . P6.
The features calculating section 442 calculates features (a hash value Hi) from the invariant calculated based on the current feature point. When the current feature point is at Pi, the hash value Hi is expressed by Hi=(Hi1×102+Hi2×101+Hi3×100)/E, where i is positive integer and expresses the number of the feature points and also E is a constant determined depending on the margin. For example, the margin ranges from 0 to 9 when E=10. The hash value can thus fall in the range.
Another calculating method for an invariant about the current feature point may be comprising the steps of selecting four combinations based on the four surrounding feature points P1, P2, P3, P4 around the current feature point P3 as shown in
The features are not limited to the hash value and may be calculated using any other hash function. The peripheral points are selected, but not limited to, four in the embodiment, and they may be six. In the latter case, five out of the six feature points are extracted in six different combinations. When three are extracted from the five points to determine the invariant from which the hash value can be calculated.
The image data stored as the stored format in the memory 449 is thus correlated with the hash value calculated in the foregoing manner.
The vote processing section 443 is provided for retrieving the hash table corresponding to the hash value (the features) calculated by the features calculating section 442 and voting for the document of the index stored in the table. The voting result is stored, specifying which of the feature points in the stored format is voted by the feature point of interest in the input image.
The similarity determining section 444 is provided for determining the similarity of the input image based on the voting result received from the vote processing section 443 and notifying the controller 440 of a result of the judgment. The similarity determining section 444 compares the number of votes received from the vote processing section 443 with a predetermined threshold and, when the number of votes is equal to or larger than the threshold, determines that the input image is similar to the stored format. When the number of votes is smaller than the threshold, the similarity determining section 444 determines that no similar document is present and notifies the controller 440 of a result of the determination.
The foregoing method is illustrative and may be conducted through any other proper processes such as dividing the number of votes by the maximum number of votes (the number of feature points to be calculated) for each document and normalizing the result thereof, and carrying out the comparison and the determination.
The sequence of processing will be described for reading the image of a document of a predetermined format such as a ledger sheet with the image input apparatus 3 and subjecting the read image to the collating processing of the document matching process section 44.
The document matching process section 44 calculates the features in the input image from the feature point (Step S12). More specifically, the features calculating section 422 in the document matching process section 44 selects one of the feature points as the current feature point, calculates based on the current feature point and its surrounding feature points the invariant which remains unchanged during the movement or rotation of the input image, and determines the features in the input image from the invariant.
The document matching process section 44 retrieves the hash table as shown in
The document matching process section 44 determines the similarity between the input image and the stored format from the result of voting in Step S13 (Step S14) and examines whether or not the input image is similar to the stored format (Step S15). More specifically, the number of votes for the stored format listed in the hash table is compared with the predetermined threshold. When the number of votes is equal to or larger than the threshold, it is determined that the input image is similar to the stored format. When the number of votes is smaller than the threshold, it is determined that the input image is dissimilar to the stored format.
When determining that the input image is similar to the stored format (yes in S15), the document matching process section 44 extracts the region where the stored format is written in (Step S16). The process of extracting the written region will be described later in more detail.
The written region determined in the process of extracting the written region is then subjected to a encoding process (Step S17) and, accompanied by a form ID which indicates the correlation with the stored format, the encoded region is stored in the memory 449 (Step S18). When determining that the input image is not similar to the stored format (no in S15), the document matching process section 44 displays on the operation panel 1 a message for storing a format (Step S19).
Assuming that Pin is a matrix created using the coordinates of the feature points in the stored format, Pout is a matrix created using the coordinates of the feature points in the input image, and A is a transform matrix between the two matrixes Pin and Pout, the relationship of the coordinates between the stored format and the input image is expressed by the next expression.
As the matrix Pin is not a square matrix, it is multiplied at both sides by the transposed matrix PinT of Pin and then by an inverse matrix of PinTPin to obtain a transform matrix A.
A=(PinTPin)−PinTPout
Then, the relationship between the coordinates (x′, y′) in the input image and the coordinates (x, y) in the stored format is expressed by the next expression.
(x′,y′,1)=(x,y,1)×A
The transformation of the coordinates with the transform matrix A is used for determining a region to be extracted from the input image. The process of extracting the region from the image of, for example, a ledger sheet produced using the stored format will be described.
When the character strings are read and extracted as images from, e.g., the name blank, the address blank, and the remarks blank, the rectangular regions denoted by the hatching in
When the coordinate system of the input image is converted into the coordinate system of the stored format using the inverse matrix of the transform matrix A, a difference between the input image and the stored format is calculated for each region to be extracted (Step S22). In the case where the image data is specified in 256 gray levels, the difference is taken such that, when the pixel values are varied within a small range of about 5 to 10, they are regarded as being identical, in consideration with the reproducibility of the pixel values at the time of reading the document.
Then, the ratio of pixels which have been determined to be identical is calculated with respect to the pixels of the regions in the stored format (Step S23) and the examination follows whether or not the ratio is smaller than the threshold THwr (0.99 for example) (Step 24). When the ratio is smaller than the threshold THwr (yes in Step S24), it is determined that something has been written in (Step S25).
When the radio is equal to or larger than the threshold THwr (no in Step S24), it is then determined that no writing-in has been made (Step S26).
It is then examined whether or not the above steps have been completed for each of the regions extracted (Step S27). When it is determined that they have not been completed (no in Step S27), the procedure returns back to Step S22. When it is determined that they have been completed (yes in Step S27), the procedure of this flowchart is terminated.
For reading the image data from the memory 449, the user selects the image data consisting of a character string to be handled. Alternatively, the image data of each character string is associated with a keyword in advance, and the data may be retrieved using the keyword and displayed in thumbnails or in a sequential manner for permitting the selection. Since the image data of the character string is also correlated with the stored format and the form ID, the coordinates of each region are examined to produce a composite image in combination with the corresponding stored format. At this time, an editing process in which a specific region (e.g., the name blank) is not outputted, for example, may be performed. The editing process may be carried out using the operation panel 1 which is provided with an editing mode and adapted to display the content of the process on the display for permitting the selection through touching the panel.
Although in the first embodiment, the similarity between the input image and the stored image (stored format) is determined first and a desired region is extracted when it is determined that the two images are similar, the input image may first be subjected to the process of extracting the desired region before the similarity between the extracted region and that in the stored format is determined. In particular, the second embodiment includes reading of the image of a ledger sheet containing characters and ruled lines, extracting of the ruled lines from the image, and examining whether or not the extracted ruled lines are similar to those in a specific format stored in advance (referred to as a stored format hereinafter).
The operation panel 1 is an interface for receiving instructions from a user and includes an operating unit provided with various switches and buttons and a display unit for displaying data, images, etc. to be informed to the user.
The image input apparatus 3 is a unit for optically reading an image of a document and includes a light source for emitting light towards the document to be read and an image sensor such as a charge coupled device (CCD). The image input apparatus 3 focuses a reflected image from the document placed at a predetermined reading position on the image sensor and outputs an analog electric signal of RGB colors (R: red, G: green, and B: blue). The analog electric signal outputted from the image input apparatus 3 is transferred to the image processing apparatus 5. In the present embodiment, a ledger sheet is set as the document.
The image processing apparatus 5 converts the analog electric signal received from the image input apparatus 3 into a digital electric signal, which is then subjected to appropriate image processing and the resultant image data is dispatched to the image output apparatus 7. The internal arrangement and operation of the image processing apparatus 5 will be described later in more detail.
The image output apparatus 7 is a unit for creating an image on a sheet such as paper or an OHP film based on the image signal received from the image processing apparatus 5. For creating a desired image on the sheet by an electrophotographic method, the image output apparatus 7 includes a charger for charging a photoconductive drum to a predetermined potential, a laser writer for emitting a laser beam in response to the image data received from the outside to produce an electrostatic latent image on the photoconductive drum, a developer for applying toner to the electrostatic latent image produced on the surface of the photoconductive drum to visualize the image, and a transfer device (not shown) for transferring the toner image formed on the surface of the photoconductive drum onto the sheet. The method of forming images is not limited to the electrophotographic method using a laser writer but may be selected from any other applicable image forming methods including ink-jet printing, thermal printing, and sublimate printing.
The internal arrangement of the image processing apparatus will be described next. An AD conversion section 51 is provided for converting an analog signal of RGB colors received from the image input apparatus 3 into a digital signal. A shading correction section 52 is provided for subjecting the digital signal of RGB colors received from the AD conversion section 51 to a process of eliminating various distortions developed in the illuminating system, the image focusing system, and the image sensing system of the image input apparatus 3. After subjected to the shading compensation process, the RGB signal is dispatched to a filing process section 50 and an document type discrimination section 53.
The filing process section 50 is provided for extracting a ruled line and a character from the input image and saving the extracted ruled line and character in association with each other. When the extracted ruled line is of a specific format which has been stored, it is not stored but the character image is stored together with an identifier for identifying the specific format (referred to as a form ID hereinafter). When the format of the extracted ruled line is not stored, the character image is stored in association with a form ID which is newly given for the format of the ruled line.
The document type discrimination section 53 is provided for converting the RGB signal (reflectance signal of RGB), which has been subjected to the processing by the shading correction section 52 for eliminating various distortions and adjusting the color balance, into a signal such as a density signal which can easily be handled by the image processing system commonly installed in a color image processing apparatus while discriminating the type of the document. The discrimination of the type of the document may be conducted by a known technique.
The input tone correction section 54 is provided for conducting image quality adjusting processes such as removing page background density and adjusting contrast. A segmentation process section 55 is provided for separating and allocating each pixels of the input image based on the RGB signal into a character region, a halftone region, and a photograph region. Based on the result of the separation, the segmentation process section 55 produces and dispatches a segmentation class signal indicative of the region assigned with each pixel to a black generation and under color removal section 58, a spatial filter process section 59, and a tone reproduction process section 61 connected downstream while transferring the signal from the input tone correction section 54 directly to a color correction section 56 connected downstream.
The color correction section 56 is provided for removing color impurity through checking the spectral characteristic of C, M, and Y components including useless absorption component in order to reproduce the colors with high fidelity. The color-corrected RGB signal is then transferred to a scaling section 57. The scaling section 57 is provided for enlargement or reduction of the size of the image in response to an instruction signal received from the operation panel 1.
The black generation and under color removal section 58 is provided for producing a K (black) signal from the color-corrected C, M, and Y components of the primary color signal and subtracting the is K signal from the original C, M, and Y signals to generate a new set of C, M, and Y signals. More particularly, this process generates four, i.e., C, M, Y, and K signals from the three original color signals.
The generation of black signal may be conducted by a skeleton black method. Assuming that the input/output characteristic of a skeleton curve is y=f(x), the input data are C, M, and Y, the output data are C′, M′, Y′, and K′, and the UCR (under color removal) rate is a (where 0<α<1), the process of black generation and under color removal is expressed by the next expressions.
K′=f{min(C,M,Y)}
C′=C−αK′
M′=M−αK′
Y=Y−αK′
The spatial filter process section 59 is provided for subjecting the C, M, Y, and K signals of image data received from the black generation and under color removal section 58 to a spatial filtering process with the use of a digital filter based on the segmentation class signal thus to correct the spatial frequency characteristic for eliminating any blur or granular deterioration in the output image.
For example, when the image is separated into the character region by the segmentation process section 55, for improving the reproducibility of achromatic text or chromatic text, the data thereof is subjected to the spatial filtering process performed by the spatial filter process section 59 for sharpening or increasing the degree of enhancement for high frequencies. This is followed by the tone reproduction process section 61 performing a binarization or a multivaluing process with a high-resolution screen suitable for reproduction of the high frequencies. When the image is separated into the halftone region by the segmentation process section 55, the data thereof is subjected to a low-pass filtering process in the spatial filter process section 59 for extracting input halftone components. This is followed by an output tone correcting process by the output tone correction section 60 for converting a signal such as a density signal to a halftone area rate signal which represents a characteristic of a color image display apparatus and then a gradation reproducing process by the tone reproduction process section 61 for finally separating the image into pixels to reproduce respective gray level thereof. When the image is separated into the photograph region by the segmentation process section 55, the data thereof is subjected to the binarization or multivaluing process with a screen specified for reproduction of the gradation.
The image data on which the above described processes have been performed is temporarily stored in a memory means (not shown) and read out from the same at a desired timing and transferred to the image output apparatus 7.
The controller 500 may be a CPU for controlling the processing of each unit of hardware described above.
The binarizing section 501 is provided for producing a binary input image from the input image. More specifically, the input image of the RGB signal is converted into a monochrome image. This converting process may be carried out by the next expression.
L=0.299×R+0.587×G+0.114×B
The binary input image used in the ruled line extracting section 502 and subsequent units is produced from the monochrome image produced by the converting process. An exemplary process of producing the binary input image will be described. As one line in the input image is selected as a current line to be binarized, the pixels thereof are divided into groups. The pixels divided in the groups determine a mask size. When the pixels in each group are 128, the mask size is expressed as 128 pixels by 128 lines. Then, the average in the mask size is calculated for determining a threshold to be used for the binarizing process. The binarizing process is conducted through comparing the pixels along the current line with the threshold, thus producing a binary form of the input image.
The ruled line extracting section 502 is provided for extracting the ruled line from the image data scanned and producing an image of the ruled line. The process of extracting the ruled line may be implemented by a method disclosed in Japanese Patent Application Laid-Open No. 1-214934. The disclosed method includes separating the image data into strips of a predetermined width and projecting each strip in the vertical direction. A portion of the ruled line is then extracted from the projected data. After such data is extracted from each strip, a candidate which overlaps with a candidate of interest at the largest extension is selected from a neighboring strip and they are connected with each other as a candidate of the same ruled line. By repeating, throughout the strips, this process of determining the candidate which overlaps with the connected candidates at the largest extension from the neighboring strip, a group of the connected candidates which may constitute the same ruled line is determined. As the projection in the horizontal direction of the group of connected candidates is defined as a partial projection which is at a right angle to the projection of the strips in the vertical direction, the coordinates at both ends of the ruled line is determined to obtain the ruled line. As a group of the ruled lines thus obtained is joined or combined together, the final group of the ruled lines is determined. Not only the horizontal ruled line but also the vertical ruled line can be extracted in the same manner.
The matching process section 503 is provided for comparing the input image of the ruled lines with the specific format which has been stored. When the specific format has been stored, a form ID set for each stored format is obtained. When the specific format is not stored, the data of the ruled lines is assigned as a specific format to be stored and a new form ID is set therefor.
The action of collating the specific format may be implemented by a method disclosed in Japanese Patent Application Laid-Open No. 8-255236. The disclose method includes raster scanning and edge extracting the image of ruled lines to detect the start point of tracing in contour definition and tracing a closed curve of the boundary of the figure clockwise or counter-clockwise from the start point. The contour data determined by tracing the closed curve is then stored as a coordinate point string. Using the contour data, the feature points in the image including intersections and corners are determined, and a frame is determined from a combination of the point strings. Then, a circumscribed figure of the input frame data is calculated.
Next, the coordinates at the center point in each frame are calculated. It is assumed that the coordinates at the four corners of the frame are expressed by, from the lower left, (x0, y0), (x1, y1), (x2, y2), and (x3, y3) while the intersection of the two diagonal lines is at (cx, cy). A difference between the coordinates at the upper left and the coordinates at the upper left in the input image is denoted by (dx, dy), and the coordinates at the center point in the frame in either the stored format or the input image of the ruled lines are corrected. The frame data is then correlated between the image and the format. For example, when the coordinates at the center in the frame in the stored format and in the image of the ruled lines are denoted by (tcx, tcy) and (icx, icy) respectively, the distance D is calculated by
D=(icx−tcx)2+(icy−tcy)2
The voting on similarity is performed when the frame is found corresponding to the frame in the stored format (D<dth, dth being the threshold of the distance). When the voting is completed for all the frames in the stored format, the division by n (the number of frames) is performed. More particularly, the similarity is determined by the number of corresponding frames/the number of frames in the ledger sheet. It is then determined from the similarity whether the image of the ruled lines is stored as a specific format.
The character string extracting section 504 is provided for producing an image of characters from the binary input image and the image of the ruled lines. The process of extracting the character string may be implemented by a technique of determining an exclusive OR of the binary input image and the image of the ruled lines extracted by the ruled line extracting section 502. Since the exclusive OR is calculated, not the common ruled lines present in both the binary input image and the image of the ruled lines but the characters can be extracted.
The storing process controlling section 505 is provided for determining whether or not the ruled line image is stored in the memory 507 and correlating the image of characters with a specific format. When the image of the ruled lines extracted by the ruled line extracting section 502 is of the specific format, the storing process controlling section 505 does not permit saving of the image of the ruled lines extracted. Meanwhile, the image of the characters extracted by the character string extracting section 504 is correlated with the specific format and stored in the memory 507.
Alternatively, when the image of the ruled lines extracted by the ruled line extracting section 502 is not of the specific format, the storing process controlling section 505 permits the image of the ruled lines extracted to be stored in relation to another specific format. The image of the ruled lines is assigned with a form ID and then stored in the memory 507. Simultaneously, the image of the characters extracted by the character string extracting section 504 is assigned with another specific format of the ruled lines and stored in the memory 507.
The encoding/decoding process section 506 is provided for encoding the image data to be stored in the memory 507 and decoding the images of the ruled lines and characters in the encoded form retrieved from the memory 507. More particularly, when determined by the storing process controlling section 505 to be stored in the memory 507, the images of the ruled lines and characters are encoded and stored in the memory 507. The method of encoding may be selected from MH, MR, and MMR techniques. The image of the ruled lines and the image of the characters may be encoded separately by different encoding methods.
The method of decoding the images of the ruled lines and characters stored in the encoded form is the reverse of the encoding method. Although both the image of the ruled lines and the image of the characters are subjected to the encoding in this embodiment, one of them may be encoded.
The controller 500 controls the processing of reading the images of the ruled lines and characters of the specific format stored separately in the memory 507. For, e.g., producing a ledger sheet from the image data stored in the memory 507, a group of the character images are read and displayed on the operation panel 1 for allowing the user to select a desired one of the images. Then, the form ID associated with the selected image is retrieved for reading the ruled lines and characters of the selected image from the memory 507. The image of the ruled lines and the image of the characters are transferred to the image data combining section 508.
The images of the characters may be correlated with keywords respectively, and the result of search using the keywords may be displayed in thumbnails or in sequence for selection.
The image data combining section 508 is provided for combing the image of the ruled lines and the image of the characters read out from the memory 507 and outputting the image data (of the RGB signal) formed by combining the same to the document type discrimination section 53. The image data is subjected to the processes in the document type discrimination section 53 to the tone reproduction process section 61 and then transferred to the image output apparatus 7 where it is printed on a sheet of paper to produce the ledger sheet.
The processes of the image processing system according to this embodiment will be described.
Similarly, when the image of another ledger sheet 20 shown in
Furthermore, when the image of a further ledger sheet 30 shown in
A procedure of carrying out the processes in the image processing system will now be described.
When received by the image processing apparatus 5, the analog RGB signal is transferred via the AD conversion section 51 and the shading correction section 52 to the filing process section 50. The binarizing section 501 in the filing process section 50 then generates a binary image from the input image (Step S32). The ruled line extracting section 502 extracts an image of ruled lines from the binary image produced by the binarizing section 501 (Step S33).
The matching process section 503 accesses a list of the ruled line images stored in the memory 507 and examines the similarity between the extracted ruled line image and stored ruled images (Step S34). It is then examined whether or not the ruled line image extracted in Step S33 has been stored as a specific format (Step S35).
When it is determined that the ruled line image has not been stored as a specific format (no in Step S35), the ruled line image is assigned with a new form ID (Step S36). The character string extracting section 504 calculates an exclusive OR of the input image and the ruled line image to extract character strings contained in the input image and generate an image of characters, i.e. text image (Step S37).
The storing process controlling section 505 stores the ruled line image assigned with the new form ID as a specific format in the memory 507 (Step S38). More particularly, the ruled line image is stored in the memory 507 after having been encoded by the encoding/decoding process section 506.
Also, the storing process controlling section 505 assigns the character image with a form ID corresponding to the form ID assigned to the ruled line image for correlating the ruled line image with the text image (Step S39). The text image is encoded by the encoding/decoding process section 506 and stored in memory 507 (Step S40).
On the other hand, when it is determined that the ruled line image extracted has been stored as a specific format (yes in Step S35), the form ID assigned to the specific format is obtained (Step S41). The character string extracting section 504 calculates an exclusive OR of the input image and the ruled line image to extract character strings contained in the input image and generate a text image (Step S42).
The storing process controlling section 505 assigns the character image with a form ID corresponding to the form ID assigned to the ruled line image for correlating the ruled line image with the text image (Step S39). The text image is encoded by the encoding/decoding process section 506 and stored in the memory 507 (Step S40).
Although each of the processes is carried out with hardware according to both the first and second embodiments, it may be conducted by a computer that executes a set of computer programs (including an execution program, an intermediate code program, and a source program).
The RAM 104 is a volatile memory for temporarily saving the control program and a variety of data produced during the execution of the computer programs for conducting the above described processes. The hard disk drive 105 is a storage unit having a magnetic recording medium where the program codes of the computer program and the like are stored. The external storage 106 includes a reader for reading the program codes from the recording medium M for conducting the above described processes. The recording medium M may be a flexible disk (FD) or a CD-ROM, for example. The program codes read by the external storage 106 are stored in the hard disk drive 105. The CPU 101 loads the RAM 104 with the program codes received from the hard disk drive 105 for use in conducting and controlling the processing of the entire apparatus as described in the first embodiment for correlating the ruled line image with the character image to be stored and storing the two images in the hard disk drive 105.
The input section 107 functions as an interface for receiving the image data from the outside. The input section 107 may thus be connected with a color scanner. The display section 108 functions as an interface for displaying image data to be processed, image data being processed, processed image data, and other data, The display section 108 may be connected with an external display such as a liquid crystal display for displaying the image data. Alternatively, the display section 108 may have a built-in display provided for displaying the image data. The communication port 109 is an interface for communicating with an external printer 150. For printing the processed image data with the printer 150, the image processing apparatus 100 produces from the image data print data to be decoded by the printer 150 and dispatches the same to the printer 150.
Although the various calculations are performed by the CPU 101 in the present embodiment, it may be conducted by a dedicated chip for performing calculation relating to image processing in response to commands from the CPU 101.
The recording medium M for saving the program codes of the computer program is not limited to the floppy disk or CD-ROM but may be selected from optical disks such as an MO, an MD, or a DVD, magnetic recording media such as a hard disk, card-type recording media such as an IC card, a memory card, or an optical card, and semiconductor memories such as a mask ROM, an EPROM (erasable programmable read only memory), an EEPROM (electrically erasable programmable read only memory), or a flash ROM. Also, the system may be connected with communication networks including the Internet for downloading the program codes of the computer program through a desired communication network to carry out the above described processes. Moreover, the program codes to be employed may be received electronically in the form of computer data signals buried in carriers.
The computer program may be provided as a single application program, a utility program, or a part of a composite program consisting of an application program and a utility program.
As this invention may be embodied in several forms without departing from the spirit of essential characteristics thereof, the present embodiments are therefore illustrative and not restrictive, since the scope of the invention is defined by the appended claims rather than by the description preceding them, and all changes that fall within metes and bounds of the claims, or equivalence of such metes and bounds thereof are therefore intended to be embraced by the claims.
Number | Date | Country | Kind |
---|---|---|---|
2006-212348 | Aug 2006 | JP | national |
2006-278954 | Oct 2006 | JP | national |