This application claims priority from Japanese Patent Application No. JP2004-274393 filed on Sep. 22, 2004, which is incorporated hereinto by reference.
1. Field of the Invention
The present invention relates to an image reading apparatus, an image processing apparatus, and an image forming apparatus featuring an image extracting function.
2. Description of the Related Art
Up to now, in order to extract and print a target page from documents or images configured as a plurality of pages which are stored in a storage apparatus, it has been necessary to input a page number by a keyboard or extracting a requested image from minimized images displayed as a list.
For example, Japanese Patent Application Open to Public Inspection No. H05-73624 discloses an image information processing apparatus capable of printing a target page by reading and analyzing an image marked on an index sheet. Such index sheet is a sheet including minimized images in plural pages. A user places a mark on the target page to be printed.
Character recognition technology for extracting a specific character in an image has become widely used. For example, Japanese Patent Application Open to Public Inspection No. JP2001-306554 discloses an image processing method and a print processing apparatus which automatically performs character conversion and color conversion of a extracted character string which has been extracted from an image obtained by applying an optically reading operation via the character recognition technology.
In the case of extracting the target image for printing by placing a mark on the printed index sheet including the plural pages of minimized images, since the user has to search for a target page from a number of small images printed on an index sheet, which are difficult to distinguish and to place a mark on it, the workload on the user is increased.
In the apparatus using a character recognition technology, since it just extracts and displays a character string, or modifies an extracted character string, in order to selectively print a page which includes the target character string, the apparatus must confirm the page which includes the extracted character string by the character recognition technology and the user had to specify the page for printing.
The present invention was achieved to solve the problems described above and provide an image reading apparatus, an image processing apparatus or an image forming apparatus capable of extracting merely a specific target page from a plurality of pages of documents with less workload on a user.
These and other objects are attained by an image reading apparatus comprises a reading section to read an original document having plural pages, and generate plural page data corresponding to the pages, a judging section to determine whether the page data includes at least one of a predetermined character, a predetermined symbol or predetermined attribution information, and an extracting section to extract a page which is determined to include at least one of the predetermined character, the predetermined symbol or the predetermined attribution information by the judging section.
And the above objects are attained by an image processing apparatus comprises a judging section to determine whether each page data of the plural page data corresponding to the plural pages includes at least one of a predetermined character, a predetermined symbol or a predetermined attribution information, and an extracting section to extract a page which is determined to include at least one of the predetermined character, the predetermined symbol or the predetermined attribution information by the judging section.
And, the above objects are attained by an image forming apparatus comprises a reading section to read an original document having plural pages, and to generate page data corresponding to the plural pages, a printing section to print a page of the plural pages based on the page data, a judging section to determine whether the page data includes at least one of a predetermined character, a predetermined symbol or predetermined attribution information exists in the page data, and an extracting section to extract a page corresponding to the page data includes at least one of the predetermined character, the predetermined symbol and the predetermined attribution information, and outputs the page data corresponding to the page being extracted to the printing section.
Further, the above objects are attained by an image forming apparatus comprises a printing section to print page data, a judging section to determine whether the page data includes at least one of a predetermined character, a predetermined symbol or predetermined attribution information exists in the page data, and an extracting section to extract a page which is determined to include at least one of the predetermined character, the predetermined symbol or the predetermined attribution information by the judging section, and to allow the printing section to print the page data based on the page being extracted.
According to the image reading apparatus, image processing apparatus and image forming apparatus of the present invention, since these apparatuses extract a page including a predetermined a character, a symbol or attribution information from plural pages of a document, an identification process based on a page by page sequence becomes possible at a lowered workload. Since it is not necessary for a user to confirm the page which includes an extracted character string and to conduct an operation to specify the page as the extracting target page, which is different from the case of extracting merely a character string.
The invention itself, together with further objects and attendant advantages, will best be understood by reference to the following detailed description taken in conjunction with the accompanying drawings.
The first embodiment of the preferred embodiments will be described below based on drawings.
Reading section 11 comprises a light source which irradiates a document, a line sensor which reads each line in across the width of the page, a moving device which moves the reading position line by line in the longitudinal direction of the document, and an optical path configured of a lens and a mirror for forming an image by guiding a reflected light from the document into an image sensor. Analog image signals outputted from the line sensor are converted into digital image signals (A/D conversion). Reading section 11 includes an auto document feeder which continuously reads plural document pages sequentially.
Setting section 12, shown in
Judging section 13 and page extracting section 14 comprise a circuit including a CPU (Central Processing Unit)(not shown), ROM (Read Only Memory) and RAM (Random Access Memory) as a main section. ROM stores programs which CPU executes as well as various kinds of fixed data. RAM functions as a memory for temporarily storing image data read by reading section 11.
Judging section 13 analyses the image data temporarily stored in RAM, described above and conducts character recognition. The character recognition is conducted by a conventional OCR (Optical Character-Recognition) algorithm and a pattern matching process. When continuously reading plural groups of documents (For example, Job A, Job B, etc.,) the judging and extracting processes are conducted for each page of respective jobs.
Judging section 13 judges whether the character string set by the user is with in the image data of any page which has been read (Step S53). If judging section determines the character string is included, that page, corresponding to the image data which includes the character string, is extracted (Step S54). The handling of extracted image data is designed so that the user can select one of the following options. For example, page extracting section 14 transfers extracted image data to an external apparatus; transfers image data as a file to an external apparatus; prints the image data after the transferring; or stores the image data in external or internal memory. Further, it is also possible that the extracted page based on a judgment that the each page data which includes data corresponding to a predetermined character or predetermined symbol, may be displayed on a display section (not shown) integrated in image reading apparatus 10, or on the display section of a computer (not shown). With regard to the display, it is possible to display the extracted page as a whole page, a part of a page or a minimized index image. When there is limitation with a display section, only the page number of the extracted page may be displayed. The image data which is included in pages which have not been extracted will be deleted. The document pages which have not been extracted may be stored or outputted as distinct pages or differently separated from the extracted pages.
Judging section 13 continues the judging process as long as set character string is detected (Step S64; N). When no character string is detected though the last page (Step S64; Y), Judging section 13 checks whether the page is the last page. When it is not the last page (Step S65; N), then judging section 13 processes the next page (Step S66) and terminates the process of the page if it is a last page (Step S65; Y) (Return).
Image processing apparatus 100 of the second embodiment of the preferred embodiments will be described below. Image processing apparatus 100 functions to extract a page including the predetermined character string from a plurality of pages of a document. The image processing apparatus comprises a CPU, a ROM, a RAM, a main body having various kinds of interfaces (I/F), a keyboard and a computer all of which function as image processing apparatus 100 which is operated by executing predetermined computer programs.
Setting section 111 is structured by a keyboard, a mouse and a display. Instead of setting section 111, the specific character string for the judging reference may be inputted from an external apparatus. As for image storing section 114, a large capacity storage apparatus, such as a hard disk apparatus is preferable.
The document data to be judged by judging section 112 may be inputted via external scanning apparatus 102 and/or information processing apparatus 101 via a LAN. Document data stored in image storing section 114 may be the objective data to be judged. It is also possible that the document data can be inputted or received by using an interface function integrated to image processing apparatus 100. Here, reading section 102a of scanning apparatus 102 is the same configuration of reading section 11 of image reading apparatus 10. The document data includes image data as image information and printed data denoting the contents of the document by a code, such as character code. Image storing section 114 may be provided outside of image processing apparatus 100 and connected to image processing apparatus 100.
Image processing apparatus 100 receives the image data sent from scanning apparatus 102, and temporally stores the image data to image storing section 114 and/or other memory. At this time, a management table is created to manage a storing location of the image data in the memory.
Judging section 112 determines whether image data of each page stored in the memory includes the character string set by the user (Step S154) and extracts the page which is determined that the page data corresponding to the page includes the character string (Step S155). The handing of extracted pages is designed either being fixedly set or being extracted by the user as following. For example, image data in the extracted page is stored as it is; stored as a file; transferred to an external apparatus and/or requested to be printed by an external printing apparatus. Further, it is also possible that the extracted page based on a determination that the each page data which includes data corresponding to a predetermined character or a predetermined symbol, may be displayed on a display section (not shown) incorporated in image reading apparatus 100 or a display section of image processing apparatus 101 connected to image reading apparatus 100. With regard to the display, it may be possible to display the extracted page, the whole page, a part of the page or a minimized index image. When there is limitation with a display section, only the page number of the extracted page may be displayed.
Non-extracted pages of document data are deleted. The non-extracted pages of document data may be configured so as to be stored or outputted separately from extracted pages. When the document data is coded data, the presence of the character string is checked by the accordance of the code data.
Image processing system 160 of the third embodiment of the preferred embodiments will be explained below. Image processing system 160 shown in
Printing apparatus 104 forms and outputs images corresponding to inputted image data or printing data, onto a recording paper sheet by electronic photo processing. Printing apparatus 104 is configured as a laser beam printer, which comprises a conveying apparatus of recording paper sheets, a photosensitive drum, an electro-charger, a laser unit, a developing apparatus, a transferring/separating apparatus, a cleaning apparatus and a fixing apparatus as an engine section.
In image processing apparatus 100, page extracting section 113 transfers image data of extracted page to printing apparatus via a LAN (Step S186). Printing apparatus 104 prints and outputs the image corresponding to the image data transmitted from image processing apparatus 100(Step S187).
With regard to how to handle the image data, the image data may be printed out; stored as it is; stored as a file; or transferred to an external apparatus, such as management server.
Non-extracted pages of document data are deleted. The non-extracted pages of document data may be configured so as to be stored or outputted separately from extracted pages.
Image forming system 200 of the fourth embodiment of the preferred embodiments will be explained.
Since reading section 201 is the substantially same as reading section 11 of image reading apparatus 10; setting section 202 is the substantially same as setting section 12 of image reading apparatus 10; judging section 203 is the substantially same as judging section 13 of image reading apparatus 10; page extracting section 204 is the substantially same as page extracting section 14 of image reading apparatus 10; and printing section 205 is the substantially same as printing section 104a of printing apparatus 104 shown in
Judging section 203 judges whether the character string, which a user has set, exits in the image data of each page read by reading section 201 (Step S223). Page extracting section 204 extracts the page which, is determined by judging section 203 to include the above mentioned character string (Step S224), and printing section 205 prints out the page extracted by page extracting section 204 (Step S225). The image data of the extracted page is deleted after printing is completed. Non-extracted image data is deleted before printing or after the printing together with the extracted image data.
While the preferred embodiments have been shown and described, it is to be understood that these disclosure are for the purpose of illustration and various changes and modifications may be made without departing from the scope of the invention as set forth in the appended claims. For example, in the embodiment, the existence of the set character string is check across the whole area of a page, however it may be a specific area in the page. Namely, once, a tile, such as “order sheet” is determined, it may be appeared in a specific portion of the order sheet. Accordingly, the process load for the judgment and a processing time may be minimized by limiting the judging area.
With regard to judging reference, various additional conditions, such as attribution data may be set. For example, in many cases, the title of the document is arranged to be different from others in order to differentiate it from others by changing a character size, font and putting special decoration. In order to extract those character strings, the additional conditions, such as character size (attribution data) may be added to the detecting conditions.
Only additional condition (attribution data), such as character size and existence of decoration may be used as a judging reference. For example, extracting a character string having more than a 30-point.
The maximum number of the characters and/or minimum number of characters may be set in order to improve the detecting accuracy of a character string and prevent misdetection of an unintended character string.
In the embodiments described above, the page corresponding to the page data including the set character string is arranged to be extracted. However it is possible that the page corresponding to the page data not including the set character string is arranged to be extracted. In any case, it may be allowed that each page data is checked whether it includes at least one of a predetermined character, a predetermined symbol and predetermined attribution information and the page corresponding to the page data which matches the judging reference based on the judgment described above, is extracted. Further, it is allowed to selectively extract the pages including the set character string and other pages. For example, Pages corresponding to the page data including the set character string and other pages corresponding to the page data which do not include the set character string are stored as respective files in the memory. With regard to non-extracted page, the user can select a handling method of image data in the non-extracted page. For example, it may be deleted; stored as a file separately from extracted pages; separately printed out from extracted pages; or separately transferred from extracted pages. In the embodiments described above, explained is an example which extracts a page by judging whether each page data corresponding to each page includes any one of a predetermined character, a predetermined symbol or predetermined attribution information, page by page. And when it matches the criteria, for example, predetermined character is included or predetermined no predetermined symbol is included, all pages matching the criteria are extracted. However, the present invention is not limited to this embodiment. Namely, it is allowed that the judging process is stopped to extract the page as soon as the page data which matches the detecting condition, that any one of predetermined character, predetermined symbol or predetermined attribution information is detected in a page in each page data to be retrieved. The present invention is not limited to the embodiment described above. Namely, in the case that when page data which matches the retrieve condition that one of a predetermined character, a predetermined symbol or predetermined attribution information is found in each page to be retrieved, the judging process is stopped and the page may be extracted.
In the embodiment described above, it is described that when generating each page data based on a plurality of pages of a document, the judging process may be started after having generated page data corresponding to all pages. However, it is not limited to this embodiment. Namely, it is possible that judging process and extracting process may be started after reading documents having plural pages but before generating page data corresponding to all pages the document having plural pages.
In the case of double sided documents, it is possible that when a predetermined character is detected at least in either side of the document, both pages are extracted. In this case it is preferable to make it selectable that setting that both pages are extracted or only the page where the said character string exists is extracted.
It is also possible that plural judging references can be set and plural kind pages are extracted based on respective judging references. For example, when character string A and character string B are set as judging references, a page corresponding to page data including the character string A may be extracted as group A and a page corresponding to page data including the character B may be extracted as group B. It becomes possible to extract plural pages classified into plural classes based on one reading operation.
Number | Date | Country | Kind |
---|---|---|---|
JP2004-274393 | Sep 2004 | JP | national |