Although electronic documents are utilized in almost every industry today, hardcopy (or paper) documents continue to be produced, copied, and circulated using photocopiers, fax machines, and other devices. Such hardcopy documents may contain a wealth of useful information, and thus remain important in today's information culture.
However, hardcopy documents may be quite voluminous, potentially leading to difficulty in managing the information contained therein. It is possible, for example, for a user to be interested in only a small portion of the information contained in a hardcopy document. While some types of documents may include indexes or section headings to help guide a user through the document, such indexes typically add to the size of the document and may not allow meaningful searching of the document because of limitations inherent in any pre-prepared guide. Users thus still may face re-reading substantial portions of a document in order to identify information of interest.
In contrast to hardcopy documents, many electronic documents may be rapidly searched upon a user providing a search request to a search engine. The search request may include words, partial words, phrases, etc., and may employ search logic such as boolean operators, proximity operators, “followed by” operators, etc. Based on such a search request, a search engine may search the electronic document for “hits,” (e.g., portions of the electronic document that correlate with or correspond to the search criteria, according to the particular search logic used by the search engine). These “hits” may be brought to the user's attention, and the user may view the “hits” within the electronic document. Depending on the particular search engine used, it may be necessary to negotiate multiple screens or windows during each print request.
Referring initially to
In the present illustration, copy machine 10 will be understood to include both a scanner and an imaging device, and thus may be referred to generally as a multi-functional device. Facsimile machine 14 may include both a scanner and a printer, and thus also may be referred to generally as a multi-functional device. As used herein, the term “multi-functional device” is not limited to copy machines or facsimile machines, or even to network devices, but rather is intended to designate a device characterized by plural document-processing functions (e.g., both scanning and printing). Such multi-functional devices also may be referred to as “all-in-one” devices or “printer-copier-fax” devices, regardless of style or size.
Communications network 12 may take the form of a local area network (LAN), a wide area network (WAN) such as the Internet, or any other network or combination of networks capable of providing for communication between network devices, such as copy machine 10, facsimile machine 14, and/or computer 16. In the depicted embodiment, although only unidirectional communication of electronic documents is described, it is to be understood that bi-directional communication is possible throughout network environment 5.
As indicated, copy machine 10 may be adapted to scan hardcopy documents, print hardcopy documents and/or produce copies of hardcopy documents. In connection with these basic functions, the copy machine 10 may be configured to produce corresponding electronic documents and/or to process electronic documents, whether produced by a scanner onboard the copy machine 10, or produced by another device. For example, where the copy machine 10 forms a part of a network environment, as shown in
In
In the exemplary embodiment of
Upon receiving original hardcopy document 18, copy machine 10 may employ an input device, such as onboard scanner 24, to scan the original hardcopy document, thereby producing electronic source document 18a. Electronic source document 18a may be at least temporarily stored in onboard memory 25 as an image file (e.g., a bitmap), and/or may be made available for processing by onboard processor 20. The electronic source document 18a also may be presented to a user, using an output device, such as onboard printer 26.
As will be explained further below, onboard processor 20 may be configured to convert an image file to a text-recognizable file (text, rich text, etc.), thereby producing a searchable electronic document 18b using optical character recognition (OCR) or similar technology. The searchable electronic document, in turn, may be stored in memory 25, and/or processed further to produce an electronic search report 19. The electronic search report 19 also may be stored in memory 25, as shown, and/or may be sent to onboard printer 26 for printing. As indicated, a resultant document in the form of a hardcopy search report 19a thus may be produced.
Alternatively or additionally, the electronic search report 19 may be communicated to another device via communications network 12. In
Operation of a multi-functional device such as copy machine 10 may be directed via a control panel 30, which may employ one or more user-input features (e.g., buttons, a touch screen, or similar features). With these features, a user may enter information, or select desired functions, to effect scanning of an original hardcopy document, searching of an electronic document and/or printing of a search report. Alternatively or additionally, a multi-functional device may be directed using another device, such as network computer 16.
In one embodiment, a stand-alone multi-functional device may include a keyboard or keypad for entry of search criteria and initiation of a search request. Upon initiating the search request, an original hardcopy document may be scanned by the onboard scanner, and automatically converted to a searchable electronic format, if necessary. The resulting searchable electronic document thus may be searched, automatically, using the entered search criteria, and an electronic search report produced. The electronic search report then may be presented, automatically, to the user. The search report may be presented on a display screen of the control panel and/or presented as printed, automatically, a hardcopy search report produced by the onboard printer.
Turning now to
As indicated in connection with the exemplary embodiment of
Processor 40 also may be employed to direct processing of the electronic source document 18a, including directing performance of a desired search. In one embodiment, processor 40 may be provided with user direction via a user interface 42, which may take the form of a control panel, or the like. The user thus may enter search criteria defined in relation to the electronic source document 18a and interpretable by the processor 40 to effect searching of the electronic source document 18a, as will be described below.
For example, upon identifying an electronic source document 18a of interest, whether by scanning an original hardcopy document 18 to create the electronic source document 18a, by receiving the electronic source document 18a (already in electronic form) via a communications link, or by some other means, the identified electronic source document 18a may be prepared for processing. Accordingly, the identified electronic source document 18a may be reviewed to determine whether it is in a searchable format. If the identified electronic source document 18a is determined to be in such a searchable format (e.g., PDF text, WORD®, text, rich text, etc.), the search may begin. However, if the identified electronic source document 18a is determined not to be in a searchable format, processor 40 may be employed to convert the electronic source document 18a from a non-searchable format to a searchable format.
As indicated, processor 40 may employ a converter 40a configured to convert an image file (derived from the hardcopy document) to a searchable text file. Converter 40a may take the form of optical character recognition (OCR) software, firmware and/or hardware, or any other type of character, design or pattern recognition software, firmware and/or hardware. It will be appreciated that optical character recognition may involve recognition of printed or written text characters received by photo-scanning of the text. The text may be analyzed character-by-character for translation of characters into character codes, such as American Standard Code for Information Interchange (ASCII), which is commonly used in data processing.
Processor 40 also may employ a search engine 40b, which may be configured to utilize specialized search logic to find and identify “hits” (e.g., portions) in the electronic source document 18a that correlate with or correspond to search criteria. The search engine 40b thus may generate an electronic search report 19 that includes (or identifies) excerpts of the original hardcopy document 18 meeting the search criteria. Those of skill in the art will be familiar with the myriad search logic terms that allow a user to define search criteria as precisely or as imprecisely as desired.
An output manager 40c may be employed to present the electronic search report 19, generated by search engine 40b. The search report 19 may be presented automatically upon initiating the search, or may be presented in accordance with further user direction regarding presentation format and scope. As described above, the search report 19 may take the form of a printed hardcopy document, a displayed electronic document 19a′, or both. Where the search report 19 takes the form of an electronic document, the search report may be stored in memory 38 and/or communicated to an output device such as a printer, or a display. Such communication may take the form of an email message sent to remote printer or computer via a communications network (as indicated generally in
By way of example, it will be appreciated that a user may desire to search a document for the term “apple.” A hardcopy source document thus may be placed in a document scanner, and the term “apple” entered via a user interface. Upon initiating the scan (as by entering a “start” command), the scan may commence. The resulting scanned image may be automatically converted by a converter 40a to a searchable electronic document, and then automatically searched by a search engine 40b to identify portions of the searchable electronic document that include the search term, “apple”. An output manager 40c then may automatically produce an electronic search report 19 including (or identifying) portions of the hardcopy source document which include the search term, “apple”. The electronic search report 19 then may be automatically printed, or otherwise presented, to a user. The electronic search report 19 may include reprints of sentences, paragraphs, pages and/or sections (based on user-selection) of the hardcopy source document which include the search term, “apple”.
As noted above, an electronic document derived from a source document (e.g., the aforementioned scanned image, or corresponding searchable electronic source document) may be saved to memory (e.g., memory 38) for later access. Thus, should a user desire to search a particular hardcopy source document again (with the same or a different search request) the hardcopy document need not be scanned again, and converted to a searchable format again. Similarly, an electronic search report 19 derived from the searchable electronic document may be saved to memory for later access.
Memory 38 may be configured to store electronic documents permanently, or temporarily, in accordance with user direction and/or system needs. For example, memory 38 may be configured to store only the most recently generated electronic documents, may be configured to store electronic documents for a period of time after creation, or may be configured to store electronic documents indefinitely.
Users may access electronic search reports 19 via a computer 16 or other suitable device that is in communication with the document search system. For example, it will be appreciated that an electronic search report 19 may accessed by a remote computer for visual display. Similarly, an electronic search report 19 may be forwarded to a network printer, or network copy device, etc. for hardcopy presentation. Alternatively, or additionally, an electronic search report 19 may be forwarded to a remote facsimile machine (via a telecommunications network) for hardcopy presentation by such facsimile machine. Forwarding of the electronic search report 19 for presentation (printed or otherwise) may be effected automatically by the output manager 40c, or based on user direction in connection with the search request.
An electronic source document (18a;
At 56, a determination is made regarding whether the electronic source document 18a is searchable. If the electronic source document 18a is not searchable, the electronic source document is converted to a searchable electronic document, at 58, and the search criteria is applied to the searchable electronic document, at 60. If the electronic source document 18a is searchable, conversion of the electronic source document 18a is bypassed, and the search logic is applied directly to the electronic source document. It will be appreciated that the aforementioned determination, conversion and application of search logic may be achieved automatically, if desired.
Based on the results of the search, an electronic search report 19 is generated, at 62. As noted above, the electronic search report 19 may include excerpts from the searchable electronic document and/or may include references to relevant portions of the searchable electronic document. At 64, the search report 19 may be presented, whether by printing the electronic search report to present a hardcopy search report 19a, or by displaying the electronic search report on a display to present a visual search report 19a′. Such presentation may be effected automatically upon generating the search report and/or may be effected by user directive. For example, as noted above, an electronic search report 19 may be stored in memory, and accessed on demand.
It will be appreciated that the size and character of excerpts presented in the electronic search report may be user-selected. Moreover, the electronic search report may include additional descriptive information, such as a line number, page number, section and/or chapter for each excerpt. The descriptive information may further include the title of the document from which the excerpt is taken. This may be desirable, for example, if more than one document is searched at the same time. This descriptive information may be input by a user upon initiating a search, or taken from the source document directly. Thus, a user may be able to provide presentation directives that specify a desired excerpt size as well as what descriptive information, if any, is to be included in the electronic search report. It also may be desirable for the user to be provided with other options. For example, a user may desire to provide presentation directives regarding the method of delivery for the search results. This may include the location where the search results should be output, the manner in which the search results are output, etc. Some or all of these options may be available to the user at the time the search request is input to the multi-functional device.
It will be appreciated that the aforementioned method may be completed entirely by a multi-functional device (such as the aforementioned copy machine), or may be completed by plural devices (e.g., a printer and a scanner) related by a communications network. Similarly, it will be appreciated that the aforementioned document search system may be housed in a unitary multi-functional device (such as the aforementioned copy machine), or may be distributed across a network environment including distinct network devices capable of performing one or more of the operations described herein.
Although the present disclosure includes specific embodiments, these embodiments are not to be considered in a limiting sense as numerous variations are possible. The following claims particularly point out certain combinations and subcombinations regarded as novel and nonobvious. Other combinations and subcombinations of features, functions, elements, and/or properties may be claimed through amendment of the present claims or through presentation of new claims in this or a related application. Such claims, whether broader, narrower, equal, or different in scope to the original claims, are regarded as included within the subject matter of the present disclosure.
This application claims priority from copending U.S. Provisional Patent Application Ser. No. 60/554,306, which was filed on Mar. 17, 2004 and entitled “Document Search System,” the completed disclosure of which is incorporated herein by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
60554306 | Mar 2004 | US |