A document processing unit may facilitate an exchange and/or collection of information.
For example, an employee may use a printer or copier to create multiple copies of a memo or report to be distributed to other employees of the company. As another example, a person might use a scanner to capture images of bills, receipts, and the like for tax purposes.
In some cases, a person or business might establish an archive rule or policy to retain information. For example, copies of documents related to a certain financial transaction (e.g., a corporate acquisition or merger) might need to be retained for a pre-determined period of time to comply with governmental regulations. Other types of documents that may need to be retained might be associated with, for example, medical records, educational transcripts, and/or legal documents.
To ensure that documents are retained, a company policy handbook might let employees know that certain types of documents need to be stored, for example, in company archive (e.g., on an archive server). Even with such an approach, however, employees might forget the policy or mistakenly store copies of documents in a wrong location (e.g., making it difficult to later retrieve the information). Thus, it can be very difficult to monitor and control the archiving of information, especially when a relatively large number of people, documents, and/or document processing units are involved.
Note that it may be desirable to retain information that is collected or created via the document processing unit 150. For example, copies of documents related to a certain financial transaction (e.g., a corporate acquisition or merger) might need to be retained for a pre-determined period of time to comply with governmental regulations. Other types of documents that may need to be retained might be associated with, for example, medical records, educational transcripts, and/or legal documents.
To ensure that documents are retained, a company policy handbook might let employees know that certain types of documents need to be stored, for example, in company archive (e.g., on an archive server). Even with such an approach, however, employees might forget the policy or mistakenly store copies of documents in a wrong location (e.g., making it difficult to later retrieve the information). Thus, it can be very difficult to monitor and control the archiving of information, especially when a relatively large number of people, documents, and/or document processing units 150 are involved.
Accordingly, a method and mechanism to efficiently, accurately, and automatically help ensure compliance with these types of archive policies may be provided in accordance with some embodiments described herein. In particular, the document processing unit 150 of
Note that
Any of the devices illustrated in
All systems and processes discussed herein may be embodied in program code stored on one or more non-transitory computer-readable media. Such media may include, for example, a floppy disk, a CD-ROM, a DVD-ROM, magnetic tape, solid state Random Access Memory (“RAM”) or Read Only Memory (“ROM”) storage units. Embodiments are therefore not limited to any specific combination of hardware and software.
At 202, a “document processing unit” may receive information associated with a document to be processed. As used herein, the phrase “document processing unit” might refer to, for example, a printer, a scanner, a copier, a facsimile machine, and/or a multi-function document processing unit (e.g., that acts as both a printer and a copier).
At 204, the document processing unit may “automatically” analyze the received information in view of at least one pre-determined “archive policy.” As used herein, an action may be “automatic” if it requires little or no human intervention. Moreover, as used herein the phrase “archive policy” may refer to, for example, any rule that may be applied to the processing of documents, such as a rule associated with the detection of particular words or phrases. For example, a business might want to store all documents that are associated with a particular governmental contract. As still another example, an educational institution might want to store all documents related to students. Note that any archive policy described herein might be associated with a keywords, a text search, a pattern search (e.g., looking for a sequence of numbers arranged “XXX-XX-XXXX” where X is a numeric character to detect potential Social Security numbers), an Optical Character Recognition (“OCR”) analysis, an Intelligent Character Recognition (“ICR” process), and/or an image analysis (e.g., looking for a watermark or bar code associated with a particular type of product).
According to some embodiments, an archive policy might be associated with detecting a presence of a date and time (e.g., retaining documents associated with a particular time period). Note that instead of looking for and detecting certain types of material, an archive policy might be associated with detecting missing information. For example, an archive policy might note that a document is missing copyright information (e.g., “Materials Copyrighted 2015©”) or an indication that a word or phrase is trademarked (e.g., with a “®” or “™” symbol) and store copies of those documents in an archive.
At 206, the document processing unit may automatically determine, based on the analysis of 204, whether or not to apply a policy “action,” associated with the pre-determined archive policy, to the processing of the document. As used herein the phrase “policy action” may refer to, for example, automatically storing a copy of the document in an archive. For example, a printer may simply decide that all documents that include the words “receipt” or “invoice” will be automatically copied to an archive. According to other embodiments, a policy action may refer to a determination of an appropriate storage location (e.g., a particular server, database, or electronic folder) for a document. For example, if a document included the words “TOP SECRET” near the top margin, a copier might automatically encrypt an image of the document and store the encrypted copy of the document in a secure archive server.
As other examples, the policy action might be associated with an automatic generation of file name. For example, a printer might detect the name of a company in a document being printed and automatically store a copy of the document in an archive using a file name of “PR_Info_xx/xx/xxxx.yy” (where xx/xx/xxxx represents the date the document was printed and yy incrementally represents the number of times the archive rule was triggered on that particular day).
According to some embodiments, the application of an archive policy may be based at least in part on a user identifier. For example, a user might enter his or her employee identifier into a copier. In this case, different policies might be applied to different employees. For example, copies of documents scanned by a supervisor might be automatically saved while documents scanned by other employees are not. Note that the user identifier might be based on, for example, a communication between a document processing unit and a user device, such as a user's smartphone, Radio Frequency IDentifier (“RFID”) keychain, or employee card with a magnetic strip. According to other embodiments, biometric information (e.g., a fingerprint) or facial recognition process may be used to determine a user identifier. Note that application of an archive policy may be based on a user's title or role in a company. For example, copies of documents printed by a person working in human resources department might be automatically archived while documents printed by other employees are not.
According to some embodiments, the application of an archive policy may be based at least in part on a processing function type. For example, a policy might indicate that a certain type of document should automatically be saved in an archive when it is printed but not when it is sent via facsimile.
Note that in the example of
The processor 410 communicates with a storage device 430. The storage device 430 may comprise any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, and/or semiconductor memory devices. The storage device 430 stores a program 412 and/or policy engine 414 for controlling the processor 410. The processor 410 performs instructions of the programs 412, 414, and thereby operates in accordance with any of the embodiments described herein. For example, the processor 410 may receive information associated with a document to be processed. The processor 410 may then automatically analyze the received information in view of at least one pre-determined archive policy. The processor 410 may then automatically determine, based on the analysis, whether to apply a policy action, associated with the pre-determined archive policy, to the processing of the document. For example, the processor 410 might automatically save a copy of a document in an archive 460.
The programs 412, 414 may be stored in a compressed, uncompiled and/or encrypted format. The programs 412, 414 may furthermore include other program elements, such as an operating system, a database management system, and/or device drivers used by the processor 410 to interface with peripheral devices.
As used herein, information may be “received” by or “transmitted” to, for example: (i) the document processing system 400 from another device; or (ii) a software application or module within the document processing system 400 from another software application, module, or any other source.
In some embodiments (such as shown in
The policy identifier 502 may be, for example, a unique alphanumeric code identifying a policy that is to be applied to a document being processed. The policy rule 504 may define the ways in which a document is to be analyzed. For example, the policy rule 504 might indicate that Social Security numbers should be detected (e.g., by looking for certain patterns or by matching values within another database) or that keywords should be detected (e.g., looking for student names). The policy action 506 may indicate one or more tasks that will be executed when the policy rule 504 is satisfied. For example, the policy action 506 might indicate that a copy of a document should be stored in archive, a file name should be generate, and/or specify where a copy of the document should be stored. The priority 508 might help a document processing unit determine which policy action 506 should be performed when multiple policy rules 504 are satisfied simultaneously.
The policy rules 504 and policy actions 506 may be defined, reviewed, and/or adjusted by an administrator and/or users of a document processing unit. For example,
The display 600 may, for example, comprise a sequential list of archive policies along with one or more rules and/or actions associated with that policy. According to some embodiments, selection of a policy may result in a display of all documents that have been archived based on that policy (and where those documents are stored).
Note that embodiments described herein might be implemented using any number of different architectures.
According to some embodiments, the document processing unit 750 includes an OCR/ICR platform 760. The OCR/ICR platform 760 may, for example, detect handwritten, typewritten, or printed text in a scanned document and output the data in a machine-encoded text that a document analyzer 770 may read and interpret. Note that paper documents might be input to the document processing unit 750 via th optical scanner 110, and electronic documents may be sent to the document processing unit 750 via a computer device 720, such as computer network. The input format of these documents may not be consistent with the format required by various components of the document processing system 750. As a result, a document format converter may convert an input document format into a format that is consumable by the components of the document processing system 750 (e.g., the OCR/ICR platform 160).
The document processing unit 750 may also include a policy database 500 according to some embodiments. The policy database 500 may be configured and maintained by a system administrator and contain a set of rules, such as rules associated with a presence or lack of presence of particular content. For example, a rule might detect the presence of the word “Invoice” in a document or a document name. As another example, a rule might detect the presence of Social Security numbers in a document. The policy database 500 may further define actions to take when rule violations are detected. For example, the actions might be associated with storing a copy of the document in an archive, determining an appropriate location for the document, and/or determining an appropriate file name for the document. The policy database 500 may also include a priority level to be used when multiple rules are triggered.
The document processing unit 750 may also include the document analyzer 770 according to some embodiments. The inputs to the document analyzer 770 may be the policy rules as well as the document being processed. The document analyzer 770 may then evaluate each rule in the context of the current document and output a result to a policy enforcer 780. According to some embodiments, there are two classes of analysis that may be processed by the document analyzer 770: (i) a text based analysis, and (ii) an image based analysis. The text based analysis may employ techniques such as OCR algorithms and ICR (e.g., to detect handwriting). The image based analysis might, for example, search for specified images in the document.
According to some embodiments, the document processing unit 750 may also include the policy enforcer 780. The inputs to the policy enforcer 780 may be the output of the document analyzer 770 and the action list and priority levels from the policy database 500. The policy enforcer 780 may be responsible for deciding one or more final actions taken by the system 700 (such as to create a file name, select a storage location, automatically copy a document, etc). The policy enforcer 780 may make this decision based on the results generated by the document analyzer 770 and the priority of each rule. That is, a plurality of pre-determined archive policies are each associated with a policy priority, and actions actually performed by the document processing system 750 may be further based on those policy priorities.
Consider, for example, a situation where the document analyzer 770 detects two events that each have an associated action required by the policy enforcer 780. The first event has an associated low priority action of creating a file name using a first rule and the second event has an associated high priority action of creating a file name using a second rule. In this case, the policy enforcer 780 may decide to not perform the actions associated with the first event (and name the file in accordance with the second rule). According to other embodiments, two files might be created instead.
The policy enforcer 780 may arrange for a copy of the document to be automatically stored into an archive database 460 via a database manager 790. The database manager 790 might comprise, for example, a Database Management System (“DBMS”) that facilitates the storage, maintenance, and/or access of information stored in the archive 460. The database manager 790 may also provide encrypted data storage for security as well as reliability features (e.g., storage redundancy) and/or efficiency advantages (e.g., parallel communications). According to some embodiments, the image processing unit 750 and/or database manager 790 is also associated with a user control panel. The user control panel might comprise, for example, a human interface used to support user operations, such as: (i) creating, modifying, or deleting archive policy rules, (ii) accessing information from an archive, and/or (iii) printing documents stored in the archive.
Note that in the example of
Accordingly, a method and mechanism to efficiently, accurately, and automatically help ensure compliance with archive policies may be provided in accordance with some embodiments described herein.
The following illustrates various additional embodiments and do not constitute a definition of all possible embodiments, and those skilled in the art will understand that the present invention is applicable to many other embodiments. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above-described apparatus and methods to accommodate these and other embodiments and applications.
Although embodiments have been described with respect to particular types of archive policies, note that embodiments may be associated with other types of policies. For example, an archive policy might be associated with a company's newly developed products, competitors, and/or customers. Moreover, while embodiments have been illustrated using particular ways of applying policies to documents, note that embodiments might be associated with audio and/or video information (e.g., displayed on a monitor, captured via a web video camera, and/or spoken over a telephone).
Embodiments have been described herein solely for the purpose of illustration. Persons skilled in the art will recognize from this description that embodiments are not limited to those described, but may be practiced with modifications and alterations limited only by the spirit and scope of the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US11/67511 | 12/28/2011 | WO | 00 | 6/5/2013 |