Claims
- 1. A method for managing a plurality of native documents to be uploaded to a document management computer system, the steps comprising:
a) determining a file type for each native document of the plurality of native documents; b) creating a fingerprint for each native document; c) de-duplicating each native document in accordance with the fingerprint; d) extracting data from each native document; e) associating extracted data with a corresponding native document; and f) distributing the plurality of native documents and extracted data substantially equally amongst a plurality of nodes of the document management computer system.
- 2. The method for managing the plurality of native documents according to claim 1, further comprising the step of extracting native document(s) included in the plurality of documents from an archive file.
- 3. The method for managing the plurality of native documents according to claim 1, wherein the fingerprint for each native document is created using a MD5 checksum.
- 4. The method for managing the plurality of native documents according to claim 1, wherein step (c) further comprises comparing the fingerprint of each native document with a plurality of fingerprints comprised of the fingerprints for each native document to be uploaded.
- 5. The method for managing the plurality of native documents according to claim 1, wherein step (c) further comprises comparing the fingerprint of each native document with at least one fingerprint corresponding to a native document stored in the document management computer system.
- 6. The method for managing the plurality of native documents according to claim 4, further comprising discarding native documents that are determined to be the same in accordance with the comparison of fingerprints.
- 7. The method for managing the plurality of native documents according to claim 5, further comprising discarding native documents that are determined to be the same in accordance with the comparison of fingerprints.
- 8. The method for managing the plurality of native documents according to claim 1, wherein step (d) further comprises creating at least one data file corresponding to the extracted data for each native document.
- 9. The method for managing the plurality of native documents according to claim 1, wherein step (d) further comprises creating a plurality of data files corresponding to the extracted data for each native document.
- 10. The method for managing the plurality of native documents according to claim 9, wherein the plurality of data files includes files selected from a group consisting of a text file, a meta data file, an XML file and a HTML file.
- 11. The method for managing the plurality of native documents according to claim 10, wherein in step (e), a data table is created for at least one native document for defining an association with the plurality of data files.
- 12. The method for managing the plurality of native documents according to claim 1, wherein in step (e), a data table is created for at least one native document for defining an association with extracted data.
- 13. A program product, comprising executable code transportable by at least one machine readable medium, wherein execution of the code by at least one programmable computer causes the at least one programmable computer to perform a sequence of steps, comprising the steps recited in claim 1.
- 14. A method for searching a plurality of native documents stored in a document management computer system having a plurality of computer nodes storing the plurality of native documents, the steps comprising:
a) defining search criteria for searching the plurality of native documents; b) executing in parallel searches in accordance with the search criteria for each of the plurality of nodes, wherein each computer node scores each search result in accordance with the search criteria; c) ranking the search results in accordance with the score determined in each computer node; and d) omitting certain documents represented by the search results in accordance with a user's predefined permission level; and e) displaying final search results to a user.
- 15. The method for searching a plurality of native documents according to claim 14, further comprising comparing the user's predefined permission level with a document classification for each native document represented by the search results.
- 16. The method for searching a plurality of native documents according to claim 15, further comprising determining whether or not a user is permitted to view each native document represented by the search results in accordance with the comparison of the user's predefined classification and the document classifications.
- 17. A program product, comprising executable code transportable by at least one machine readable medium, wherein execution of the code by at least one programmable computer causes the at least one programmable computer to perform a sequence of steps, comprising the steps recited by claim 14.
- 18. A method for managing attributes of at least one native document produced from a search of a plurality of native documents stored in a document management computer system, the steps comprising:
a) defining search criteria for searching the plurality of native documents; b) executing a search in accordance with the defined search criteria; c) displaying search results; d) modifying document attributes of at least one document represented by the search results to create a user defined classification; and e) storing the user defined classification associated with the at least one document, wherein the user defined classification maintained for future searches.
- 19. The method for managing attributes of at least one native document according to claim 18, wherein modifying document attributes includes adding a comment to be displayed when the at least one document is later viewed.
- 20. The method for managing attributes of at least one native document according to claim 19, further comprising designating the comment as public so as to be displayed to users in addition to the user who authored the comment when later viewing the document.
- 21. The method for managing attributes of at least one native document according to claim 19, further comprising designating the comment as private so as to be displayed only to the user who authored the comment when later viewing the document.
- 22. The method for managing attributes of at least one native document according to claim 18, further comprising selectively sending a link to at least one document of the search results to another user.
- 23. The method for managing attributes of at least one native document according to claim 18, wherein modifying document attributes includes selectively categorizing the at least one document represented by the search results.
- 24. The method for managing attributes of at least one native document according to claim 18, wherein modifying document attributes includes selectively sending a link to the at least one document represented by the search results to a user.
- 25. A method for searching a plurality of native documents stored in a document management computer system, the steps comprising:
a) defining search criteria for searching the plurality of native documents; b) executing a search in accordance with the defined search criteria; c) displaying search results as links to data files representative of associated native documents; and d) selectively viewing a native document represented by at least one link of the search results displayed to the user.
- 26. The method for searching a plurality of native documents stored in a document management computer system according to claim 25, wherein the native document is downloaded to a user interface that sent a request to selectively view the native document.
- 27. A method for producing search results of a plurality of native documents stored in a computer system in accordance with a user-defined search query, comprising:
a) providing at least one server in communication with the computer system for storing the plurality of native documents to be searched; b) receiving the user-defined search query; c) sending a search query to the computer system in accordance with the user-defined search query; d) based on results of step (c), receiving search results from the computer system corresponding to the user-defined search query; e) attributing at least one user defined classification to at least one document represented by the search results received in step (d), wherein the user defined classification is displayed when the at least one document is later viewed.
- 28. A method for producing search results of a plurality of native documents stored in a computer system in accordance with a user-defined search query comprising:
a) providing a Website hosted by a server interfacing with the computer system and a user connected via a user interface over a communication network; b) under control of the user interface, displaying the search results of the plurality of native documents in accordance with the user-defined search query; and c) in response to at least one user-defined classification selected by the user, attributing the user-defined classification to at least one native document represented by the search results, wherein the user-defined attribute is displayed when the link representing the at least one native document is later viewed.
- 29. An electronic document management system comprising:
a plurality of computer nodes for storing a plurality of native documents; and a computer in communication with the plurality of computer nodes for receiving a plurality of input files to be uploaded to the plurality of computer nodes, wherein the computer is configured to determine the type of native document for each of the plurality of input files, to assign a unique identification tag to each native document, and to eliminate duplicate native documents based on the unique identification tags, for producing a subset of input files to be uploaded to the plurality of computer nodes, wherein the subset of input files are distributed substantially equally amongst the plurality of computer nodes.
- 30. The electronic document management system according to claim 29, wherein the computer is further configured to extract data from each native document.
- 31. An electronic document management system according to claim 30, wherein the computer creates a text file corresponding to the extracted data.
- 32. An electronic document management system according to claim 29, wherein the computer creates a data file selected from a group consisting of a text file, a meta data file, a XML file, and a HTML file.
- 33. An electronic document management system according to claim 29, wherein the subset of input files and associated data extracted therefrom are distributed substantially equally amongst the plurality of computer nodes.
- 34. An electronic document management system comprising a PC type computer connected in a parallel cluster, said computer using an operating system that stores electronic documents in a hard disk drive throughout the cluster, said operating system defining a document identification tag where each document is identified by its files extension that is converted to ASCII text and given a unique identification number, each of a plurality of documents having at least one of either meta-data, text or attachments identified for retrieval that are indexed for web-based retrieval from the cluster database, said identification of the plurality of documents forming a cluster data base that is web-searchable by use of a predetermined descriptive term.
RELATED APPLICATIONS
[0001] This application claims priority from Provisional Application Serial No. 60/438,508 filed on Jan. 8, 2003, entitled: “ELECTRONIC DOCUMENT MANAGEMENT”, the entire disclosure of which is hereby incorporated by reference herein.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60438508 |
Jan 2003 |
US |