Claims
- 1. A method for reducing duplication of files in a network system including one or more sending systems and one or more receiving systems, the method comprising:
determining a digital signature for a digital file received by a receiving system; comparing the digital signature against stored digital signatures of stored digital files accessible by the receiving system; and determining whether to store the digital file based on the comparison of the digital signature of the digital file against the stored digital signatures of the digital files accessible by the receiving system.
- 2. The method of claim 1 wherein the digital file comprises an electronic mail message.
- 3. The method of claim 2 wherein the digital file includes an attachment in an electronic mail message.
- 4. The method of claim 3 wherein determining the digital signature for the digital file includes determining the digital signature of the attachment.
- 5. The method of claim 1 further comprising storing the digital file when a result of the comparison reveals that the digital signature of the digital file does not correspond to any of the stored digital signatures accessible by the receiving system.
- 6. The method of claim 6 wherein the digital signature for the digital file and the stored digital signatures are compared by comparing the digital signature for the attachment with the stored digital signatures corresponding to attachments of the digital files accessible by the receiving system.
- 7. The method of claim 6 further comprising storing the digital signature when the result of the comparison reveals that the digital signature of the digital file does not correspond to any of the stored digital signatures.
- 8. The method of claim 6 further comprising generating a location identifier for the digital file indicating a location of the digital file when the result of the comparison indicates that the digital signature of the digital file does not correspond to any of the stored digital signatures.
- 9. The method of claim 8 further comprising storing the location identifier if the file has been received more than a storage threshold number of times.
- 10. The method of claim 1 further comprising storing a location identifier for a previously-stored digital file corresponding to one of the stored digital signatures when the result of the comparison reveals that the one stored digital signature matches the digital signature for the digital file received.
- 11. The method of claim 10 further comprising not redundantly storing the digital file when a result of the comparison reveals that the digital signature of the digital file corresponds to at least one of the stored digital signatures.
- 12. The method of claim 1 wherein determining the digital signature includes applying a hashing technique to all or part of all of the digital file.
- 13. The method of claim 12 wherein applying the hashing technique includes applying an MD5 algorithm to the digital file.
- 14. The method of claim 12 wherein applying the hashing technique includes applying a version of an SHA algorithm to the digital file.
- 15. The method of claim 1 wherein the digital signature is determined from less than all of the digital file.
- 16. The method of claim 1 wherein the digital signature is determined based on a name of the digital file.
- 17. The method of claim 1 wherein determining the digital signature is determined based on a size of the digital file.
- 18. The method of claim 1 farther comprising verifying that the digital file received by the receiving system corresponds to a stored digital file.
- 19. The method of claim 18 wherein verifying that the digital file corresponds to the stored digital file includes verifying that at least a portion of a name of the digital file corresponds to at least a portion of a name of the stored digital file.
- 20. The method of claim 18 wherein verifying that the digital file corresponds to the stored digital file includes verifying based on a size of the digital file.
- 21. The method of claim 18 wherein verifying that the digital file corresponds to the stored digital file includes verifying based on a hash performed on the digital file.
- 22. The method of claim 18 wherein verifying that the digital file corresponds to the stored digital file includes verifying based on data in the digital file.
- 23. The method of claim 1 further comprising adding a counter set to an initial value when adding the digital signature to the stored digital signatures.
- 24. The method of claim 23 further comprising incrementing the counter when the digital signature is determined to match one of the stored digital signatures.
- 25. The method of claim 23 further comprising decrementing the counter when a user deletes a user copy of the digital file.
- 26. The method of claim 23 further comprising deleting the digital file when the counter is decremented below a file deletion threshold.
- 27. The method of claim 23 further comprising removing the digital signature from the stored digital signatures of stored digital files when the counter falls below a signature deletion threshold.
- 28. The method of claim 23 further comprising deleting the location identifier when the counter is decremented below a location identifier threshold.
- 29. The method of claim 1 wherein determining whether to store the digital file includes determining whether the digital file has been replaced with a location identifier a high volume threshold number of times per stored instance.
- 30. The method of claim 29 further comprising getting the location identifier for a previously-stored version of the digital file when the digital file has not been replaced a high volume threshold number of times per stored instance.
- 31. The method of claim 30 further comprising replacing the digital file with the location identifier.
- 32. The method of claim 29 further comprising storing the digital file when the digital file has been replaced a high volume threshold number of times per stored instance.
- 33. The method of claim 32 further comprising storing the location identifier for the stored digital file.
- 34. An apparatus for reducing duplication of files in a network system, the apparatus comprising:
an interface structured and arranged to receive a digital file; at least one signature processor structured and arranged to determine a digital signature of the digital file; a comparing device structured and arranged to compare the digital signature against stored digital signatures of digital files accessible by the receiving system; and at least one decision processor that is structured and arranged to determine whether to store the digital file based on a result of the comparison performed by the comparing device.
- 35. The apparatus of claim 34 wherein the digital file includes an electronic mail message.
- 36. The apparatus of claim 34 wherein the decision processor is structured and arranged to store the digital file when the result of the comparison reveals that the digital signature of the digital file does not correspond to any of the stored digital signatures.
- 37. The apparatus of claim 36 wherein the decision processor is structured and arranged to add the digital signature to the stored digital signatures when the result of the comparison performed by the comparing device reveals that the digital signature of the digital file does not correspond to any of the stored digital signatures.
- 38. The apparatus of claim 36 wherein the decision processor is structured and arranged to create a location identifier for the digital file indicating a location of the digital file when the result of the comparison performed by the comparing device reveal that the digital signature of the digital files does not correspond to any of the stored digital signatures.
- 39. The apparatus of claim 38 wherein the decision processor is structured and arranged to store the location identifier with the digital signature of the digital file after files with the digital signature has been received more than a storage threshold number of times.
- 40. The apparatus of claim 34 wherein the decision processor is structured and arranged to store the digital file when the comparison reveals that the digital signature of the digital file is found in the stored digital signatures of the digital files accessible to the receiving system.
- 41. The apparatus of claim 40 wherein the decision processor is structured and arranged to not redundantly store a location identifier for a digital file corresponding to one of the digital signatures when the result of the comparison reveals that the one stored digital file signature matches the digital signature for the digital file received.
- 42. The apparatus of claim 34 wherein the signature processor is structured and arranged to determine the digital signature by applying a hashing technique to all or part of all of the digital file.
- 43. The apparatus of claim 42 wherein the signature processor is structured and arranged to determine the digital signature by applying an MD5 algorithm to the digital file.
- 44. The method of claim 42 wherein the signature processor is structured and arranged to determine the digital signature by applying a version an SHA algorithm to the digital file.
- 45. The apparatus of claim 34 wherein the signature processor is structured and arranged to determine the digital signature from less than all of the digital file.
- 46. The apparatus of claim 34 wherein the signature processor is structured and arranged to determine the digital signature based on a name of the digital file.
- 47. The apparatus of claim 34 wherein the signature processor is structured and arranged to determine the digital signature based on a size of the digital file.
- 48. The apparatus of claim 34 wherein the signature processor is structured and arranged to verify that the digital file received corresponds to a stored digital file.
- 49. The apparatus of claim 34 wherein the decision processor is structured and arranged to include adding a counter set to an initial value when the digital signature of the digital file is added to the stored digital signatures of digital files.
- 50. The apparatus of claim 34 further comprising a user interface enabling a user to access the digital file.
- 51. The apparatus of claim 34 wherein the user interface includes an electronic mailbox.
- 52. The apparatus of claim 51 wherein the electronic mailbox includes one or more location identifiers.
- 53. The apparatus of claim 34 further comprising one or more SMTP relays.
- 54. The apparatus of claim 34 further comprising a file separator structured and arranged to separate the digital file into one or more constituent components.
- 55. The apparatus of claim 54 wherein at least one of the constituent components includes header information.
- 56. The apparatus of claim 54 wherein at least one of the constituent components is content of a electronic mail message.
- 57. The apparatus of claim 54 wherein at least one of the constituent components is an attachment.
- 58. The apparatus of claim 54 wherein the device is structured and arranged to create a link between more than one constituent component of the digital file.
- 59. The apparatus of claim 58 wherein the link includes a location identifier.
- 60. The apparatus of claim 34 wherein the signature processor is structured and arranged to determine the digital signature of the digital file received by the receiving system on a device physically distinct from the interface structured and arranged to receive a digital file.
- 61. The apparatus of claim 60 wherein the signature processor forwards one or more digital signatures to a data store of digital signatures for digital files accessible to the receiving system.
- 62. The apparatus of claim 34 wherein a local receiving system in a group of two or more receiving systems maintains a local data store of digital signatures corresponding to digital files received by the local receiving system.
- 63. A computer program for reducing duplication of files in a network system, comprising one or more sending nodes, and one or more receiving systems, stored on a computer readable medium, comprising:
a signature processing code segment that is operable to make a computer processor determine a digital signature for a digital file received by a receiving system; a comparing code segment that is operable to make a computer processor compare the digital signature against a stored digital signatures of digital files accessible by the receiving system; and a decision code segment that is operable to make a computer processor determine whether to store the digital file based on a result of the comparison performed by the comparing code segment.
- 64. The computer program of claim 63 wherein the decision code segment is structured and arranged to add the digital signature to the stored digital signatures when the result of the comparison performed by the comparing code segment reveals that the digital signature of the digital file does not correspond to any of the stored digital signatures.
- 65. The computer program of claim 63 wherein the decision code segment is structured and arranged to create a location identifier for the digital file indicating a location of the digital file when the result of the comparison performed by the comparing code segment reveals that the digital signature of the digital files does not correspond to any of the stored digital signatures.
- 66. The computer program of claim 65 wherein the decision code segment is structured and arranged to store the location identifier with the digital signature of the digital file after files with the digital signature has been received more than a storage threshold number of times.
- 67. The computer program of claim 63 wherein the decision code segment is structured and arranged to store the digital file when the comparison reveals that the digital signature of the digital file is found in the stored digital signatures of the digital files accessible to the receiving system.
- 68. The computer program of claim 67 wherein the decision code segment is structured and arranged to store a location identifier for a digital file corresponding to one of the digital signatures when the result of the comparison reveals that the one stored digital file signature matches the digital signature for the digital file received.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Application No. 60/334,578, filed Dec. 3,2001, and entitled “REDUCING DUPLICATION OF FILES ON A NETWORK.”
Provisional Applications (1)
|
Number |
Date |
Country |
|
60334578 |
Dec 2001 |
US |