The present application includes a computer program listing appendix on compact disc. Two duplicate compact discs are provided herewith. Each compact disc contains a plurality of files of the computer program listing as follows:
Converted to ASCII Files:
The computer program listing appendix is hereby expressly incorporated by reference in the present application.
The invention relates generally to the certification of files, and more particularly, to a method and apparatus for using digital fingerprinting to certify the content and date associated with a file.
File integrity is critical in today's business environment. Every business has critical business records, for example, compliance records for the Health Insurance Portability and Accountability Act of 1996 (HIPAA), as well as internal control files for managing customers, manufacturing processes, and other sensitive areas. These records are only as good as the company's ability to prove their integrity. That is, the ability to prove specific content at a specific point in time.
Electronic records have many advantages over paper records. Unfortunately, electronic records can be easily modified, rendering these records less reliable in terms of integrity. This lack of reliability complicates efforts to demonstrate control of files and processes in the event of business or legal proceedings.
Thus, there is a long-felt need to provide a means to ensure integrity of electronic files.
The invention broadly comprises a computer-based method for certifying files using a specially programmed computer. The method sets parameters for identifying files to process and parameters for a processing schedule. An identified file is digitally fingerprinted and, in some aspects, the fingerprint is compared to fingerprints of previously processed files. If the fingerprints for the file do not match any of the fingerprints of previously processed files, the file has not been processed. Then, in some aspects, a copy of the file is archived. In some aspects, the archived file is renamed and/or converted to a read-only file. Processing also includes creating a Bulk Certification Record (BCR), adding the fingerprint to the BCR, and generating log and detail files listing details of the method operation. At the end of a session, the method transmits the BCR to a base computer, which compiles BCR information into a Daily Certification Record (DCR). A digital fingerprint is made of the DCR, and the DCR and the DCR fingerprint are given a respective sequential number. The method also publishes the DCR, DCR fingerprint, and the respective sequential numbers both electronically and in print media. The present invention also includes an apparatus to certify a file.
It is a general object of the present invention to provide a method and apparatus for maintaining the integrity of electronic files.
It is another object of the present invention to provide a method and apparatus for certifying the content of an electronic file and the time and date associated with the content.
It is still another object of the present invention to provide a method and apparatus for storing and managing certified electronic files.
It is a further object of the present invention to provide a method and apparatus for publicly publishing certification records regarding certified electronic files.
These and other objects and advantages of the present invention will be readily appreciable from the following description of preferred embodiments of the invention and from the accompanying drawings and claims.
a and 2b are a process flow chart illustrating a present invention computer-based method and apparatus for certifying a file;
a through 3f are a programming flow chart for a present invention method and apparatus;
g and 3h are a programming flow chart further illustrating the collection of digital fingerprints shown in
i and 3j are a programming flow chart further illustrating the transmission of collected of digital fingerprints shown in
At the outset, it should be appreciated that like drawing numbers on different drawing views identify identical, or functionally similar, structural elements of the invention. While the present invention is described with respect to what is presently considered to be the preferred aspects, it is to be understood that the invention as claimed is not limited to the disclosed aspects.
Furthermore, it is understood that this invention is not limited to the particular methodology, materials and modifications described and as such may, of course, vary. It is also understood that the terminology used herein is for the purpose of describing particular aspects only, and is not intended to limit the scope of the present invention, which is limited only by the appended claims.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs. Although any methods, devices or materials similar or equivalent to those described herein can be used in the practice or testing of the invention, the preferred methods, devices, and materials are now described.
In the drawings and written description of the invention, we utilize screen captures taken while operating the software to illustrate the best mode of the invention known to the inventors at the time of application for patent and to enable those having ordinary skill in the art to use the invention. We also include an appendix containing the source code for the computer program of the invention to enable one having ordinary skill in the art to make the invention. The software of the present invention is operatively arranged to operate with a conventional web browser, such as those commercially available from Netscape or Microsoft Corporation. It should be understood that the present invention is not limited to any particular web browser. The present invention is compatible with a variety of operating systems, for example Windows 2000 and Windows XP. It should be understood that the present invention is not limited to any particular operating system.
Processing element 18 includes configuring element 22 and transceiver 24. Computer 14 includes packaging element 26. The general operation of each of the elements noted above is now briefly described. Detailed descriptions regarding these operations are provided below. Element 22 is used to set the run schedule for the apparatus and to set various file parameters associated with operation of apparatus 10. Transceiver element 24 sends information regarding the first file, typically after the file is processed by element 18, to packaging element 26. Packaging element 26 receives the information regarding the processed file and performs operations to complete the certification of the file.
Regarding the run schedule, in general, the certification process for apparatus 10 is defined by a certification period, for example, a 24-hour period. It should be understood that the apparatus 10 is not limited to any particular time duration for a certification period. The files and associated fingerprints processed by apparatus 10 during a certification period are certified as a group at the end of the period. The general cycle of operations performed by computer 12 can be referred to as the fingerprinting operations. Each execution of these operations is called a session or run. Apparatus 10 can execute multiple sessions within a certification period. For example, within a 24-hour certification period, hourly sessions can be performed. The intervals for the sessions can be default settings in element 22 or can be modified by a user via user interface 28. Also, a user can manually initiate a session at any time using interface 28. It should be understood that the operations for apparatus 10 are applicable to more than one file during a respective session or certification period.
File parameters in element 22 also can be default settings or can be inputted or modified by a user via interface 28. In some aspects, file parameters include file locations, file identifiers, archive bit control, and selection of a location in which to store digital fingerprints. In some aspects, copying a selected file to archive 20 is optional and an archive select is included among the file parameters. In some aspects, renaming a file copy in archive 20, further described below, and/or converting a file copy in archive 20 to a read-only file, also described further below, are optional. In these cases, file parameters include a rename select and a read-only select, respectively. File locations refer to locations in which to look for files to certify. For example, searches can be directed to specific folders or file locations. File identifiers refer to identification of files to certify. Files may be selected based on a number of criteria, including time of last modification or the file name matching a specific pattern. When multiple folders are specified for scanning, each folder may have its own selection criteria. Some programs include an archive bit that lets other programs know if the file has been backed up or otherwise archived. For one aspect of archive bit control, files that have the archive bit set are selected for certification processing. For another aspect of archive bit control, for files having an archive bit, the bit is cleared after the file is fingerprinted.
In response to the run schedule parameters in element 20, element 18 initiates the fingerprinting operations, further described below, in computer 12. For example, if an hourly run schedule is selected in element 20, element 18 initiates the fingerprinting operations each hour until a period ends. Element 18 searches or “crawls” the locations designated by the search parameters and identifies files meeting the file identifier parameters. If, within a run, no files are found meeting the identifier parameters, element 18 sends a corresponding signal to report generator 30. For each selected file, element 18 computes a digital fingerprint. This fingerprint (sometimes called a file signature or hash) is computationally unique to the contents of the file. This means that any modifications to the file, no matter how slight, results in a different fingerprint value. This fingerprint is a one-way value. This means the fingerprint is computed based on file contents but the file contents can in no way be determined given a fingerprint. The present invention utilizes industry standard algorithms, such as Message Digest 5 (MD5) or Secure Hash Algorithm 1 (SHA1) for computing fingerprint. Therefore, it should be understood that any suitable fingerprint algorithm known in the art can be used by the present invention.
As described below, a copy of the digital fingerprint for each file selected for certification in computer 12 is stored in fingerprint memory location 32. Although location 32 is shown in the same computer 12 as the processing element, it should be understood that location 32 can be in a different computer 12 (not shown), and that the disposition of location 32 in different computers is included in the spirit and scope of the claims. It is possible that a digital fingerprint for the selected file already exists in computer 12, for example, the selected file has been previously certified by apparatus 10. Therefore, to prevent unnecessary operations in apparatus 10 and to prevent archive 20 from being overburdened with duplicate files, element 18 determines if the selected file has already been certified. In some aspects, the foregoing determination regarding previous certification is optional. In these aspects, the file parameters noted above include a “fingerprint repeat” select to enable or disable the determination function. Since the digital fingerprint for each file certified by apparatus 10 is stored in element 32, in some aspects, element 18 includes a comparison element 33 that compares the digital fingerprint of the selected file to the fingerprints stored in element 32. If the print for the selected file matches a print in location 32, the selected file has been previously certified and does not require further processing. Then, operations on the selected file are suspended. If the print for the selected file does not match any print in location 32, the selected file, hereafter referred to as the subject file, has not yet been certified, and the subject file and subject file fingerprint are further operated upon by processing element 18.
The first time a file is identified for certification within a certification period, element 18 creates ticket storage element 34, also known as a Bulk Certification Record (BCR). The BCR is a ticket that identifies the aggregating point for digital fingerprints in a given certification period. The BCR includes a detailed record or text file. Alternately, the same information in the BCR can be populated into a database at the user's election. After creating the BCR, element 18 signals transceiver element 24, which relays the signal to base transceiver element 36 in packaging element 26. Base transceiver element 36 assigns a BCR identifier (BCRI) for ticket storage element 34 and transmits the BCRI to transceiver 24. Transceiver 24 transmits the BCRI to ticket storage element 34. In some aspects, this value consists of the text “IPBCR” followed by 9 digits. Once the BCR is in place, element 18 adds the fingerprint for the subject file to the BCR. In some aspects, the BCR stores, for each digital fingerprint in the BCR, the time and/or date the digital fingerprint was created, and/or the file name.
In some aspects, element 18 automatically stores a copy of the digital fingerprint for the subject file in archive 20. In some aspects, element 18 stores a copy of the digital fingerprint for the subject file in archive 20 in response to a selection made by a user of apparatus 10, as described above for file parameters. In some aspects, element 18 automatically converts the file in archive 20 to a read-only file. This option prevents a user from inadvertently modifying a file that has been certified and archived, since such modification invalidates the original certification of the file. That is, the contents of the modified file would no longer match the contents of the file at the time the file was originally fingerprinted and certified. In some aspects, element 18 converts the file in archive 20 to a read-only file in response to a selection made by a user of apparatus 10, as described above for file parameters.
Processing element 18 simplifies operation of apparatus 10 for the user by making it easy for the user to select files to certify, save copies of certified files, and identify files that have been certified. For example, the user does not need to execute any steps beyond those already required for the particular program, for example, a word processing program, being used to generate or modify the file, once apparatus 10 is configured. In some aspects, element 18 automatically renames the subject file copy in archive 20 according to the syntax selected in element 22. In some aspects, element 18 renames the subject file copy in archive 20 according to the syntax selected in element 22 in response to a selection made by a user of apparatus 10, as described above for file parameters. In some aspects, the rename includes the original name for the selected file, to facilitate later identification of the file copy, and appends an identifier related to the certification process. For example, a file entitled “test.doc” can be modified to “test<.doc, where <> is the identifier. In some aspects, the identifier is the date and/or time of day that the file was digitally fingerprinted.
Generator 30 can provide a report for each session completed. The reports can be sent to computer 12, for example, to user interface 28 or to a database in computer 12. In some aspects, the user can select the database location using interface 28. Also, reports can be sent using email element 38. Generator 30 can provide a report for a successful session or a report for an unsuccessful session.
At the end of each session, element 18 passes the digital fingerprints in the BCR and the BCRI to transceiver element 24, which transmits the contents to base transceiver element 36 in computer 14. Only the fingerprints of the files, not the files themselves, are transmitted. In some some aspects, the BCR passes the date and/or time a digital fingerprint in the BCR was created. In some aspects, a file name for a digital fingerprint in the BCR is passed to element 24. In some aspects, the BCRI is written to the application log file, and can be included in any “success” message. Thus, the BCRI provides a means of tracing the transmission of a specific fingerprint to computer 14. Typically, transceiver 24 communicates with transceiver 36 using a network connection. It should be understood that any type of network connection known in the art can be used by apparatus 10. Examples of possible network connections include the Internet, FTP, and VPN. The first step in the communication is to verify information in a user file, identifying computer 12, so that fingerprint information can be attributed to a session specific to an account associated with computer 12. Multiple user files can be supplied to a single site, and the selection of the appropriate file is specified in a file in computer 14.
During transmission, element 24 constructs a session digital fingerprint, also referred to as a composite digital fingerprint, which is based on the data fingerprints accumulated during a respective session and their sequence within the BCR. In some aspects, the composite digital fingerprint incorporates the date and/or time a digital fingerprint in the BCR was created. The session fingerprint validates the set of fingerprints included in the session, and their order in the session. After all individual fingerprints are transmitted; the session fingerprint is transmitted to transceiver 36 for validation by computer 14. Transceiver 36 constructs a second session fingerprint for the fingerprint data received at computer 14. If there is a mismatch between the session fingerprint sent from computer 12 and the value computed by transceiver 36, this indicates that an error has occurred during transmission, and transceiver 36 sends an error message to the transceiver 24. In turn, transceiver 24 notifies generator 30, which can provide a report regarding the error.
Computer 14 also includes compiling element 40, sequencing element 42, and publishing element 44. Typically, computer 14 is enabled to receive BCR information from multiple users (not shown). In some aspects, computer 14 also receives other unrelated files corresponding to other documentation processes. Compiling element 40 creates a periodic summary file, which summarizes the activities of computer 14 in the course of a certification period. In some aspects, this summary file is called a Daily Certification Record (DCR). Thus, the DCR lists the BCRs and unrelated files received during a certification period. Sequencing element 42 creates a digital fingerprint of the DCR and assigns a respective sequential number to the DCR and the digital fingerprint of the DCR. Publishing element 44 publishes the DCR, the DCR fingerprint, and the respective sequential numbers. In some aspects, element 44 publishes in an electronic registry available to the public (not shown). In some aspects, element 44 publishes in a print journal available to the public (not shown). In some aspects, the electronic registry and the print journal are published daily and monthly, respectively.
In some aspects (not shown), apparatus 10 does not copy a subject file and therefore, apparatus 10 does not include archive 20 or an alternate storage location. For these aspects, file parameters in element 22 include a read-only select, to convert a subject file to a read-only file, and a rename select, to rename a subject file. The read-only conversion and renaming operations are as described above for the copy of the subject file in archive 20. For the foregoing aspects, the remainder of the operations described above for apparatus 10 is applicable.
a and 2b are a process flow chart illustrating a present invention computer-based method and apparatus for certifying a file. In
If step 66 identifies files, step 76 digitally fingerprints the identified files and compares the fingerprints to fingerprints in a fingerprint storage location selected in step 60. This location holds fingerprints for files already processed. In some aspects, the location holds fingerprints only for files processed earlier in the certification period or session. Step 78 queries the fingerprint comparison. If fingerprints match, then the identified file has already been processed and step 80 discontinues operations on the file. If fingerprints do not match, the file has not yet been processed, and step 82 processes the identified file.
If the identified file is the first file processed in the subject certification period, step 82 creates a BCR. Then, step 82 communicates with the base computer and step 86 assigns a BCR identifier (BCRI) for the BCR and communicates the BCRI to the first computer. Then, step 82 adds the fingerprint for the identified file to the BCR. If these options are selected in step 60, step 82 copies the file to the archive, appends the name for the copy in the archive according to the parameters selected in step 60 and changes the file in the archive to a read-only file. Step 82 also copies the fingerprint for the identified file to the fingerprint storage location. At the end of each session, for each file added to the BCR in that session, at least a portion of the information in the BCR is transmitted to the base computer in step 82. Step 82 also creates a session fingerprint and transmits the session fingerprint to the base computer. Step 86, in turn, computes a second session fingerprint for the information actually received in the base computer and compares the first and second session fingerprints. If the fingerprints do not match, an error has occurred during transmission and step 86 notifies the local computer of the error. Step 68 generates a report regarding the success or failure of operations in step 82.
If step 74 signals the end of the certification period, step 84 closes out the BCR. Then, Step 86 creates a period summary file, in some aspects, called a Daily Certification Record (DCR), and adds the BCR to the DCR. Step 88 digitally fingerprints the DCR and assigns a respective sequential number to the DCR and the digital fingerprint for the DCR. Step 90 publishes the DCR, the DCR fingerprint, and respective sequential numbers for the DCR and the DCR fingerprint in an electronic registry in the public domain. Step 92 publishes the DCR, the DCR fingerprint, and the respective sequential numbers for the DCR and the DCR fingerprint in a paper journal.
a through 3f are a programming flow chart for a present invention method and apparatus.
g and 3h are a programming flow chart further illustrating the collection of digital fingerprints shown in
i and 3j are a programming flow chart further illustrating the transmission of collected of digital fingerprints shown in
Thus, it is seen that the objects of the invention are efficiently obtained, although changes and modifications to the invention should be readily apparent to those having ordinary skill in the art, without departing from the spirit or scope of the invention as claimed. Although the invention is described by reference to a specific preferred embodiment, it is clear that variations can be made without departing from the scope or spirit of the invention as claimed.