MANIPULATING THE ORIGINAL CONTENT OF AT LEAST ONE ORIGINAL READ-ONLY COMPUTER FILE IN A COMPUTER FILE-SYSTEM IN A COMPUTER SYSTEM

Information

  • Patent Application
  • 20080183734
  • Publication Number
    20080183734
  • Date Filed
    January 31, 2007
    17 years ago
  • Date Published
    July 31, 2008
    16 years ago
Abstract
The present invention provides a method and system of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, where the computer system includes an operating system including a framework for in-line monitoring of accesses to the file-system. In an exemplary embodiment, the method, system, and service include (1) transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content, (2) storing the transformed content in a transformed computer file, (3) splitting the transformed computer file into a first file (F13 1) and a second file (F—2), and (4) associating the first file (F—1) with the second file (F—2) in the file-system. In an exemplary embodiment, the non-length-preserving data transformation algorithm includes a length-increasing data transformation algorithm (i.e., encryption).
Description
FIELD OF THE INVENTION

The present invention relates to a computer file-systems, and particularly relates to a method and system of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, where the computer system includes an operating system including a framework for in-line monitoring of accesses to the file-system.


BACKGROUND OF THE INVENTION

A computer system typically includes a computer file-system. A computer system typically includes an operating system. The operating system may include a framework for in-line monitoring of accesses to the file-system. Such a framework could be a file-system filter driver.


Need for Manipulating the Original Content of an Original Read-Only Computer File

Such a file-system filter driver would logically reside above the file-system stack and would have the ability to monitor and modify input/output requests that are sent to and completed from the underlying file-system. In addition, such a file-system filter driver could allow sophisticated file-data manipulation features, such as file data encryption and file data compression. Some modern operating systems support using file-system filter drivers to perform non-length preserving data transformations (e.g., file data encryption, file data compression). However, other operating systems do not provide such support. Thus, for those operating systems, there is a need to manipulate the original content of an original read-only computer file.


Challenges in Manipulating the Original Content of an Original Read-Only Computer File

For example, an operating system based on Microsoft Corporation's Windows NT kernel (e.g., Windows 2000, Windows XP, Server 2003) does not support manipulating the original content of an original read-only computer file. Specifically, such an operating system does not support using file-system filter drivers to perform non-length preserving data transformations. Namely, in such an operating system, the underlying file-system discloses the on-disk length of the file to the operating system's cache manager directly, without giving any of the mounted filter drivers a chance to transform the length appropriately. For example, if a file were encrypted (using an algorithm that increases the file length)(i.e., transformed via a non-length preserving data transformation), the cache manager would see the encrypted file-length, which is larger than the decrypted length of the file. If the file were paged-in by the cache-manager for caching, the cache manager would attempt to stream in data past the decrypted end-of-file. Such an attempt would lead to incorrect computer system behavior or an application crash. Specifically, if the encrypted file were a kernel driver for the operating system, the computer system could display an operating system blue-screen and/or experience kernel panics.


Prior Art

As shown in prior art FIG. 1, a typical prior art system (1) transforms the original content of a computer file via a non-length preserving data transformation algorithm, thereby resulting in transformed content, (2) stores the transformed content in a transformed computer file, (3) creates an in-memory state version of the original content, (4) implements each file system operation to support input/output to the in-memory state version, and (5) caches the in-memory state version. Unfortunately, such a system duplicates the functionality of the underlying file system by implementing a mini-file system in (3). Also, the system duplicates file system input/output support for the in-memory state version in (4). Also, the system duplicates file system caching in (5).


Therefore, a method and system of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, where the computer system includes an operating system including a framework for in-line monitoring of accesses to the file-system, is needed.


SUMMARY OF THE INVENTION

The present invention provides a method and system of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, where the computer system includes an operating system including a framework for in-line monitoring of accesses to the file-system. In an exemplary embodiment, the method and system include (1) transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content, (2) storing the transformed content in a transformed computer file, (3) splitting the transformed computer file into a first file (F_1) and a second file (F_2), and (4) associating the first file (F_1) with the second file (F_2) in the file-system. In an exemplary embodiment, the non-length-preserving data transformation algorithm includes a length-increasing data transformation algorithm (i.e., encryption).


In an exemplary embodiment, the splitting includes (a) writing an amount of the transformed content to the first file (F_1) that equals the size of the original computer file and (b) saving the remainder of the transformed content to the second file (F_2). In an exemplary embodiment, the writing includes writing the first N bytes of the transformed computer file to the first file (F_1), where N equals the length of the original computer file. In an exemplary embodiment, the saving includes saving the bytes after the first N bytes of the transformed computer file to the second file (F_2). In an exemplary embodiment, the saving includes denying direct open requests for the second file (F_2).


In an exemplary embodiment, the associating includes, (a) if the file-system supports at least one alternate data stream, writing the second file (F_2) as an alternate data stream of the first file (F_1) and, (b) if the file-system does not support at least one alternate data stream, naming the second file (F_2) with a name derived from the name of the first file (F_1). In a further embodiment, the associating includes, if the file-system does not support at least one alternate data stream, removing the second file (F_2) from a directory listing of the file-system. In a further embodiment, the naming includes naming the second file (F_2) with a name derived from a cryptographic hash of at least the name of the first file (F_1). In a further embodiment, the naming includes naming the second file (F_2) with a name derived from a cryptographic hash of the name of the first file (F_1) and the data of the first file (F_1).


In a further embodiment, the present invention further includes, if a read request is received for the first file (F_1), retrieving the original content from the first file (F_1). In a further embodiment, the retrieving includes (a) recognizing that the first file (F_1) resulted from the transforming, (b) locating the second file (F_2) associated with the first file (F_1), (c) given the byte offset and byte length of the request and based on the non-length-preserving data transformation algorithm used in the transforming, determining whether to retrieve the transformed content from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2), (d) retrieving the transformed content, based on the determining, from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2), (e) performing the inverse transformation of the non-length-preserving data transformation algorithm on the retrieved transformed content, thereby resulting in the original content, and (f) returning the original content.


In a further embodiment, the present invention further includes, if a close request is received for the first file (F_1), closing the first file (F_1) and the second file (F_2).


The present invention also provides a computer program product usable with a programmable computer having readable program code embodied therein of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, where the computer system includes an operating system including a framework for in-line monitoring of accesses to the file-system. In an exemplary embodiment, the computer program product includes (1) computer readable code for transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content, (2) computer readable code for storing the transformed content in a transformed computer file, (3) computer readable code for splitting the transformed computer file into a first file (F_1) and a second file (F_2), and (4) computer readable code for associating the first file (F_1) with the second file (F_2) in the file-system.


The present invention also provides a method of providing a service to manipulate the original content of at least one original read-only computer file in a computer file-system in a computer system, where the computer system includes an operating system including a framework for in-line monitoring of accesses to the file-system. In an exemplary embodiment, the method includes (1) transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content, (2) storing the transformed content in a transformed computer file (3) splitting the transformed computer file into a first file (F_1) and a second file (F_2) and (4) associating the first file (F_1) with the second file (F_3) in the file-system.





THE FIGURES


FIG. 1 is a flowchart of a prior art technique.



FIG. 2 is a flowchart in accordance with an exemplary embodiment of the present invention.



FIG. 3 is a flowchart of the splitting step in accordance with an exemplary embodiment of the present invention.



FIG. 4A is a flowchart of the writing step in accordance with an exemplary embodiment of the present invention.



FIG. 4B is a flowchart of the saving step in accordance with an exemplary embodiment of the present invention.



FIG. 4C is a flowchart of the saving step in accordance with an exemplary embodiment of the present invention.



FIG. 5 is a block diagram in accordance with an exemplary embodiment of the present invention.



FIG. 6A is a flowchart of the associating step in accordance with an exemplary embodiment of the present invention.



FIG. 6B is a flowchart of the associating step in accordance with a further embodiment of the present invention.



FIG. 7A is a flowchart of the naming step in accordance with an exemplary embodiment of the present invention.



FIG. 7B is a flowchart of the naming step in accordance with an exemplary embodiment of the present invention.



FIG. 8A is a block diagram in accordance with an exemplary embodiment of the present invention.



FIG. 8B is a block diagram in accordance with an exemplary embodiment of the present invention.



FIG. 9A is a flowchart of the retrieving step in accordance with an exemplary embodiment of the present invention.



FIG. 9B is a flowchart of the retrieving step in accordance with a further embodiment of the present invention.



FIG. 10 is a flowchart of the closing step in accordance with an exemplary embodiment of the present invention.





DETAILED DESCRIPTION OF THE INVENTION

The present invention provides a method and system of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, where the computer system includes an operating system including a framework for in-line monitoring of accesses to the file-system. In an exemplary embodiment, the method and system include (1) transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content, (2) storing the transformed content in a transformed computer file, (3) splitting the transformed computer file into a first file (F_1) and a second file (F_2), and (4) associating the first file (F_1) with the second file (F_2) in the file-system. In an exemplary embodiment, the non-length-preserving data transformation algorithm includes a length-increasing data transformation algorithm (i.e., encryption).


Referring to FIG. 2, in an exemplary embodiment, the present invention includes a step 212 of transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content, a step 214 of storing the transformed content in a transformed computer file, a step 216 of splitting the transformed computer file into a first file (F_1) and a second file (F_2), and a step 218 of associating the first file (F_1) with the second file (F_2) in the file-system.


Splitting

Referring to FIG. 3, in an exemplary embodiment, splitting step 216 includes a step 312 of writing an amount of the transformed content to the first file (F_1) that equals the size of the original computer file and a step 314 of saving the remainder of the transformed content to the second file (F_2).


Writing


Referring to FIG. 4A, in an exemplary embodiment, writing step 312 includes a step 412 of writing the first N bytes of the transformed computer file to the first file (F_1), where N equals the length of the original computer file.


Saving


Referring to FIG. 4B, in an exemplary embodiment, saving step 314 includes a step 422 of saving the bytes after the first N bytes of the transformed computer file to the second file (F_2).


Denying


Referring to FIG. 4C, in an exemplary embodiment, saving step 314 includes a step 432 of denying direct open requests for the second file (F_2).


Referring to FIG. 5, in an exemplary embodiment, the present invention (a) transforms the original content 510 via a non-length preserving data transformation algorithm transformer 520, thereby resulting in transformed content, (b) stores the transformed content in a transformed computer file 530, and (c) splits, via a splitter 540, the transformed computer file into a first file (F_1) 550 and a second file (F_2) 560.


Associating

Referring to FIG. 6A, in an exemplary embodiment, associating step 218 includes a step 612 of, if the file-system supports at least one alternate data stream, writing the second file (F_2) as an alternate data stream of the first file (F_1) and a step 614 of, if the file-system does not support at least one alternate data stream, naming the second file (F_2) with a name derived from the name of the first file (F_1). Referring to FIG. 6B, in a further embodiment, associating step 218 further includes a step 622 of, if the file-system does not support at least one alternate data stream, removing the second file (F_2) from a directory listing of the file-system.


Naming


Referring to FIG. 7A, in an exemplary embodiment, naming step 614 includes a step 712 of naming the second file (F_2) with a name derived from a cryptographic hash of at least the name of the first file (F_1). Referring to FIG. 7B, in a further embodiment, naming step 712 further includes a step 722 of naming the second file (F_2) with a name derived from a cryptographic hash of the name of the first file (F_1) and the data of the first file (F_1).


Referring to FIG. 8A, in an exemplary embodiment, the present invention (a) associates the first file (F_1) 550 with the second file (F_2) 560 in the file-system via an associator 812 and, (b) if the file-system supports at least one alternate data stream, writes the first file (F_1) as the primary data stream 814 of the first file (F_1) and writes the second (F_2) as an alternate data stream 816 of the first file (F_1). Referring to FIG. 8B, in an exemplary embodiment, the present invention (a) associates the first file (F_1) 550 with the second file (F_2) 560 in the file-system via an associator 822 and, (b) if the file-system does not support at least one alternate data stream, names the second file (F_2) with a name derived from the name of the first file (F_1), such that the content of the first file (F_1) 824 is associated with the content of the second file (F_2) 826.


Retrieving the Original Content

Referring to FIG. 9A, in a further embodiment, the method, system, and service further include a step 912 of, if a read request is received for the first file (F_1), retrieving the original content from the first file (F_1). Referring to FIG. 9B, in a further embodiment, retrieving step 912 further includes a step 922 of recognizing that the first file (F_1) resulted from transforming step 212, a step 924 of locating the second file (F_2) associated with the first file (F_1), a step 926 of, given the byte offset and byte length of the request and based on the non-length-preserving data transformation algorithm used in transforming step 212, determining whether to retrieve the transformed content from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2), a step 928 of retrieving the transformed content, based on determining step 926, from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2), a step 930 of performing the inverse transformation of the non-length-preserving data transformation algorithm on the retrieved transformed content, thereby resulting in the original content, and a step 932 of returning the original content.


In a specific embodiment, recognizing step 922 includes checking a special attribute to the first file (F_1) in order to determine if the first file (F_1) resulted from transforming step 212.


In an exemplary embodiment, if the non-length-preserving data transformation algorithm includes a length-decreasing data transformation algorithm (i.e., compression), the present invention includes padding the transformed file such that the length of the padded file (F_1) equals the length of the original file and such that the length of the second file (F_2) is 0 bytes.


Closing the Files

Referring to FIG. 10, in a further embodiment, the method, system, and service further includes a step 1012 of, if a close request is received for the first file (F_1), closing the first file (F_1) and the second file (F_2).


General

The present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment containing both hardware and software elements. In an exemplary embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, and microcode.


Furthermore, the present invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer system or any instruction execution system. The computer program product includes the instructions that implement the method of the present invention. A computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid-state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, and an optical disk. Current examples of optical disks include compact disk—read only memory (CD-ROM), compact disk—read/write (CD-R/W), and DVD.


A computer system suitable for storing and/or executing program code includes at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements include local memory employed during actual execution of the program code, bulk storage, and cache memories that provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution. Input/output (I/O) devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the computer system either directly or through intervening I/O controllers. Network adapters may also be coupled to the computer system in order to enable the computer system to become coupled to other computer systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems, and Ethernet cards are just a few of the currently available types of network adapters.


CONCLUSION

Having fully described a preferred embodiment of the invention and various alternatives, those skilled in the art will recognize, given the teachings herein, that numerous alternatives and equivalents exist which do not depart from the invention. It is therefore intended that the invention not be limited by the foregoing description, but only by the appended claims.

Claims
  • 1. A method of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, wherein the computer system comprises an operating system comprising a framework for in-line monitoring of accesses to the file-system, the method comprising: transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content;storing the transformed content in a transformed computer file;splitting the transformed computer file into a first file (F_1) and a second file (F_2); andassociating the first file (F_1) with the second file (F_2) in the file-system.
  • 2. The method of claim 1 wherein the splitting comprises: writing an amount of the transformed content to the first file (F_1) that equals the size of the original computer file; andsaving the remainder of the transformed content to the second file (F_2).
  • 3. The method of claim 2 wherein the writing comprises writing the first N bytes of the transformed computer file to the first file (F_1), wherein N equals the length of the original computer file.
  • 4. The method of claim 3 wherein the saving comprises saving the bytes after the first N bytes of the transformed computer file to the second file (F_2).
  • 5. The method of claim 2 wherein the saving comprises denying direct open requests for the second file (F_2).
  • 6. The method of claim 1 wherein the associating comprises: if the file-system supports at least one alternate data stream, writing the second file (F_2) as an alternate data stream of the first file (F_1); andif the file-system does not support at least one alternate data stream, naming the second file (F_2) with a name derived from the name of the first file (F_1).
  • 7. The method of claim 6 further comprising, if the file-system does not support at least one alternate data stream, removing the second file (F_2) from a directory listing of the file-system.
  • 8. The method of claim 6 wherein the naming comprises naming the second file (F_2) with a name derived from a cryptographic hash of at least the name of the first file (F_1).
  • 9. The method of claim 8 wherein the naming comprises naming the second file (F_2) with a name derived from a cryptographic hash of the name of the first file (F_1) and the data of the first file (F_1).
  • 10. The method of claim 1 further comprising, if a read request is received for the first file (F_1), retrieving the original content from the first file (F_1).
  • 11. The method of claim 10 wherein the retrieving comprises: recognizing that the first file (F_1) resulted from the transforming;locating the second file (F_2) associated with the first file (F_1);given the byte offset and byte length of the request and based on the non-length-preserving data transformation algorithm used in the transforming, determining whether to retrieve the transformed content from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2);retrieving the transformed content, based on the determining, from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2);performing the inverse transformation of the non-length-preserving data transformation algorithm on the retrieved transformed content, thereby resulting in the original content; andreturning the original content.
  • 12. The method of claim 1 further comprising, if a close request is received for the first file (F_1), closing the first file (F_1) and the second file (F_2).
  • 13. A system of manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system, wherein the computer system comprises an operating system comprising a framework for in-line monitoring of accesses to the file-system, the system comprising: a transforming module configured to transform the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content;a storing module configured to store the transformed content in a transformed computer file;a splitting module configured to split the transformed computer file into a first file (F_1) and a second file (F_2); andan associating module configured to associate the first file (F_1) with the second file (F_2) in the file-system.
  • 14. The system of claim 13 wherein the splitting module comprises: a writing module configured to write an amount of the transformed content to the first file (F_1) that equals the size of the original computer file; anda saving module configured to save the remainder of the transformed content to the second file (F_2).
  • 15. The system of claim 14 wherein the writing module comprises a writing module configured to write the first N bytes of the transformed computer file to the first file (F_1), wherein N equals the length of the original computer file.
  • 16. The system of claim 15 wherein the saving module comprises a saving module configured to save the bytes after the first N bytes of the transformed computer file to the second file (F_2).
  • 17. The system of claim 14 wherein the saving module comprises a denying module configured to deny direct open requests for the second file (F_2).
  • 18. The system of claim 13 wherein the associating module comprises: a writing module configured, if the file-system supports at least one alternate data stream, to write the second file (F—2) as an alternate data stream of the first file (F_1); anda naming module configured, if the file-system does not support at least one alternate data stream, to name the second file (F_2) with a name derived from the name of the first file (F_1).
  • 19. The system of claim 18 further comprising, a removing module configured, if the file-system does not support at least one alternate data stream, to remove the second file (F_2) from a directory listing of the file-system.
  • 20. The system of claim 18 wherein the naming module comprises a naming module configured to name the second file (F_2) with a name derived from a cryptographic hash of at least the name of the first file (F_1).
  • 21. The system of claim 20 wherein the naming module comprises a naming module configured to name the second file (F_2) with a name derived from a cryptographic hash of the name of the first file (F_1) and the data of the first file (F_1).
  • 22. The system of claim 13 further comprising a retrieving module configured, if a read request is received for the first file (F_1), to retrieve the original content from the first file (F_1).
  • 23. The system of claim 22 wherein the retrieving module comprises: a recognizing module configured to recognize that the first file (F_1) resulted from the transforming;a locating module configured to locate the second file (F_2) associated with the first file (F_1);a determining module configured, given the byte offset and byte length of the request and based on the non-length-preserving data transformation algorithm used in the transforming, to determine whether to retrieve the transformed content from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2);a retrieving module configured to retrieve the transformed content, based on the determining, from the first file (F_1), from the second file (F_2), or from the first file (F_1) and the second file (F_2);a performing module configured to perform the inverse transformation of the non-length-preserving data transformation algorithm on the retrieved transformed content, thereby resulting in the original content; anda returning module configured to return the original content.
  • 24. The system of claim 13 further comprising a closing module configured, if a close request is received for the first file (F_1), to close the first file (F_1) and the second file (F_2).
  • 25. A computer program product usable with programmable computer having readable program code embodied therein of manipulating the original content of a least one original read-only computer file in a computer file-system in a computer system, wherein the computer system comprises an operating system comprising a framework for in-line monitoring of accesses to the file-system, the computer program product comprising: computer readable code for transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content;computer readable code for storing the transformed content in a transformed computer file;computer readable code for splitting the transformed computer file into a first file (F_1) and a second file (F_2); andcomputer readable code for associating the first file (F_1) with the second file (F_2) in the file-system.
  • 26. A method of providing a service to manipulate the original content of at least one original read-only computer file in a computer file-system in a computer system, wherein the computer system comprises an operating system comprising a framework for in-line monitoring of accesses to the file-system, the method comprising: transforming the original content via a non-length-preserving data transformation algorithm, thereby resulting in transformed content;storing the transformed content in a transformed computer file;splitting the transformed computer file into a first file (F_1) and a second file (F_2); andassociating the first file (F_1) with the second file (F_2) in the file-system.