Memory is a precious resource on embedded systems. For many embedded devices, flash memory is the storage medium of choice. Flash memory, however, is an expensive non-volatile memory that may be only written to a limited number of times before it fails. The failure of the flash memory occurs since each flash sector only has a limited number of write events that it may execute before it fails and burns out. In order to save cost, many systems attempt to minimize the amount of flash memory required. While the NTFS (New Technology File System) provides compression support that would save memory space, it is not typically used with flash memory. Using NTFS with flash memory may cause the memory to quickly fail since NTFS writes a log file to a specific sector on the media on a regular basis thereby exceeding its allowed write events. Additionally, NTFS requires a larger amount of space overhead as compared to other file systems. The use of the File Allocation Table (FAT) file system is commonly used with flash memory. Sector or volume based compression that is used in conjunction with FAT compresses the entire volume which may cause some applications and operating system components to perform slowly or improperly.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Individual files within a FAT volume may be compressed while other files remain uncompressed. A FAT Compression Filter (FCF) program intercepts file requests to the file system and performs compression and decompression tasks relating to the files on the FAT volume. An API may be used to configure and perform actions relating to the compression and decompression of the files stored on a FAT volume. The use of individual file compression with the FAT file system helps to ensure that the flash memory has a long life and does not quickly fail while still providing the benefits of individual file compression. The FAT Compression Filter allows individual files within a volume to be excluded from being compressed. Generally, files that are excluded from being compressed are files that when compressed would adversely affect an application's performance.
Referring now to the drawings, in which like numerals represent like elements, various embodiments will be described. In particular,
Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types. Other computer system configurations may also be used, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. Distributed computing environments may also be used where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
Referring now to
As illustrated, computer 100 includes a central processing unit 5 (“CPU”), a system memory 7, including a random access memory 9 (“RAM”) and a read-only memory (“ROM”) 11, and a system bus 12 that couples the memory to the CPU 5. System memory 7 may be any combination of non-volatile memory and volatile memory. A basic input/output system containing the basic routines that help to transfer information between elements within the computer, such as during startup, is stored in the ROM 11. The computer 100 further includes a mass storage device 14 for storing an operating system 16, application programs, and other program modules, which will be described in greater detail below.
The mass storage device 14 is connected to the CPU 5 through a mass storage controller (not shown) connected to the bus 12. The mass storage device 14 and its associated computer-readable media provide non-volatile storage for the computer 100. Although the description of computer-readable media contained herein refers to a mass storage device, such as a hard disk, DVD drive or CD-ROM drive, the computer-readable media can be any available media that can be accessed by the computer 100.
By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, digital versatile disks (“DVD”), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 100.
According to various embodiments, the computer 100 may operate in a networked environment using logical connections to remote computers through a network 18, such as the Internet. The computer 100 may connect to the network 18 through a network interface unit 20 connected to the bus 12. The network interface unit 20 may also be utilized to connect to other types of networks and remote computer systems. The connection may be a wired and/or wireless connection. The computer 100 also includes an input/output controller 22 for receiving and processing input from a number of devices, such as: a keyboard, mouse, electronic stylus and the like. Similarly, the input/output controller 22 may provide output to a display 28, speakers, or some other type of device.
As mentioned briefly above, a number of program modules and data files may be stored in the memory of the computer 100, including an operating system 16 suitable for controlling the operation of a computing device, such as the WINDOWS MOBILE or WINDOWS XP operating systems from MICROSOFT CORPORATION of Redmond, Wash. The computing device 100 may be an embedded system that includes an embedded operating system as well as other embedded data, files and applications.
The operating system 16 may utilize the FAT file system. Generally, the FAT file system allows an operating system to keep track of the location and sequence of each piece of a file. Additionally, the FAT file system allows the operating system 16 to identify the clusters that are unassigned and available for new files. When a request is received to read a file, the FAT file system reassembles each piece of the file into one unit for viewing.
According to one embodiment, all or some of the memory the may be FLASH memory, or some other suitable memory for embedded systems. The mass storage device 14 and RAM 9 may also store one or more program modules. In particular, the mass storage device 14 and the RAM 9 may store a FAT compression filter (FCF) program 10. The FCF program 10 is operative to provide functionality for interacting with and compressing/decompressing files 24 and interacting with operating system 16. For example, FCF program 10 is configured to individually intercept calls to the FAT file system, perform the compression and decompression tasks, and return the data to/from the volume on the mass storage device or to/from the requesting application. The use of individual file compression with the FAT file system helps to ensure that the flash memory has a long life and does not quickly fail while providing individual file compression. Individual files within a FAT volume may be excluded from being compressed. As such, an exclusion list 26 may be utilized to facilitate excluding specific files from being compressed. Other types of indicators may be used to indicate whether or not a file should be compressed. For example, each file may have an indicator within a header; the filename may indicate whether it should be compressed, and the like. Generally, the files that are excluded from compression are files that are required early in the boot process of a computing device or those files that adversely affect an application's performance when compressed. The determination of the files to be excluded from compression may be configured by an authorized user. For example, in one application, an authorized user may be a system administrator whereas in another application an authorized user may be the user of computing device 100. Additional details regarding the operation of the FCF program 10 will be provided below.
Generally, FAT compression system 200 allows individual files within a FAT volume to be compressed while other files remain uncompressed. The FAT Compression Filter (FCF) program 10 intercepts file system requests 204 made by an application (e.g. application 202) to the file system 222 and performs the compression and decompression tasks relating to the files. Files that are typically excluded from being compressed are boot files and files that when compressed adversely affect an application's performance. The files stored on the FAT volume may be a mixture of compressed files 236 and uncompressed files 240. The files may also reside on one or more FAT volumes (e.g. FAT volume 1230 and FAT volume 2250). The FCF program 10 allows individual files within a volume to be excluded from being compressed.
Exclusion list 232 is used to identify the files that should not be compressed. Exclusion list 232 may also include folders or paths that are not to be compressed. The exclusion list 232 may be configured to compress all/or part of the files and/or subdirectories that are contained within a folder or below a specified path. The exclusion list 232 may also include a checksum to allow the FCF program 10 to determine whether the exclusion list file has been tampered with or has been corrupted. Other methods may also be used to determine if the exclusion list has been tampered with and/or corrupted.
FCF program 10 includes settings 212. The settings 212 may include many different types of settings relating to the operation of FCF program 10. For example, settings 212 may include a list of files to always exclude from being compressed, a default compression algorithm, a minimum compression threshold, and the like. Settings 212 may be configured globally, by volume, by folder, or by file.
The FCF program 10 also includes a volume list 210 that defines which FAT volumes are attached and that include files that should be compressed by FCF program 10. When a new FAT volume is accessed, the FCF program 10 checks the root of the FAT volume for a configuration file 231. If the configuration file 231 exists and specifies that it is to be attached to the FCF program 10 then the volume is attached to the FCF program 10 and the volume list 210 is updated. Similarly, when a volume is unattached the volume is removed from volume list 210. Many other ways may be used to determine whether a FAT volume is attached to FCF program 10. For example, any FAT volume that resides on a computing device may be automatically attached, only specified FAT volume(s) are attached, and the like.
Both compressed files 236 and uncompressed files 240 reside on a FAT volume. According to one embodiment, each compressed file 236 includes a header 234 that is utilized by the FCF program 10. According to one embodiment, the header 234 includes a signature; a compression type; a checksum; and compression mapping information. Among other uses, the FCF program 10 uses header 234 to identify whether a file is compressed. When a file includes header 234 then the file is compressed. When a file does not include header 234 then the file is not compressed. This allows system 200 to read files without a separate mapping file, as well as making the files portable and allowing for different compression algorithms to be used on the same file system. The unique signature within the header 234 may also be used to identify the file as a compressed file.
The compression type within header 234 may be used to specify a compression algorithm to be used in performing the compression on the file. According to one embodiment, files are compressed by default using a ZIP compression algorithm, such as the MSZip compression algorithm. Other compression algorithms may be specified. For example, an LZNT compression algorithm may be used. Compression algorithms offer different advantages. Generally, the tradeoff is between space and performance. The ability to select a compression algorithm allows applications and devices to be optimized for their particular use.
Other methods may be used to identify the compression algorithm. For example, all files may be compressed using a default compression algorithm, a list may be included that identifies each file and its compression algorithm, and the like. Including the type of compression algorithm within the header 234 of each of the compressed files 236 helps to ensure that the compressed file 236 will be accessible even if the file system supports a different default compression algorithm. According to one embodiment, once a file has been compressed using one compression algorithm, any updates to the file continue to use the same compression algorithm. To change the compression algorithm, the file is uncompressed by FCF program 10 and then recompressed by FCF program 10 using the selected compression algorithm.
When application 202 requests data to be read from an attached FAT volume (e.g. FAT volume 230), the FCF program 10 identifies whether or not the file is compressed. According to one embodiment, the FCF program 10 determines whether the file includes header 234. If the file does include the header, the FCF program 10 reads the data from the file, decompresses the requested portion of the file, and passes the requested data back to the requesting application 202 through file system requests 204. When the file does not include the header, the FCF program 10 passes back the requested data without performing any decompression on the data.
When a write is requested by application 202, the FCF program 10 receives the request through file system requests 204 and determines whether the file is compressed or should be compressed (e.g. a copy to a different volume, the file does not currently exist, etc.). When the file does not already exist in the FAT volume, then the exclusion list 232 is accessed to determine whether to compress the file before it written to the FAT volume. As with the read request, a determination is made as to whether the file includes header 234. When the file includes header 234, the FCF program 10 determines the compression algorithm specified in header 234 and uses the specified compression algorithm to compress the data before writing the data to the file on the FAT volume. When the file does not include the header, the data is written to the file on the FAT volume without being compressed.
When copying a file on a FAT volume, the compression system 200 writes a new file to the specified location. If the file is copied to a location that specifies the file to be compressed then the file is compressed before being stored on the FAT volume. Moving a file within the same FAT volume changes the file location in the file allocation table and does not change the compression of the file. Alternatively, a move may involve determining whether the file should be compressed or uncompressed in the new location. In this example, the move would be treated as a copy with the original file being removed from the FAT volume after being moved. Similarly, moving a file across volumes involves copying the file to the new volume and then deleting the file on the original volume.
According to one embodiment, if the file that is being copied to a volume on another device where it will be stored in a compressed format then the file is recompressed on the destination device by the FCF program. This helps to ensure that each device may interact with the compressed files. According to another embodiment, the file may be copied to the new location in the compressed format. In this situation it should be ensured that the device includes support for the specified compression algorithm.
According to one embodiment, if the file is to be compressed, the FCF program 10 determines if the file in a compressed state meets a minimum compression threshold (e.g. a savings of <5% by default). Other thresholds may be utilized. If the file does not meet the minimum compression threshold then the file is stored as an uncompressed file to help ensure that there is no degradation in performance. Files excluded from compression for not meeting the minimum compression threshold are added to the exclusion list 232 and are marked as not meeting the minimum compression threshold. Any file that is marked as not meeting the minimum compression threshold may be periodically retested according to the specified settings. The value for the minimum compression threshold may be stored within settings 212 and may be configured many different ways. For example, the minimum compression threshold may be configured using API 238.
According to one embodiment, known boot files to the FCF program 10 remain uncompressed and may not be compressed. The boot files may be dynamically identified by FCF program 10 by searching the run-times registry for boot drivers. These boot drivers may be added to the exclusion list 232 and/or settings 212 and marked as mandatory. When marked as mandatory, the file is never compressed.
API 238 provides an interface to interact with and adjust settings relating to compressing individual files on a FAT volume. API 238 may be utilized to remove a file or path from the exclusion list 232; commit exclusion list changes now; set whether a specific file (or files within a folder or files below a path) are to be either compressed or uncompressed; update the compressed state of files; apply changes to only new files; attach/detach a volume; and change the default compression type. A command line tool may also be used to configure the settings relating to compressing the files on a FAT volume. For example, the command line tool may be used attach or detach a volume to the FCF program, display the exclusion list, and the like. The following is a list of exemplary functions that may be utilized within API 238. Other combination of functions may also be utilized.
An Update Exclusion List is used to add, remove, display, and change information in the exclusion list 232.
A Convert Files function is used to make changes to the compressed state of a file, or files within a directory structure. According to one embodiment, the convert files function includes may utilize the following arguments. The “Subdirs” argument forces the changing of all files within the directory and its subdirectories to the specified compression state. The “C” or compress argument compresses the file. The “U” or uncompress argument decompresses the file. The FORCE argument in combination with any of the other arguments forces the change to the file regardless of the files inclusion within the exclusion list 232. An argument may also be supplied that specifies the compression algorithm to use (e.g. -LZNT, -MSZip, and the like).
The FCF program intercepts file requests before they are sent to the file system. To account for the change in file structure due to compression, the file request is modified by mapping the offset from the uncompressed file to the compressed file. In this example, a request for Chunks 2 and 3 is sent to the stack of the file system, which handles the disk IO. The file system then returns the compressed data (chunks 2 and 3) of compressed file 312. The FCF program intercepts the returned data, decompresses the data, truncates any extra data that was not requested and then returns the data as requested.
Referring now to
Moving to decision operation 420, a determination is made as to whether the file from which data has been requested is compressed. According to one embodiment, the file is compressed when it includes a header.
When the file is not compressed, the process flows to operation 430 where the requested data is retrieved from the uncompressed file. The process then moves to operation 460 where the data is returned.
When the file is compressed, the process flows to operation 440 where the requested data is located and retrieved from the compressed file. According to one embodiment, the header within the compressed file includes mapping information that indicates where to access the requested data within the compressed file.
Moving to operation 450, the retrieved data is decompressed using the specified compression algorithm. The operation then moves to operation 460 where the data is returned to the requesting application. The process then moves to an end operation and returns to processing other actions.
Moving to decision operation 520, a determination is made as to whether the write request is for a file that already exists on the FAT volume. When the file does not already exist, the process flows to operation 540 where the file is created (See
When the file already exists, the process flows to decision operation 530 where a determination is made as to whether the file is compressed. When the file is not compressed, the process flows to operation 560 where the uncompressed data is written to the file.
When the file is compressed, the process flows to operation 550 where the data that is associated with the write request is compressed using the selected compression algorithm. The header is also updated to include any changes to the mapping information. The process then moves to operation 560 where the compressed data is written to the file.
The process then moves to an end block and returns to processing other actions.
When the file is to be compressed, the process moves to operation 620 where the data is compressed using the selected compression algorithm. The process then flows to optional operation 630 where a header is created. As discussed above, the header includes information relating to the compression of the file as well as mapping information.
Moving to operation 640, the compressed data and header (if included) is written to the new file. The process then moves to an end operation and returns to processing other actions.
The above specification, examples and data provide a complete description of the manufacture and use of the composition of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.