Not applicable.
Not applicable.
Not applicable.
Not applicable.
Field of the Invention: The present invention relates generally to data compression and archiving. More particularly the present invention relates to a system and method for dynamically creating two-stage self-extracting archives. More specifically, the method relates to intelligent selective linking of the decompressor/decryptor in the code segment of a self-extracting archive to reduce the overall size of the overall file.
Definitions: As used herein, the following terms shall generally have the indicated meanings:
Archive: a collection of files created for the purpose of storage or transmission, usually in compressed and otherwise transformed form. An archive generally includes structural information and archive data.
Self-Extracting Archive: a compressed archive file containing a compressed file archive as well as associated programming to extract this information. While typical archive files require a second executable file or program to extract from the archive, self-extracting archives generally do not require such a program or executable file.
Algorithm: a specific computational technique used for processing information.
Compression Algorithm: a specific computational technique used for encoding information using fewer bits than an encoded representation would use through use of specific encoding schemes
File: a set of one or more typed forks, also possessing optional attributes, which may include, but are not limited to directory, name, extension, type, creator, creation time, modification time, and access time.
Archive Data: file data in transformed form.
Archive Creation: the process of combining one or more files and their attributes into an archive.
Full Archive Expansion: the process of recreating forks, files, and their attributes from an archive.
Inverse Algorithm: transformation of data that is the inverse of another algorithm.
Background Discussion: Current archiving software such as STUFFIT®, ZIP®, RAR® and similar products create a self-extracting archive by statistically linking the code segment of the self-extracting archive. When creating a self-extracting archive, archiving software currently in use must add every possible algorithm (as well as supporting data necessary to extract files; e.g., tables or dictionaries) to the code segment. This may (and typically does) result in the creation of an unnecessarily large self-extracting archive.
When a self-extracting archive is created, not all of the algorithms need to be added to the self-extracting archive because only a subset of the possible algorithms is necessary for expansion of the archive. However, using the currently available archiving software, such as the utilities mentioned above, all algorithms are linked to the archive at the time of archive creation, whether or not these algorithms are utilized during the decompression process. Some of the algorithm code is therefore superfluous. The addition of such superfluous algorithm code to the archive results in a needlessly large archive size, sometimes even larger than the original uncompressed data.
In the existing approach, a fixed subset of the available algorithms is supported in order to limit the size of the code segment of a self-extracting archive and compression choices are limited to that fixed subset. This traditional approach may lead to any or all of the following potential problems: (1) algorithm code that will not be executed during expansion may nonetheless be included; (2) algorithms that might produce a smaller archive may be excluded; (3) an algorithm that is both included and used, may nonetheless result in smaller savings in the archived data than what it adds in code size.
It would therefore be desirable to provide a method of dynamically selecting the algorithms to be applied when the archive is created, and limiting the executable code included in the self-extracting archive to include only the corresponding inverse algorithms so as to facilitate a considerable reduction in the size of the resulting self-extracting archive.
The needed solution to the above-described problem is provided by the present invention, which is a method of dynamically creating a two-stage self-extracting archives. The method is implemented on a data processing computer, wherein during the archive creation process the executable code segments for inverse algorithms are selectively added to the self-extracting archive, but only for algorithms applied during archive creation. This archive creation process results in a considerably smaller size for the self-extracting archive. To achieve even further space saving, the original data can be reprocessed and any algorithm applied in the archive creation process that resulted in less space saving than the additional size of the corresponding inverse algorithm can be eliminated. Selected inverse algorithms are also compressed, and a compact inverse algorithm is provided as ready-to-execute code. This compact inverse algorithm restores the selected inverse algorithms to an executable state, and then causes them to be executed on the compressed file data.
The invention will be better understood and objects other than those set forth above will become apparent when consideration is given to the following detailed description thereof. Such description makes reference to the annexed drawings wherein:
The invention will be understood and its various objects and advantages will become apparent when consideration is given to the following detailed description thereof. Such description makes reference to the annexed drawings.
Referring first to
At this first stage all of the code modules used to prepare the archive are filtered separately 105. Furthermore, the savings in storage is increased by separately calculating the savings achieved by using each algorithm. For example, if the text optimizer comprises 100 kB of code and its dictionary comprises 100 kB, and using the optimizer does not produce at least 200 kB of savings in the archive, then no overall savings was achieved and the files are re-coded without the text optimizer, and the text optimizer code is removed from the subset of algorithms. This technique of re-coding the original archive leads to efficient storage in the self-extracting archive.
In the second stage of the archiving process 106, to further the savings in storage, a secondary archive structure of the code part of the self-extracting archive is prepared with another compact compressor 107. The code archive 108 includes the algorithm code module 109, the main code for parsing and extracting the archive in a compressed format (such as STUFFIT®, ZIP® or RAR®), the user interface code, and so forth—all of the code segments are compressed. This facilitates the further reduction in size of the self-extracting archive 110, which also includes the file data 111 and code for the compact inverse algorithm 112 used to load and decompress necessary algorithms. The self-extracting archive may be saved on any of a number of suitable data storage media, such as ROM, flash memory, hard disks, floppy discs, magnetic tapes, optical discs, and so forth, using any of a number of suitable storage devices, including hard disc drives, tape disc drives, compact disc drives, digital video disc drives, Blu-ray disc drives, flash memory data storage devices, and the like. [STUFFIT® is a registered trademark of Smith Micro Computer, Inc., of Aliso Viejo, Calif.; ZIP® is a registered trademark of Iomega Corporation, San Diego, Calif.; RAR® is a registered trademark of Eugene Roshal of Chelyabinsk, Russian Federation.]
Referring next to
At the second stage 303 the compressed files are extracted from the concatenated archive 304 and the original files are restored 305, and upon completion, the code segments that were temporarily extracted and run on the user's machine are disposed of 306.
It will be appreciated by those with skill in the art that the above-described method reduces the size of self-extracting archives by dynamically creating two-stage self-extracting archives which selectively include an appropriate/optimal decompressor/decryptor in the code segment of the archive. This advances the art of reducing demands on expensive hardware resources, such as disk storage space, and data communications resources, such as transmission bandwidth. The algorithms involved in the method steps are encoded and stored as a program on a computer-readable medium. Thus, the method is implemented on a programmable device, such as a suitable encoder/decoder, which executes the instructions for dynamically creating a two-stage self-extracting archive.
The above disclosure is sufficient to enable one of ordinary skill in the art to practice the invention, and provides the best mode of practicing the invention presently contemplated by the inventor. While there is provided herein a full and complete disclosure of the preferred embodiments of this invention, it is not desired to limit the invention to the exact construction, dimensional relationships, and operation shown and described. Various modifications, alternative constructions, changes and equivalents will readily occur to those skilled in the art and may be employed, as suitable, without departing from the true spirit and scope of the invention. Such changes might involve alternative materials, components, structural arrangements, sizes, shapes, forms, functions, operational features or the like.
Therefore, the above description and illustrations should not be construed as limiting the scope of the invention, which is defined by the appended claims.
The present application claims the benefit of U.S. Provisional Patent Application Ser. No. 61/326,132 filed Apr. 20, 2010 (Apr. 20, 2010).
| Number | Date | Country | |
|---|---|---|---|
| 61326132 | Apr 2010 | US |