1. Field of Invention
The present invention relates to data compression, and more specifically to a lossless text, audio and image data compression method and an apparatus for data storage and transmission, which significantly reduces the amount of data being accesses and transmitted between media devices.
2. Description of Related Art
In the past decades, the benefit of high efficiency of transmission media like xDSL makes Internet and other networking technology prevailingly popular in data communication. The convenience of using Internet and networking has driven more users more frequently to transmit larger data file into Internet which causes “traffic jam” of the networking environments. The growth rate of data being sent to networking like Internet appears to be higher than the growth rate of bandwidth of the transmission media and technology. The method and apparatus of the lossless data compression can reduce the amount of data rate hence make the efficiency of transmission higher and ease the problem of the “traffic jam” in data transmission.
Most semiconductor memories dissipate a certain amount of power during data accessing which include data writing, data erasing, data reading and data retaining. For instance, the DRAM, Dynamic Random Accessing Memory, consumes a lot power since its storage device is mostly likely made of a deep trench capacitor which inherently leaks current all the time once electronic charges are pulled into the capacitor and hence memory cells need to be refreshed from time to time which causes higher power consumption. In an SRAM, the Static Random Access Memory, the junction diodes of each transistor even not that severely leaks current like the DRAM, still leaks about ˜1 uA current every one thousand bits of cells.
Due to the prevailing advantage of no power dissipation during data retaining, the non-volatile memory, NVM has become a popular storage device in mass data storage.
A flash memory is a most commonly used NVM device. A flash memory can be programmed or said written byte by byte or word by word with a length of a word ranging from 8 bits to thousands of bits, while it can be erased only block by block. Which means, during erasure, a whole block data of flash memory cells will be erased. During reading, like most memories, flash memory outputs data byte by byte or word by word with a speed of tens nanosecond per output. In contrast, programming and erasing operations take much longer time in a scale of millisecond to tens of second depends on the block size of memory cells. Due to the need of applying high voltage on the gate and drain or source of the memory cell during programming and erasing, writing or erasing flash memory data consumes much higher power than other memory devices.
The advantage of consuming no power during retaining data drives the flash memory to become a key memory in the mass storage applications. Applications of the mass storage include but are not limited to memory cards like CF, a Compact Flash card, mainly used in digital cameras, SD, a Security Digital card, another popular memory card in digital cameras and USB memory disk, a popular portable memory disk.
Due to the high complexity of manufacturing and limited suppliers, the unit cost of the flash memory is higher than other semiconductor devices. And the end product prices of the mass storage devices like the memory cards and USB memory disk are materially higher.
The present invention is related to a method and apparatus of compressing data before transmitting or saving into a storage device which significantly reduces the amount of data needed to be transmitted and stored hence improves the performance of data transmission or writing data to a storage device and reducing the cost of the storage device.
According to one embodiment of the present invention, a lossless compression method is applied to compress and reduce data from a media before sending to a storage device or a transmission line.
According to another embodiment of the present invention, a lossless decompression method is applied to recover data from a storage device or from the end node of a transmission line.
According to another embodiment of the present invention, a lossless compression method is applied to compress and reduce data from a so-called “File System” and store it to a sector of the storage memory.
According to another embodiment of the present invention, a lossless decompression method is applied to recover the data from the storage memory and to be executed by the controller for accurately mapping data from the storage device to the media it accesses.
According to another embodiment of the present invention, a lossless decompression code or execution code is saved into the flash memory, when the storage device is connected to the PC or other media like Internet, TV, radio station or a set-top box, the losssless compression code or the execution code is read out from the flash memory and loaded into the PC or the media to compress those data before sending to the storage device.
According to another embodiment of the present invention, a lossless compression code or its execution code is saved as a software driver and is saved into a PC, when a storage device or a transmission media is connected to the PC or other media for data accessing or transmission, the data file needed to be sent goes through the lossless compression code or the execution code and compressed to be smaller size before storing to the storage device or being transmitted to the destination.
According to another embodiment of the present invention, a lossless decompression code or execution code is saved as a software driver and is saved into a PC, when a storage device or a transmission media is connected to the PC or other media for data accessing or transmission, the data file received from the source of data storage or transmission point goes through the lossless decompression procedure and recovers to be original data file in the point of the destination.
According to another embodiment of the present invention, a certain amount of types of data will be supported and a certain number of state machines are implemented to drive the sequences of the lossless compression procedure according to the types of data file to be stored or transmitted.
According to another embodiment of the present invention, a data path with ALU, arithmetic unit and multiplier is implemented to be shared and to execute the compression operation for each type of data.
It is to be understood that both the foregoing general description and the following detailed description are by examples, and are intended to provide further explanation of the invention as claimed.
The present invention relates specifically to a method and apparatus of lossless compression. The method and apparatus compresses data downloading from a PC, Internet or from a media and store the data to a storage device which results in a significant data reduction and hence reduces the cost of the semiconductor memory or reduces the time or bandwidth of data transmission.
In the past decade, the dropping of the semiconductor memory price and commercialization of some consumers' products which consuming a large amount of memory like digital camera, mobile phone, the mobile storage devices become more popular due to the convenience and portability. The popular mobile storage devices including some memory cards and USB memory drives become prevailingly welcome. Examples of such popular memory cards include CF card, the Compact Flash card, SD card, the Security Digital card, and MM card, the Multimedia card. These cards can be used as a storage devices in digital cameras as well as in mobile phones.
Most semiconductor memories dissipate a certain amount of power during data accessing which include data writing, data erasing, data reading and data retaining. For instance, the DRAM, Dynamic Random Accessing Memory consumes a lot power since its storage device is made mostly of a deep trench capacitor which inherently leaks current all the time once electronic charges are pull into he capacitor and hence memory cells need to be refreshed from time to time which consumes higher power consumption. In an SRAM, the Static Random Access Memory, the junction diodes of each transistor even not that severely leaks current like the DRAM, it still leaks about ˜1 uA current every one thousand bits of cells.
Due to the prevailing advantage of no power dissipation during data retaining, the non-volatile memory, NVM has become a popular storage device in mass data storage applications.
A flash memory is a most commonly used NVM device. A flash memory can be programmed or said written byte by byte or word by word with a length of a word ranging from 8 bits to thousands of bits, while it can be erased only block by block. Which means, during erasure, a whole block data of flash memory cells will be erased. During reading, like most memories, flash memory outputs data byte by byte or word by word with a speed of tens nanosecond per output. While programming and erasing operations take much longer time in a scale of millisecond to tens of second depends on the block size of memory cells. Due to the need of applying high voltage on the gate and drain or source of the memory cell during programming and erasing, writing or erasing flash memory data consume much higher power than other memory devices. As a storage device in mass data storage, the flash memory has high probability of programming-erasing operation and consumes higher power as described above.
The design of the embodiment as shown in the conceptual figure in
A software solution can be implemented like the block diagram illustrated in
In some close system applications, a storage device functions as like it does not matter with any format with external system since the system data format can be unique and the data format within a storage device is defined accordingly. In some system applications, the storage device needs to make the data format storing into the flash memory or reading from the flash memory fully compliant to the file format. In this case, the lossless compression reduces the data amount, but makes the data format twisted and no long the original starting address and end address in the file format. Which in some points need to be corrected in the data recovering.
In the VLSI chip implementation of the storage device control, the high cots of manufacturing the flash memory and high power consumption in writing and erasing data to and from the flash memory, the data compression technique becomes critical in cost reduction. Which means a 4X compression rate saves the flash memory cost by a factor of 4X. Since the external data types might not be known before the data is sent into the micro-controller. For maintaining the data quality and making compliant to most system, lossless data compression algorithm is needed in the storage device application.
Since different data type has different format and very variant in data organization, it is not feasible for a lossless data compression algorithm to support too many types of data. According to the present invention, some popular data types are supported in the lossless data compression. According to a statistically survey, in Windows®' “Word”, “Power Point”, “Notepad” and “Excel” are the most popular document/text file formats. In image file, the “.bmp” is the most popular raw data. In audio file, the “.WAV” is the most popular audio raw data. There are many lossless data compression algorithms been developed and applied to variable applications. The “.AVI” file is a popular audio-video raw data format comprising of a “.WAV” audio raw data and a “.bmp” image raw data. In the feasible hardware implementation, a certain amount of state machines are implemented to control the sequence of lossless data compression based on the data type accordingly.
One of the most popular lossless data compression algorithms is the LZ algorithm which is a dictionary based lossless compression developed by Dr. Lempel and Dr. Ziv. A dictionary based lossless algorithm saves previously pattern into a storage device and compares the coming pattern, if a match, a pair of (starting point, matching length) is assigned to represent the target pattern. Another lossless data compression algorithm is the RAR compression which achieves more than 4X to 10X lossless data compression in “word” and “power point” document data compression. Besides the dictionary based document compression, according to one of the embodiment of the present invention, a proprietary lossless image compression algorithm is developed and applied to compression the .bmp image documents. According to another embodiment of the present invention, a proprietary lossless audio compression algorithm is developed and applied to compress the .WAV raw audio data.
Since the target data file is loaded to the storage device sequentially, no two types of data will be read in the same time. For saving the gate count and the cost of the hardware implementation as shown in
For covering more applications, according to an embodiment of the present invention, another lossless data compression engine 424 is implemented to compress the data of the “File System” program before saving it into the flash memory 44. A File System is mainly used to indicate the file format and location of the starting and ending of a file. During data manipulation, the micro-controller copies the compressed “File System” from the flash memory and recover it sector by sector and saves into some temporary buffer 421, 42, 423 sequentially. The execution of the decompressed File System in the temporary buffer can be done sequentially without occupying large amount of buffer, which means a small amount buffer can be used to store a certain length of the File System program and use a buffer fullness or emptiness pointer to indicate the position the program being executing. When the buffer is below a predetermined level, the File System Codec 423 accelerates decompressing the compressed File System Program to avoid potential of emptiness of the program buffer. In a practical implementation, the smaller temporary buffer storing the decompressed File System can be organized as a ping-pong buffer with one buffer is being executing the function of file management, the other is used to receive and save the decompressed File System to accelerate the operation without wait state.
According to another embodiment of this invention in the application of data transmission, before a data file is sent to transmission media like Internet or Ethernet (networking), the data file can be compressed to be smaller amount of data file as shown in
According to an embodiment of this invention, the data files 61 planned to be sent out can be compressed by using the corresponding lossless compression mechanism 62. If the data file(s) is to be sent to the another destination through either Internet or Ethernet they can be compressed 62 and packed by inserting an execution file of a lossless decompression into any predetermined location of the packed data files. The execution file of decompression can include a complete decompression algorithms and can include only corresponding decompression algorithm according to the types of data to be transmitted.
It is obvious that the lossless data compression method and apparatus of the present invention helps significantly in reducing the amount of data to be stored or to be transmitted. The present invention significantly saves the time of writing data to and reading data from a storage device or through a transmission media which also results in a significant saving of power dissipation.
It will be apparent to those skills in the art that various modifications and variations can be made to the structure of the present invention without departing from the scope or the spirit of the invention. In the view of the foregoing, it is intended that the present invention cover modifications and variations of this invention provided they fall within the scope of the following claims and their equivalents.