The present disclosure is related to Huffman coding.
As is well-known, Huffman codes of a set of symbols are generated based at least in part on the probability of occurrence of source symbols. A binary tree, commonly referred to as a “Huffman Tree” is generated to extract the binary code and the code length. See, for example, D. A. Huffman, “A Method for the Construction of Minimum—Redundancy Codes,” Proceedings of the IRE, Volume 40 No. 9, pages 1098 to 1101, 1952. D. A. Huffman, in the aforementioned paper, describes the process this way:
The subject matter regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of this specification. The invention, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.
As previously described, Huffman codes for a set of symbols are generated based, at least in part, on the probability of occurrence of the source symbols. Accordingly, a binary tree, commonly referred to as a Huffman tree, is generated to extract the binary code and the code length. For example, in one application for text compression standards, such as GZIP, although, of course, the invention is limited in scope to this particular application, the Huffman tree information is passed from encoder to decoder in terms of a set of code lengths with the compressed text data. Both the encoder and decoder generate a unique Huffman code based on the code length information. However, generating the length information for the Huffman codes by constructing the corresponding Huffman tree is inefficient and often redundant. After the Huffman codes are produced from the Huffman tree, the codes are abandoned because the encoder and decoder will generate the Huffman codes based on the length information. Therefore, it would be desirable if the length information could be determined without producing a Huffman tree.
One embodiment, in accordance with the invention of a method of generating code lengths, for codes; to be encoded, using a data structure, is provided. In this particular embodiment, the data structure is sorted, symbols in the data structure are combined, and symbol length is updated based, at least in part, on the frequency of the symbols being coded. In this particular embodiment, the data structure aides in the extraction of lengths of Huffman codes from a group of symbols without generating a Huffman tree where the probability of occurrence of the symbols is known. Although the invention is not limited in scope to this particular embodiment, experimental results show efficiency both in terms of computation and usage of memory suitable for both software and hardware implementation.
In this particular embodiment, although, again, the invention is not limited in scope in this respect, the data structure to be employed has at least two portions. As has previously been indicated, it is noted that the invention is not restricted in scope to this particular data structure. Clearly, many modifications to this particular data structure may be made and still remain within the spirit and scope of what has been described. For this embodiment, however, one portion is illustrated in FIG. 2. This portion of the data structure tracks or stores the index and length information for each non-zero frequency symbol. As illustrated in
As illustrated,
The second part or portion of the data structure for this particular embodiment, after initialization using the data or symbols in
As previously described, initially, the symbol to be coded is assigned a different bit flag for each symbol. Again, in this particular embodiment, although the invention is, again, not limited in scope in this respect, the code length initially comprises zero for each symbol. As shall be described in more detail hereinafter, in this particular embodiment, with the data structure initialized, symbol flags are combined beginning with the smallest frequency symbols. The symbols are then resorted and frequency information is updated to reflect the combination. These operations of combining signal flags and resorting are then repeated until no more symbols remain to be combined.
As previously described, the process is begun by initializing the data structure, such as the embodiment previously described, and setting a “counter” designated here “no_of_group”, to the number of non-zero frequency symbols, here 16. Next, while this “counter,” that is, no_of_group, is greater than one, the following operations are performed.
Begin
}/* end of while */
End
As illustrated in
It is likewise noted, although the invention is not limited in scope in this respect, that the merger or combining operation for the group frequency may be implemented in this particular embodiment by simply adding the frequencies together and a merger/combining operation for the second field of the data structure for this particular embodiment may be implemented as a “bitwise” logical OR operation. This provides advantages in terms of implementation in software and/or hardware. Another advantage of this particular embodiment is efficient use of memory, in addition to the ease of implementation of operations, such as summing and logical OR operations.
As previously described, a combining or merge operation results in two “groups” or “rows” being combined into one. Therefore, memory that has been allocated may be reused and the dynamic allocation of new memory after initialization is either reduced or avoided.
Next, the length information in the first portion or part of the data structure for this particular embodiment is updated to reflect the previous merging or combining operation. This is illustrated, for example, for this particular embodiment, in FIG. 4. One way to implement this operation, although the invention is not restricted in scope in this respect, is by scanning the “one” bits of the merged bit flags. That is, in this particular embodiment, the second field in the second portion of the data structure, is scanned and length information is increased or augmented by one in the corresponding entries in the first portion or part of the data structure.
Next the “counter” that is here, no_of_group, is reduced by one. The previous operations are repeated until the counter reaches the value one in this particular embodiment.
It should be noted that for this particular embodiment, once the “counter” reaches one, as illustrated in
As previously described, for this particular embodiment of a method of generating code length information, several advantages exist. As previously discussed, in comparison, for example, with generating the Huffman tree, memory usage is reduced and the dynamic allocation of memory may be avoided or the amount of memory to be dynamically allocated is reduced. Likewise, computational complexity is reduced.
Likewise, as previously described, operations employed to implement the previously described embodiment are relatively easy to implement in hardware or software, although the invention is not limited in scope to those embodiments in these particular operations. Thus, Huffman code length information may be extracted or produced without generating a Huffman tree.
In an alternative embodiment in accordance with the present invention, a method of encoding symbols may comprise encoding symbols using code length information; and generating the code length information without using a Huffman tree, such as, for example, using the embodiment previously described for generating code length information, although the invention is, of course, not limited in scope to the previous embodiment. It is, of course, understood in this context, that the length information is employed to encode symbols where the length information is generated from a Huffman code. Likewise, in another alternative embodiment in accordance with the present invention, a method of decoding symbols may comprise decoding symbols, wherein the symbols have been encoded using code length information and the code length information was generated without using a Huffman tree. It is, again, understood in this context, that the length information employed to encode symbols is generated from a Huffman code. Again, one approach to generate the code length information comprises the previously described embodiment.
It will, of course, be understood that, although particular embodiments have just been described, the invention is not limited in scope to a particular embodiment or implementation. For example, one embodiment may be in hardware, whereas another embodiment may be in software. Likewise, an embodiment may be in firmware, or any combination of hardware, software, or firmware, for example. Likewise, although the invention is not limited in scope in this respect, one embodiment may comprise an article, such as a storage medium. Such a storage medium, such as, for example, a CD-ROM, or a disk, may have stored thereon instructions, which when executed by a system, such as a computer system or platform, or an imaging system, may result in an embodiment of a method in accordance with the present invention being executed, such as a method of generating Huffman code length information, for example, as previously described. Likewise, embodiments of a method of initializing a data structure, encoding symbols, and/or decoding symbols, in accordance with the present invention, may be executed.
While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.
This patent application is a continuation of U.S. patent application Ser. No. 09/704,392, filed Oct. 31, 2000 now U.S. Pat. No. 6,636,167, titled “A Method of Generating Huffman Code Length Information.” The subject patent application also is related to U.S. patent application Ser. No. 09/704,380, filed Oct. 31, 2000, titled “A Method of Performing Huffman Decoding,” by Acharya et al., assigned to the assignee of the present invention and herein incorporated by reference. The subject patent application also is related to U.S. patent application Ser. No. 10/293,187, titled “A Method of Performing Huffman Decoding,” by Acharya et al., assigned to the assignee of the present invention. The subject patent application also is related to U.S. patent application Ser. No. 10/391,892, titled “A Method of Performing Huffman Decoding,” by Acharya et al., assigned to the assignee of the present invention.
Number | Name | Date | Kind |
---|---|---|---|
4813056 | Fedele | Mar 1989 | A |
5467088 | Kinouchi et al. | Nov 1995 | A |
5778371 | Fujihara | Jul 1998 | A |
5875122 | Acharya | Feb 1999 | A |
5973627 | Bakhmutsky | Oct 1999 | A |
5995210 | Acharya | Nov 1999 | A |
6009201 | Acharya | Dec 1999 | A |
6009206 | Acharya | Dec 1999 | A |
6047303 | Acharya | Apr 2000 | A |
6075470 | Little et al. | Jun 2000 | A |
6091851 | Acharya | Jul 2000 | A |
6094508 | Acharya et al. | Jul 2000 | A |
6108453 | Acharya | Aug 2000 | A |
6124811 | Acharya et al. | Sep 2000 | A |
6130960 | Acharya | Oct 2000 | A |
6151069 | Dunton et al. | Nov 2000 | A |
6151415 | Acharya et al. | Nov 2000 | A |
6154493 | Acharya et al. | Nov 2000 | A |
6166664 | Acharya | Dec 2000 | A |
6178269 | Acharya | Jan 2001 | B1 |
6195026 | Acharya | Feb 2001 | B1 |
6215908 | Pazmino et al. | Apr 2001 | B1 |
6215916 | Acharya | Apr 2001 | B1 |
6229578 | Acharya et al. | May 2001 | B1 |
6233358 | Acharya | May 2001 | B1 |
6236433 | Acharya et al. | May 2001 | B1 |
6236765 | Acharya | May 2001 | B1 |
6269181 | Acharya | Jul 2001 | B1 |
6275206 | Tsai et al. | Aug 2001 | B1 |
6285796 | Acharya et al. | Sep 2001 | B1 |
6292114 | Tsai et al. | Sep 2001 | B1 |
6292144 | Taflove et al. | Sep 2001 | B1 |
6301392 | Acharya | Oct 2001 | B1 |
6348929 | Acharya | Feb 2002 | B1 |
6351555 | Acharya et al. | Feb 2002 | B1 |
6356276 | Acharya | Mar 2002 | B1 |
6366692 | Acharya | Apr 2002 | B1 |
6366694 | Acharya | Apr 2002 | B1 |
6373481 | Tan et al. | Apr 2002 | B1 |
6377280 | Acharya et al. | Apr 2002 | B1 |
6381357 | Tan et al. | Apr 2002 | B1 |
6392699 | Acharya | May 2002 | B1 |
6449380 | Acharya et al. | Sep 2002 | B1 |
6535648 | Acharya | Mar 2003 | B1 |
6556242 | Dunton et al. | Apr 2003 | B1 |
6563439 | Acharya et al. | May 2003 | B1 |
6563948 | Tan et al. | May 2003 | B2 |
6574374 | Acharya | Jun 2003 | B1 |
6600833 | Tan et al. | Jul 2003 | B1 |
6608912 | Acharya et al. | Aug 2003 | B2 |
6625308 | Acharya et al. | Sep 2003 | B1 |
6625318 | Tan et al. | Sep 2003 | B1 |
6628716 | Tan et al. | Sep 2003 | B1 |
6628827 | Acharya | Sep 2003 | B1 |
6633610 | Acharya | Oct 2003 | B2 |
6636167 | Acharya et al. | Oct 2003 | B1 |
6639691 | Acharya | Oct 2003 | B2 |
6640017 | Tsai et al. | Oct 2003 | B1 |
6646577 | Acharya et al. | Nov 2003 | B2 |
6650688 | Acharya et al. | Nov 2003 | B1 |
6653953 | Becker et al. | Nov 2003 | B2 |
6654501 | Acharya et al. | Nov 2003 | B1 |
6658399 | Acharya et al. | Dec 2003 | B1 |
6662200 | Acharya | Dec 2003 | B2 |
6678708 | Acharya | Jan 2004 | B1 |
6681060 | Acharya et al. | Jan 2004 | B2 |
6690306 | Acharya et al. | Feb 2004 | B1 |
6694061 | Acharya | Feb 2004 | B1 |
6697534 | Tan et al. | Feb 2004 | B1 |
6707928 | Acharya et al. | Mar 2004 | B2 |
6725247 | Acharya | Apr 2004 | B2 |
6731706 | Acharya et al. | May 2004 | B1 |
6731807 | Pazmino et al. | May 2004 | B1 |
6738520 | Acharya et al. | May 2004 | B1 |
6748118 | Acharya et al. | Jun 2004 | B1 |
6751640 | Acharya | Jun 2004 | B1 |
6757430 | Metz et al. | Jun 2004 | B2 |
6759646 | Acharya et al. | Jul 2004 | B1 |
6766286 | Acharya | Jul 2004 | B2 |
6775413 | Acharya | Aug 2004 | B1 |
6795566 | Acharya et al. | Sep 2004 | B2 |
6795592 | Acharya et al. | Sep 2004 | B2 |
6798901 | Acharya et al. | Sep 2004 | B1 |
6813384 | Acharya et al. | Nov 2004 | B1 |
6825470 | Bawolek et al. | Nov 2004 | B1 |
6834123 | Acharya et al. | Dec 2004 | B2 |
20020063789 | Acharya et al. | May 2002 | A1 |
20020063899 | Acharya et al. | May 2002 | A1 |
20020101524 | Acharya | Aug 2002 | A1 |
20020118746 | Kim et al. | Aug 2002 | A1 |
20020122482 | Hyun et al. | Sep 2002 | A1 |
20020161807 | Acharya | Oct 2002 | A1 |
20020174154 | Acharya | Nov 2002 | A1 |
20020181593 | Acharya et al. | Dec 2002 | A1 |
20030021486 | Acharya | Jan 2003 | A1 |
20030053666 | Acharya et al. | Mar 2003 | A1 |
20030063782 | Acharya et al. | Apr 2003 | A1 |
20030067988 | Kim et al. | Apr 2003 | A1 |
20030072364 | Kim et al. | Apr 2003 | A1 |
20030108247 | Acharya | Jun 2003 | A1 |
20030123539 | Kim et al. | Jul 2003 | A1 |
20030126169 | Wang et al. | Jul 2003 | A1 |
20030174077 | Acharya et al. | Sep 2003 | A1 |
20030194008 | Acharya et al. | Oct 2003 | A1 |
20030194128 | Acharya et al. | Oct 2003 | A1 |
20030210164 | Acharya et al. | Nov 2003 | A1 |
20040017952 | Acharya et al. | Jan 2004 | A1 |
20040022433 | Acharya et al. | Feb 2004 | A1 |
20040042551 | Acharya et al. | Mar 2004 | A1 |
20040047422 | Acharya et al. | Mar 2004 | A1 |
20040057516 | Kim et al. | Mar 2004 | A1 |
20040057626 | Acharya et al. | Mar 2004 | A1 |
20040071350 | Acharya et al. | Apr 2004 | A1 |
20040080513 | Acharya | Apr 2004 | A1 |
20040146208 | Pazmino et al. | Jul 2004 | A1 |
20040158594 | Acharya | Aug 2004 | A1 |
20040169748 | Acharya | Sep 2004 | A1 |
20040169749 | Acharya | Sep 2004 | A1 |
20040172433 | Acharya et al. | Sep 2004 | A1 |
20040174446 | Acharya | Sep 2004 | A1 |
20040240714 | Acharya et al. | Dec 2004 | A1 |
Number | Date | Country |
---|---|---|
0 907 288 | Apr 1999 | EP |
Number | Date | Country | |
---|---|---|---|
20030210164 A1 | Nov 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09704392 | Oct 2000 | US |
Child | 10454553 | US |