The present invention relates to the field of image processing. More specifically, the present invention relates to improved hardware parallelization.
When encoding an image or an image block, the pixels are typically processed in a raster scan order. Thus, the generated bitstream for an image or an image block is in the same order. Because of this form of encoding, encoding and decoding of the image occurs in a serial fashion. Therefore, a more efficient process of encoding and decoding is desired when the total number of bits generated for the image or image block is fixed (constant).
Bi-directional bitstream ordering is able to be used for expedited processing. The first part of the bitstream is coded in a standard format, but the end of the bitstream is coded in reverse order. In encoding and decoding, parallel processing is able to be implemented to provide more efficient (parallel and hence faster) encoding and decoding where a bitstream is separated and processed in parallel.
In one aspect, a method implemented in a controller of a device comprises placing a first set of bits of a block in a bitstream starting at the beginning of the bitstream and placing a second set of bits in the bitstream starting at the end of the bitstream in a reverse order. The block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream. The block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block. The device is an encoder. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone/iPad, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
In another aspect, a method of decoding a bitstream in a controller of a device comprises decoding a first set of bits of a block in the bitstream using a first processor and decoding a second set of bits in the bitstream using a second processor, wherein the second set of bits are in a reverse order. The block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream. The block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block. The controller comprises hardware logic gates. The controller comprises a memory and a processor. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone/iPad, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
In another aspect, a device comprises an encoding module for placing a first set of bits of a block in a bitstream starting at the beginning of the bitstream, placing a second set of bits in the bitstream starting at the end of the bitstream in a reverse order and a decoding module for: decoding the first set of bits in the bitstream using a first processor and decoding the second set of bits in the bitstream using a second processor. The block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream. The block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block. The encoder module and the decoder module comprise hardware logic gates. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone/iPad, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
In yet another aspect, a device comprises a memory for storing an application, the application for decoding a first set of bits in a bitstream using a first processor and decoding a second set of bits in the bitstream using a second processor, wherein the second set of bits in a reverse order and a processing component coupled to the memory, the processing component configured for processing the application. The block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream. The block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block.
In another aspect, an encoder comprises a bitstream size and reconstruction quality computation component for computing a bitstream size and a reconstruction quality using different values of a quantization number, a mode decision component for determining a best mode of the bitstream size and reconstruction quality and a differential pulse code modulation encoding component for decoding a block of data using the best mode, wherein a variable length generation and concatenation block utilizes parallel processing to process bits of the block. The block is two lines and a first set of bits and a second set of bits are each approximately half of the bitstream, wherein the second set of bits are in reverse order. The block is more than two lines and a first set of bits are even lines and a second set of bits are odd lines of the block, wherein the second set of bits are in reverse order.
A random access capability (RAC) codec is described in U.S. patent application Ser. No. 12/789,010, filed May 27, 2010, titled, “IMAGE COMPRESSION METHOD WITH RANDOM ACCESS CAPABILITY,” which is hereby incorporated by reference. A fixed number of bits (FNB) codec is described in U.S. patent application Ser. No. 13/035,060, filed Feb. 25, 2011, titled, “METHOD OF COMPRESSION OF DIGITAL IMAGES USING A FIXED NUMBER OF BITS PER BLOCK,” which is hereby incorporated by reference.
A modified RAC codec or a modified FNB codec and its bitstream syntax are able to have the bits reordered in the bitstream to enable parallel decoding of each block. Parallel decoding branches within each block are able to reduce decoder delay or decoder gate size. The changes will also reduce delay and gate size in the Variable Length Coding (VLC) concatenation part of the encoder.
In the case of coding an 8×2 block in Differential Pulse Code Modulation (DPCM) mode in the RAC codec as shown in
In The RAC codec, samples are coded and decoded in raster scan order as shown in
If the starting location of the VLC bits for the sample 9 were known in advance, then samples 2 and 9 are able to be decoded in parallel, followed by samples 3 and 10 in parallel, samples 4 and 11 in parallel and so on. Therefore, a bi-directional syntax includes placing the bits of samples 1 to 8 from the beginning of the fixed-length bitstream and placing the bits of samples 9 to 16 from the end of the fixed-length bitstream, in reverse order, as shown in
Using bi-directional syntax and decoding, there is no change to the samples in the first line (samples 1-8). However, samples (or pixels) in the second line are written from the last bit of the bitstream, in a reversed order. In The RAC codec, there is a fixed number of bits per block. With the bi-directional syntax, samples 2 and 9 are able to be decoded together, since there is no prediction dependency between them, and their VLC decoding is able to start together. Refinement bits and zero pads (if any) are placed in the middle of the bitstream as shown in
The syntax takes 9 clock cycles to decode in the 8×2 block case as opposed to the original 16 clock cycles. The decoding order and its parallelization are shown in
When a block has more than two lines, even lines are decoded in one decoding branch and odd lines are decoded in another branch as shown in
In the FNB codec, among all 9 modes, 8 modes do not have any prediction dependency among the pixels; therefore, bi-directional parallel decoding is able to be applied. Only the LEFT mode has a prediction dependency. With a standard prediction chain, at least 16 clocks are used: sample 1->sample 2->sample 3->. . . ->sample 16. A new LEFTRIGHT mode is able to replace LEFT, and the total delay is again reduced to 9. In some embodiments, since two VLC codewords are in the header, one of the codewords is able to be put at the tail of the bitstream. The new bitstream is shown in
In some embodiments, the bi-directional syntax application(s) 1030 include several applications and/or modules. Modules include an encoding module for encoding a bitstream according to the syntax described herein and a decoding module for decoding the bitstream according to the syntax described herein. In some embodiments, modules include one or more sub-modules as well. In some embodiments, fewer or additional modules are able to be included.
Examples of suitable computing devices include a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone/iPad, a video player, a DVD writer/player, a Blu-ray® writer/player, a television, a home entertainment system or any other suitable computing device.
To utilize the bi-directional bitstream ordering, a device encodes or decodes data such as an image or video using the specified order to enable expedited processing. In decoding, parallel processing is able to be implemented to provide more efficient decoding. The encoding and decoding utilizing the bi-directional bitstream is able to occur automatically without user intervention.
In operation, the bi-directional bitstream ordering speeds up the RAC and FNB decoding within a block. The technique is able to be applied to any codec where the block bit budget is fixed, and the number of bits is known in advance, before starting the encoding or decoding. The technique is demonstrated for the RAC codec and FNB coded herein. There is no performance loss compared to the RAC codec and FNB codec The method is able to be useful for encoding and decoding.
1. A method implemented in a controller of a device comprising:
2. The method of clause 1 wherein the block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream.
3. The method of clause 1 wherein the block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block.
4. The method of clause 1 wherein the device is an encoder.
5. The method of clause 1 wherein the device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone/iPad, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
6. A method of decoding a bitstream in a controller of a device comprising:
7. The method of clause 6 wherein the block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream.
8. The method of clause 6 wherein the block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block.
9. The method of clause 6 wherein the controller comprises hardware logic gates.
10. The method of clause 6 wherein the controller comprises a memory and a processor.
11. The method of clause 6 wherein the device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone/iPad, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
12. A device comprising:
13. The device of clause 12 wherein the block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream.
14. The device of clause 12 wherein the block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block.
15. The device of clause 12 wherein the encoder module and the decoder module comprise hardware logic gates.
16. The device of clause 12 wherein the device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone/iPad, a video player, a DVD writer/player, a Blu-ray® writer/player, a television and a home entertainment system.
17. A device comprising:
18. The device of clause 17 wherein the block is two lines and the first set of bits and the second set of bits are each approximately half of the bitstream.
19. The device of clause 17 wherein the block is more than two lines and the first set of bits are even lines and the second set of bits are odd lines of the block.
20. An encoder comprising:
21. The device of clause 20 wherein the block is two lines and a first set of bits and a second set of bits are each approximately half of the bitstream, wherein the second set of bits are in reverse order.
22. The device of clause 20 wherein the block is more than two lines and a first set of bits are even lines and a second set of bits are odd lines of the block, wherein the second set of bits are in reverse order.
The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be readily apparent to one skilled in the art that other various modifications may be made in the embodiment chosen for illustration without departing from the spirit and scope of the invention as defined by the claims.
This application claims priority under 35 U.S.C. §119(e) of the U.S. Provisional Patent Application Ser. No. 61/432,879, filed Jan. 14, 2011 and titled, “PROPOSAL FOR CHANGES IN PE-L2 AND PE-H2 CODECS FOR POTENTIALLY BETTER HARDWARE PARALLELIZATION.” The Provisional Patent Application, Ser. No. 61/432,879, filed Jan. 14, 2011 and titled, “PROPOSAL FOR CHANGES IN PE-L2 AND PE-H2 CODECS FOR POTENTIALLY BETTER HARDWARE PARALLELIZATION” is also hereby incorporated by reference in its entirety for all purposes.
Number | Date | Country | |
---|---|---|---|
61432879 | Jan 2011 | US |