1. Field of the Invention
The invention is related to the field of decoding systems and, in particular, to parallel decoding of Run Length Limited (RLL) encoded datastreams.
2. Statement of the Problem
Print data transmitted between various processes within a printing system is typically encoded to reduce the amount of bandwidth required to transmit the data. Before the encoded print data is finally printed, the print data is decoded. In some cases, however, the printing speed of the printing system is not limited by a print engine printing the data, but instead by the speed at which the printing system decodes the print data prior to printing the data.
One type of data encoding is RLL encoding. RLL encoding is a lossless compression scheme which bounds the length of runs of repeat data during which the signal does not change. If the runs are too long, then clock recovery becomes difficult. If the runs are too short, then the communication channel might attenuate the high frequencies within the signal.
Apple Computer, Inc. introduced a RLL encoding scheme with the release of the Macintosh® computer called PackBits. A PackBits datastream includes packets with a one-byte header followed by one or more bytes of data. The header is a signed byte. The header defines the following data as either literal data or repeat data. The header also defines the number of bytes of encoded literal data or encoded repeat data. In other words, the header encodes both the type of data (literal or repeat) and the amount of encoded data.
One problem with decoding RLL datastreams, such as PackBits, is that the decoding scheme inherently requires serial processing of each byte of the datastream to determine how to treat each subsequent byte of the datastream. The serial processing of each byte of the datastream can limit the performance of systems relying on the decoded output of a RLL datastream, such as printing systems.
Embodiments herein describe parallel decoding of RLL encoded datastreams. Sequential headers defining blocks of RLL encoded data are identified from the datastream. The blocks of RLL encoded data are parsed from the datastream and decoded in parallel to generate a decoded output. Decoding the datastream in parallel provides for an improvement in the decoding performance as compared to serial decoding the datastream.
One embodiment comprises a decoding system including a parsing system, a first decoder, and a second decoder. The parsing system is operable to receive a Run Length Limited (RLL) encoded datastream, to identify a first header from the datastream that defines a first number of data blocks subsequent to the first header and a first RLL encoding of the first number of data blocks. The parsing system is further operable to identify a following header with the datastream that defines a following number of data blocks subsequent to the following header and a following RLL encoding of the following data blocks. The first decoder is operable to decode the first number of data blocks based on the first RLL encoding, and the second decoder is operable to decode the following number of data blocks based on the following RLL encoding in parallel with the first decoder to generate an output.
Another embodiment comprises a method of parallel decoding of a Run Length Limited (RLL) encoded datastream. According to the method, the RLL datastream is received. A first header within the datastream is identified that defines a first number of data blocks subsequent to the first header and a first RLL encoding of the first number of data blocks. A following header within the data stream is identified that defines a following number of data blocks subsequent to the following header and a following RLL encoding of the following number of data blocks. The first number of data blocks are decoded in parallel with the following number of data blocks. The first number of data blocks are decoded based on the first RLL encoding. The second number of data blocks are decoded based on the following RLL encoding. The decoding of the first number of data blocks and the following number of data blocks generates a decoded output.
Other exemplary embodiments may be described below.
Some embodiments of the present invention are now described, by way of example only, and with reference to the accompanying drawings. The same reference number represents the same element or the same type of element on all drawings.
The figures and the following description illustrate specific exemplary embodiments of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within the scope of the invention. Furthermore, any examples described herein are intended to aid in understanding the principles of the invention, and are to be construed as being without limitation to such specifically recited examples and conditions. As a result, the invention is not limited to the specific embodiments or examples described below, but by the claims and their equivalents.
In step 302, parsing system 102 of encoding system 100 receives datastream 108. Datastream 108 may comprise any RLL encoded datastream, such as the RLL encoded datastream illustrated in
In step 304, parsing system 102 identifies a following header 203 within datastream 108. Following header 203 defines a following number of data blocks 208 that are subsequent to following header 203 in datastream 108. Following header 203 also defines a following RLL encoding for the following number of data blocks 208.
In step 306, first decoder 104 and second decoder 106 decode their respective portions of datastream 108 in parallel to generate an output. First decoder 104 decodes the first number of data blocks 207 based on the first RLL encoding. Second decoder 106 decodes the following number of data blocks 208 based on the following RLL encoding in parallel with first decoder 104 to generate an output 110. By identifying and decoding data blocks 207 and 208 in parallel, decoding system 100 provides for an improved decode performance of datastream 108 as compared to decoding datastream 108 in a serial manner. After generating an output 110, additional downstream headers may be identified within datastream 108 and blocks of data decoded until print datastream 108 is decoded in its entirety. As datastream 108 is processed in this manner, steps 302-306 may be repeated as illustrated in
In some cases, print data is encoded in an RLL format for transmission between print controller processes to reduce the amount of bandwidth and/or the memory storage usage of the print data. For example, a host may generate a Page Description Language (PDL) print datastream for a printer. After the printer receives the PDL print datastream, the printer may rasterize the PDL datastream and generate an RLL encoded datastream for subsequent processing and transmission to a print engine for output. In addition, while
In step 502, print controller 402 identifies first header 202 within datastream 108 based on previous header 201. For example, rasterizer 406 may receive PDL print data 410 from host system 404, and convert PDL print data 410 into datastream 108. If datastream 108 is a PackBits datastream, then print controller 402 may first identify previous header 201 as defining literal data. In a PackBits datastream, headers may comprise literal headers, repeat headers, or indicate a skip header.
The following table illustrates how headers are encoded in PackBits:
When the signed header ranges from 0 to 127, the header defines the bytes subsequent to the header in the datastream as literal data. The value of the header also defines the number of bytes of literal data as 1+n, where n is the signed value of the header. For example, if previous header 201 has a value of 2, then previous header 201 defines that the next 3 bytes in datastream 108 are literal data bytes. This is indicated by data block 206 in
After print controller 402 identifies previous header 201, print controller 402 may then generate an offset within datastream 108 to locate first header 202 based on previous header 201. In the example, previous header 201 defines 3 bytes of literal encoded data (e.g., previous header 201 has a value of 2 such that previous header 201 defines 2+1 bytes of subsequent literal data in datastream 108). Thus, print controller 402 may calculate a 4 byte offset (i.e., 1 header byte plus 3 data bytes) from previous header 201 to locate first header 202. After locating first header 202 in datastream 108, print controller 402 may then identify, for example, that first header 202 has a value of 4. Because first header 202 resides within the range from 0 to 127, first header 202 also defines 5 bytes (i.e., 4+1) of literal data in datastream 108. This is indicated by data block 207.
In step 504, print controller 402 identifies a second header 203 within datastream 108 based on first header 202. In continuing with the example, first header 202 defines 5 bytes of literal encoded data. Print controller 402 may then identify second header 203 within datastream 108 based on a 6 byte offset (1 header byte plus 5 data bytes) from first header 202. After locating second header 203 in datastream 108, print controller 402 may then identify second header 203 as defining repeat encoded data.
Referring again to table 1 above, when the header value resides within a range of −1 to −127, the header defines one byte of data repeated 1-n times in the decoded output. For example, if a header has a value of −5, then the header defines that the following byte is repeated 6 times in the decoded output. Because the data byte is repeated, only one byte of data is used to represent the decoded output.
In step 506, print controller 402 decodes the first number of data bytes defined by first header 202 in step 502 (i.e., data block 207) in parallel with the second number of data bytes defined by second header 203 in step 504 (i.e., data block 207) to generate output 110. As subsequent processing remains for datastream 108 for decoding, processing returns to step 502. Steps 502-506 will be described with reference to headers 204-205 and data blocks 209-210 in
Returning to step 502, print controller 402 identifies header 204 based on header 203 (previously as second header 203). In the example, header 203 defines repeat encoded data. Thus, header 203 defines one byte of data, repeated (1-n) times in the decoded output. Print controller 402 may then identify header 204 based on a 2 byte offset (1 header byte plus 1 repeat byte). After locating header 204, print controller 402 may then identify header 204, for example, as defining 4 bytes of literal encoded data as indicated in data block 209.
In step 504, print controller 402 identifies a header 205 based on header 204. In the example, header 204 defines 4 bytes of literal encoded data. Print controller 402 may then identify header 205 based on a 5 byte offset. After locating header 205, print controller 402 may then identify header 205 as defining 4 bytes of literal data (e.g., header 205 has a value of 4) as indicated in block 210.
In step 506, print controller 402 decodes the number of data bytes defined by header 204 in step 502 (i.e., data block 209) in parallel with the number of data bytes defined by header 205 in step 504 (i.e., data block 210) to generate output 110. Processing datastream 108 continues repeatedly between steps 502-506 until datastream 108 is decoded in its entirety. After print controller 402 generates output 110, accumulator 408 combines output 110 into datastream 410 for print engine 412 to generate a printed output.
In
Decoder 604 receives eight bytes of data comprising datastream 108 from buffer 602, decodes the first byte of datastream 108 (in our example, we will consider first header 202 (see
Flow control 618 is a flow control state machine across all decoding stages, collecting stall signals from the decoding stages to halt reads at the system input. Pointer 620 contains information about the current decode location within datastream 108. Literal data 626 and literal data 628 transmit literal data decoded from decoder 604 to decoder 608 and decoder 612. Decode state 606 and 610 track current pointers and counters for each corresponding stages' progress in decoding data streams.
Decoder 608 receives up to two valid header decodes from decoder 604. Decoder 608 implements the repeat byte command by multiplying one to eight bytes of data output. Decoder 608 implements the literal output by accepting from one to eight bytes at a time from decoder 604 and passing the literal output on to decoder 612. Decoder 608 may receive ignore/skip headers from decoder 604, but decoder 608 ignores the ignore/skip headers. Decoder 608 recognizes valid data from decoder 604 by decoding a passed-on header from decoder 604. Decoder 608 will stall using state machine 622 if decoder 604 is implementing a repeat output over multiple cycles.
Decoder 612 accepts valid repeat and literal data from decoder 608. Decoder 612 can process up to two full eight-byte words from decoder 608. Decoder 612 may also be stalled by state machine 624 when buffer 614 is full. State machine 624 may also stall decoder 608 and decoder 604 when this occurs. Accumulator 614 recombines the parallelized multiple decoded data streams while maintaining the original order of the incoming data from decoder 612. It is designed to be twice as wide as the preceding data path to accommodate dual streams of data without causing a stall in the decode stages. Its ability to output 16 bytes at a time allows the overall throughput to stay very high even if a decode stalls occasionally due to poor compression.
Any of the various elements shown in the figures or described herein may be implemented as hardware, software, firmware, or some combination of these. For example, an element may be implemented as dedicated hardware. Dedicated hardware elements may be referred to as “processors”, “controllers”, or some similar terminology. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, a network processor, application specific integrated circuit (ASIC) or other circuitry, field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), non volatile storage, logic, or some other physical hardware component or module.
Also, an element may be implemented as instructions executable by a processor or a computer to perform the functions of the element. Some examples of instructions are software, program code, and firmware. The instructions are operational when executed by the processor to direct the processor to perform the functions of the element. The instructions may be stored on storage devices that are readable by the processor. Some examples of the storage devices are digital or solid-state memories, magnetic storage media such as a magnetic disks and magnetic tapes, hard drives, or optically readable digital data storage media.
Although specific embodiments were described herein, the scope of the invention is not limited to those specific embodiments. The scope of the invention is defined by the following claims and any equivalents thereof.