The invention relates generally to embedded microprocessor architectures and more specifically to a clip instruction for SIMD microprocessor architectures and a method of performing a clip operation using such a clip instruction.
Single instruction multiple data (SIMD) architectures have become increasingly important as demand for video processing in electronic devices has increased. The SIMD architecture exploits the data parallelism that is abundant in data manipulations often found in media related applications, such as discrete cosine transforms (DCT) and filters. Data parallelism exists when a large mass of data of uniform type needs the same instruction performed on it. Thus, in contrast to a single instruction single data (SISD) architecture, in a SIMD architecture a single instruction may be used to effect an operation on a wide block of data. SIMD architecture exploits parallelism in the data stream while SISD can only operate on data sequentially.
An example of an application that takes advantage of SIMD is one where the same value is being added to a large number of data points, a common operation in many media application. One example of this is changing the brightness of a graphic image. Each pixel of the image may consist of three values for the brightness of the red, green ad blue portions of the color. To change the brightness, the R, G and B values, or alternatively the YUV values are read from memory, a value is added to it, and the resulting value is written back to memory. A SIMD processor enhances performance of this type of operation over that of a SISD processor. A reason for this improvement is that that in SIMD architectures, data is understood to be in blocks and a number of values can be loaded at once. Instead of a series of instructions to incrementally fetch individual pixels, a SIMD processor will have a single instruction that effectively says “get all these pixels” Another advantage of SIMD machines is multiple pieces of data are operated on simultaneously. Thus, a single instruction can say “perform this operations on all the pixels.” Thus, SIMD machines are much more efficient in exploiting data parallelism than SISD machines.
SIMD architectures have particular promise for video encoding/decoding applications where many repetitive numerical computations must be performed on relatively large blocks of data. Numerical computation algorithms, such as those common in video encoding/decoding, often require results to be clipped to be within a specified range of values. For example, in video processing, a system will have a maximum pixel depth depending on the system's resolution. If the value of an intermediate calculation result, such as interpolation or other calculation, lies outside the maximum value the final result will have to be clipped to the saturation value, for example, the maximum pixel value.
Clipping is typically implemented in software using a sequence of instructions that first test the intermediate value and then conditionally assign the final value, for example, if value>maximum, then value=maximum. Such a software clipping implementation incurs a high overhead due to the number of calculations required to test each value. The sequential nature of a software implementation makes it very difficult to be optimized in processors designed to exploit instruction level parallelism, such as, for example, SISD reduced instruction set (RISC) machines or very long instruction word (VLIW) machines. Some processors do implement clipping at the hardware level using specialized processor instructions, however, the clipping ranges of these instructions are fixed to some value, typically a power of two.
Thus, there exists a need for a SIMD microprocessor architecture that ameliorates at least some of the above-noted deficiencies of conventional systems. At least one embodiment of the invention may provide a parameterizable microprocessor clip instruction. The parameterizable microprocessor clip instruction according to this embodiment may comprise a destination register operand, a source register operand of a value to be clipped, and a second source operand containing the control parameter specifying the manner in which clipping is to be performed, wherein the control parameter comprises a range type and range specifier. It should be appreciated that in the context of a SIMD machine, the source operand containing the “value” to be clipped is really referring to the values to be clipped because a 128-bit register is used to hold 8 16-bit values to be clipped by a single instruction.
Accordingly, at least one embodiment of the invention may provide a method of causing a microprocessor to perform a clip operation. The method according to this embodiment may comprise providing an assembly instruction to the microprocessor, the instruction comprising an input address, an output address and a controlling parameter, decoding the instruction with logic in the microprocessor, retrieving a data input from the input address, determining a specific clip operation based on the controlling parameter, performing the clip operation on the data input, and writing the result to output address.
Another embodiment of the invention may provide a method of performing a clip operation with a single parameterizable assembly language-based clip instruction executing on a microprocessor. The method of performing a clip operation with a single parameterizable assembly language-based clip instruction executing on a microprocessor may comprise specifying a source address of a data input, a destination address of a clipped output and a controlling parameter in a single instruction, obtaining the data input at the source address, performing the clip operation on the data input in accordance with the controlling parameter, and storing the result at the destination address.
At least one other embodiment of the invention may provide a parameterizable assembly language program instruction for performing a clip operation in a video processing application. The parameterizable assembly language program instruction according to this embodiment may comprise an instruction name for a particular microprocessor instruction, a first instruction input operand comprising a destination register address to write an instruction result, a second instruction input operand comprising a source register address containing a value to be clipped, and a third instruction input operand comprising a controlling parameter.
These and other embodiments and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
In order to facilitate a fuller understanding of the present disclosure, reference is now made to the accompanying drawings, in which like elements are referenced with like numerals. These drawings should not be construed as limiting the present disclosure, but are intended to be exemplary only.
The following description is intended to convey a thorough understanding of the embodiments described by providing a number of specific embodiments and details involving microprocessor architecture and systems and methods for performing clip operations with a parameterizable clip instruction. It should be appreciated, however, that the present invention is not limited to these specific embodiments and details, which are exemplary only. It is further understood that one possessing ordinary skill in the art, in light of known systems and methods, would appreciate the use of the invention for its intended purposes and benefits in any number of alternative embodiments, depending upon specific design and other needs.
Referring now to
Conventionally, clipping is implemented in software using a sequence of instructions that first test the intermediate value and then conditionally assign the final value, for example, if value>maximum, then value=maximum. Such a software clipping implementation incurs a high overhead due to the number of calculations required to test each value. The sequential nature of a software implementation makes it very difficult to be optimized in processors designed to exploit instruction level parallelism, such as, for example, SISD reduced instruction set (RISC) machines or very long instruction word (VLIW) machines. Some processors do implement clipping at the hardware level using specialized processor instructions, however, the clipping ranges of these instructions are fixed to some value, typically a power of two. Therefore, various embodiments of this invention provide a parameterizable clip instruction for a microprocessor that enables adjustment of clipping parameters.
Referring to
In the example of
In the table 110 of
Referring now to
The embodiments of the present inventions are not to be limited in scope by the specific embodiments described herein. For example, although many of the embodiments disclosed herein have been described with reference to systems and methods for performing clip operations with a parameterizable clip instruction, the principles herein are equally applicable to other aspects of microprocessor design and function. Indeed, various modifications of the embodiments of the present inventions, in addition to those described herein, will be apparent to those of ordinary skill in the art from the foregoing description and accompanying drawings. Thus, such modifications are intended to fall within the scope of the following appended claims. Further, although some of the embodiments of the present invention have been described herein in the context of a particular implementation in a particular environment for a particular purpose, those of ordinary skill in the art will recognize that its usefulness is not limited thereto and that the embodiments of the present inventions can be beneficially implemented in any number of environments for any number of purposes. Accordingly, the claims set forth below should be construed in view of the full breath and spirit of the embodiments of the present inventions as disclosed herein.
This application claims priority to U.S. Provisional Patent Application No. 60/721,108 titled “SIMD Architecture and Associated Systems and Methods,” filed Sep. 28, 2005, the disclosure of which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
60721108 | Sep 2005 | US |