The present disclosure is generally related to a bit splitting instruction.
Advances in technology have resulted in smaller and more powerful computing devices. For example, there currently exist a variety of portable personal computing devices, including wireless computing devices, such as portable wireless telephones, personal digital assistants (PDAs), and paging devices that are small, lightweight, and easily carried by users. More specifically, portable wireless telephones, such as cellular telephones and internet protocol (IP) telephones, can communicate voice and data packets over wireless networks. Further, many such wireless telephones include other types of devices that are incorporated therein. For example, a wireless telephone can also include a digital still camera, a digital video camera, a digital recorder, and an audio file player. Also, such wireless telephones can process executable instructions, including software applications, such as a web browser application, that can be used to access the Internet. As such, these wireless telephones can include significant computing capabilities.
Wireless telephones that perform multimedia processing such as audio or video decoding may often perform a bit unpacking operation. For example, the bit unpacking operation may extract specific bits from a coded bit stream during decoding of a compressed object. Current bit unpacking operations utilize at least two instructions to extract bits from a bit stream. The first instruction may pull or extract a group of bits from a source register and the second instruction may perform a shift operation on the remaining bits of the source register (e.g., to align the bits). Alternatively, a first instruction may store a first portion of bits from the source register into a first destination register and a second instruction may store a second portion of bits from the source register into a second destination register.
A single instruction that performs both bit extraction and alignment is disclosed. For example, the instruction may be used to perform bit splitting operations during audio or video decoding at an electronic device. In one implementation, the instruction may specify a source value (e.g., a bit stream) and an offset value. The source value may be data stored in a source register, such as data representing a compressed audio/video object. The offset value may be an immediate value (e.g., a numerical constant) or a register indicator (e.g., an offset register) storing the immediate value. For example, the register indicator may be “R1,” where register R1 stores the immediate value. When the instruction is executed, a first result (e.g., a first set of bits from the bit stream) and a second result (e.g., a second set of bits from the bit stream) may be generated. The first result and the second result may be stored in a first destination register and a second destination register, respectively. The first destination register and the second destination register may be registers of a destination register pair. Alternately, or in addition, the first destination register and the second destination register may be registers of a destination register file. For example, the first result may be extracted bits from a coded audio/video bit stream and the second result may be remaining bits of the coded audio/video bit stream. The remaining bits may be shifted to a least significant bit of a register (e.g., in preparation for a subsequent bit unpacking operation).
In a particular embodiment, an apparatus includes a memory storing an instruction that specifies a source value and an offset value. Upon execution, the instruction generates a first result of the instruction and a second result of the instruction. The first result is a first portion of the source value and the second result is a second portion of the source value. For example, if the source value is 32 bits and the offset value is eight (8), the first portion may be the 8 least significant bits of the source value and the second portion may be the remaining 24 bits of the source value.
In another particular embodiment, a method includes receiving a single instruction that indicates a source value and an offset value. The method includes executing the single instruction to generate a first result and a second result of the instruction. The method further includes storing the first result in a first destination register and storing the second result in a second destination register. The first result is a first portion of the source value and the second result is a second portion of the source value.
In another particular embodiment, an apparatus includes means for storing an instruction that specifies a source value and an offset value. The apparatus further includes means for executing the instruction to generate a first result of the instruction and a second result of the instruction. The first result is a first portion of the source value and the second result is a second portion of the source value.
In another particular embodiment, a non-transitory computer-readable medium includes program code that, when executed by a processor, causes the processor to receive a single instruction that indicates a source value and an offset value, to execute the single instruction to generate a first result of the instruction and a second result of the instruction, and to store the first result in a first destination register and store the second result in a second destination register. The first result is a first portion of the source value and the second result is a second portion of the source value.
One particular advantage provided by at least one of the disclosed embodiments is reduced code size and fewer execution cycles for applications (e.g., embedded multimedia processing applications) due to use of a single instruction (instead of multiple instructions) to perform bit extraction (e.g., to unpack words of varying lengths from a continuous bit stream).
Other aspects, advantages, and features of the present disclosure will become apparent after review of the entire application, including the following sections: Brief Description of the Drawings, Detailed Description, and the Claims.
An instruction for performing bit extraction may include a source value (e.g., a bit stream) and an offset value (e.g., an immediate value or a register indicator of an offset register storing the immediate value). The instruction may optionally identify destination registers. When the instruction is executed, a first result and a second result may be generated. The first result may be a first set of bits from the bit stream and the second result may be a second set of bits from the bit stream. The second result may include bits from the source value that are not included in the first result. The first result may be stored in a first destination register and the second result may be stored in a second destination register. The first result and the second result may both be shifted to the least significant bit positions in their respective destination registers. When a single instruction is invoked for performing bit extraction, fewer processor execution cycles and a reduction in code size may be achieved.
In a particular embodiment, a bit splitting instruction may specify a source value and an offset value. Alternately, the bit splitting instruction may specify an offset value, a source register, and destination registers. The source register may include the source value. For example, as illustrated in
During operation, the bit splitting instruction specifying the source value 112 (or the source register 110 including the source value 112) and the offset value 114 may be stored in a memory (e.g., as illustrated in
To illustrate, the offset value 114 (e.g., ‘4’) may indicate that the first portion 122 of the source register 110 includes bits “X3 X2 X1 X0.” (i.e., the 4 least significant bits of the source register 110). The second portion 132 may be the remaining bits of the source register 110 that are not in the first portion 122 (i.e., bits “X15 . . . X4”). The first portion 122 may be stored in the first destination register 120 of the destination register pair 140, and the second portion 132 may be stored in the second destination register 130 of the destination register pair 140. Further, the first portion 122 may be shifted to the least significant bit position of the first destination register 120 and the second portion 132 may be shifted to the least significant bit position of the second destination register 130. Thus, during decoding of a compressed object, specific bits of a coded bit stream may be extracted and aligned in multiple destinations using a single instruction.
Referring to
During operation, the bit splitting instruction specifying the source value 112 and the offset register 240 may be stored in a memory (e.g., as illustrated in
When executed, the bit splitting instruction generates a first result and a second result. The first result may be the first portion 122 of the source register 110 and the second result may be the second portion 132 of the source register 110. In a particular embodiment, the source value 112 may be a concatenation of the first result and the second result (i.e., the first portion 122 and the second portion 132). The first portion 122 and the second portion 132 may be stored in a first register and a second register of a register pair. To illustrate, the offset value 114 (e.g., ‘4’) stored in the offset register 240 (e.g., register R1) may indicate that the first portion 122 of the source register 110 includes bits “X3 X2 X1 X0.” (i.e., the 4 least significant bits of the source register 110). The second portion 132 may be the remaining bits of the source register 110 that are not in the first portion 122 (i.e., bits “X15 . . . X4”). The first portion 122 may be stored in the first destination register 120 of the destination register pair 140, and the second portion 132 may be stored in the second destination register 130 of the destination register pair 140. Further, the first portion 122 may be shifted to the least significant bit position of the first destination register 120 and the second portion 132 may be shifted to the least significant bit position of the second destination register 130.
Referring to
The bit splitting instruction 350 may specify a source value (e.g., data stored in source register 110) and an offset value, as illustrated in
The instruction cache 310 may be coupled to a sequencer 314 via a bus 311. The sequencer 314 may receive general interrupts 316, which may be retrieved from an interrupt register (not shown). In a particular embodiment, the instruction cache 310 may be coupled to the sequencer 314 via a plurality of current instruction registers (not shown), which may be coupled to the bus 311 and associated with particular threads (e.g., hardware threads) of the processor 300. In a particular embodiment, the processor 300 may be an interleaved multi-threaded processor including six (6) threads.
In a particular embodiment, the bus 314 may be a one-hundred and twenty-eight bit (128-bit) bus and the sequencer 314 may be configured to retrieve instructions from the memory 302 via instruction packets (e.g., a very long instruction word (VLIW) instruction packet including one or more bit splitting instructions 350) having a length of thirty-two (32) bits each. The bus 311 may be coupled to a first instruction execution unit 318, a second instruction execution unit 320, a third instruction execution unit 322, and a fourth instruction execution unit 324. It should be noted that there may be fewer or more instruction execution units. Each instruction execution unit 318-324 may be coupled to a general register file 326 via a first bus 328. The general register file 326 may also be coupled to the sequencer 314, the data cache 312, and the memory 302 via a second bus 330. The general register file 326 may include the destination registers 360 (e.g., the destination register pair 140 of
The system 300 may also include supervisor control registers 332 and global control registers 334 to store bits that may be accessed by control logic within the sequencer 314 to determine whether to accept interrupts (e.g., the general interrupts 316) and to control execution of instructions.
In a particular embodiment, any of the execution units 318-324 may execute the bit splitting instruction 350 to generate a first result and a second result. In another embodiment, some, but not all, of the execution units 318-324 may execute the bit splitting instruction 350. The first result may be a first portion of the source value stored in the source register 110 and the second result may be a second portion of the source value stored in the source register 110. The first portion may be stored in a first of the destination registers 360 and the second portion may be stored in a second of the destination registers 360. Further, the first portion and the second portion may be shifted to the least significant bits of their respective destination registers. Thus, during decoding of compressed objects, specific bits of a coded bit stream may be extracted from a source register and aligned in destination registers using a single instruction. The data in a first destination register may subsequently be decoded. The data in a second destination register may be subjected to another bit splitting operation after data in the first destination register is decoded (or during decoding). The bit splitting instruction 350 may achieve an overall reduction in code size and perform fewer execution cycles of a processor due to the use of a single instruction to perform bit extraction.
It should be noted that the system 300 depicted in
Referring to
The method 400 may include receiving an instruction that indicates a source value and an offset value, at 402. For example, in
The method 400 may further include storing the first result in a first destination register and storing the second result in a second destination register, at 406. In a particular embodiment, the first destination register and the second destination register may be part of a destination register pair (e.g., the destination register pair 140 of
The method 400 may also include shifting the second result to a least significant bit of the second destination register, at 408. For example, in
The method 400 of
Referring to
When processed, the bit splitting instruction 350 generates a first result of the bit splitting instruction 350 and a second result of the bit splitting instruction 350. Upon generating the first and second results, the DSP 564 or a component thereof may store the first result in a first destination register and store the second result in a second destination register. The first destination register (D1 in
In a particular embodiment, an input device 530 and a power supply 544 are coupled to the system-on-chip device 522. Moreover, in a particular embodiment, as illustrated in
It should be noted that although
In conjunction with the described embodiments, an apparatus is disclosed that includes means for storing an instruction that specifies a source value and an offset value. For example, the means for storing may be the memory 302 of
The apparatus may also include means for executing the instruction to generate a first result of the instruction and a second result of the instruction, where the first result is a first portion of the source value and the second result is a second portion of the source value. For example the means for executing may include one or more of the execution units 318, 320, 322, and 324 of
The apparatus may further include means for storing the first result of the instruction. For example, the means for storing the first result may include the first destination register 120 of the register pair 140 of
The apparatus may also include means for storing the second result of the instruction. For example, the means for storing the second result may include the second destination register 130 of the register pair 140 of
The apparatus may further include means for shifting the second portion to a least significant bit of the means for storing the second result. For example, the means for shifting may be the execution units 314-324 of
Those of skill would further appreciate that the various illustrative logical blocks, configurations, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. Various illustrative components, blocks, configurations, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in random access memory (RAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, a compact disc read-only memory (CD-ROM), or any other form of storage medium known in the art. An exemplary non-transitory (e.g. tangible) storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application-specific integrated circuit (ASIC). The ASIC may reside in a computing device or a user terminal In the alternative, the processor and the storage medium may reside as discrete components in a computing device or user terminal.
The previous description of the disclosed embodiments is provided to enable a person skilled in the art to make or use the disclosed embodiments. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the principles defined herein may be applied to other embodiments without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope possible consistent with the principles and novel features as defined by the following claims.