This application claims the benefit under 35 U.S.C. §119(a) of Korean Patent Application No. 10-2010-0022493, filed on Mar. 12, 2010, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated herein by reference for all purposes.
1. Field
The following description relates to a computer processor, and more particularly, to a process for supporting Multiple-Input Multiple-Output (MIMO) operation.
2. Description of the Related Art
Complex instruction set computing is a computer instruction set architecture in which each instruction can execute several low-level operations, such as a load from memory, an arithmetic operation, and a memory store, all in a single instruction. A “complex instruction” refers to an instruction of simultaneously processing several basic operations, for example, a Multiply and ACcumulate (MAC) operation which allows first input data to be multiplied by second input data and the result of multiplication to be added to third input data. Other examples of complex instructions include: saving many registers on the stack at once, moving large blocks of memory, complex and/or floating-point arithmetic (sine, cosine, square root, etc.), performing an atomic test-and-set instruction, and instructions that combine ALU with an operand from memory rather than a register. By using such a complex instruction, the number of registers and cycles required to process a plurality of basic operations are reduced, as compared with instructions in which the basic operations are consecutively processed. In particular, the complex instruction is useful in improving the performance of a multimedia application that needs to repeat a predetermined type of operation.
In general, such a complex instruction requires multiple inputs (at least three) and/or multiple outputs (at least two). A “Multiple Input Multiple Output (MIMO) instruction” refers to a complex instruction with input and output, in which at least one of which is implemented as a multiple. A “MIMO operation” is defined as an operation that is performed in a processor run by the MIMO instruction.
In one approach to processing the MIMO instruction, a processor or a Functional Unit (FU) may be configured to have at least three input register ports and at least two output register ports. Alternatively, using another method, an interconnection may be installed to connect at least two adjacent FUs to each other in a processor including a plurality of FUs such that the FUs connected to each other simultaneously executes a plurality of basic operations at a single cycle. However, as described above, these two methods require additional hardware such as input/output ports or interconnection.
In one general aspect, there is provided a processor for supporting a Multiple Input Multiple Output (MIMO) operation, the processor including: a scheduler configured to: map multiple inputs of a MIMO instruction to K sequential cycles, K being an integer greater than or equal to 2, respectively, and map multiple outputs of the MIMO instruction to L sequential cycles, L being an integer greater than or equal to 2, and a functional unit (FU) configured to: read a register during the K sequential cycles to execute a MIMO operation, and write a result of the MIMO operation into a register during the L sequential cycles.
In the processor, the FU may include: a reading unit configured to read multiple pieces of input data from the register during respective K sequential cycles, a MIMO executing unit configured to: generate multiple pieces of output data by receiving the multiple pieces of input data from the reading unit, and execute the MIMO operation during a predetermined number of cycles, and a writing unit configured to write the multiple pieces of output data received from the MIMO executing unit into the register during respective L sequential cycles.
In the processor, the reading unit may be further configured to simultaneously transfer the multiple pieces of input data to the MIMO executing unit, and the MIMO executing unit may be further configured to simultaneously transfer the multiple pieces of output data to the writing unit.
In the processor, a plurality of the FUs may be provided, and the scheduler may be further configured to map the multiple inputs and the multiple outputs to at least two FUs which are connected to each other.
In the processor, the FU may include two input register ports and one output register port.
In the processor, the processor may be configured to use a fixed bit instruction encoding.
In the processor, the processor may be configured to support a Very Long Instruction Word (VLIW).
In another general aspect, there is provided a method of processing a Multiple Input Multiple Output (MIMO) instruction in a processor, the method including: reading multiple pieces of input data from a register during a single cycle or during K sequential cycles, K being an integer greater than or equal to 2, generating multiple pieces of output data by executing a MIMO operation during a predetermined number of cycles by use of the multiple pieces of input data, and writing the multiple pieces of output data into a register during a single cycle or during L sequential cycles, L being an integer greater than or equal to 2, wherein at least one of the reading of multiple pieces of input data and the writing of multiple pieces of output data is performed during a plurality of cycles.
In the method, the processor may include a plurality of functional units (FUs), and the reading of multiple pieces of input data, the executing of MIMO operation, and the writing of multiple pieces of output data may be performed by one of the plurality of FUs.
In the method, the processor may include a plurality of functional units (FUs), and the reading of multiple pieces of input data and the writing of multiple pieces of output data may be performed by at least two of the plurality of FUs that are connected to each other.
In the method, the executing MIMO operation may be performed by at least two of the plurality of FUs that are connected to each other.
In the method, in the reading of multiple pieces of input data, at most two pieces of input data may be read from the register at each of the K sequential cycles, and in the writing of multiple pieces of output data, at most one piece of output data may be written into the register at each of the K sequential cycles.
The method may further include using a fixed bit instruction encoding.
The method may further include supporting a Very Long Instruction Word (VLIW).
In another general aspect, there is provided a method of processing a Multiple Input Multiple Output (MIMO) instruction in a processor, the method including at least one of: processing multiple inputs of the MIMO instruction by reading a register during K sequential cycles, K being an integer greater than or equal to 2, and processing multiple outputs of the MIMO instruction by executing a MIMO operation, and then writing a result of the MIMO operation into a register during L sequential cycles, L being an integer greater than or equal to 2.
The method may further include performing scheduling in which at least one of: the multiple inputs are mapped to the K sequential cycles, and the multiple outputs are mapped to the L sequential cycles.
In the method, the processor may include a plurality of functional units (FUs), and the scheduling may be performed such that the multiple inputs and the multiple outputs are processed by one of the plurality of FUs.
In the method, the processor may include a plurality of functional units (FUs), and the scheduling may be performed such that the multiple inputs and the multiple outputs are processed by at least two of the plurality of FUs which are connected to each other.
In the method, the MIMO operation may be performed by one of the at least two of the plurality of FUs that are connected to each other.
In the method, the processor may include a plurality of functional units (FU), and each of the plurality of FUs: may read at most two pieces of input data from the register for each cycle of the K sequential cycles, and may write at most one piece of output data into the register for each cycle of the L sequential cycles.
Other features and aspects may be apparent from the following detailed description, the drawings, and the claims.
Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals will be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated for clarity, illustration, and convenience.
The following detailed description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses, and/or systems described herein. Accordingly, various changes, modifications, and equivalents of the systems, apparatuses and/or methods described herein will be suggested to those of ordinary skill in the art. The progression of processing steps and/or operations described is an example; however, the sequence of steps and/or operations is not limited to that set forth herein and may be changed as is known in the art, with the exception of steps and/or operations necessarily occurring in a certain order. Also, descriptions of well-known functions and constructions may be omitted for increased clarity and conciseness.
Examples will be described with reference to accompanying drawings in detail.
In a broad sense, the MIMO operation represents an operation that is executed by a complex instruction having at least one multiple input and output. For example, at least three inputs may be provided and at least two outputs may be provided. However, in a more narrow sense, the MIMO operation represents an operation that is executed by a complex instruction having multiple inputs and multiple outputs. For example, an operation executed by an instruction having two inputs and one output, or by an instruction having two inputs and two outputs may be classified into the MIMO operation in a broad sense but not in a narrow sense. Meanwhile, an operation executed by an instruction having at least three inputs and at least two outputs may be classified as the MIMO operation. The processor 100 may support the narrow sense-MIMO operation or the broad sense-MIMO operation.
The processor 100 may include an input register port RPi and an output register port RPO used for the FU 120 (see
The processor 100 may be a device using a fixed bit instruction encoding. For example, the processor 100 may use a single register file having a size of 64 bits or 128 bits. If the processor 100 uses a fixed instruction encoding, data processing speed may be improved.
The processor 100 may further include a Very Long Instruction Word (VLIW) machine for supporting the MIMO operation. The VLIW machine represents a Central Processing Unit (CPU) architecture that may be designed to take advantage of Instruction Level Parallelism (ILP). The VLIW machine may include a plurality of FUs to process a plurality of instructions simultaneously. At least one of the FUs may perform an MIMO operation. Input instructions may be grouped into as many instruction bundles as respective FUs, and instructions included in a single instruction bundle may be distributed to respective FUs and simultaneously processed. The VLIW machine may include a limited number of input register ports and output register ports, and may usea fixed bit instruction encoding.
The processor 100 may process multiple inputs and multiple outputs of the MIMO instruction by grouping the multiple inputs into multi-cycle inputs and grouping the multiple outputs into multi-cycle outputs. The “multi-cycle input” refers to a register reading that is performed by a single FU 120 over a plurality of sequential cycles, and the “multi-cycle output” refers to a register writing that is by performed by a single FU 120 over a plurality of sequential cycles. However, it should be appreciated that the processor 100 may not need to group inputs or outputs among cycles if the MIMO instruction uses two inputs or one output. In order to process multiple inputs and/or multiple outputs during a plurality of respective cycles, the processor 100 may include a scheduler 110. The term “scheduler” is arbitrarily selected and the scheduler 120 may be referred to using another term, for example, a “mapper” or a “controller.” The scheduler 110 may map multiple inputs of the MIMO instruction to K sequential cycles, in which K is an integer greater than or equal to 2. The scheduler 110 may map multiple outputs of the MIMO instruction to L sequential cycles, in which L is an integer greater than or equal to 2. It is obvious that the L sequential cycles mapping to the multiple outputs may be followed by the K sequential cycles and one or more cycles for a MIMO operation.
The FU 120 may perform the MIMO operation using multiple pieces of input data which are fetched by reading a register during the K sequential cycles mapped by the scheduler 110. The FU 120 may write multiple pieces of output data which are obtained through the MIMO operation into the register during the L sequential cycles mapped by the scheduler 120. As such, the FU 120 may include a reading unit 122, an MIMO executing unit 124, and a writing unit 126. The reading unit 122, the MIMO executing unit 124, and the writing unit 126 may be subdivided in a logic terms, but at least two of the reading unit 122, the MIMO executing unit 124, and the writing unit 126 may be integrated. However, such a logical division of the reading unit 122, the MIMO executing unit 124, and the writing unit 126 is qualified for the sake of convenience, and the functions of the components is not fixed. Any combination of integrating the reading unit 122, the MIMO executing unit 124, and the writing unit 126 may be used.
The reading unit 122 may read multiple pieces of input data from the register during respective K sequential cycles mapped by the scheduler 110. During each cycle of the K sequential cycles, the reading unit 122 may read from, at most, a number of registers corresponding to the number of input register ports, which are included in the FU 120. In addition, the reading unit 122 may simultaneously transfer multiple pieces of input data read during the K sequential cycles to the MIMO executing unit 124.
The MIMO executing unit 124 may perform an MIMO operation by use of the multiple pieces of input data that are simultaneously read by the reading unit 122. The number of cycles operating in the MIMO executing unit 124 is not limited, and may vary depending on an algorithm of the MIMO operation. As a result of the MIMO operation, the MIMO executing unit 124 may generate multiple pieces of output data, and may transfer the multiple pieces of output data together to the writing unit 126.
The writing unit 126 may write the multiple pieces of output data into a register during respective L sequential cycles mapped by the scheduler 110. During each cycle of the L sequential cycles, the writing unit 126 may write to, at most, a number of registers corresponding to the number of output register ports that are included in the FU 120. In this manner, the multiple pieces of output data may be simultaneously transferred from the MIMO executing unit 124 to the writing unit 126, but the writing unit 126 may write the multiple pieces of output data during respective L sequential cycles.
If the MIMO instruction shown in
The reading unit 122 may read the four registers (reg v1, reg v2, reg v3, and reg v4) over the two sequential input cycles that are scheduled by the scheduler 110. Input data read at the first input cycle of the two sequential input cycles, e.g., input data of reg v1 and reg v2, may be held during one cycle and transferred to the MIMO executing unit 124 together with input data, which may be read at the second input cycle, e.g., input data of reg v3 and reg v4. In
The MIMO executing unit 124 may perform a MIMO operation by use of the received multiple pieces of input data. After the sequential input cycles, e.g., the K cycles, the MIMO operation may proceed during at least one cycle, hereinafter, referred to an “execution cycle.” The MIMO executing unit 124 may simultaneously transfer multiple pieces of output data that are generated as a result of MIMO operation to the writing unit 126.
The writing unit 126 may write one of the multiple pieces of output data transferred from the MIMO executing unit 124 into a register, for example, reg v10 at the first output cycle of the two sequential output cycles. The writing unit 126 may write the remaining of the multiple pieces of output data transferred from the MIMO executing unit 124 into registers, for example, reg v20 at the second output cycle of the two sequential output cycles. In
As described above, the example of the processor 100 including the FU 120 shown in
If a MIMO instruction shown in
The reading unit 122 may read the six registers (reg v1, reg v2, reg v3, reg v4, reg v5, and reg v6) over the three sequential input cycles that are scheduled by the scheduler 110. Input data read at the first input cycle, e.g., input data of reg v1 and reg v2, may be held during two cycles. Input data read at the second input cycle, e.g., input data of reg v3 and reg v4, may be held during one cycle. The input data of reg v1, reg v2, reg v3, and reg v4 may be transferred to the MIMO executing unit 124 together with input data read at the third input cycle, for example, reg v5 and reg v6. In
The MIMO executing unit 124 may perform a MIMO operation by use of the received multiple pieces of input data. After the sequential input cycles, the MIMO operation may is proceed for at least one cycle. The MIMO executing unit 124 may simultaneously transfer multiple pieces of output data that are generated as a result of MIMO operation to the writing unit 126.
The writing unit 126 may write one of the multiple pieces of output data transferred from the MIMO executing unit 124 into a register, e.g., reg v10, at the first output cycle of the three sequential output cycles. The writing unit 126 may write the second output data of the output data transferred from the MIMO executing unit 124 into a register, e.g., reg v20, at the second output cycle of the three sequential output cycles. The writing unit 126 may write the third output data of the output data transferred from the MIMO executing unit 124 into a register, e.g., reg v30, at the last output cycle of the three sequential output cycles. In
It should be appreciated that the FU 120 of the processor 100 shown in
The processor 200 may process multiple inputs and multiple outputs of the MIMO operation by grouping the multiple inputs into multiple cycle inputs and grouping the multiple outputs into multiple cycle outputs. For example, the processor 200 may allow the FU0 and FU1 to process the multiple inputs and multiple outputs in cooperation with each other. In order for the Function Units (FU0 and FU1) to process multiple inputs and/or multiple outputs during a plurality of respective cycles, the processor 200 may include a scheduler 210. The scheduler 210 may map multiple inputs of the MIMO instruction to K sequential cycles, in which K is an integer greater than or equal to 2. The scheduler 110 may map multiple outputs of the MIMO instruction to L sequential cycles, in which L is an integer greater than or equal to 2.
The Functional Units (FU0 and FU1) connected to each other may read a register during the K sequential cycles mapped by the scheduler 210 and one of the Functional Units. For example, FU0 may perform a MIMO operation by use of multiple pieces of input data. The Functional Units (FU0 and FU1) may write multiple pieces of output data, which are obtained through the MIMO operation, into the register during the L sequential cycles mapped by the scheduler 210. As such, the Functional Units (FU0 and FU1) may include a reading unit 222, a MIMO executing unit 224, and a writing unit 226 (see
As shown in
The MIMO executing unit 224 may perform an MIMO operation by use of the multiple pieces of input data that are simultaneously from the reading unit 222. The number of cycles operating in the MIMO executing unit 224 is not limited, and may vary depending on an algorithm of the MIMO operation. As a result of the MIMO operation, the MIMO executing unit 224 may generate multiple pieces of output data, and may simultaneously transfer the multiple pieces of output data to the writing unit 226.
The writing unit 226 may write the multiple pieces of output data into the register during respective L sequential cycles mapped by the scheduler 210. During each cycle of the L sequential cycles, the writing unit 226 may write to, at most, a number of registers corresponding to the number of output register ports that are included in the Functional Units (FU0 and FU1) 220. In this manner, the multiple pieces of output data may be simultaneously transferred from the MIMO executing unit 224 to the writing unit 226, but the writing unit 226 may write the multiple pieces of output data during respective L sequential cycles.
If a MIMO instruction shown in
The reading unit 222 may read the eight registers (reg v1, reg v2, reg v3, reg v4, reg v5, reg v6, reg v7, and reg v8) over the two sequential input cycles that are scheduled by the scheduler 210. Input data read at the first input cycle of the two sequential input cycles, e.g., input data of reg v1 to reg v4, may be held for one cycle and transferred to the MIMO executing unit 224 together with input data which are read at the second input cycle, e.g., input data of reg v5 to reg v8. In
The MIMO executing unit 224 may perform a MIMO operation by use of the received multiple pieces of input data. After the sequential input cycles, the MIMO operation may proceed for at least one execution cycle. The MIMO executing unit 224 may transfer multiple pieces of output data that are generated as a result of MIMO operation to the writing unit 226.
The writing unit 226 may write one of the multiple pieces of output data transferred from the MIMO executing unit 224 into registers, e.g., reg v10 and reg v20, at the first output cycle of the two sequential output cycles. The writing unit 226 may write the remaining of the multiple pieces of output data transferred from the MIMO executing unit 224 into registers, e.g., reg v30 and reg v40, at the second output cycle of the two sequential output cycles. In
Different from the processor 100 shown in
As described above, the processor for supporting a MIMO operation may relieve or minimize the need for additional hardware used to process a MIMO instruction, such as a register port or interconnections, such that the structure of the processor is simpler. In addition, the processor for supporting a MIMO operation may minimize the use of the registers for processing a MIMO operation, and may increase the processing speed.
The processes, functions, methods and/or software described above may be recorded, stored, or fixed in one or more computer-readable storage media that includes program instructions to be implemented by a computer to cause a processor to execute or perform the program instructions. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. The media and program instructions may be those specially designed and constructed, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable media include magnetic media, such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks and DVDs; magneto-optical media, such as optical disks; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory, and the like. Examples of program instructions include machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter. The described hardware devices may be configured to act as one or more software modules in order to perform the operations and methods described above, or vice versa. In addition, a computer-readable storage medium may be distributed among computer systems connected through a network and computer-readable codes or program instructions may be stored and executed in a decentralized manner.
As a non-exhaustive illustration only, the device described herein may refer to mobile devices such as a cellular phone, a personal digital assistant (PDA), a digital camera, a portable game console, and an MP3 player, a portable/personal multimedia player (PMP), a handheld e-book, a portable tablet and/or laptop PC, a global positioning system (GPS) navigation, and devices such as a desktop PC, a high definition television (HDTV), an optical disc player, a setup and/or set top box, and the like consistent with that disclosed herein.
A computing system or a computer may include a microprocessor that is electrically connected with a bus, a user interface, and a memory controller. It may further include a flash memory device. The flash memory device may store N-bit data via the memory controller. The N-bit data is processed or will be processed by the microprocessor and N may be 1 or an integer greater than 1. Where the computing system or computer is a mobile apparatus, a battery may be additionally provided to supply operation voltage of the computing system or computer.
It will be apparent to those of ordinary skill in the art that the computing system or computer may further include an application chipset, a camera image processor (CIS), a mobile Dynamic Random Access Memory (DRAM), and the like. The memory controller and the flash memory device may constitute a solid state drive/disk (SSD) that uses a non-volatile memory to store data.
A number of example embodiments have been described above. Nevertheless, it will be understood that various modifications may be made. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents. Accordingly, other implementations are within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2010-0022493 | Mar 2010 | KR | national |