This application claims the benefit under 35 U.S.C. § 119(a) of Korean Patent Application No. 10-2010-0049852, filed on May 27, 2010, the entire disclosure of which is incorporated herein by reference for all purposes.
1. Field
The following description relates to an equal model conservation technique for a pipeline processor.
2. Description of the Related Art
An instruction is processed at several stages in a pipeline processor. For example, an instruction may be processed at a fetch stage, a decoding stage, an execution stage, and a write-back stage. In the pipeline processor, because a plurality of instructions sequentially pass through the individual stages of the pipeline while overlapping each during execution, programs can be processed at high speeds.
Examples of a pipeline processor include an equal model processor and a less-than-or-equal (LTE) processor. In the pipeline processor, a maximum latency may be set for each instruction. For example, a value of maximum latency for each instruction may depend on a processor type.
“Maximum latency” indicates a maximum time period for which an operation corresponding to each instruction is executed and the result of the execution is written in a target register. In the equal model processor, the result of the execution of an instruction is written in the target register after the set maximum latency elapses. Meanwhile, in the LTE model processor, the result of execution of an instruction may be written in the target register before set maximum latency elapses. Accordingly, the equal model processor allows flexible scheduling through multiple assignment of a target register because the result of the execution of an instruction is not written in the target register until maximum latency elapses. Alternatively, in the LTE model processor, because the result of execution of an instruction can be written in a target register before maximum latency elapses, multiple assignment of the target register is limited.
In one general aspect, there is provided a pipeline processor comprising a pipeline processing unit to process an instruction according to a plurality of stages, and an equal model compensator to store the results of the processing of some or all of the instructions located in the pipeline processing unit and to write the processing results in a register file based on the latency of each instruction.
The equal model compensator may store the processing results in response to a backup signal for flushing the pipeline processing unit.
The equal model compensator may write the processing results in the register file in response to a restore signal for restarting the pipeline processing unit.
The equal model compensator may store data to be written in the register file upon the processing of the instruction, an address of a location at which the data is to be written, and write-enable information indicating whether or not the data is to be written in the register file.
The equal model compensator may write the stored processing results in the register file after latency of an instruction corresponding to the stored processing results elapses.
The plurality of stages may include a fetch stage, a decoding stage, a plurality of execution stages, and a write-back stage.
In response to a backup signal for flushing the pipeline processing unit, the pipeline processor may process instructions corresponding to remaining execution stages except for a first execution stage and may process instructions corresponding to a write-back stage.
The equal model compensator may store results obtained by processing the instructions corresponding to remaining execution stages except for the first execution stage and may store results obtained by processing the instructions corresponding to the write-back stage.
The equal model compensator may be disposed in front of and/or behind the write stage.
The equal model compensator may store the results of the processing in a FIFO fashion according to a clock or cycle of the pipeline processing unit.
In another aspect, there is provided an equal model conservation method comprising storing processing results that are a result of the processing of some or all of the instructions located in a pipeline, and writing the processing results in a register file in based on the latency of each instruction.
The equal model conservation method may further comprise, after storing the processing results, executing the instructions and another event.
The other event may include pipeline debugging or an interrupt service routine.
The processing results may include data to be written in a register file upon the processing of the instruction, an address of a location at which the data is to be written, and a write-enable indicating whether or not the data is to be written in the register file.
The storing of the processing results may comprise storing the processing results in a FIFO fashion according to a clock or cycle for processing the pipeline.
The writing of the processing results may comprise writing the stored processing results in the register file after latency of an instruction corresponding to the stored processing results elapses.
Other features and aspects may be apparent from the following description, the drawings, and the claims.
Throughout the drawings and the description, unless otherwise described, the same drawing reference numerals should be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated for clarity, illustration, and convenience.
The following description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses, and/or systems described herein. Accordingly, various changes, modifications, and equivalents of the methods, apparatuses, and/or systems described herein may be suggested to those of ordinary skill in the art. Also, descriptions of well-known functions and constructions may be omitted for increased clarity and conciseness.
Referring to
For example, when the pipeline processor 100 empties a pipeline processing unit 101 for the purposes of debugging, the pipeline processor 100 may store the processing results of some or all of the instructions existing in the pipeline processing unit 101 when an event related to debugging is generated. For example, the pipeline processor 100 may store the processing results in a storage 109 in a FIFO fashion. After the event is terminated, pipeline processor 100 may send the processing results stored in the storage 109 to the register file 103 based on the latency of each instruction.
In the example of
For example, the pipeline processing unit 101 may process instructions according to a plurality of stages based on a predetermined operation clock or clock cycle. For example, the stages may include a fetch stage 104, a decoding stage 105, an execution stage 106, and a write-back stage 107. It should be appreciated that the number of the stages may be changed according to their purposes of use and applications.
The fetch stage 104 corresponds to a construction or process for fetching instructions from an external memory. The decoding stage 105 corresponds to a construction or process for interpreting the fetched instructions and interpreting the content of the instructions. The execution stage 106 corresponds to a construction or process for obtaining operands based on the interpreted content of the instructions and executing the instructions. The write-back stage 107 corresponds to a construction or process for writing the results of the execution.
For example, the pipeline processing unit 101 may fetch an instruction in a first cycle, decode the instruction in a second cycle, execute the instruction in a third cycle, and write-back the result of the execution in a fourth cycle. In this example, the execution stage 106 occupies one cycle, however, it should be appreciated that the execution stage 106 may occupy a plurality of cycles.
The pipeline processing unit 101 may simultaneously process a plurality of instructions in parallel. For example, a first instruction may be fetched in a first cycle, and in a second cycle the first instruction may be decoded and simultaneously a second instruction may be fetched. In other words, as seen in a snapshot of the pipeline processing unit 101 at a certain moment, a first instruction may be located in the execution stage 106, a second instruction may be located in the decoding stage 105, a third instruction may be located in a fetch stage 104, and the like.
The equal model compensator 102 includes a controller 108 and a storage 109. In some embodiments, the equal mode compensator 102 may also include the register file 103. As another example, the register file 103 may be included in the pipeline processing unit 101. As another example, the register file 103 may be disposed externally from the equal mode compensator 102 and the pipeline processing unit 101. The controller 108 receives a certain event from an outside source and controls the pipeline processor 101 and the storage 109 according to the received event. The storage 109 temporarily stores the processing results of instructions processed by the pipeline processing unit 101 in a FIFO fashion.
For example, the controller 108 may control the pipeline processing unit 101 and storage 109 such that some or all of the instructions existing in the pipeline processing unit 101 are processed and the processing results are temporarily stored, in response to a backup signal for flushing the pipeline processing unit 101.
As another example, the controller 108 may control the pipeline processing unit 101 and storage 109 such that the temporarily stored results of the processing are transferred to the register file 103 and instructions are filled in the pipeline processing unit 101, in response to a restore signal for restarting the pipeline processing unit 101.
In the example illustrated in
When the controller 108 sends the back-up signal to the pipeline processing unit 101 and storage 109, in a first cycle, the first instruction {circle around (1)} located in the execution stage 106 is executed to transfer the result of the execution {circle around (1)}′ to the write-back stage 107, as illustrated I
In a second cycle, the result of the execution {circle around (1)}′ is transferred to the storage 109 and the storage 109 stores the result of the execution {circle around (1)}′, as illustrated in
In a third cycle, the result of the execution {circle around (2)}′ is stored in the storage 109, as illustrated in
Likewise, in a fourth cycle, the result of the execution {circle around (3)}′ is stored in the storage 109, NOP is filled in the remaining stages 104, 105, 106 and 107 and then the pipeline processing unit 101 is flushed, as illustrated in
When a certain event is generated to flush the pipeline processing unit 101, each instruction of the pipeline processing unit 101 is processed over stages according to an operation clock or clock cycle, and the processing results are sequentially stored in the equal model compensator 102.
In the examples of
Also, in the current example, when an event for flushing the pipeline processing unit 101 is generated, all instructions existing in the pipeline processing unit 101 are processed and backed up, however, it is also possible for only some of the instructions that exist in the pipeline processing unit 101 to be processed and backed up. In this example, which instructions are processed may be determined based on the purposes and/or the use and applications. For example, it is possible that when the execution stage 106 is divided into a plurality of sub-execution stages, instructions after a first sub-execution stage may be flushed. For example, instructions corresponding to execution stages following the first sub-execution stage and instructions corresponding to a write-back stage may be flushed.
As illustrated in
For example, as illustrated in
In the next cycle, as illustrated in
In the described examples, the result of the execution {circle around (1)}′ of the first instruction and the result of the execution {circle around (2)}′ of the second instruction are the resultant values obtained after the instructions {circle around (1)} and {circle around (2)} pass through each of the stages of the pipeline processing unit 101. The result of the execution {circle around (1)}′ and the result of the execution {circle around (2)}′ pass through the storage 109 in a FIFO fashion over several cycles and are written in the register file 103. Accordingly, the result of the execution {circle around (1)}′ and the result of the execution {circle around (2)}′ satisfy the latency of an equal model.
As another example, even when a trigger signal, such as an interrupt or breakpoint, is generated at an arbitrary time, it is possible to empty a pipeline with no delay, perform an instruction for debugging, and restart the pipeline processing unit 101.
Referring to
The controller 108 applies a restore signal 201 and a back-up signal 202 to the storage 109. The restore signal 201 corresponds to a signal for restarting a pipeline processing unit and the back-up signal 202 corresponds to a signal for flushing the pipeline processing unit, for example, the pipeline processing unit 101 shown in
In the example of
For example, the FIFO unit 205 may be connected to a register other than the register file 103 through an interface.
In status #1, in response to the restore signal 201 being “0,” the first switch 203 may select the output 301 of the write-back stage 107. In response to the back-up signal 202 being “0”, the second switch 204 may select the output 302 of the first switch 203. Accordingly, the output 301 of the write-back stage 107 may detour the FIFO unit 205 and may be transferred to the register file 103.
In status #2, in response to the restore signal 201 being “0,” the first switch 203 may select the output 301 of the write-back stage 107. In response to the back-up signal 202 being “1,” the second switch 204 may select a default value (for example, “0”). Accordingly, the output of the write-back stage 107 may be stored in the FIFO unit 205.
The output of the write-back stage 107 that is stored in the FIFO unit 205 may be referred to as EQ model data. For example, EQ model data may include data to be written in a register file according to execution of an instruction, an address of a location at which the data will be written, a write enable indicating whether or not the data can be written in the register file, and the like.
In status #3, in response to the restore signal 201 being “1,” the first switch 203 may select the output 304 of the FIFO unit 205. In response to the back-up signal 202 being “0,” the second switch 204 may select the output 302 of the first switch 203 which corresponds to the output 304 of the FIFO unit 205. Accordingly, a value stored in the FIFO unit 205 may be transferred to the register file 103.
In status #4, in response to the restore signal 201 being “1,” the first switch 203 may select the output 304 of the FIFO unit 205. In response to the back-up signal 202 being “1,” the second switch 204 may select a default value (for example, “0”). Accordingly, a value stored in the FIFO unit 205 may be backed up again.
Referring to the example pipeline processor of
Referring to
Referring to
After the results of the processing are stored, in 702 the instructions located in the pipeline and other events are executed. For example, a code for debugging or an interrupt service routine code may be filled in the pipeline processing unit and executed.
After the debugging or the interrupt processing is completed, the results of the processing are written in a register file in consideration of latency of each instruction, in 703. For example, after debugging is complete, the controller creates a restore signal and applies the restore signal to the pipeline processing unit and the storage. Subsequently, the results of the processing stored in the FIFO fashion in the storage are one by one transferred to a register file.
Furthermore, a compiler of the processor may know the maximum latency for each instruction and may consider the maximum latency of each instruction during scheduling.
The processes, functions, methods, and/or software described above may be recorded, stored, or fixed in one or more computer-readable storage media that includes program instructions to be implemented by a computer to cause a processor to execute or perform the program instructions. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. Examples of computer-readable storage media include magnetic media, such as hard disks, floppy disks, and magnetic tape; optical media such as CD ROM disks and DVDs; magneto-optical media, such as optical disks; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory, and the like. Examples of program instructions include machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter. The described hardware devices may be configured to act as one or more software modules in order to perform the operations and methods described above, or vice versa. In addition, a computer-readable storage medium may be distributed among computer systems connected through a network and computer-readable codes or program instructions may be stored and executed in a decentralized manner.
A number of examples have been described above. Nevertheless, it should be understood that various modifications may be made. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents. Accordingly, other implementations are within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2010-0049852 | May 2010 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
4498136 | Sproul, III | Feb 1985 | A |
5764971 | Shang et al. | Jun 1998 | A |
6694425 | Eickemeyer | Feb 2004 | B1 |
6708267 | Flacks et al. | Mar 2004 | B1 |
20020087840 | Kottapalli et al. | Jul 2002 | A1 |
20020116601 | Skrzeszewski et al. | Aug 2002 | A1 |
20020144093 | Inoue et al. | Oct 2002 | A1 |
20040111591 | Arimilli et al. | Jun 2004 | A1 |
20050081021 | Huang | Apr 2005 | A1 |
20050108510 | Johnson | May 2005 | A1 |
20050216708 | Katayama | Sep 2005 | A1 |
20050262389 | Leijten | Nov 2005 | A1 |
20070074006 | Martinez et al. | Mar 2007 | A1 |
20070083742 | Abernathy | Apr 2007 | A1 |
20070088935 | Feiste et al. | Apr 2007 | A1 |
20070103474 | Huang et al. | May 2007 | A1 |
20070106885 | Rychlik | May 2007 | A1 |
20070186085 | Yim et al. | Aug 2007 | A1 |
20090119456 | Park et al. | May 2009 | A1 |
Number | Date | Country |
---|---|---|
10-1998-702203 | Jul 1998 | KR |
10-2001-0077997 | Aug 2001 | KR |
10-2001-0100879 | Nov 2001 | KR |
10-2003-0088892 | Nov 2003 | KR |
10-2004-0049254 | Jun 2004 | KR |
10-2007-0080089 | Aug 2007 | KR |
10-2009-0046609 | May 2009 | KR |
WO 9625705 | Aug 1996 | WO |
Entry |
---|
Shen, John P., Lipasti, Mikko H. “Modern processor design: Fundamentals of superscalar processors” McGraw Hill, pp. 206-209, 2005. |
Korean Office Action dated Feb. 22, 2016 in counterpart Korean Application No. 10-2010-0049852 (10 pages in Korean with English translation). |
Number | Date | Country | |
---|---|---|---|
20110296143 A1 | Dec 2011 | US |