This patent application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2011-063108 filed on Mar. 22, 2011, the entire contents of which are incorporated herein by reference.
The disclosures herein are related to an arithmetic processing unit and an arithmetic processing method.
The advancement of the semiconductor manufacturing technology has led to a significant improvement in microfabrication and high-integration of transistors in a central processing unit (CPU). However, simultaneously, failures of the transistors integrated in the CPU frequently occur due to the microfabrication process and high integration of the transistors. In order to prevent such failures of the transistors, there is proposed a technique for implementing a failure detecting circuit for detecting the failures of the transistors in the CPU. With this technique, the failure detecting circuit is configured to detect the failures of the transistors prior to affecting operations of the CPU. Thus, even if some of the transistors utilized in the CPU have failed, the CPU is prevented from malfunctioning by detecting the failures of the transistors in advance. Specifically, if the detected failures of the transistors are correctable, the detected failures may be corrected and hence, the CPU may be able to continue to run without being interrupted by the failures of the transistors.
There is a technology to correct the aforementioned failures of the transistors known in the art. In this technology, an error correcting circuit may be provided for correcting such failures in data utilized for executing an arithmetic operation, and if the data include errors, the error correcting circuit is readily to correct such errors. Such data utilized for executing the arithmetic operation are retrieved from a register file such as a fixed-point register or a floating-point register of the CPU. With this technology, a pipeline is clarified at the time that an error is detected and the instruction is executed again after the detected error is corrected. This has enabled the CPU to continue to run the program executing operation without terminating the execution of the program.
Note that if a memory access instruction such as a load instruction to access a noncache area has been engaged in accessing a noncache area at the time that an error is detected, the program executing operation of the CPU is controlled such that the program is terminated without allowing the load instruction to be executed again. This is because the load instruction for accessing a noncache area may have changed the contents of data in the access destination while reading the data for the first time. If the contents of the data in the access destination that have been changed are retrieved for a second time by executing the load instruction, erroneous data may be retrieved as a result. For example, the load instruction may serve as a “read-modify-write” instruction to retrieve data and modify the retrieved data simultaneously. Such an instruction (i.e., load instruction) may be utilized for controlling a semaphore or a mutex to manage a synchronization mechanism. The load instruction may generally be executed, not for accessing a cache which is less likely to directly read or write data in an access destination, but be executed for accessing a noncache area. Further, even if the load instruction is a simple read instruction, an access destination maybe a memory having a data structure in which reading one entry transitions to a next entry such as a first-in-first-out or a stack. In such a case, the load instruction may also be executed, not for accessing a cache which is less likely to directly read or write data in an access destination, but be executed for accessing a noncache area.
Thus, it may be undesirable to control the operation of the CPU to terminate a program at the time that an error is detected even when the load instruction for accessing the noncache area has already been engaged in accessing the noncache area. Accordingly, it is desirable to control the operation of the CPU to continue to execute the program without terminating the program at the time that an error is detected.
Further, if the CPU is provided with a circuit for correcting errors such as an error correcting code circuit (ECC), it maybe necessary to validate the control operation of the CPU at the time that an error is generated. There is a technology to validate the control operation of the CPU at the time that an error is generated by intentionally causing an error. However, in order to validate the operation of the CPU, it is preferable to validate the operation without terminating the execution of the program. A typical technique for validating the operation at the time that an error is generated includes creating a special program that will not generate a load instruction to access a noncache area and validating the operation of the CPU by executing such a created special program. However, if the operation is validated only by executing such a special program, the validation coverage may be small. Further, extra time and cost may be required for creating the special program. Accordingly, it is desirable to validate the operation by utilizing an ordinary program that is not specifically created when data retrieved from the fixed-point register or the floating-point register for performing the arithmetic operation are found to be erroneous.
Patent Document 1: International Publication WO2008/152728
Patent Document 2: International Publication WO2008/155795
Patent Document 3: Japanese Laid-open Patent Publication No. 5-274173
According to an aspect of an embodiment, an arithmetic processing unit includes a cache memory; a register configured to hold data used for arithmetic processing; a correcting controller configured to detect an error in data retrieved from the register; a cache controller configured to access a cache area of a memory space via the cache memory or a noncache area of the memory space without using the cache memory in response to an instruction executing request for executing a requested instruction, and notify a report indicating that the requested instruction is a memory access instruction for accessing the noncache area; and an instruction executing controller configured to delay execution of other instructions subjected to error detection by the correcting controller while the cache controller executes the memory access instruction for accessing the noncache area when the instruction executing controller receives the notified report.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention as claimed.
In the following, a description is given with reference to the accompanying drawings of embodiments.
Each of the cores 14 incorporates a primary cache. In viewing from instruction-executing controllers inside the cores 14, each of the instruction-executing controllers inside the cores 14 is configured to access a primary cache and further access a secondary cache outside the cores 14. The cache memories in the CPU 11 are arranged in a hierarchical configuration. Thus, when an error occurs in the cache memory, a penalty caused by accessing a main storage may be reduced, owing to the hierarchical configuration of the cache memories. In this example, the secondary cache 15, which may be accessed faster than the main storage, is arranged between the primary cache and the main storage (i.e., DIMM 12). With this configuration, when the error occurs in the cache memory, the penalty may be reduced by lowering the frequency in accessing the main storage.
An interconnect part 13 is configured to control data exchange between the CPU 11 and the external devices or other nodes (e.g., other CPUs). In the configuration of the information processing system 10, only one CPU is implemented on a CPU/memory board. A noncache area accessed by a load instruction or the like may include registers inside the MAC 16 and the interconnect part 13.
The core 14 includes an instruction buffer 21, an instruction decoder 22, a reservation station for address generation (RSA) 23, a reservation station for execution (RSE) 24, and a reservation station for floating (RSF) 25. The core 14 further includes a reservation station for branch (RSBR) 26, a commit stack entry (CSE) 27, a primary data cache controller 28, an arithmetic unit 29, an arithmetic unit 30, a next program counter (NEXTPC) 31 and a program counter (PC) 32. The core 14 further includes a fixed-point renaming register 33, a floating-point renaming register 34, a fixed-point register 35, a floating-point register 36, an error detecting-correcting controller 37, a branch predicting mechanism 38 and an instruction fetch address generator 39. The core 14 further includes a primary instruction cache 40 and a pipeline clearing controller 41. The primary data cache controller 28 includes an operand address generator 42 and a primary data cache 43.
The instruction fetch address generator 39 is configured to generate an instruction fetch address based on an instruction address supplied from the program counter 32 and information acquired from the branch predicting mechanism 38. When the instruction fetch address generator 39 generates the instruction fetch address, the branch predicting mechanism 38 performs branch prediction based on information acquired from the RSBR 26. The instruction fetch address generator 39 issues an instruction fetch address and an instruction fetch request to the primary instruction cache 40 to fetch an instruction corresponding to the instruction fetch address. The fetched instruction is then stored in an instruction buffer 21. The instruction buffer 21 supplies the instructions sequentially stored in the order of program instructions to the instruction decoder 22. The instruction decoder 22 sequentially decodes the instructions in the order of program instructions and issues the decoded instructions in the order of program instructions. The instruction decoder 22 creates entries that indicate respective instructions to the RSA 23, RSE 24, RSF 25 and RSBR 26 based on types of the decoded instructions by issuing the decoded instructions.
The RSA 23 is a reservation station configured to control the created entries regardless of the order of program instructions (i.e., out of program instruction order) so as to generate a main storage operand address and execute a load instruction or a store instruction. The operand address generator 42 generates an address of an access destination based on the control carried out by the RSA 23, such that the load instruction or the store instruction is executed corresponding to the generated address in the primary data cache 43. The data retrieved based on the load instruction are stored in a register specified by the fixed-point renaming register 33 or the floating-point renaming register 34. The RSE 24 is a reservation station for controlling the created entries regardless of the program instruction order (i.e., out of program instruction order) so as to execute a fixed point arithmetic operation on the data in the specified register. The arithmetic unit 29 carries out a fixed point arithmetic operation on data in the specified register of the fixed-point renaming register 33 based on the control carried out by the RSE 24 and stores the arithmetic operation result in the specified register of the fixed-point renaming register 33. The RSF 25 is a reservation station for controlling the created entries regardless of the program instruction order (i.e., out of program instruction order) so as to execute a floating point arithmetic operation on the data in the specified register. The arithmetic unit 30 carries out a floating point arithmetic operation on data in the specified register of the floating-point renaming register 34 based on the control carried out by the RSF 25 and stores the arithmetic operation result in the specified register of the floating-point renaming register 34. The RSB 26 is a reservation station for executing a branch instruction and supplies information on a branch instruction destination to the next program counter 31 and the branch predicting mechanism 38.
The instruction decoder 22 further creates entries of all the decoded instructions in the CSE 27 configured to control the completion of the instructions in the order of program instructions. When the instructions are executed based on the controls performed by the RSA 23, RSE 24, RSF 25 and RSBR 26, respective reports on instruction execution completion are generated along with identifiers of the executed (completed) instructions. The entries corresponding to the executed (completed) instructions are released from the CSE 27 in the order of program instructions and the completion of the instructions is sequentially finalized in the order of program instructions based on the released entry of a corresponding one of the executed instructions. When the completion of the instructions released from the CSE 27 is finalized, resources corresponding to the instructions are updated. When the load instruction, the fixed-point arithmetic operation instruction, and the floating-point arithmetic operation instruction are carried out, the data in the fixed-point renaming register 33 and the floating-point renaming register 34 are transferred to the fixed-point register 35 and the floating-point register 36 such that the executed instruction results are reflected in the accessible registers via software. Simultaneously, a value of the program counter 32 is updated corresponding to the value of the next program counter 31 while the value of the next program counter 31 is changed in an appropriate amount such that the changed value of the next program counter 31 indicates the address of the next instruction to be fetched. Accordingly, the program counter 32 indicates the address of the next instruction subsequent to the executed (completed) instruction released from the CSE 27. Note that if the execution of the branch instruction is completed, the branch destination address is stored in the next program counter 31
The pipeline clearing controller 41 is configured to cancel the executed result of the instruction when a predetermined condition is satisfied, for example, when the execution of the branch instruction has failed, or when the later-described error is generated. Accordingly, a pipeline of the instruction executed by the core 14 is cleared (flushed). Respective instructions in an execution phase, such as an instruction fetch, an instruction decode, an instruction issue, an instruction execute and an instruction completion wait, are aligned in the instruction fetch address generator 39, the instruction buffer 21, the instruction decoder 22, the RSA 23, the RSE 24, the RSF 25, the RSBR 26, the CSE 27, and the like. These instructions in the execution phases are deleted by clearing (flushing) the pipeline based on the instruction executed by the pipeline clearing controller 41. Accordingly, no instructions in the execution phases are aligned in the instruction fetch address generator 39, the instruction buffer 21, the instruction decoder 22, the RSA 23, the RSE 24, the RSF 25, the RSBR 26, the CSE 27, and the like.
When the error detecting-correcting controller 37 reads data having 1-bit error from the fixed-point register 35 or the floating-point register 36, the error detecting-correcting controller 37 detects the 1-bit error and corrects the detected 1-bit error. The error detecting-correcting controller 37 may use an error correction code (ECC) to detect and correct the 1-bit error. The error detecting-correcting controller 37 intentionally causes 1-bit error in the data to be retrieved from the fixed-point register 35 or the floating-point register 36 for validating the control operation.
Initially, a basic control process for executing a load instruction to access noncache area is described. The primary data cache controller 28 accesses a cache area or a noncache area based on a request for executing an instruction while reporting to the primary data cache controller 28 that the instruction requested for execution is a load instruction for accessing the noncache area. When the primary data cache controller 28 receives the report indicating that the instruction requested for execution is the load instruction for accessing the noncache area, the instruction-executing controller 50 delays execution of other instructions while allowing the primary data cache controller 28 to execute the load instruction for accessing the noncache area. Accordingly, an error may not be detected by the error detecting-correcting controller 37 (see
In order to implement the aforementioned control process, the noncache access reexecuting mode instructing part 51 is provided in the instruction-executing controller 50 and various signals are exchanged as illustrated in
When the signal indicating the unexecuted access to the noncache area is asserted, the instruction-executing controller 50 waits for the load instruction to be aligned at the head of the unfinalized, uncompleted instructions. When the load instruction is aligned at the head of the unfinalized, uncompleted instructions, the instruction-executing controller 50 asserts a noncache access reexecuting mode signal. Specifically, when the entry corresponding to the load instruction is aligned at the head of the instructions among the stored entries, the instruction-completing controller (CSE) 27 requests for reexecuting the noncache access to the noncache access reexecuting mode instructing part 51. In response to that request, a “1” may be stored in the noncache access reexecuting mode instructing part 51 to assert the noncache access reexecuting mode signal. Further, in the instruction-executing controller 50, the pipeline clearing controller 41 clears (flushes) the pipeline in response to the request for reexecuting noncache access received from the instruction-completing controller 27, and the execution of refetching the load instruction is initiated again. Specifically, in the instruction-executing controller 50, the instruction decoder 22 decodes the refetched load instruction to issue the decoded load instruction, and the operand address executing controller 23 requests the primary data cache controller 28 to execute the decoded load instruction. At this moment, since the noncache access reexecuting mode signal is being asserted in the instruction-executing controller 50, the instruction decoder 22 will not issue the instructions subsequent to the load instruction to delay the execution of the other instructions.
When the instruction-executing controller 50 requests the primary data cache controller 28 to execute the load instruction while the noncache access reexecuting mode signal is being asserted, the primary data cache controller 28 executes the load instruction to access the noncache area. When the primary data cache controller 28 executes the load instruction, the instruction-executing controller 50 negates the noncache access reexecuting mode signal to initiate issuing of other instructions subsequent to the executed load instruction. Specifically, in the instruction-executing controller 50, the instruction-completing controller 27 reports the completion of the load instruction execution to the noncache access reexecuting mode instructing part 51. In response to the execution completion report, the noncache access reexecuting mode signal output by the noncache access reexecuting mode instructing part 51 is switched to a negate state. Further, the instruction decoder 22 initiate issuing of other instructions subsequent to the load instruction in response to the negate state of the noncache access reexecuting mode signal.
If the primary data cache controller 28 is in a validation mode, the primary data cache controller 28 may report to the instruction-executing controller 50 that the executed load instruction is the load instruction to access a noncache area. If, on the other hand, the primary data cache controller 28 is not in the validation mode, the primary data cache controller 28 may execute the load instruction to access the noncache area without reporting to the instruction-executing controller 50 that the executed load instruction is the load instruction to access the noncache area. That is, only when the primary data cache controller 28 is in the validation mode, the primary data cache controller 28 may reexecute the load instruction while allowing the pipeline clearing controller 41 to clear (flush) the pipeline and delaying the execution of other instructions. By contrast, when the primary data cache controller 28 is not in the validation mode, the primary data cache controller 28 may execute the load instruction in a similar manner as other instructions executed in a normal control operation mode. Note that whether the primary data cache controller 28 is in the validation mode maybe indicated by the contents of a validation mode signal based on the settings of the validation mode register 52. Accordingly, the execution of all the load instructions for accessing the noncache area in the normal operation mode may not be decelerated by performing a specific control over the load instruction for accessing the noncache area only when the primary data cache controller 28 is being in the validation mode. If the load instruction for accessing the noncache area is specifically controlled while the primary data cache controller 28 being in the validation mode, the error correcting control operation may be effectively validated without the necessity of creating a special validation program.
As illustrated in
Only the noncache access instruction is redecoded in step S7 of
In step S12 of
S22, the instruction decoder 22 proceeds with step S24 so as to determine whether one instruction has been decoded in the noncache access reexecuting mode. If the instruction decoder 22 determines that no instruction has yet been decoded in the noncache access reexecuting mode, the instruction decoder 22 proceeds with step S26 so as to decode only one instruction to issue the decoded instruction. The issued instruction is a first instruction in the noncache access reexecuting mode (i.e., the first instruction in the noncache access reexecuting mode after clearing (flushing) the pipeline), which is the load instruction for accessing the noncache area. Specifically, the instruction decoder 22 creates one entry and stores the load instruction corresponding to the created entry in the operand address executing controller (RSA) 23. If, on the other hand, the instruction decoder 22 determines that one instruction has already been decoded in the noncache access reexecuting mode in step S24, the instruction decoder 22 proceeds with step S25 so as not to issue an instruction subsequent to the load instruction.
If, on the other hand, the primary data cache controller 28 determines that the instruction requested for execution (i.e., the load instruction) corresponds to the noncache access reexecuting mode in step S33, the primary data cache controller 28 executes the access to the noncache area in step S36. When the primary data cache controller 28 completes the execution of the access to the noncache area, the primary data cache controller 28 proceeds with step S37 so as to report the completion of the instruction execution to the instruction-completing controller (CSE) 27.
After the instruction pipeline is cleared (flushed), the corresponding load instruction (i.e., the load instruction for accessing the noncache area) is refetched, the refetched load instruction is decoded and an entry corresponding to the refetched load instruction is created in the instruction-completing controller 27. The instruction-completing controller 27 waits for receiving the execution completion report corresponding to the load instruction from the primary data cache controller 28. When the instruction-completing controller 27 receives the report on the instruction execution completion, the instruction-completing controller 27 stores the execution completion report in the entry indicated by the simultaneously received entry number (i.e., the entry of the load instruction). In step S43, the instruction-completing controller 27 determines whether the signal indicating unexecuted access to the noncache area is in an on state corresponding to the entry determined as the entry of the completed instruction (i.e., the entry of the load instruction). If, on the other hand, the instruction-completing controller 27 determines that the signal indicating unexecuted access to the noncache area is not an on state corresponding to the entry determined as the entry of the completed instruction, the instruction-completing controller 27 proceeds with step S47 so as to determine whether to finalize the completion of the load instruction. In this case, it may be necessary to finalize the completion of the instructions in the order of the program instructions. When the completion of the instruction is finalized (“YES” in step S47), the noncache access instruction reexecuting mode is switched off (step S48) and the resources such as the registers are updated (step S49).
If the validation mode is in an on state (“YES” in step S39), the primary data cache controller 28 determines whether the instruction requested for execution corresponds to the noncache access reexecuting mode in step 533. Thereafter, the primary data cache controller 28 controls the execution or unexcution of access to the noncache area based on the determination result indicating that the instruction requested for execution corresponds to or does not correspond to the noncache access reexecuting mode. The control operation in this case is similar to that illustrated in
As illustrated in
When the noncache access instruction reexecuting request signal +NONCACHE_ACCESS_RERUN_REQUEST output by the instruction-completing controller 27 is “1” and the pipeline clear signal +PIPELINE_CLEAR is “0”, “1” is set to the latch circuit 65. Accordingly, the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE is switched to an assert state (i.e., “1” in this example). The output of the latch circuit 65 is maintained as “1” by the feedback path until a condition is satisfied, in which a program head instruction completion indicating signal +TOQ_CSE_END output by the instruction-completing controller 27 is “1” and the pipeline clear signal +PIPELINE_CLEAR is “0”. That is, the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE is maintained in the assert state.
The instruction-issuing controller 70 is sequentially supplied with instructions in the order of the program instructions and generates respective 1-bit signals +D0_REL, +D1_REL, and +D2_REL indicating the issuing of the instructions. These signals are +D0_REL, +D1_REL, and +D2_REL are in the order of the corresponding instructions. The signals +D0_REL, +D1_REL, and +D2_REL are sequentially switched to “1” in the order of the corresponding instructions. A “0” is assigned as an initial setting of the latch circuit 79. With this condition, when the +D0_REL is switched to “1”, the output of the AND circuit 73 is switched to “1” and the +D0_ISSUE signal indicating the issuing of the first instruction is switched to “1”. Further, when the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE is switched to “1”, the output of the AND circuit 77 is switched to “1”, and hence “1” is set to the latch circuit 79. The output of the latch circuit 79 is updated by the result of an AND operation of the output signal of the latch 79 and the the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE. Accordingly, the “1” is maintained as the output of the latch circuit 79 while the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE is “1”. Further, the AND circuit 73 carries out an AND operation of the inverted signal of the output of the latch circuit 79 and +D0_REL signal. Accordingly, the +D0_ISSUE will not be “1” and will not issue the instruction while the output of the latch circuit 79 is “1”. Hence, the instruction is not issued while the output of the latch circuit 79 is “1”. Moreover, +D1_REL and +D2_REL signals are blocked by the AND circuits 75 and 76 while any one of the +NONCACHE_ACCESS_RERUN _MODE and the output signal of the latch circuit 79 is “1”. Thus, +D1_ISSUE and +D2_ISSUE will not be “1” to issue the instruction while the +NONCACHE_ACCESS _RERUN_MODE or the output signal of the latch circuit 79 is “1”.
Accordingly, only the head of the instructions among the program instructions is issued after the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE is switched to “1”, which may inhibit the subsequent instruction from being issued. Note that the configuration of the instruction-issuing part of the instruction decoder 22 illustrated in
The operand address generator 80 generates an operand address for the instruction requested for execution in response to the instruction executing request received from the operand address executing controller (RSA) 23. The noncache access controller 81 determines whether the instruction requested for execution is the load instruction in response to the instruction executing request. The noncache access controller 81 further outputs a noncache access signal +NONCACHE_LOAD_REQUEST (see
The execution completion selecting circuit 86 selects one of the execution completion signal from the primary data cache 84, the execution completion signal from the noncache area 85, and the signal indicating unexecuted access to the noncache area from the AND circuit 83. The execution completion selecting circuit 86 generates a signal +L1 _DCACHE_EXEC_COMP indicating the completion of the instruction execution by selecting one of the execution completion signals and transmits the generated signal to the instruction-completing controller 27. Further, if the execution completion selecting circuit 86 selects the signal indicating unexecuted access to the noncache area as the execution completion signal, the execution completion selecting circuit 86 transmits a noncache area access unexecuted signal +NOT_EXEC_NONCACHE_LOAD to the instruction-completing controller 27.
If, on the other hand, the validation mode signal +ERROR_INJECTION_MODE is “1”, the following control process is carried out. In this condition, if the noncache access signal +NONCACHE_LOAD_REQUEST is “1”, and the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE is “0”, the access to the noncache area 85 is not executed. Further, if the noncache access signal +NONCACHE_LOAD_REQUEST is “1”, and the noncache access instruction reexecuting mode signal +NONCACHE_ACCESS_RERUN_MODE is “1”, the access to the noncache area 85 is executed. Similar to the configuration in
According to at least one embodiment, the load instruction for accessing the noncache area may be executed alone in a state where other instructions are unexecuted. Accordingly, an error will not be detected in the data retrieved from the registers when the load instruction for accessing the noncache area is being executed. That is, when an error is detected in the data retrieved from the registers, the load instruction for accessing the noncache area will not be in execution. Accordingly, the program executing operation may be continued without being interrupted.
The embodiments of the invention described so far are not limited thereto. Various modifications may be made within the scope of the inventions described in the claims.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority or inferiority of the invention.
Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2011-063108 | Mar 2011 | JP | national |