The invention relates to the field of superscalar technology, in particular to a method for superscalar delay optimization.
In the superscalar technology issue queue, each issue needs to scan the issue queue from the beginning to the end to find out the commands that have been prepared for issue. And when the issue queue is long, especially the CIQ queue (centralized issue queue), it is necessary to select several commands that can be executed from a queue including a huge number of commands, and the delay will be very large, so that the cycle time is greatly affected, thereby greatly reducing the clock frequency and execution efficiency of the processor. And its delay is proportional to the capacity of the issue queue, the more commands that can be accommodated in the issue queue, the greater the delay.
In traditional superscalar techniques, scanning from beginning to end, looking for the commands that needs to be issued may require O (n) time complexity at worst, and then wake up, so that the wake-up time can only wait until that O (n) scans the entire issue queue is completed in one cycle.
The technical problem to be solved by the invention is to reduce the delay of scanning the entire issue queue from the beginning to the end per clock cycle of the traditional superscalar technology to one-third of the original one.
In order to solve the above technical problems, the technical solutions provided by the invention are: a method for superscalar delay optimization, wherein an issue queue is divided into three types, namely, ready queue, wait 1 queue, and wait 2 queue; the length of each type of queue is one-third of the issue queue, so that the delay of scanning the entire issue queue from beginning to end per clock cycle is reduced to one-third of the original; in the issue queue, the details are:
in the issue queue, the three queues include three command entry and exit modes, specifically:
The advantages of the invention compared with prior art are: in the invention, an issue queue is divided into three types, namely, ready queue, wait 1 queue, and wait 2 queue; the length of each type of queue is one-third of the issue queue, so that the delay of scanning the entire issue queue from beginning to end per clock cycle is reduced to one-third of the original. Further, the three command entry and exit modes are parallel.
Further, the positions of issue width in the front of ready queue are set as the issue ports, the first position is the issue port for issuing the first issue queue command, and the second position is the issue port for issuing the second issue queue command, and so on to the issue width position, so that it is possible to find the commands that need to be issued and executed from the issue queue in a time complexity of O (1) in one cycle, that is, it can be woken up after a very small constant time complexity.
Further, in the issue queue, the three command entry and exit modes are applicable to any command type, but it needs to meet the condition that one command has at most two source registers and one destination register (applicable to risc instruction type, arm instruction type, micro instruction type that meets the condition, and any other instruction type that meets the condition).
When the invention is in specific implementation, several important parameters are included:
The type of queue storage command in this solution: any command type, but it needs to meet the condition that one command has at most two source registers and one destination register (applicable to risc instruction type, arm instruction type, micro instruction type that meets the condition, and any other instruction type that meets the condition).
In one embodiment of the invention, the core principle of this solution is: an issue queue is divided into three types, namely, ready queue, wait 1 queue, and wait 2 queue; the length of each type of queue is one-third of the issue queue, so that the delay of scanning the entire issue queue from beginning to end per clock cycle is reduced to one-third of the original.
In detail, the issue queue includes: 1) ready queue: the data of both source registers is ready and can be issued directly;
In one embodiment of the invention, in the issue queue, the three queues include three command entry and exit modes, specifically:
The above three processes are parallel, thus reducing the scan queue delay to ⅓ of the original.
In one embodiment of the invention, the working principle of the solution is: the positions of issue width in the front of ready queue are set as the issue ports, the first position is the issue port for issuing the first issue queue command, and the second position is the issue port for issuing the second issue queue command, and so on to the issue width position, so that it is possible to find the commands that need to be issued and executed from the issue queue in a time complexity of O (1) in one cycle, that is, it can be woken up after a very small constant time complexity.
The invention and the embodiments thereof are described hereinabove, and this description is not restrictive. What is shown in the drawings is only one of the embodiments of the invention, and the actual structure is not limited thereto. All in all, structural methods and embodiments similar to the technical solution without deviating from the purpose of the invention made by those of ordinary skill in the art without creative design shall all fall within the protection scope of the invention. The protection scope of the invention is defined by the appended claims and the equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
2023105185126 | May 2023 | CN | national |