Claims
- 1. A processor, comprising:
an instruction pipeline having a plurality of stages; a result pipeline having a plurality of stages; an execution unit connected to the instruction pipeline and the result pipeline, wherein the execution unit includes an operand input and a result output, wherein the operand input receives an operand from the instruction pipeline and wherein the execution unit transmits a result to the result output as a function of the operand received by the operand input; and a reorder buffer, wherein the reorder buffer supplies instructions and operands to the instruction pipeline and receives results from the result pipeline and wherein the instruction pipeline and the result pipeline wrap around the reorder buffer to create counter rotating queues.
- 2. The processor of claim 1, wherein the execution unit includes a plurality of stages, wherein each stage operates under control of a clock.
- 3. The processor of claim 1, wherein the execution unit is a wavefront processor.
- 4. The processor of claim 1, wherein the instruction pipeline is two instructions wide.
- 5. The processor of claim 1, wherein the result output is connected to the data pipeline and wherein the result output transmits a result to the result pipeline as a function of the operand received by the operand input.
- 6. The processor of claim 1, wherein the result output is connected to the instruction pipeline, wherein the result output transmits a result to the instruction pipeline as a function of the operand received by the operand input and wherein the instruction pipeline subsequently copies the result to the result pipeline.
- 7. The processor of claim 1, wherein the reorder buffer is implemented with nonassociative memory.
- 8. The processor of claim 7, wherein each result must travel at least one half trip around the result pipeline after being recovered.
- 9. The processor of claim 8, wherein each result recovered into the result pipeline after a halfway point is marked as needing to pass the reorder buffer.
- 10. The processor of claim 8, wherein each result recovered into the result pipeline carries a tag identifying the instruction with which the result is associated.
- 11. The processor of claim 110, wherein the tag identifies the reorder buffer register associated with the instruction.
- 12. The processor of claim 1, wherein each result recovered into the result pipeline carries a tag identifying the instruction with which the result is associated.
- 13. The processor of claim 12, wherein the tag identifies the reorder buffer register associated with the instruction.
- 14. The processor of claim 1, wherein the processor further comprises:
a cache, wherein the cache stores recently accessed data and instructions; an instruction prefetch unit; and a branch prediction unit connected to the instruction prefetch unit; wherein the reorder buffer receives an instruction from the instruction prefetch unit and launches the instruction, with its operands, down the instruction pipeline.
- 15. The processor of claim 1, wherein the reorder buffer uses nonassociative memory.
- 16. The processor of claim 1, wherein the reorder buffer is distributed across two or more segments of the instruction pipeline.
- 17. The processor of claim 1, wherein the reorder buffer is configured as two segments, wherein each instruction in the instruction pipeline includes an instruction tag and wherein a reorder buffer tag is appended to each instruction tag, wherein the reorder buffer tag identifies the reorder buffer which issued the instruction.
- 18. The processor of claim 1, wherein each result in the result pipeline includes a tag identifying whether the result is valid and whether the result is a predicted value.
- 19. The processor of claim 1, wherein partial results are stored in a consumer array within the instruction pipeline.
- 20. A computer system comprising:
memory; and a processor; wherein the processor includes:
a cache connected to the memory, wherein the cache stores recently accessed data and instructions; an instruction prefetch unit; a branch prediction unit connected to the instruction prefetch unit; an instruction pipeline having a plurality of stages; a result pipeline having a plurality of stages; an execution unit connected to the instruction pipeline and the result pipeline, wherein the execution unit includes an operand input and a result output, wherein the operand input receives an operand from the instruction pipeline and wherein the result output transmits a result to the result pipeline as a function of the operand received by the operand input; and a reorder buffer, wherein the reorder buffer receives instructions from the instruction prefetch unit, supplies instructions and operands to the instruction pipeline and receives results from the result pipeline and wherein the instruction pipeline and the result pipeline wrap around the reorder buffer to create counter rotating queues.
- 21. The processor of claim 20, wherein the execution unit includes a plurality of stages, wherein each stage operates under control of a clock.
- 22. The processor of claim 20, wherein the execution unit is a wavefront processor.
- 23. The processor of claim 20, wherein the instruction pipeline is two instructions wide.
- 24. The processor of claim 20, wherein the result output is connected to the data pipeline and wherein the result output transmits a result to the result pipeline as a function of the operand received by the operand input.
- 25. The processor of claim 20, wherein the result output is connected to the instruction pipeline, wherein the result output transmits a result to the instruction pipeline as a function of the operand received by the operand input and wherein the instruction pipeline subsequently copies the result to the result pipeline.
- 26. The processor of claim 20, wherein the reorder buffer is implemented with nonassociative memory.
- 27. The processor of claim 26, wherein each result must travel at least one half trip around the result pipeline after being recovered.
- 28. The processor of claim 27, wherein each result recovered into the result pipeline after a halfway point is marked as needing to pass the reorder buffer.
- 29. The processor of claim 27, wherein each result recovered into the result pipeline carries a tag identifying the instruction with which the result is associated.
- 30. The processor of claim 29, wherein the tag identifies the reorder buffer register associated with the instruction.
- 31. The processor of claim 20, wherein each result recovered into the result pipeline carries a tag identifying the instruction with which the result is associated.
- 32. The processor of claim 31, wherein the tag identifies the reorder buffer register associated with the instruction.
- 33. The processor of claim 20, wherein the processor further comprises:
a cache, wherein the cache stores recently accessed data and instructions; an instruction prefetch unit; and a branch prediction unit connected to the instruction prefetch unit; wherein the reorder buffer receives an instruction from the instruction prefetch unit and launches the instruction, with its operands, down the instruction pipeline.
- 34. The processor of claim 20, wherein the reorder buffer uses nonassociative memory.
- 35. The processor of claim 20, wherein the reorder buffer is distributed across two or more segments of the instruction pipeline.
- 36. The processor of claim 20, wherein the reorder buffer is configured as two segments, wherein each instruction in the instruction pipeline includes an instruction tag and wherein a reorder buffer tag is appended to each instruction tag, wherein the reorder buffer tag identifies the reorder buffer which issued the instruction.
- 37. The processor of claim 20, wherein each result in the result pipeline includes a tag identifying whether the result is valid and whether the result is a predicted value.
- 38. The processor of claim 20, wherein partial results are stored in a consumer array within the instruction pipeline.
- 39. A method of executing instructions within a counterflow pipeline processor having an instruction pipeline, a data pipeline, a reorder buffer and a plurality of execution units, including a first execution unit, the method comprising:
fetching an instruction; determining operands for the instruction; issuing the instruction into the instruction pipeline; determining, at the first execution unit, if the instruction is ready for execution; if the instruction is ready for execution, loading the operands into the first execution unit; monitoring for a result from the first execution unit; on receiving a result, storing the result in the result pipeline; determining if the instruction has executed; and if the instruction has not executed by the end of the instruction pipeline, wrapping the instruction back into the instruction pipeline.
- 40. The method according to claim 39, wherein writing the result to the reorder buffer includes:
determining if the result was stored in the result pipeline over half a pipeline length before reaching the reorder buffer; and if not, writing the result from the reorder buffer to the result pipeline.
- 41. The method according to claim 39, wherein writing the result to the reorder buffer includes:
determining if the instruction was invalidated; and if so, deleting the result from the result pipeline.
- 42. The method according to claim 39, wherein storing the result in the result pipeline includes storing, with the result in the result pipeline, a tag associated with the instruction.
- 43. A processor, comprising:
an instruction pipeline having a plurality of stages, including a first and a second stage; a result pipeline having a plurality of stages, including an first and a second stage; first and second execution units, wherein the first and second execution units are connected to the first and second stages, respectively, of the instruction pipeline and the result pipeline, wherein each execution unit includes an operand input and a result output, wherein the operand input receives an operand from its respective stage of the instruction pipeline and wherein the result output transmits a result to its respective stage of the result pipeline as a function of the operand received by the operand input; and first and second reorder buffers, wherein the first reorder buffer supplies instructions and operands to the first stage of the instruction pipeline and receives results from the first stage of the result pipeline and wherein the second reorder buffer supplies instructions and operands to the second stage of the instruction pipeline and receives results from the second stage of the result pipeline.
- 44. The processor of claim 43, wherein each execution unit includes a plurality of stages, wherein each stage operates under control of a clock.
- 45. The processor of claim 43, wherein one of the execution units is a wavefront processor.
- 46. A computer system having memory and a processor, wherein the processor is capable of executing a plurality of instructions, including a first instruction, wherein the processor comprises:
a plurality of instruction pipelines; a plurality of result pipelines; and a plurality of reorder buffers, wherein each reorder buffer receives instructions from one instruction pipeline and issues instructions to a second instruction pipeline, wherein each reorder buffer receives data from one result pipeline and issues data to a second result pipeline and wherein each reorder buffer includes:
a register file having a plurality of registers, wherein each register includes a data entry and a tag field; and a register alias table having a plurality of register alias table entries, wherein each register alias table entry includes a pipeline field and a register field, wherein the pipeline field shows which instruction pipeline the first instruction was dispatched into and wherein the register field show the register into which the first instruction will write its result.
- 47. The computer system according to claim 46, wherein each register alias table entry further includes a last field which points to the register alias table entry which previously was going to write to the first register.
- 48. The computer system according to claim 46, wherein each register includes further includes an alias field, wherein the alias field is capable of holding the register alias table entry which is assigned to write to that register.
- 49. In a computer system having a plurality of threads, including a first and second thread, a method of executing more than one thread at a time, the method comprising:
providing a first and a second reorder buffer; reading first instructions and first operands associated with the first thread from the first reorder buffer; executing one of the first instructions and storing a result in the first reorder buffer, wherein storing the result includes marking the result with a tag associating the result with the first thread; reading second instructions and second operands associated with the second thread from the second reorder buffer; and executing one of the second instructions and storing a result in the second reorder buffer, wherein storing the result includes marking the result with a tag associating the result with the second thread.
- 50. In a counterflow pipeline processing system having an instruction pipeline and a data pipeline, both of which feed back into a reorder buffer, a method of recovering from incorrect speculations, wherein the method comprises:
detecting a mispredicted branch, wherein the mispredicted branch includes a first instruction; invalidating, in the reorder buffer, all instructions after the mispredicted branch; if the first instruction is in the instruction pipeline and can execute, executing the instruction and invalidating results associated with that instruction when they reach the reorder buffer; and if the instruction reaches the end of the instruction pipeline, deleting the instruction.
- 51. A method of controlling data speculation, comprising:
providing an instruction; obtaining an operand associated with the instruction, wherein obtaining an operand includes:
determining whether the operand is valid; determining whether the operand is a speculative value; and marking the operand as a function of whether the operand is valid and whether the operand is a speculative value; executing the instruction to generate a result as a function of the operand; and if the operand was a speculative value, checking for a nonspeculative value for the operand, comparing the nonspeculative value against the speculative value and, if the speculative value was correct, saving the result.
- 52. The method of controlling data speculation according to claim 51, wherein marking the operand includes attaching a valid bit and a speculative bit to the operand.
- 53. A method of controlling data speculation within a computer system having a processor, the method comprising:
providing an instruction; obtaining an operand associated with the instruction, wherein obtaining an operand includes:
determining whether the operand is valid; determining whether the operand is a speculative value; and marking the operand as a function of whether the operand is valid and whether the operand is a speculative value; executing the instruction to generate a result as a function of the operand; and if the operand was a speculative value, checking for a nonspeculative value for the operand, comparing the nonspeculative value against the speculative value and, if the speculative value was correct, saving the result.
- 54. The method of controlling data speculation according to claim 53, wherein marking the operand includes attaching a valid bit and a speculative bit to the operand.
- 55. The method of controlling data speculation according to claim 53, wherein executing the instruction to generate a result includes preventing the speculative value from being retired to permanent memory.
- 56. The method of controlling data speculation according to claim 53, wherein executing the instruction to generate a result includes:
assigning a priority based on whether the operands of the instructions are real or speculated values; and giving the instructions with real values a higher priority in execution.
- 57. The method of controlling data speculation according to claim 53, wherein the processor includes a pipeline and wherein comparing the nonspeculative value against the speculative value includes:
comparing the nonspeculative value against the speculative value while the instruction still resides in the pipeline; and modifying the valid and speculative bits as a function of the comparison.
- 58. A microprocessor comprising:
a results pipeline; an instruction pipeline; a reorder buffer which provides instructions and operands to the instruction pipeline and receives results from the results pipeline; and control logic for data speculation, wherein the control logic includes: means for determining whether an operand associated with an instruction is valid and a speculative value; means for marking the operand as a function of whether the operand is valid and whether the operand is a speculative value; means for executing the instruction to generate a result as a function of the operand; and means for comparing a nonspeculative value against the speculative value; and if the operand was a speculative value, checking for a nonspeculative value for the operand, comparing the nonspeculative value against the speculative value and, if the speculative value was correct, saving the result.
- 59. A computer system comprising:
memory; and a processor; wherein the processor includes: a results pipeline; an instruction pipeline; a reorder buffer which provides instructions and operands to the instruction pipeline and receives results from the results pipeline; and control logic for data speculation, wherein the control logic includes: means for determining whether an operand associated with an instruction is valid and a speculative value; means for marking the operand as a function of whether the operand is valid and whether the operand is a speculative value; means for executing the instruction to generate a result as a function of the operand; and means for comparing a nonspeculative value against the speculative value; and if the operand was a speculative value, checking for a nonspeculative value for the operand, comparing the nonspeculative value against the speculative value and, if the speculative value was correct, saving the result.
Parent Case Info
[0001] This application is a Divisional of U.S. application Ser. No. No. 09/792,781, filed Feb. 23, 2001, which is a Divisional of U.S. application Ser. No. 09/638,974 filed Aug. 15, 2000, now issued as U.S. Pat. No. 6,247,115, which is a Divisional of U.S. application Ser. No. 09/164,016 filed Sep. 30, 1998, now issued as U.S. Pat. No. 6,163,839.
Divisions (3)
|
Number |
Date |
Country |
Parent |
09792781 |
Feb 2001 |
US |
Child |
10054632 |
Jan 2002 |
US |
Parent |
09638974 |
Aug 2000 |
US |
Child |
09792781 |
Feb 2001 |
US |
Parent |
09164016 |
Sep 1998 |
US |
Child |
09638974 |
Aug 2000 |
US |