Claims
- 1. An apparatus comprising:
fetch logic to fetch first instruction information for a first thread and to fetch second instruction information for a second thread, the fetch logic further to mark the second instruction information as speculative; and
blocker logic to prevent data associated with a store instruction executed by the second thread from being stored in a memory system and to prevent forwarding of the data associated with the store instruction to the first thread.
- 2. The apparatus of claim 1, wherein blocker logic further comprises:
store blocker logic to prevent data associated with a store instruction executed by the second thread from being stored in the memory system; and dependence blocker logic to prevent forwarding of the data associated with the store instruction to the first thread.
- 3. The apparatus of claim 1, wherein fetch logic further comprises:
first fetch logic to fetch first instruction information for the first thread; and second fetch logic to fetch second instruction information for the second thread, wherein the second fetch logic is further to mark the second instruction information as speculative.
- 4. The apparatus of claim 3, wherein:
the memory system is further to store instructions; first fetch logic is further to fetch the first instruction information from the memory system; and second fetch logic is further to fetch the second instruction information from the memory system.
- 5. The apparatus of claim 1, wherein:
the second instruction information corresponds to the predicted execution control path of the first thread.
- 6. The apparatus of claim 1, further comprising:
a cache, accessible by both the first and second threads, to store the first and second instruction information.
- 7. The apparatus of claim 5, wherein:
the cache is a trace cache.
- 8. The apparatus of claim 5, wherein:
the cache is an execution instruction cache.
- 9. The apparatus of claim 1, wherein:
the first fetch logic and the second fetch logic are logically independent sequencers implemented in a single shared physical fetch unit.
- 10. A method comprising:
identifying a code region that is predicted to incur at least a predetermined quantity of performance loss during execution of the code region; identifying one or more spawning pairs that each includes a spawn point and a target point; selecting one of the one or more spawning pairs, the target point of the selected spawning pair being associated with the code region; and generating an enhanced binary code that includes one or more instructions to cause, during execution of a first thread, spawning of a second thread at the selected spawn point, the instructions further to cause the second thread to execute the instruction associated with the selected target point.
- 11. The method of claim 10, wherein:
the target point for each of the one or more identified spawning pairs is a control-quasi-independent point.
- 12. The method of claim 10, wherein:
identifying one or more spawning pairs further includes approximating a reaching probability for each of the spawning pairs.
- 13. The method of claim 12, wherein:
identifying one or more spawning pairs further includes identifying spawning pairs that have at least a threshold approximated reaching probability.
- 14. The method of claim 10, wherein:
selecting further includes determining that the selected spawning pair encompasses the code region.
- 15. An article comprising:
a machine-readable storage medium having a plurality of machine accessible instructions which, if executed by a machine, cause the machine to perform operations comprising:
identifying a code region that is predicted to incur at least a predetermined quantity of performance loss during execution of the code region; identifying one or more spawning pairs that each includes a spawn point and a target point; selecting one of the one or more spawning pairs, the target point of the selected spawning pair being associated with the code region; and generating an enhanced binary code that includes one or more instructions to cause, during execution of a first thread, spawning of a second thread at the selected spawn point, the instructions further to cause the second thread to execute the instruction associated with the selected target point.
- 16. The article of claim 15, wherein:
the target point for each of the one or more identified spawning pairs is a control-quasi-independent point.
- 17. The article of claim 15, wherein:
instructions that provide for identifying one or more spawning pairs further include instructions that provide for approximating a reaching probability for each of the spawning pairs.
- 18. The article of claim 17, wherein:
instructions that provide for identifying one or more spawning pairs further include instructions that provide for identifying spawning pairs that have at least a threshold approximated reaching probability.
- 19. The article of claim 15, wherein:
instructions for selecting further include instructions that provide for determining that the selected spawning pair encompasses the code region.
- 20. A method comprising:
executing, in a second thread context, a current instruction associated with a speculative thread while concurrently executing one or more instructions associated with a non-speculative thread in a first thread context, wherein executing the current instruction further includes:
responsive to determining that instruction information for the current instruction is not present in a cache:
fetching instruction information for the current instruction; indicating that the instruction information is associated with the speculative thread; and placing the instruction information into the cache; wherein executing the current instruction further includes, responsive to the current instruction being a store instruction, blocking commission of store data associated with the store instruction to a memory.
- 21. The method of claim 20, wherein executing the current instruction further includes:
responsive to determining that the current instruction is a store instruction, preventing bypass of the store data to a load instruction executed by the non-speculative thread.
- 22. The method of claim 20, further comprising:
responsive to an instruction in the non-speculative thread, spawning the speculative thread in the second thread context.
- 23. The method of claim 20, wherein executing in a first thread context one or more instructions associated with a non-speculative thread further comprises:
preventing the forwarding of store data to the non-speculative thread, wherein the store data is associated with the speculative thread.
- 24. The method of claim 20, wherein:
executing in a first thread context one or more instructions associated with a non-speculative thread further includes fetching instruction information via a first logical fetch unit; and fetching instruction information for the current instruction further includes fetching instruction information via a second logical fetch unit.
- 25. The method of claim 20, wherein:
fetching instruction information further comprises fetching and decoding an instruction from an instruction cache.
- 26. The method of claim 20, wherein:
fetching instruction information further comprises building a trace.
- 27. The method of claim 26, wherein:
placing the instruction information into the cache further comprises placing the trace into a trace cache.
- 28. The method of claim 20, wherein:
placing the instruction information into the cache further comprises placing a decoded instruction in an execution instruction cache.
- 29. An article comprising:
a machine-readable storage medium having a plurality of machine accessible instructions; wherein, when the instructions are executed by a processor, the instructions provide for
executing, in a second thread context, a current instruction associated with a speculative thread while concurrently executing one or more instructions associated with a non-speculative thread in a first thread context, wherein the instructions that provide for executing the current instruction further include instructions that provide for:
responsive to determining that instruction information for the current instruction is not present in a cache:
fetching instruction information for the current instruction; indicating that the instruction information is associated with the speculative thread; and placing the instruction information into the cache; wherein instructions that provide for executing the current instruction further include instructions that provide for, responsive to the current instruction being a store instruction, blocking commission of store data associated with the store instruction to a memory.
- 30. The article of claim 29, wherein instructions that provide for executing the current instruction further provide for:
responsive to determining that the current instruction is a store instruction, preventing bypass of the store data to a load instruction executed by the non-speculative thread.
- 31. The article of claim 29, wherein the instructions further provide for:
responsive to a spawn instruction in the non-speculative thread, spawning the speculative thread in the second thread context.
- 32. The article of claim 29, wherein instructions that provide for executing in a first thread context one or more instructions associated with a non-speculative thread further provide for:
preventing the forwarding of store data to the non-speculative thread, wherein the store data is associated with the speculative thread.
- 33. The article of claim 29, wherein:
instructions that provide for executing in a first thread context one or more instructions associated with a non-speculative thread further provide for fetching instruction information via a first logical fetch unit; and instructions that provide for fetching instruction information for the current instruction further provide for fetching instruction information via a second logical fetch unit.
- 34. The article of claim 29, wherein:
instructions that provide for fetching instruction information further include instructions that provide for fetching and decoding an instruction from an instruction cache.
- 35. The article of claim 29, wherein:
instructions that provide for fetching instruction information further provide for building a trace.
- 36. The article of claim 35, wherein:
Instructions that provide for placing the instruction information into the cache further provide for placing the trace into a trace cache.
- 37. The article of claim 29, wherein:
placing the instruction information into the cache further comprises placing a decoded instruction in an execution instruction cache.
- 38. An apparatus comprising:
means for executing, in a second thread context, a current instruction associated with a speculative thread while concurrently executing one or more instructions associated with a non-speculative thread in a first thread context, wherein means for executing the current instruction further includes:
means for, responsive to determining that instruction information for the current instruction is not present in a cache:
fetching instruction information for the current instruction; indicating that the instruction information is associated with the speculative thread; and placing the instruction information into the cache; wherein means for executing the current instruction further includes, responsive to the current instruction being a store instruction, means for blocking commission of store data associated with the store instruction to a memory.
- 39. The apparatus of claim 38, wherein means for executing the current instruction further includes:
means for, responsive to determining that the current instruction is a store instruction, preventing bypass of the store data to a load instruction executed by the non-speculative thread.
- 40. The apparatus of claim 38, further comprising:
means for, responsive to an instruction in the non-speculative thread, spawning the speculative thread in the second thread context.
- 41. The apparatus of claim 38, wherein means for executing in a first thread context one or more instructions associated with a non-speculative thread further comprises:
means for preventing the forwarding of store data to the non-speculative thread, wherein the store data is associated with the speculative thread.
- 42. The apparatus of claim 38, wherein:
means for executing in a first thread context one or more instructions associated with a non-speculative thread further includes first means for fetching instruction information; and means for fetching instruction information for the current instruction further includes second means for fetching instruction information.
- 43. The apparatus of claim 38, wherein:
means for fetching instruction information further comprises means for fetching and decoding an instruction from an instruction cache.
- 44. The apparatus of claim 38, wherein:
means for fetching instruction information further comprises means for building a trace.
- 45. The apparatus of claim 44, wherein:
means for placing the instruction information into the cache further comprises means for placing the trace into a trace cache.
- 46. The apparatus of claim 38, wherein:
means for placing the instruction information into the cache further comprises means for placing a decoded instruction in an execution instruction cache.
- 47. A system comprising:
a dynamic random access memory; a first fetch unit to fetch first instruction information for a first thread; a second fetch unit to fetch second instruction information for a second thread; a store blocker mechanism to prevent data associated with a store instruction executed by the second thread from being stored in the memory; and a dependence blocker mechanism to prevent forwarding of the data associated with the store instruction to the first thread.
- 48. The system of claim 47, further comprising:
a memory hierarchy to store instructions, the memory hierarchy including the dynamic random access memory; wherein the first fetch unit is further to fetch the first instruction information from the memory hierarchy; and wherein the second fetch unit is further to fetch the second instruction information from the memory hierarchy.
- 49. The system of claim 47, further comprising:
a cache to store the first and second instruction information.
- 50. The system of claim 49, wherein:
the cache is a trace cache.
- 51. The system of claim 49, wherein:
the cache is an execution instruction cache.
- 52. The system of claim 47, wherein:
the first fetch unit and the second fetch unit are logically independent sequencers implemented in a single shared physical fetch unit.
- 53. The system of claim 47, wherein:
the first fetch unit and the second fetch unit are physically distinct from each other.
- 54. A compiler comprising
a cache miss analyzer to determine a code region expected to incur at least a predetermined quantity of performance loss due to cache misses during execution of the code region; a spawning pair identifier to identify one or more candidate spawning pairs that have at least a minimum approximated reaching probability; a spawning pair selector to select one of the one or more candidate spawning pairs such that the selected spawning pair encompasses the code region; and a code generator to generate one or more instructions that provide for spawning a speculative thread at a spawn point associated with the selected spawning pair, the instructions further providing for the speculative thread to execute a target point associated with the selected spawning pair.
- 55. The compiler of claim 54, wherein:
the spawning pair identifier is further to identify candidate spawning pairs such that a target point of a particular spawning pair is a control quasi-independent-point associated with a spawn point of the particular spawning pair.
- 56. The compiler of claim 54, wherein:
the code generator is further to generate one or more instructions to speculatively compute a live-in value for the speculative thread.
- 57. The compiler of claim 54, wherein:
the code generator is further to generate instructions that provide for speculative preexecution, in the speculative thread, of one or more instructions from the code region.
RELATED APPLICATIONS
[0001] The present patent application is a continuation-in-part of prior U.S. patent application Ser. No. 10/356,435, filed on Jan. 31, 2003, entitled “Control-Quasi-Independent-Points Guided Speculative Multithreading.”
Continuation in Parts (1)
|
Number |
Date |
Country |
| Parent |
10356435 |
Jan 2003 |
US |
| Child |
10423633 |
Apr 2003 |
US |