Claims
- 1. In a data processing system having a memory hierarchy including a cache and a lower-level memory system, a method comprising the steps of:
receiving a data element having a write with inject attribute associated therewith from a data producer; forwarding said data element to the cache without accessing the lower-level memory system; and updating at lease one cache line containing said data element in the cache.
- 2. The method of claim 1 wherein said step of receiving said data element comprises the step of receiving said data element at an input of a microprocessor.
- 3. The method of claim 1 wherein said step of receiving said data element comprises the step of receiving at least a portion of a data communication frame.
- 4. The method of claim 1 wherein said step of receiving said data element comprises the step of receiving said data element with said write with inject attribute using a link substantially compatible with the HyperTransport™ I/O Link Specification, Revision 1.03.
- 5. The method of claim 4 wherein said step of receiving further comprises the step of detecting the write with inject attribute from a reserved command field encoding in a request packet.
- 6. The method of claim 1 further comprising the step of performing said steps of forwarding said data element to the cache and updating at least one cache line containing said data element in the cache only if the cache is the owner of said data element.
- 7. The method of claim 1 wherein said step of forwarding comprises the step of temporarily storing said data element in a buffer.
- 8. The method of claim 7 further comprising the step of performing said step of forwarding conditionally depending upon whether the data processing system requires said buffer for another purpose before said step of forwarding is performed.
- 9. The method of claim 7 wherein said step of forwarding further comprises the steps of:
sending a probe prefetch to a central processing unit coupled to the cache; issuing a read request by the central processing unit in response to said probe prefetch; sending said data element to the cache in response to said read request; and removing said data element from said buffer.
- 10. The method of claim 9 wherein said step of sending said probe prefetch comprises the step of sending a broadcast probe prefetch to a plurality of nodes in the data processing system.
- 11. The method of claim 9 wherein said step of sending said probe prefetch comprises the step of sending a directed probe prefetch to a node associated with the cache.
- 12. The method of claim 1 wherein said step of forwarding comprises the step of checking a directory associated with the cache and sending said data element to the cache only if the cache line is owned by the cache and is present in the directory in a predetermined state.
- 13. The method of claim 12 wherein said step of updating said at least one cache line further comprises the step of updating said at least one cache line only if said at least one cache line is present in the directory in a reserved state or a modified state.
- 14. The method of claim 1 wherein said step of updating comprises the step of marking said at least one cache line as modified.
- 15. The method of claim 1 further comprising the step of reading said data element from the cache using an I/O driver program executing on a central processing unit.
- 16. A method for use in a data processing system having a plurality of nodes each including a central processing unit and an associated cache comprising the steps of:
receiving a write with inject packet having a data element associated therewith from a data producer; checking a directory to see if said data element is already present in said directory in a predetermined state; if said data element is not present in said directory in said predetermined state, creating a directory entry for said data element and writing said data element to a lower-level memory system; and if said data element is already present in said directory in said predetermined state, forwarding said data element to a cache that is the owner of said data element without accessing said lower-level memory system.
- 17. The method of claim 16 wherein said step of creating said directory entry comprises the step of creating said directory entry in a written state.
- 18. The method of claim 17 further comprising the step of changing said directory entry from said written state to a reservation state in response to a central processing unit associated with said cache reading said data element.
- 19. The method of claim 18 wherein said step of checking said directory comprises the step of checking said directory to see if said data element is present in said directory in said reservation state.
- 20. The method of claim 19 further comprising the step of changing said directory entry from said reservation state to a modified state in response to said step of forwarding said data element to said cache and to said cache updating at least one cache line associated with said data element.
- 21. The method of claim 20 where said step of checking said directory further comprises the step of checking said directory to see if said data element is present in said directory in said modified state.
- 22. The method of claim 16 wherein said step of forwarding comprises the step of temporarily storing said data element in a buffer.
- 23. The method of claim 22 further comprising the step of performing said step of forwarding conditionally depending upon whether the data processing system requires said buffer for another purpose before said step of forwarding is performed.
- 24. The method of claim 23 wherein said step of forwarding further comprises the steps of:
sending a probe prefetch to a central processing unit associated with said cache; receiving a read request by said central processing unit in response to said probe prefetch; sending said data element to said cache in response to said read request; and removing said data element from said buffer.
- 25. The method of claim 24 wherein said step of sending said probe prefetch comprises the step of sending a directed probe prefetch to a node having a cache that is the owner of said data element.
- 26. A data processor comprising:
a central processing unit including a cache, said central processing unit adapted to initiate a prefetch read in response to receiving a probe prefetch; a host bridge coupled to said central processing unit and adapted to receive a write with inject packet for a data element from a data producer; and a memory controller coupled to said central processing unit and to said host bridge and adapted to be coupled to a lower level memory system, and having an output coupled to said central processing unit, wherein said memory controller includes a buffer and stores said data element from said host bridge in said buffer, provides said probe prefetch to said central processing unit in response to receiving said data element, and provides said data element from said buffer in response to a prefetch read from said central processing unit.
- 27. The data processor of claim 26 wherein said central processing unit, said host bridge, and said memory controller are coupled together by means of a crossbar switch.
- 28. The data processor of claim 26 wherein said host bridge is adapted to be coupled to said data producer using a link substantially compatible with the HyperTransport™ I/O Link Specification, Revision 1.03.
- 29. The data processor of claim 26 wherein said memory controller temporarily stores data in said buffer before providing it to said lower level memory system, and wherein said memory controller removes said data element from said buffer without writing said data element to said lower-level memory system if said central processing unit reads said data element before said memory controller writes it to memory.
- 30. The data processor of claim 26 wherein said central processing unit, said host bridge, and said memory controller are combined into a single integrated circuit.
- 31. A data processor comprising:
a central processing unit including a cache; a host bridge coupled to said central processing unit and adapted to receive a write with inject packet for a data element from a data producer; and a directory/memory controller coupled to said central processing unit and to said host bridge and adapted to be coupled to a lower level memory system, and having an output coupled to said central processing unit, wherein said directory/memory controller is responsive to said write with inject packet to check a directory thereof to see if a cache state of a line associated with said data element is in a predetermined state, and if so to send said data element to said central processing unit for storage in said cache without accessing said lower level memory system.
- 32. The data processor of claim 31 wherein said central processing unit, said host bridge, and said directory/memory controller are coupled together by means of a crossbar switch.
- 33. The data processor of claim 31 wherein said host bridge is adapted to be coupled to said data producer using a link substantially compatible with the HyperTransport™ I/O Link Specification, Revision 1.03.
- 34. The data processor of claim 31 wherein said directory/memory controller temporarily stores data in said buffer before providing it to said lower level memory system, and wherein said memory controller removes said data element from said buffer without writing said data to said lower-level memory system if said central processing unit reads said data element before said memory controller writes it to memory.
- 35. The data processor of claim 31 wherein said central processing unit, said host bridge, and said directory/memory controller are combined into a single integrated circuit.
- 36. A data processor comprising:
a central processing unit including a cache; a host bridge coupled to said central processing unit and adapted to receive a write with inject packet for a data element from a data producer; and means coupled to said central processing unit, to said host bridge, and to a lower-level memory system for forwarding said data element to said central processing unit for storage in said cache without accessing said lower-level memory system.
- 37. The data processor of claim 36 wherein said central processing unit is adapted to initiate a prefetch read in response to receiving a probe prefetch and said means for forwarding comprises a memory controller including a buffer, wherein said memory controller stores said data element from said host bridge in said buffer, provides said probe prefetch to said central processing unit in response to receiving said data element, and provides said data element from said buffer in response to said prefetch read from said central processing unit.
- 38. The data processor of claim 37 wherein said memory controller temporarily stores data in said buffer before providing it to said lower level memory system, and wherein said memory controller removes said data element from said buffer without writing said data element to said lower-level memory system if said central processing unit reads said data element before said memory controller writes it to memory.
- 39. The data processor of claim 36 wherein said means for forwarding comprises a directory/memory controller, wherein said directory/memory controller is responsive to said write with inject packet to check a directory thereof to see if a cache state of a line associated with said data element is in a predetermined state and is owned by said cache, and if so to send said data element to said central processing unit for storage in said cache without accessing said lower level memory system.
- 40. The data processor of claim 36 wherein said directory/memory controller temporarily stores data in a buffer before providing it to said lower level memory system, and wherein said directory/memory controller removes said data element from said buffer without writing said data element to said lower-level memory system if said central processing unit reads said data element before said directory/memory controller writes it to memory.
- 41. The data processor of claim 36 wherein said central processing unit, said host bridge, and said means for forwarding are coupled together by means of a crossbar switch.
- 42. The data processor of claim 36 wherein said host bridge is adapted to be coupled to said data producer using a link substantially compatible with the HyperTransport™ I/O Link Specification, Revision 1.03.
- 43. The data processor of claim 36 wherein said central processing unit, said host bridge, and said means for forwarding are combined into a single integrated circuit.
CROSS REFERENCE TO RELATED COPENDING APPLICATION
[0001] Related subject matter is contained in copending U.S. patent application Ser. No. 10/261,642, filed Sep. 30, 2002, entitled “Method and Apparatus for Reducing Overhead in a Data Processing System with a Cache” invented by Patrick Conway and assigned to the assignee hereof.