Claims
- 1. A computer system, comprising:a plurality of agents, interconnected by a common bus, the agents including at least one processor and at least one memory, the processor to execute a memory-to-memory copy instruction by: atomically reading data from a source location in the memory to the processor, atomically writing the data from the processor to a target location, and unlocking the source location to other agents following the atomic reading and before the atomic writing.
- 2. The computer system of claim 1, wherein the processor comprises:a processor core comprising a plurality of registers, a cache system, and a bus interface coupled to the processor core and the cache system.
- 3. The computer system of claim 2, wherein the data item has a width that exceeds a width of the registers.
- 4. The computer system of claim 2, wherein the data item has a width that equals a width of cache lines within the cache system.
- 5. The computer system of claim 1, further comprising:a plurality of clusters coupled together by a network fabric, wherein a first cluster includes the agents, a third agent of which is a bridge to the network fabric, the second cluster comprises a second plurality of agents coupled by a common second bus, the second agents including at least one second processor, second memory and second bridge.
- 6. The computer system of claim 5, wherein:the first and second memories are configured to provide a universal memory space, and the source and target locations can be in either memory.
- 7. The computer system of claim 1, wherein execution of the memory-to-memory copy instruction further causes to processor to return a version stamp provided in the data to an application layer of a program.
- 8. The computer system of claim 1, wherein the processor is a multi-threaded processor.
- 9. The computer system of claim 8, wherein, during operation of the atomic read of one thread of the processor, the source location is locked against use by other threads of the processor.
- 10. The computer system of claim 8, wherein, during operation of the atomic write of one thread of the processor, the target location is locked against use by other threads of the processor.
- 11. A method of implementing an atomic memory-to-memory copy of data, comprising:atomically reading data to a thread from a source location, after the atomic reading, unlocking the source location for use by other threads, and atomically writing the data from a thread to a target location.
- 12. The method of claim 11, further comprising, before the atomic writing, reading data from the source location by another thread.
- 13. The method of claim 11, wherein the reading and writing respectively transfer a quantity of data larger than an internal register of the thread.
- 14. The method of claim 11, wherein the reading and writing respectively transfer a quantity of data up to a cache line of a computer system in which the thread is located.
- 15. The method of claim 11, wherein the reading is directed to a predetermined address in a system memory, a first portion of the address representing a cache line from which the data is to be read and a second portion of the address representing a location within the cache line where a version stamp is located.
- 16. The method of claim 15, wherein, in a system having aligned cache lines of L bytes in length, the second address portion is log2(L) in length.
- 17. The method of claim 15, wherein a length of the version stamp is defined by an instruction used by software to invoke the method.
- 18. A data transfer method, comprising:copying an array of data from a first space in a memory to a second space in a memory, the copying comprising, for each location in the first space: locking the location in the first space, reading a data unit from the location in the first space to a thread, releasing the lock when the reading concludes, and writing the data unit to a location in the second space; wherein, for each location in the first space, other threads are permitted access to the location before the respective reading occurs and after the respective writing occurs on the data unit associated therewith.
- 19. The data transfer method of claim 18, further comprising:prior to each reading, storing a first version stamp associated with the data unit corresponding thereto, subsequent to the reading, determining a second version stamp associated with the corresponding data unit, and comparing the first and second version stamps, and if the version stamps do not match, repeating the copying with respect to the corresponding data unit.
- 20. The data transfer method of claim 19, wherein the array includes a single version stamp.
- 21. The data transfer method of claim 19, wherein each data unit includes a version stamp.
- 22. A method comprising:atomically reading data from a source location to a thread, and atomically acquiring exclusive ownership of a target location and writing the data to the target location, unlocking the source location for use by other threads after the atomically reading and before the atomic acquiring and writing.
- 23. The method of claim 22, wherein the reading and writing respectively transfer a quantity of data larger than an internal register of the thread.
- 24. The method of claim 22, wherein the reading and writing respectively transfer a quantity of data up to a cache line of a computer system in which the thread is located.
- 25. The method of claim 22, wherein the reading is directed to a predetermined address in a system memory, a first portion of the address representing a cache line from which the data is to be read and a second portion of the address representing a location within the cache line where a version stamp is located.
- 26. The method of claim 25, wherein a length of the version stamp is defined by an instruction used by software to invoke the method.
- 27. The method of claim 25, wherein, in a system having aligned cache lines of L bytes in length, the second portion is log2(L) in length.
- 28. The method of claim 22, wherein the atomic reading operation and the atomic acquisition-and-writing operation are performed pursuant to execution of a single memory-to-memory copy instruction.
- 29. The method of claim 28, further comprising, upon conclusion of the instruction, determining a version number of the data.
- 30. The method of claim 22, wherein the atomic reading operation and the atomic acquisition-and-writing operation respectively read the data to and write the data from a processor.
- 31. The method of claim 22, further comprising:prior to the reading, storing a first version stamp associated with the data, subsequent to the reading, determining a second version stamp associated with the data, and comparing the first and second version stamps, and if the version stamps do not match, repeating the method.
- 32. The method of claim 22, further comprising:prior to the reading, storing a first version stamp associated with the data, subsequent to the reading, determining a second version stamp associated with the data, and comparing the first and second version stamps, and if the version stamps do not match, incrementing a counter, and if the counter exceeds a predetermined value, locking the source location and copying the data from the source location to the target location while the source location is continuously locked.
CROSS-REFERENCE TO RELATED APPLICATION
This application is a continuation application that claims the benefit of U.S. patent application Ser. No. 10/230,288 (filed Aug. 29, 2002) (allowed Jan. 31, 2003), which is a divisional application of U.S. patent application Ser. No. 09/736,433 dated Dec. 15, 2000, now U.S. Pat. No. 6,502,170 B2, issued Dec. 31, 2002, which applications are incorporated herein in their entireties.
US Referenced Citations (5)
Non-Patent Literature Citations (4)
Entry |
Pentium® Pro Family Developer's Manual, vol. 2, Programmer's Reference Manual, 1996 Chapter 11, pp. 11-69 to 11-70. |
Pentium® Pro Family Developer's Manual, vol. 2, Programmer's Reference Manual, 1996 Chapter 11, pp. 11-71 to 11-72. |
Pentium® Pro Family Developer's Manual, vol. 2, Programmer's Reference Manual, 1996 Chapter 11, pp. 11-271 to 11-272. |
Pentium® Pro Family Developer's Manual, vol. 2, Programmer's Reference Manual, 1996 Chapter 11, pp. 11-372 to 11-373. |
Continuations (1)
|
Number |
Date |
Country |
Parent |
10/230288 |
Aug 2002 |
US |
Child |
10/379716 |
|
US |