Claims
- 1. An array processing system comprising:
a digital signal processor (DSP); a direct memory access (DMA) controller; a plurality of processing element (PE) local memories; a direct memory access bus utilized by the DMA controller to access the plurality of PE local memories; a system memory; a system control bus (SCB) connected to the DMA controller; and a system data bus (SDB) connecting the DMA controller and a plurality of devices, wherein the DMA controller transfers data between the plurality of devices on the SDB.
- 2. The array processing system of claim 1 wherein the system memory is on the SDB and the plurality of PE local memories are on a direct memory access bus as controlled by the DMA controller to move data between the system memory and the plurality of PE local memories.
- 3. The array processing system of claim 1 wherein the DSP functions as an SCB master and utilizes the SCB to program the DMA controller with read and write addresses and register values to initiate control operations and read status.
- 4. The array processing system of claim 1 further comprising a host processor connected to the SCB, and which both functions as an SCB master and utilizes the SCB to program the DMA controller with read and write addresses and register values to initiate control operations and read status.
- 5. The array processing system of claim 1 wherein the DMA controller further utilizes the SCB to send synchronization messages to other SCB bus slaves.
- 6. The array processing system of claim 5 wherein the other SCB bus slaves comprise DSP control registers or a host input/output block.
- 7. The array processing system of claim 5 further comprising a host processor connected to the SCB, and wherein the DSP or the host processor can poll registers in bus slaves on the SCB to receive status data from the DMA controller.
- 8. The array processing system of claim 1 wherein the DMA controller operates to perform write operations to slave addresses which are programmed to cause interrupt side effects to the DSP allowing DMA controller messages to be handled by interrupt service routines.
- 9. A direct memory access (DMA controller for supporting efficient data transfers concurrent with host processor computation comprising:
a first transfer controller which can operate as an independent processor or can work together with another transfer controller to carry out data transfers; a system control bus (SCB) connected to the transfer controller; a system data bus (SDB) connected to the transfer controller; the first transfer controller operating as both a bus master and a bus slave on both the SCB and the SDB.
- 10. The DMA controller of claim 9 further comprising a second transfer controller which can operate as an independent processor or can work together with another transfer controller to carry out data transfers.
- 11. The DMA controller of claim 9 further comprising a first DMA bus independently connecting the first controller to a plurality of local memories.
- 12. The DMA controller of claim 10 further comprising a second DMA bus independently connecting the second transfer controller a plurality of local memories.
- 13. The DMA controller of claim 11 wherein the plurality of local memories comprise separate data random access memories (RAM) for at least four processing elements (PEs), data RAM for a sequence processing element and instruction RAM for the sequence processing element.
- 14. A transfer controller comprising:
a set of execution units including an instruction control unit (ICU), a system transfer unit (STU), a core transfer unit (CTU), and an event control unit (ECU); an inbound data queue (IDQ) comprising a data first in first out buffer; an outbound data queue (ODQ) comprising a data first in first out buffer; a system data bus (SDB); a system control bus (SCB); and a direct memory access (DMA) bus.
- 15. The transfer controller of claim 14 wherein the STU is connected to the system data bus (SDB) and controls the writing of data from the SDB to the IDQ, data is read from the IDQ under control of the CTU to be sent through the DMA bus to core memories or to the ICU in the core of instruction fetches.
- 16. The transfer controller of claim 14 wherein the CTU is connected to the DMA bus and data from the DMA bus is written to the ODQ under control of the CTU to be sent through the SDB to an SDB device or memory under control of the STU.
- 17. The transfer controller of claim 14 wherein the CTU is connected to the DMA bus and reads DMA instructions from a memory attached to the DMA bus and forwards these read instructions to the ICU for initial decoding.
- 18. The transfer controller of claim 14 wherein the ECU is connected to receive signals from external devices.
- 19. The transfer controller of claim 14 wherein the ECU is connected to the SCB and receives commands from the SCB.
- 20. The transfer controller of claim 14 wherein the ECU is connected to the ICU and receives instruction data from the ICU.
- 21. The transfer controller of claim 14 wherein the ECU generator output signals which may be used to generate interrupts on a host control processor in a system including said transfer controller.
- 22. The transfer controller of claim 14 wherein said controller fetches its own stream of DMA instructions.
- 23. The transfer controller of claim 22 wherein said DMA instructions include transfer, branch, load, synchronization and status control instructions.
- 24. The transfer controller of claim 14 wherein transfer-type instructions are fetched by the ICU and dispatched for further decoding and execution by the STU and CTU.
- 25. The transfer controller of claim 24 wherein a transfer-system-inbound instruction moves data from the SDB to the IDQ and is executed by the STU.
- 26. The transfer controller of claim 24 wherein a transfer-core-outbound instruction moves data from the DMA bus to the ODQ and is executed by the CTU.
- 27. The transfer controller of claim 24 wherein a transfer-system-outbound instruction moves data from the ODQ to the SDB and is executed by the STU.
- 28. The transfer controller of claim 24 wherein two transfer instructions are required to move data between an SDD system memory and one or more SP or PE local memories on the DMA bus, and both transfer instructions are executed concurrently.
- 29. The transfer controller of claim 24 wherein the transfer-type instructions include an address parameter which is decoded by the STU and which refers to an address on the SDB.
- 30. The transfer controller of claim 14 wherein control-type instructions include an address parameter which is decoded by the core transfer unit and which refers to addresses on the DMA bus to processing element and sequence processor local memories.
- 31. An instruction format for a transfer instruction comprising:
a base opcode field indicating that an instruction is of transfer type; a C/S filled which indicates a type of transfer unit for the instruction, a control transfer unit (CTU) or a system transfer unit (STU), respectively; an I/O field which indicates that the instruction is inbound or outbound, respectively; an execute (X) field when set indicates a start transfer event should start immediately after loading the instruction, and when not set indicates that the instruction is loaded but transfer is not initiated; a data type field indicates the size of each element transferred; an address mode field refers to a data access pattern which must be generated by a transfer unit, either the CTU or STU; a transfer count field indicates the number of data elements of the size specified by the data type field are to be transferred before an end of transfer occurs; and an address parameter field which specifies a starting address for the transfer.
- 32. The instruction format of claim 31 further comprising:
an other parameter field which follows the address parameter field depending on the addressing mode used.
- 33. The instruction format of claim 31 wherein an address of a data element within a PE local memory space is specified with three variables, a PE ID, a base value and an index value.
- 34. The instruction format of claim 31 wherein to support data reorderings the address of a PE data element is specified by a pair: PE data address=(PE ID, BitReversalSelect (Base & Index)), where the function BitReversalSelect is a permutation and selection function to support fast Fourier transform (FFT) data reorderings within each local processing element (PE) memory.
- 35. A method for multicast addressing to enable parallel distribution of a data element to more than one and up to all processing elements (PEs) in a system simultaneously comprising the steps of:
specifying an addressing mode in an address mode field of a transfer control instruction (TCI); and utilizing another parameter in the TCI to specify which PEs are to accept data transfer.
- 36. The method of claim 35 further comprising the step of:
utilizing a 16 bit field to specify any combination of up to 16 PEs to receive the data element.
- 37. The method of claim 35 further comprising the step of:
utilizing a 4 bit number to specify any set of 16 PEs to receive the data element, the 4 bit number being used in combination with a PE VID-to-PID translation table to specify the PEs to receive the data element.
- 38. The method of claim 35 further comprising the step of:
utilizing an encoded M-bit value to specify the PEs to receive the data element simultaneously.
- 39. A method for permuting data before the data is sent to processing elements (PE) for inbound transfers or before being sent to system memories for outbound transfers comprising the steps of:
reordering data within a data element; and performing other stream oriented operations including masking, data merging or complementing.
- 40. The method of claim 39 wherein said step of data merging further comprises:
performing a logical AND operation with a mask followed by performing a logical OR operation with a constant.
- 41. The method of claim 39 wherein said step of complementing further comprises using a logical XOR operation with a specified mask.
- 42. A method for performing processing element (PE) packing-gather operations comprising the following steps:
setting a packing-gather operations type indicator to indicate that each PE drives data onto a different group of data wires to return to a transfer controller ODQ; and determining the packing-gather operations type indicator to control PE deriving of data.
- 43. The method of claim 42 wherein a byte size gather over 4 PEs specifies that each PE supplies one byte of a 32-bit word to be returned to the ODQ for return to the system data bus.
- 44. A method for performing processing element (PE) relative gather-sum operations comprising:
specifying a summary transfer parameter, N, specifying a number of data elements which are to be summed as they are read from local memories; summing the N elements to form a single data element; repeating the previous steps; and transferring a single data element to an ODQ as a sum result for every N elements read from local memories.
- 45. The method of claim 44 further comprising the step of transferring the sum result from the ODQ to a system data bus.
- 46. A method for performing processing element (PE) relative unpack-distribute operations comprising the steps of:
setting an unpack-distribute operations type indicator to indicate that each PE LMIU receives data from a different group of data wires to be written to its local memory; and determining the unpack-distribute operations type indicator to control each PE LMIU's receipt of data.
Parent Case Info
[0001] The present invention claims the benefit of U.S. Provisional Application Ser. No. 60/184,668 entitled “Methods and Apparatus for Providing Bit-Reversal and Multicast Functions Utilizing DMA Controller” filed Feb. 24, 2000 and incorporated by reference herein in its entirety.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60184668 |
Feb 2000 |
US |