Claims
- 1. An apparatus for the two cycle computation of complex multiplication, the apparatus comprising:
a first storage means for storing a first complex operand and a second complex operand, the first complex operand including real component Xr and imaginary component Xi, the second complex operand including real component Yr and imaginary component Yi; multiplier means for simultaneously performing multiplications in a first cycle of operation to produce products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; a second storage means for storing products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; adder means for simultaneously performing additions and subtractions in a second cycle of operation to produce real result (Xr*Yr)−(Xi*Yi) and imaginary result (Xr*Yi)+(Xi*Yr) if a nonconjugated operation is being performed, said adder means further for simultaneously performing additions and subtractions in the second cycle of operation to produce real result (Xr*Yr)+(Xi*Yi) and imaginary result (Xi*Yr)−(Xr*Yi) if a conjugated operation is being performed; and a third storage means for storing the results of said adder means.
- 2. The apparatus of claim 1 further comprising:
accumulator means for simultaneously performing accumulation in the second cycle of operation to accumulate the results of said adder means with the current contents of said third storage means, wherein said third storage means is further for storing the results of said accumulator means.
- 3. The apparatus of claim 2 further comprising:
extended precision storage means, wherein said accumulator means is further for simultaneously performing accumulation in the second cycle of operation to accumulate the results of said adder means with both the current contents of said third storage means and the current contents of said extended precision storage means, wherein said extended precision storage means is for storing extended precision results of said accumulator means.
- 4. The apparatus of claim 3 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, the real and imaginary results are each 32 bits, and the extended precision results are each 8 bits.
- 5. The apparatus of claim 1 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, and the real and imaginary results are each 32 bits.
- 6. The apparatus of claim 1 wherein multiplier means is further for simultaneously performing multiplications in the second cycle of operation utilizing a second pair of operands.
- 7. An apparatus for the two cycle computation of complex multiplication, the apparatus comprising:
a first storage means for storing a first complex operand and a second complex operand, the first complex operand including real component Xr and imaginary component Xi, the second complex operand including real component Yr and imaginary component Yi; multiplier means for simultaneously performing multiplications in a first cycle of operation to produce products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; a second storage means for storing products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; a third storage means; and adder/accumulator means for simultaneously performing additions and subtractions in a second cycle of operation to produce real result (Xr*Yr)−(Xi*Yi) and imaginary result (Xr*Yi)+(Xi*Yr) if a nonconjugated operation is being performed, said adder/accumulator means further for simultaneously performing additions and subtractions in the second cycle of operation to produce real result (Xr*Yr)+(Xi*Yi) and imaginary result (Xi*Yr)−(Xr*Yi) if a conjugated operation is being performed, said adder/accumulator means is further for simultaneously performing accumulation in the second cycle of operation to accumulate the results with the current contents of said third storage means, wherein said third storage means is further for storing the accumulated results of said adder/accumulator means.
- 8. The apparatus of claim 7 further comprising:
extended precision storage means, wherein said adder/accumulator means is further for simultaneously performing accumulation in the second cycle of operation to accumulate the results of said adder means with both the current contents of said third storage means and the current contents of said extended precision storage means, wherein said extended precision storage means is for storing extended precision results of said adder/accumulator means.
- 9. A two cycle method of complex multiplication of a first complex operand and a second complex operand stored in a first memory device, the first complex operand including real component Xr and imaginary component Xi, the second complex operand including real component Yr and imaginary component Yi, the method comprising the steps of:
performing simultaneous multiplications in a first cycle of operation to produce products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; storing products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi in a second memory device in the first cycle of operation; performing simultaneous additions and subtractions in a second cycle of operation to produce real result (Xr*Yr)−(Xi*Yi) and imaginary result (Xr*Yi)+(Xi*Yr), if a nonconjugated operation is being performed; performing simultaneous additions and subtractions in the second cycle of operation to produce real result (Xr*Yr)+(Xi*Yi) and imaginary result (Xi*Yr)−(Xr*Yi), if a conjugated operation is being performed; and storing the real and imaginary results of the additions and subtractions in a third memory device in the second cycle of operation.
- 10. The method of claim 9 further comprising, before the step of storing the real and imaginary results, the step of:
performing accumulations in the second cycle of operation to accumulate the results of the additions and subtractions with the current contents of the third memory device.
- 11. The method of claim 9 further comprising, before the step of storing the real and imaginary results, the step of:
performing accumulations in the second cycle of operation to accumulate the results of the additions and subtractions with both the current contents of the third memory device and the contents of an extended precision register.
- 12. The method of claim 11 further comprising the step of:
storing the extended precision accumulated result in the extended precision register.
- 13. The method of claim 11 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, the real and imaginary results are each 32 bits, and the extended precision results are each 8 bits.
- 14. The method of claim 9 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, and the real and imaginary results are each 32 bits.
- 15. An apparatus for the single cycle computation of complex multiplication, the apparatus comprising:
a first storage means for storing a first complex operand and a second complex operand, the first complex operand including real component Xr and imaginary component Xi, the second complex operand including real component Yr and imaginary component Yi; multiplier means for simultaneously performing multiplications in a first cycle of operation to produce products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; adder means for simultaneously performing additions and subtractions in the first cycle of operation to produce real result (Xr*Yr)−(Xi*Yi) and imaginary result (Xr*Yi)+(Xi*Yr) if a nonconjugated operation is being performed, said adder means further for simultaneously performing additions and subtractions in the first cycle of operation to produce real result (Xr*Yr)+(Xi*Yi) and imaginary result (Xi*Yr)−(Xr*Yi) if a conjugated operation is being performed; and a third storage means for storing the results of said adder means.
- 16. The apparatus of claim 15 further comprising:
accumulator means for simultaneously performing accumulation in the first cycle of operation to accumulate the results of said adder means with the current contents of said third storage means, wherein said third storage means is further for storing the results of said accumulator means.
- 17. The apparatus of claim 16 further comprising:
extended precision storage means, wherein said accumulator means is further for simultaneously performing accumulation in the first cycle of operation to accumulate the results of said adder means with both the current contents of said third storage means and the current contents of said extended precision storage means, wherein said extended precision storage means is for storing extended precision results of said accumulator means.
- 18. The apparatus of claim 17 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, the real and imaginary results are each 32 bits, and the extended precision results are each 8 bits.
- 19. The apparatus of claim 15 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, and the real and imaginary results are each 32 bits.
- 20. The apparatus of claim 15 wherein multiplier means is further for simultaneously performing multiplications in the second cycle of operation utilizing a second pair of operands.
- 21. An apparatus for the single cycle computation of complex multiplication, the apparatus comprising:
a first storage means for storing a first complex operand and a second complex operand, the first complex operand including real component Xr and imaginary component Xi, the second complex operand including real component Yr and imaginary component Yi; multiplier means for simultaneously performing multiplications in a first cycle of operation to produce products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; a second storage means; and adder/accumulator means for simultaneously performing additions and subtractions in the first cycle of operation to produce real result (Xr*Yr)−(Xi*Yi) and imaginary result (Xr*Yi)+(Xi*Yr) if a nonconjugated operation is being performed, said adder/accumulator means further for simultaneously performing additions and subtractions in the first cycle of operation to produce real result (Xr*Yr)+(Xi*Yi) and imaginary result (Xi*Yr)−(Xr*Yi) if a conjugated operation is being performed, said adder/accumulator means is further for simultaneously performing accumulation in the second cycle of operation to accumulate the results with the current contents of said second storage means, wherein said second storage means is further for storing the accumulated results of said adder/accumulator means.
- 22. The apparatus of claim 21 further comprising:
extended precision storage means, wherein said adder/accumulator means is further for simultaneously performing accumulation in the first cycle of operation to accumulate the results of said adder means with both the current contents of said second storage means and the current contents of said extended precision storage means, wherein said extended precision storage means is for storing extended precision results of said adder/accumulator means.
- 23. A single cycle method of complex multiplication of a first complex operand and a second complex operand stored in a first memory device, the first complex operand including real component Xr and imaginary component Xi, the second complex operand including real component Yr and imaginary component Yi, the method comprising the steps of:
performing simultaneous multiplications in a first cycle of operation to produce products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; storing products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi in a second memory device in the first cycle of operation; performing simultaneous additions and subtractions in the first cycle of operation to produce real result (Xr*Yr)−(Xi*Yi) and imaginary result (Xr*Yi)+(Xi*Yr), if a nonconjugated operation is being performed; performing simultaneous additions and subtractions in the first cycle of operation to produce real result (Xr*Yr)+(Xi*Yi) and imaginary result (Xi*Yr)−(Xr*Yi), if a conjugated operation is being performed; and storing the real and imaginary results of the additions and subtractions in a second memory device in the first cycle of operation.
- 24. The method of claim 23 further comprising, before the step of storing the real and imaginary results, the step of:
performing accumulations in the first of operation to accumulate the results of the additions and subtractions with the current contents of the second memory device.
- 25. The method of claim 23 further comprising, before the step of storing the real and imaginary results, the step of:
performing accumulations in the first cycle of operation to accumulate the results of the additions and subtractions with both the current contents of the second memory device and the contents of an extended precision register.
- 26. The method of claim 25 further comprising the step of:
storing the extended precision accumulated result in the extended precision register.
- 27. The method of claim 25 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, the real and imaginary results are each 32 bits, and the extended precision results are each 8 bits.
- 28. The method of claim 23 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, and the real and imaginary results are each 32 bits.
- 29. A method of calculating at least one element of a covariance matrix RM×M=U×UH comprising the steps of:
providing a data array UM×K having M elements and K samples, wherein each element in U comprises a 16 bit real value and a 16 bit complex value; calculating a first element of R utilizing a multiply complex conjugate long extended precision accumulate (MPYCXJLXA) instruction which executes M times.
- 30. The method of claim 29 wherein the MPYCXJLXA instruction is pipelineable.
- 31. The method of claim 29 wherein the first element of R is at least 39 complex signed bits.
- 32. The method of claim 29 wherein the MPYCXJLXA instruction completes execution in 2 cycles.
- 33. The method of claim 29 wherein the MPYCXJLXA instruction completes execution in a single cycle.
- 34. The method of claim 29 wherein the step of calculating a first element further comprises the steps of:
initiating a first execution the MPYCXJLXA instruction in a first cycle; and initiating a second execution the MPYCXJLXA instruction in a second cycle, the second cycle immediately following the first cycle.
- 35. The method of claim 29 further comprising the step of:
calculating a second element of R utilizing the MPYCXJLXA instruction which executes M times.
- 36. The method of claim 35 wherein:
the step of calculating a first element utilizes a first processing element (PE) and the step of calculating a second element utilizes a second PE; and both the step of calculating a first element and the step of calculating a second element occur simultaneously.
- 37. An apparatus for the two cycle computation of complex multiplication, the apparatus comprising:
a first storage register for storing a first complex operand and a second complex operand, the first complex operand including real component Xr and imaginary component Xi, the second complex operand including real component Yr and imaginary component Yi; a multiplier for simultaneously performing multiplications in a first cycle of operation to produce products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; a second storage register for storing products Xr*Yr, Xr*Yi, Xi*Yr and Xi*Yi; an adder for simultaneously performing additions and subtractions in a second cycle of operation to produce real result (Xr*Yr)−(Xi*Yi) and imaginary result (Xr*Yi)+(Xi*Yr) if a nonconjugated operation is being performed, said adder means further for simultaneously performing additions and subtractions in the second cycle of operation to produce real result (Xr*Yr)+(Xi*Yi) and imaginary result (Xi*Yr)−(Xr*Yi) if a conjugated operation is being performed; and a third storage register for storing the results of said adder means.
- 38. The apparatus of claim 37 further comprising:
an accumulator for simultaneously performing accumulation in the second cycle of operation to accumulate the results of said adder with the current contents of said third storage register, wherein said third storage register is further for storing the results of said accumulator.
- 39. The apparatus of claim 38 further comprising:
an extended precision storage register, wherein said accumulator is further for simultaneously performing accumulation in the second cycle of operation to accumulate the results of said adder with both the current contents of said third storage register and the current contents of said extended precision storage means, wherein said extended precision storage register is for storing extended precision results of said accumulator.
- 40. The apparatus of claim 39 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, the real and imaginary results are each 32 bits, and the extended precision results are each 8 bits.
- 41. The apparatus of claim 37 wherein:
the complex operand components Xr, Xi, Yr and Yi are each 16 bits, and the real and imaginary results are each 32 bits.
RELATED APPLICATIONS
[0001] The present application claims the benefit of U.S. Provisional Application Serial No. 60/244,861 entitled “Methods and Apparatus for Efficient Complex Long Multiplication and Covariance Matrix Implementation” and filed Nov. 1, 2000, which is incorporated by reference herein in its entirety.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60244861 |
Nov 2000 |
US |