Claims
- 1. An arithmetic circuit including at least one borrow parallel counter and at least one 4-bit one-hot digital signal, said circuit achieving high performance while expending low-power, said circuit comprising:
a full-adder, which adds three bits represented by two 4-b 1-hot signals and a binary signal respectively without intermediate conversion.
- 2. The arithmetic circuit of claim 1, wherein said borrow parallel counter is constructed of Complementary Metal Oxide Semiconductor (CMOS) and uses greater weighted input bits.
- 3. The arithmetic circuit of claim 1, wherein a very large semiconductor (VLSI) design is improved by increasing speed of a calculation performed by said arithmetic circuit, decreasing area-transistor count; improving nMOS/pMOS ratio, and increasing power dissipation.
- 4. The arithmetic circuit of claim 1, wherein said circuit includes lower switching activity and use of fewer hot lines as compared with a binary circuit for use in low-power high-performance arithmetic applications.
- 5. A multiplier circuit including borrow parallel multiplier circuits and virtual multiplier circuits using borrow parallel counters providing low-power, high-speed, and small-area features, said multiplier comprising:
regular and unified layouts for small multipliers of n×n, where 3≦n≦9 including a single array of almost identical borrow counters; reduced line connections including partial product bits generations and their connections to the bit reduction networks; and a substantially same delay for almost all output bits, wherein transistor sizing and delay equalization is minimized.
- 6. The multiplier circuit of claim 5, wherein a “borrow-effect” re-arranges input bits to be processed so that the actual bits to each column are balanced and equal.
- 7. The multiplier circuit of claim 5, wherein a total length of line connections in said multiplier is minimized due to only a single counter being used in each column.
- 8. A multiplier triple-expansion non-Booth circuit comprising a partial product bit matrix decomposition circuit for efficient generation of large multipliers from smaller multipliers, wherein each expansion triples the size of the large multipliers.
- 9. The circuit of claim 8, further minimizing inter-connections and being self-testable at high-speed and low-power, and having high VLSI performance without an extra built-in test circuit and complex wiring.
- 10. The circuit of claim 8, wherein said multipliers have only about 9% to 20% more transistors than minimum existing Booth multipliers.
- 11. The circuit of claim 8, wherein said circuit is used in pipelined and multiply-accumulate (MAC) processors for performing natural four stage operations selected from one of base virtual multiplication, level-1, level-2 bit reductions and the fast final addition.
- 12. The circuit of claim 11, wherein said circuit is further performs natural four stage operations with equalized delays.
- 13. A multiplier circuit utilizing 4-b 1-hot encoded signals and borrow bits, the circuit comprising:
at least two input numbers, each of said input numbers being trisected into three segments; a plurality of Carry Select Adders (CSAs); a plurality of multipliers interconnected to the CSAs, said multipliers being arranged to minimize the interconnection to the CSAs; and a plurality of output bits.
- 14. A multiplier circuit of claim 13, further comprising a plurality of levels of 3:2 and 4:2 counters and a latch for each of said output bits.
- 15. The multiplier circuit of claim 13, wherein a 54×54-b pipelined multiplier is implemented in an area of 434.8×769.5=334,578.6 m2 with a 0.18 m technology, achieving a 1 GHz at 1.8V supply and a low-power performance.
- 16. The multiplier circuit of claim 13, wherein at least 9 multipliers are used, said multipliers being selected from one of
6×6-b (4, 2)−(3, 2) based virtual multiplier totaling 18×18-b, and 6×6-b borrow parallel virtual multiplier totaling 18×18-b.
- 17. The multiplier circuit of claim 13, wherein fewer transistors for signal type conversion from non-binary to binary are required.
- 18. The multiplier circuit of claim 13, wherein said CSAs are 4-b 1-hot borrow parallel counters including a 5—1 counter, wherein said 5—1 counter uses 78 transistors, about two third being nMOS transistor cells, and 56 transistors being used to pass 4-b 1-hot signals, thereby reducing power-consuming activities.
- 19. The multiplier circuit of claim 18, wherein said CSAs implement equations
A1+A2+A3+A4+2A5=s0+2s1+4Q) Xo=s0; Yo=Xi XOR s1; Zo=Xi; S=Yi XOR Q; and C=Zi AND Yi′ OR Q AND Yi, where A1-A5 are input bits with A5 being a borrow bit; s0, s1 and Q are temporary parameters; and Xo, Yo, Zo and Xi, Yi, Zi are in-stage carry (out/in) bits.
- 20. A small borrow parallel multiplier circuit for processing a plurality of bit inputs, the multiplier comprising:
an array including a plurality of identical counters with a simple layout arranged in a plurality of columns, wherein “borrow-effect” naturally re-arranges bits being processed so that an actual number of bits processed in each column are balanced; minimal line connections within each line, wherein a single counter is used in each column; and a plurality of output bits having similar delay, wherein said multiplier requiring little cost in transistor sizing and delay equalization.
- 21. The multiplier circuit of claim 20, wherein said delay is selected from one of about 0.6 ns and 2 times a (4, 2) delay.
- 22. The multiplier circuit of claim 20, wherein said multiplier has the same height as a single 5—1 counter, providing extra regularity and compact layout.
- 23. The multiplier circuit of claim 20, wherein a 6×6 multiplier is implemented in 180 μm CMOS technology has an area of 12.87×16.0 μm2 when using a 5—1 counter and an area of 26.5×85.5 μm2 when using a 5—1—1 counter.
- 24. The multiplier circuit of claim 20, wherein a CSA block of an 18×18 multiplier has an area of about 34.2×85.5×3 μm2.
- 25. The multiplier circuit of claim 20, wherein a CSA block of a 54×54 multiplier has an area of about 48.7×85.5×9 μm2.
- 26. The multiplier circuit of claim 20, wherein a 54×54 multiplier including a CSA block has a layout in a rectangular area with a height of ((26.5+5)×3+34.2)×3+48.7=434.8 μm and a width of 85.5×9=769.5 μm, equaling an area of 434.8×769.5=334,578.6 μm2.
- 27. The multiplier circuit of claim 20, wherein components of said multiplier are modular and repeated, a low-power and pipeline frequency of 1 GHz is achieved, and said multiplier is self-testable, as provided by a triple expansion logic scheme.
- 28. A method of optimizing only one column of a plurality of CSA block columns in a triple expansion scheme of a multiplier for processing a plurality of bit inputs, the method comprising the steps of:
providing a first level of application of a triple expansion scheme P×P, where P is (3m+z1), m is an integer multiplier, and z1 is {0, 1, −1}; and expanding the first level of application according to an E×E, where E is (3P+z2) and z2 is {0, 1, −1}.
- 29. The method of claim 28, wherein m=4, z1=−1, and z2=−1.
- 30. The method of claim 28, wherein m=6, z1=0, and z2=0.
- 31. The method of claim 28, wherein m=7, z1=0, and z2=1.
- 32. The method of claim 28, wherein m=5, z1=0, and z2=−1.
- 33. The method of claim 28, wherein m=8, z1=0, and z2=0.
- 34. The method of claim 28, wherein m=9, z1=0, and z2=0.
GOVERNMENT RIGHTS
[0001] This invention was funded, at least in part, under grants from the National Science Foundation, Nos. MIP-9630870, CCR-0073469 and New York State Office of Advanced Science, Technology & Academic Research (NYSTAR, MDC) No. 1023263. The Government may therefore have certain rights in the invention.
Provisional Applications (2)
|
Number |
Date |
Country |
|
60431373 |
Dec 2002 |
US |
|
60431372 |
Dec 2002 |
US |