This application is related to the following applications, each of which is being filed concurrently with this application and is incorporated by reference: (1) U.S. application Ser. No. 09/788,683, now U.S. Pat. No. 7,237,097 titled “Partial Bitwise Permutations”; (2) U.S. application Ser. No. 09/788,670, titled “Binary Polynomial Multiplier”; (3) U.S. application Ser. No. 09/788,682, now U.S. Pat. No. 7,162,621 titled “Configurable Instruction Sequence Generation”; and (4) U.S. application Ser. No. 09/788,685, now U.S. Pat. No. 7,181,484, titled “Extended-Precision Accumulation of Multiplier Output”.
This invention relates to microprocessor instructions for performing polynomial arithmetic, and more particularly to microprocessor instructions for performing polynomial multiplications.
Reduced instruction set computer (RISC) architectures were developed as industry trends tended towards larger, more complex instruction sets. By simplifying instruction set designs, RISC architectures make it easier to use techniques such as pipelining and caching, thus increasing system performance.
RISC architectures usually have fixed-length instructions (e.g., 16-bit, 32-bit, or 64-bit), with few variations in instruction format. Each instruction in an instruction set architecture (ISA) may have the source registers always in the same location. For example, a 32-bit ISA may always have source registers specified by bits 16-20 and 21-25. This allows the specified registers to be fetched for every instruction without requiring any complex instruction decoding.
Cryptographic systems (“cryptosystems”) are increasingly used to secure transactions, to encrypt communications, to authenticate users, and to protect information. Many private-key cryptosystems, such as the Digital Encryption Standard (DES), are relatively simple computationally and frequently reducible to hardware solutions performing sequences of XORs, rotations, and permutations on blocks of data. Public-key cryptosystems, on the other hand, may be mathematically more subtle and computationally more difficult than private-key systems.
While different public-key cryptography schemes have different bases in mathematics, they tend to have a common need for integer computation across very large ranges of values, on the order of 1024 bits. This extended precision arithmetic is often modular (i.e., operations are performed modulo a value range), and in some cases binary polynomial instead of twos-complement. For example, RSA public-key cryptosystems use extended-precision modular exponentiation to encrypt and decrypt information and elliptic curve cryptosystems use extended-precision modular polynomial multiplication.
Public-key cryptosystems have been used extensively for user authentication and secure key exchange, while private-key cryptography has been used extensively to encrypt communication channels. As the use of public-key cryptosystems increases, it becomes desirable to increase the performance of extended-precision modular arithmetic calculations.
In one general aspect, an instruction set architecture includes an instruction for performing polynomial arithmetic. The instruction includes one or more opcodes that identify the instruction as an instruction for performing a polynomial arithmetic operation. Additionally, the instruction identifies one or more registers. The instruction may be processed by performing the polynomial arithmetic operation using the identified registers.
Implementations may provide an instruction for performing binary polynomial addition, which may be implemented using a multiplier. The result of a polynomial arithmetic operation may be stored in one or more result registers. Polynomial arithmetic operations may include multiplication, where the contents of identified registers are multiplied together. Operations also may include polynomial multiplication-addition, where the contents of identified registers are multiplied together and then added to one or more result registers. The result registers may include a high-order register and a low-order register. Polynomial arithmetic operations may be performed on polynomials stored in registers. The polynomials may be encoded as a binary representation of coefficients.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features and advantages will be apparent from the description and drawings, and from the claims.
Many public-key cryptosystems use extended-precision modular arithmetic to encrypt and decrypt data. For example, many elliptic curve (EC) cryptosystems heavily use binary polynomial multiplication and addition to encrypt and decrypt data. Performance of elliptic curve cryptosystems may be enhanced by modifying a programmable CPU multiplier to be responsive to newly defined instructions dedicated to polynomial operations.
When using elliptic curves defined over GF(2163) (as recommended by the IEEE 1363-2000 standard), the main operation needed is multiplication over the field GF(2163). Each of the 2163 elements can be represented as a polynomial of degree at most 163 with coefficients equal to 0 or 1. In this representation, two elements may be added using a simple bitwise XOR and two polynomials, a(X) and b(X), may be multiplied by computing a(X)b(X) mod P(X), where the product a(X)b(X) is a 326-degree polynomial, and P(A) is an irreducible polynomial as specified by the IEEE 1363-2000 standard.
Polynomial multiplication has the same form as modular multiplication, ab mod p, over the integers, except that: (1) regular addition is replaced by an XOR; and (2) regular 32-bit multiplication is replaced by a 32-bit carry-free multiplication. Therefore, polynomial modular multiplication may be performed using shifts and XORs instead of shifts and adds.
Referring to
Because some operations, such as floating point calculations and integer multiply/divide, cannot always be performed in a single clock cycle, some instructions merely begin execution of an instruction. After sufficient clock cycles have passed, another instruction may be used to retrieve a result. For example, when an integer multiply instruction takes five clock cycles, one instruction may initiate the multiplication calculation, and another instruction may load the results of the multiplication into a register after the multiplication has completed. If a multiplication has not completed by the time a result is requested, the pipeline may stall until the result is available.
Referring to
Execution unit 2010 is the primary mechanism for executing instructions within processor core 2000. Execution unit 2010 includes a register file 2011 and an arithmetic logic unit (ALU) 2012. In one implementation, the register file 2011 includes thirty-two 32-bit general-purpose registers that may be used, for example, in scalar integer operations and address calculations. The register file 2011, which includes two read ports and one write port, may be fully bypassed to minimize operation latency in the pipeline. ALU 2012 supports both logical and arithmetic operations, such as addition, subtraction, and shifting.
The MDU 2020 performs multiply and divide operations. In one implementation, the MDU 2020 includes a 32-bit by 16-bit (32×16) Booth-encoded multiplier (not shown), result-accumulation registers (HI register 2021 and LO register 2022), a divide state machine, and all multiplexers and control logic required to perform these functions. In one pipelined implementation, 32×16 multiply operations may be issued every clock cycle to MDU 2020, so that a 32-bit number may be multiplied by a 16-bit number every clock cycle. However, the result will not be available in the HI/LO registers (2021 and 2022) until the multiplication has finished. The result may be accessed with the instructions MFHI and MFLO. These instructions move results from the HI register 2021 and LO register 2022, respectively, to an indicated register. For example, “MFHI $7” moves the contents of the HI register 2021 to general purpose register $7.
Two instructions, multiply-add (MADD/MADDU) and multiply-subtract (MSUB/MSUBU), may be used to perform the multiply-add and multiply-subtract operations. The MADD instruction multiplies two numbers and then adds the product to the current contents of the HI register 2021 and the LO register 2022. The result then is stored in the HI/LO registers (2021 and 2022). Similarly, the MSUB instruction multiplies two operands and then subtracts the product from the HI register 2021 and the LO register 2022, storing the result in the HI/LO registers (2021 and 2022). The instructions MADD and MSUB perform operations on signed values. MADDU and MSUBU perform the analogous operations on unsigned values.
Referring to
In one implementation, the registers identified by rs 3011 and rt 3012 contain binary polynomials (i.e., the polynomial's coefficients are reduced modulo two). Thus, each coefficient is either a “1” or a “0”. The polynomials are encoded in a 32-bit register with each bit representing a polynomial coefficient. For example, the polynomial “x4+x+1” would be encoded as “10011” because the coefficients of x3 and X2 are “0” and the remaining coefficients are “1”.
The MULTP instruction 3010 permits two polynomials to be multiplied. For example, (x4+x+1)(x+1)=x5+x4+x2+2x+1. Reducing the polynomial modulo two, yields x5+x4+x2+1. If the polynomials are encoded in the binary representation above, the same multiplication may be expressed as (10011)(11)=110101.
The sizes of the instruction and the operands may be varied arbitrarily; the 32-bit design described is merely by way of example. In a 32-bit implementation, a 32-bit word value stored in rs 3011 may be polynomial-basis multiplied by a 32-bit word value stored in rt 3012, treating both operands as binary polynomial values, to produce a 64-bit result. The low-order 32-bit word may be placed in LO register 2022, and the high-order 32-bit word result may be placed in HI register 2021. In some implementations, no arithmetic exceptions may occur. If the registers specified by rs 3011 and rt 3012 do not contain 32-bit sign-extended values, the result of the operation may be unpredictable.
Referring to
The MADDP instruction 3020 performs multiplication as discussed above. Binary polynomial addition is analogous to a bitwise XOR. For example, the binary polynomial addition (x4+x+1)+(x+1) gives x4+2x+2. Reducing the coefficients modulo 2 yields x4, which may be expressed as “10000”.
Similarly, the sizes of the instruction and the operands may be varied arbitrarily. In one implementation, a 32-bit word value stored in rs 3021 may be polynomial-basis multiplied by a 32-bit word value stored in rt 3022, treating both operands as binary polynomial values, to produce a 64-bit result. This result then may be polynomial-basis added to the contents of the HI register 2021 and the LO register 2022. The 64-bit result includes a low-order 32-bit word and a high-order 32-bit word. The low-order 32-bit word may be placed in LO register 2022, and the high-order 32-bit word result may be placed in HI register 2021. If the registers specified by rs 3021 and rt 3022 do not contain 32-bit sign-extended values, the result of the operation may be unpredictable.
In addition to polynomial arithmetic implementations using hardware (e.g., within a microprocessor or microcontroller), implementations also may be embodied in software disposed, for example, in a computer usable (e.g., readable) medium configured to store the software (i.e., a computer readable program code). The program code causes the enablement of the functions or fabrication, or both, of the systems and techniques disclosed herein. For example, this can be accomplished through the use of general programming languages (e.g., C, C++), hardware description languages (HDL) including Verilog HDL, VHDL, AHDL (Altera HDL) and so on, or other available programming and/or circuit (i.e., schematic) capture tools. The program code can be disposed in any known computer usable medium including semiconductor, magnetic disk, optical disk (e.g., CD-ROM, DVD-ROM) and as a computer data signal embodied in a computer usable (e.g., readable) transmission medium (e.g., carrier wave or any other medium including digital, optical, or analog-based medium). As such, the code can be transmitted over communication networks including the Internet and intranets.
It is understood that the functions accomplished and/or structure provided by the systems and techniques described above can be represented in a core (e.g., a microprocessor core) that is embodied in program code and may be transformed to hardware as part of the production of integrated circuits. Also, the systems and techniques may be embodied as a combination of hardware and software. Accordingly, other implementations are within the scope of the following claim.
Number | Name | Date | Kind |
---|---|---|---|
3614406 | Brown | Oct 1971 | A |
3642744 | Moberly et al. | Feb 1972 | A |
3654621 | Bock et al. | Apr 1972 | A |
3916388 | Shrimp et al. | Oct 1975 | A |
4023023 | Bourrez et al. | May 1977 | A |
4109310 | England et al. | Aug 1978 | A |
4126896 | Yamazaki | Nov 1978 | A |
4128880 | Cray, Jr. | Dec 1978 | A |
4130880 | Nutter | Dec 1978 | A |
4173041 | Dvorak et al. | Oct 1979 | A |
4219874 | Gusev et al. | Aug 1980 | A |
4302820 | Struger et al. | Nov 1981 | A |
4307445 | Tredennick et al. | Dec 1981 | A |
4317170 | Wada et al. | Feb 1982 | A |
4394736 | Bernstein et al. | Jul 1983 | A |
4396982 | Wada et al. | Aug 1983 | A |
4434462 | Guttag et al. | Feb 1984 | A |
4491910 | Caudel et al. | Jan 1985 | A |
4495598 | Vahlstrom et al. | Jan 1985 | A |
4507731 | Morrison | Mar 1985 | A |
4511990 | Hagiwara et al. | Apr 1985 | A |
4520439 | Liepa | May 1985 | A |
4538239 | Magar | Aug 1985 | A |
4583199 | Boothroyd et al. | Apr 1986 | A |
4586130 | Butts, Jr. et al. | Apr 1986 | A |
4771463 | Beeman | Sep 1988 | A |
4773006 | Kinoshita et al. | Sep 1988 | A |
4809212 | New et al. | Feb 1989 | A |
4811215 | Smith | Mar 1989 | A |
4814976 | Hansen et al. | Mar 1989 | A |
4825363 | Baumann et al. | Apr 1989 | A |
4829380 | Iadipaolo | May 1989 | A |
4847801 | Tong | Jul 1989 | A |
4852037 | Aoki | Jul 1989 | A |
4860192 | Sachs et al. | Aug 1989 | A |
4868777 | Nishiyama et al. | Sep 1989 | A |
4878174 | Watkins et al. | Oct 1989 | A |
4879676 | Hansen | Nov 1989 | A |
4884197 | Sachs et al. | Nov 1989 | A |
4891781 | Omura | Jan 1990 | A |
4899275 | Sachs et al. | Feb 1990 | A |
4924435 | Brunvand et al. | May 1990 | A |
4928223 | Dao et al. | May 1990 | A |
4949250 | Bhandarkar et al. | Aug 1990 | A |
4992934 | Portanova et al. | Feb 1991 | A |
5005118 | Lenoski | Apr 1991 | A |
5073864 | Methvin et al. | Dec 1991 | A |
5136696 | Beckwith et al. | Aug 1992 | A |
5150290 | Hunt | Sep 1992 | A |
5177701 | Iwasa | Jan 1993 | A |
5181183 | Miyazaki | Jan 1993 | A |
5185713 | Kobunaya | Feb 1993 | A |
5193202 | Jackson et al. | Mar 1993 | A |
5220656 | Itomitsu et al. | Jun 1993 | A |
5222244 | Carbine et al. | Jun 1993 | A |
5235686 | Bosshart | Aug 1993 | A |
5280439 | Quek et al. | Jan 1994 | A |
5280593 | Bullions, III et al. | Jan 1994 | A |
5299147 | Holst | Mar 1994 | A |
5392228 | Burgess et al. | Feb 1995 | A |
5392408 | Fitch | Feb 1995 | A |
5396502 | Owsley et al. | Mar 1995 | A |
5418915 | Matuda et al. | May 1995 | A |
5452241 | Desrosiers et al. | Sep 1995 | A |
5479620 | Kiyohara et al. | Dec 1995 | A |
5499299 | Takenaka et al. | Mar 1996 | A |
5502829 | Sachs | Mar 1996 | A |
5513366 | Agarwal et al. | Apr 1996 | A |
5517438 | Dao-Trong et al. | May 1996 | A |
5537562 | Gallup et al. | Jul 1996 | A |
5537629 | Brown et al. | Jul 1996 | A |
5550768 | Ogilvie et al. | Aug 1996 | A |
5559974 | Boggs et al. | Sep 1996 | A |
5560028 | Sachs et al. | Sep 1996 | A |
5581773 | Glover | Dec 1996 | A |
5590345 | Barker et al. | Dec 1996 | A |
5598571 | Gallup et al. | Jan 1997 | A |
5664136 | Witt et al. | Sep 1997 | A |
5666298 | Peleg et al. | Sep 1997 | A |
5669010 | Duluk, Jr. | Sep 1997 | A |
5671401 | Harrell | Sep 1997 | A |
5673407 | Poland et al. | Sep 1997 | A |
5696937 | White et al. | Dec 1997 | A |
5713035 | Ferrell et al. | Jan 1998 | A |
5717910 | Henry | Feb 1998 | A |
5721892 | Peleg et al. | Feb 1998 | A |
5726927 | Wolrich et al. | Mar 1998 | A |
5729554 | Weir et al. | Mar 1998 | A |
5729724 | Sharangpani et al. | Mar 1998 | A |
5729728 | Colwell et al. | Mar 1998 | A |
5734600 | Dieffenderfer et al. | Mar 1998 | A |
5734874 | Van Hook et al. | Mar 1998 | A |
5740340 | Lurcell et al. | Apr 1998 | A |
5748979 | Trimberger | May 1998 | A |
5752071 | Tubbs et al. | May 1998 | A |
5758176 | Agarwal et al. | May 1998 | A |
5761523 | Wilkinson et al. | Jun 1998 | A |
5768172 | Derby | Jun 1998 | A |
5774709 | Worrell | Jun 1998 | A |
5778241 | Bindloss et al. | Jul 1998 | A |
5781457 | Cohen et al. | Jul 1998 | A |
5784602 | Glass et al. | Jul 1998 | A |
5790827 | Leung | Aug 1998 | A |
5793661 | Dulong et al. | Aug 1998 | A |
5794003 | Sachs | Aug 1998 | A |
5796973 | Witt et al. | Aug 1998 | A |
5798923 | Laskowski | Aug 1998 | A |
5809294 | Ando | Sep 1998 | A |
5812147 | Van Hook et al. | Sep 1998 | A |
5812723 | Ohtsu et al. | Sep 1998 | A |
5815695 | James et al. | Sep 1998 | A |
5815723 | Wilkinson et al. | Sep 1998 | A |
5819117 | Hansen et al. | Oct 1998 | A |
5822606 | Morton | Oct 1998 | A |
5838984 | Nguyen et al. | Nov 1998 | A |
5838986 | Garg et al. | Nov 1998 | A |
5848255 | Kondo | Dec 1998 | A |
5848269 | Hara | Dec 1998 | A |
5850452 | Sourgen et al. | Dec 1998 | A |
5852726 | Lin et al. | Dec 1998 | A |
5864703 | Van Hook et al. | Jan 1999 | A |
5867682 | Witt et al. | Feb 1999 | A |
5875336 | Dickol et al. | Feb 1999 | A |
5875355 | Sidwell et al. | Feb 1999 | A |
5880984 | Burchfiel et al. | Mar 1999 | A |
5881307 | Park et al. | Mar 1999 | A |
5887183 | Agarwal et al. | Mar 1999 | A |
5892960 | Seide | Apr 1999 | A |
5918031 | Morrison et al. | Jun 1999 | A |
5922066 | Cho et al. | Jul 1999 | A |
5926642 | Favor | Jul 1999 | A |
5933650 | Van Hook et al. | Aug 1999 | A |
5936872 | Fischer et al. | Aug 1999 | A |
5944776 | Zhang et al. | Aug 1999 | A |
5953241 | Hansen et al. | Sep 1999 | A |
5960012 | Spracklen | Sep 1999 | A |
5961629 | Nguyen et al. | Oct 1999 | A |
5996056 | Volkonsky | Nov 1999 | A |
5996062 | Sachs | Nov 1999 | A |
5996066 | Yung | Nov 1999 | A |
6006316 | Dinkjian | Dec 1999 | A |
6009261 | Scalzi et al. | Dec 1999 | A |
6009450 | Dworkin et al. | Dec 1999 | A |
6026420 | DesJardins et al. | Feb 2000 | A |
6035120 | Ravichandran | Mar 2000 | A |
6035316 | Pelleg et al. | Mar 2000 | A |
6035317 | Guy | Mar 2000 | A |
6041403 | Parker et al. | Mar 2000 | A |
6058465 | Nguyen | May 2000 | A |
6058500 | DesJardins et al. | May 2000 | A |
6065115 | Sharangpani et al. | May 2000 | A |
6066178 | Bair et al. | May 2000 | A |
6067615 | Upton | May 2000 | A |
6073154 | Dick | Jun 2000 | A |
6078941 | Jiang et al. | Jun 2000 | A |
6088783 | Morton | Jul 2000 | A |
6122738 | Millard | Sep 2000 | A |
6128726 | LeComec | Oct 2000 | A |
6138229 | Kucukcakar et al. | Oct 2000 | A |
6141421 | Takaragi et al. | Oct 2000 | A |
6141786 | Cox et al. | Oct 2000 | A |
6145077 | Sidwell et al. | Nov 2000 | A |
6154834 | Neal et al. | Nov 2000 | A |
6172494 | Feuser | Jan 2001 | B1 |
6181729 | O'Farrell | Jan 2001 | B1 |
6185668 | Arya | Feb 2001 | B1 |
6192491 | Cashman et al. | Feb 2001 | B1 |
6199087 | Blake et al. | Mar 2001 | B1 |
6199088 | Weng et al. | Mar 2001 | B1 |
6233597 | Tanoue et al. | May 2001 | B1 |
6243732 | Arakawa et al. | Jun 2001 | B1 |
6263429 | Siska | Jul 2001 | B1 |
6266758 | Van Hook et al. | Jul 2001 | B1 |
6279023 | Weng et al. | Aug 2001 | B1 |
6282635 | Sachs | Aug 2001 | B1 |
6292883 | Augusteijn et al. | Sep 2001 | B1 |
6295599 | Hansen et al. | Sep 2001 | B1 |
6298438 | Thayer et al. | Oct 2001 | B1 |
6314445 | Poole | Nov 2001 | B1 |
6336178 | Favor | Jan 2002 | B1 |
6349318 | Vanstone et al. | Feb 2002 | B1 |
6349377 | Lindwer | Feb 2002 | B1 |
6397241 | Glaser et al. | May 2002 | B1 |
6421817 | Mohan et al. | Jul 2002 | B1 |
6425124 | Tominaga et al. | Jul 2002 | B1 |
6453407 | Lavi et al. | Sep 2002 | B1 |
6480605 | Uchiyama et al. | Nov 2002 | B1 |
6480872 | Choquette | Nov 2002 | B1 |
6513054 | Carroll | Jan 2003 | B1 |
6523054 | Kamijo | Feb 2003 | B1 |
6587939 | Takano | Jul 2003 | B1 |
6615366 | Grochowski et al. | Sep 2003 | B1 |
6625726 | Clark et al. | Sep 2003 | B1 |
6625737 | Kissell | Sep 2003 | B1 |
6651160 | Hays | Nov 2003 | B1 |
6658561 | Benayoun et al. | Dec 2003 | B1 |
6711602 | Bhandal et al. | Mar 2004 | B1 |
6760742 | Hoyle | Jul 2004 | B1 |
6892293 | Sachs et al. | May 2005 | B2 |
6952478 | Lee et al. | Oct 2005 | B2 |
6976178 | Kissell | Dec 2005 | B1 |
7003715 | Thurston | Feb 2006 | B1 |
7142668 | Kogure | Nov 2006 | B1 |
7162621 | Kissell | Jan 2007 | B2 |
7181484 | Kissell et al. | Feb 2007 | B2 |
20010052118 | Steinbusch | Dec 2001 | A1 |
20020013691 | Warnes | Jan 2002 | A1 |
20020062436 | Van Hook et al. | May 2002 | A1 |
20020069402 | Nevill et al. | Jun 2002 | A1 |
20020116428 | Kissell | Aug 2002 | A1 |
20030172254 | Mandavilli et al. | Sep 2003 | A1 |
20060190518 | Ekner et al. | Aug 2006 | A1 |
Number | Date | Country |
---|---|---|
196 44 688 | Apr 1998 | DE |
0 681 236 | Nov 1995 | EP |
0 757 312 | Feb 1997 | EP |
0 681 236 | Nov 2000 | EP |
10-11289 | Jan 1998 | JP |
11-003226 | Jan 1999 | JP |
11-174955 | Jul 1999 | JP |
2000-293507 | Oct 2000 | JP |
2000-321979 | Nov 2000 | JP |
WO 9707450 | Feb 1997 | WO |
WO 9708608 | Mar 1997 | WO |
Number | Date | Country | |
---|---|---|---|
20020116428 A1 | Aug 2002 | US |