Title of Invention  METHOD AND APPARATUS FOR MODULAR MULTIPLYING AND CALCULATING UNIT FOR MODULAR MULTIPLYING 

Abstract  The invention relates to a method for modular multiplying a multiplicand (C) by a multiplier (M) using a modulus (N), said multiplicand (C), said multiplier (M) and said modulus (N) being polynomials of a variable (x) from the body GF (2"), within a cryptographic calculation, said multiplicand (C), said multiplier (M) and said modulus (N) being parameters in said cryptographic calculation, said method comprising the following steps: performing (210) a multiplication lookahead method to obtain a multiplication shift value (sz), said multiplication shift value (sz) being incremented at a power of said multiplier, which is not present in the multiplier polynomial; multiplying (214) said variable (x) raised to the power of said multiplication shift value (sz) by an intermediate result polynomial (Z) to obtain a shifted intermediate result polynomial (Z'); performing a reduction lookahead method (212) to obtain a reduction shift value (sN), said reduction shift value (SN) being equal to the difference of the degree of said shifted intermediate result polynomial (Z) and the degree of said modulus polynomial (N); multiplying (216) said variable (x) raised to the power of said reduction shift value (sN) by said modulus polynomial (N) to obtain a shifted modulus polynomial (N'); summing (218) said shifted intermediate result polynomial (Z') and said multiplicand (C) and subtracting said shifted modulus polynomial (N') to obtain an updated intermediate result polynomial (Z); and repeating (226) steps (a) to (e) until all the powers of said multiplier (M) have been processed have been processed so that a multiplication result is obtained, which is a parameter of the cryptographic calculation, wherein in the repetition of steps (a) to (e) in step (d) said updated intermediate result polynomial (Z) of the previous step (e) is used as said intermediate result polynomial (Z), and in step (c) said shifted polynomial of the previous step (d) is used as a modulus polynomial (N). 
Full Text  1 Description The present invention relates to methods and apparatuses for performing a modular multiplication and, for example, to the modular multiplication for elliptic curves over GF(2n). Cryptography is one of the essential applications for modular arithmetic. Depending on the form of the modulus N two cryptography methods are basically distinduisned. If the modulus is an integer, we speak of a Z/NZ arithmetic. The parameter N stands for a prime number or for composed prime numbers. The parameter Z stands for integers. The RSA equation is an example of the case in which the modulus is composed of two prime numbers: C  ME mod(N) . As is known, C is an encrypted message, M is an unencrypted or plain message, E is the public key and N is the modulus. In contrast, the GF{2n) arithmetic is characterised in that the modulus N(x) is a polynomial of a variable x. The polynomial includes a sum of individual powers of x, a coefficient being associated with each power of x. The exponent of the highest power of x is called the degree of the polynomial. If the coefficients are from the field of GF(2), we speak of a GF(2n) modulus or, more generally, of a GF(2n) arithmetic, respectively. The GF(2n) arithmetic is, for example, used in the cryptography of elliptic curves. 2 A polynomial f(x) e GF(2)[x] of the degree n1 is given by the n coefficients an1, ..., a0, wherein the axs must be from the set of GF(2) and wherein an1 per definition is 1: The field of GF(2n) is given by an irreducible polynomial of the degree n and by polynomials of GF(2n) of the degree smaller than or equal to n1. The addition, in GF(2n), of two elements, that is polynomials, is given by XORing their coefficient vectors with a length of n. The multiplication, in GF(2n), of two elements, that is polynomials, is obtained by multiplying the polynomials over GF(2n) and subsequently reducing the obtained product modulo the irreducible polynomial N{x) of the degree n, which defines the corresponding field. Thus, the product polynomial, that is the polynomial which results from the multiplication of a first polynomial f(x) by a second polynomial g(x), must be subjected to a polynomial division with the modulus polynomial N(x) as the divisor to perform the modular operation. The result of f(x) * g(x) mod N(x) is the remainder polynomial resulting from the polynomial division. Before different manners for efficiently performing the modular multiplication over both Z/NZ and GF(2n) are dealt with, it should be noted that the modular exponentiation with both Z/NZ and GF(2n) can be split into a multiplication by 3 means of the wellknown Square and Multiply Algorithm. Thus, the following equation is to be solved: The Square and Multiply Algorithm is based on the fact that the exponent E is split into a sum of powers of two: The following example is to illustrate this. In binary representation, the following is to apply: E = 1011. Thus, the following relation applies: Thus, the following applies: For the Z/NZ arithmetic the equations described above are accordingly, with the difference that, instead of M(x), M must be written and, instead of N(x), N must be written. In the art a wellknown efficient and frequently used possibility to calculate the modular multiplications is known as the Montgomery multiplication and, for example, described in "Handbook of Applied Cryptography", Menezes, van Oorschot, Vanstone, CRC Press, pages 600 to 603 The Montgomery reduction is a technique allowing an efficient implementation 4 of the modular multiplication without the classic modular reduction step being explicitly carried out. Generally in the Montgomery reduction the division operation is expressed by simple shift operations. Meanwhile, an extension of the Montgomery multiplication operation to the finite field of GF(2n) is also known.. This extension is described in "Montgomery multiplication in GF(2k)", Koc, Azar, Designs, Codes and Cryptography, Vol. 14, 1998, pages 57 to 69. This extension is also described in "A Scalable and Unified Multiplier Architecture for Finite Fields Z/NZ and GF(2n)", Erkay Savas, et al.. Cryptographic Hardware and Embedded Systems (CHESS 2000), pages 281 to 289, Springer Lecture Notes. It is a disadvantage of the Montgomery multiplication over Z/NZ or GF{2n) that, even if the division operation, which is difficult to implement in hardware, for a modular reduction is bypassed by shift operations, no lookahead methods are used to accelerate the modular multiplication operation in hardware. DE 3631992 C2 discloses a method in which the modular multiplication over Z/NZ can be accelerated using a multiplication lookahead method and using a reduction lookahead method. The method described in DE 3631992 C2 is also called ZDN method and described in detail referring to Fig. 9. After a starting step 900 of the algorithm, the global variables M, C and N are initialised. It is the objective to calculate the following modular multiplication: Z = M * C mod N. 5 M is called the multiplier, C being called the multiplicand. Z is the result of the modular multiplication, N being the modulus. Then, various local variables, which do not have to be dealt with now, are initialised. Two lookahead methods are then applied. In the multiplication lookahead method GEN_,MULT_LA, a multiplication shift value sz as well as a multiplication lookahead parameter a are calculated (910) using various lookahead rules. The current contents of the Z register is then subjected (920) to a leftshift operation by sz digits. Essentially parallel to this a reduction lookahead method GEN_Mod_LA (930) is performed to calculate a reduction shift value sN and a reduction parameter b. In a step 940, the current contents of the modulus register, that is N, is shifted by SN digits to produce a shifted modulus value N'. The central threeoperands operation of the ZDN method takes place in a step 950. After step 920 the intermediate result Z' is added to the multiplicand C which is multiplied by the multiplication lookahead parameter a and to the shifted modulus N' which is multiplied by the reduction lookahead parameter b. According to the current situation, the lookahead parameters a and b may have a value of +1, 0 or 1. One case is that the multiplication lookahead parameter a is +1 and the reduction lookahead parameter b is 1 so that the multiplicand C is added to a shifted intermediate result Z' / and the shifted modulus N' is subtracted therefrom. Among others, a could have a value of 0 if the multiplication lookahead method allowed more than one preset number of individual left shifts, that is, if sz were greater than the maximum allowed value of sz which is also called k. For the 6 case that a equals 0 and that Z', due to the preceding modular reduction, that is the preceding subtraction of the shifted modulus, is still quite small, and especially smaller than the shifted modulus N', no reduction needs to take place so that the parameter b equals 0. The steps 910 to 950 are performed until all the digits of the multiplicand have been processed, that is, until m equals 0 and until a parameter n also equals 0, which indicates whether the shifted modulus N' is still greater than the original modulus N and whether, despite the fact that all the digits of the multiplicand have already been processed, further reduction steps must be performed by subtracting the modulus from Z. Finally, it is determined whether Z is smaller than 0. If this is the case, the modulus N must be added to Z to obtain a final reduction so that in the end a positive result z of the modular multiplication is obtained. In a step 960, the modular multiplication is finished by means of the ZDN method. The multiplication shift value sz as well as the multiplication parameter a, which are calculated in step 910 by the multiplication lookahead algorithm, arise from the topology of the multiplier as well as from the inserted lookahead rules described in DE 3631992 C2. The reduction shift value sN and the reduction parameter b are determined by comparing the current contents of the Z register with a value of 2/3 times N, as is also described in DE 3631992 C2. It is due to this comparison that the ZDN method has its name (ZDN  Zwei Drittel N = Two Thirds N). 7 As is illustrated in Fig. 9, the ZDN "method reduces the modular multiplication to a threeoperands addition (Block 950 in Fig. 9), the multiplication lookahead method and, therefore, the reduction lookahead method being utilized to increase the computing time efficiency. Compared to the Montgomery reduction for Z/NZ, a computing time advantage by a factor in the order of magnitude of 3 can thus be obtained. As has already been explained, the ZDN method described in DE 3631992 C2 only works for the Z/NZ arithmetic. It is, however, not suitable for the GF(2n) arithmetic. Thus, at present, there is no method in which computing time efficient lookahead methods can be utilized for the GF(2n) arithmetic to accelerate the modular multiplication over GF(2n). It is the object of the present invention to provide a concept for quickly performing a modular multiplication over GF{2n) . This object is achieved by a method for modular multiplying according to claim 1, an apparatus for modular multiplying according to claim 7 or a calculating unit according to claim 11. The present invention is based on the understanding that an acceleration of the modular multiplication over GF(2n) can be obtained by utilizing both a multiplication lookahead method and a reduction lookahead method. In the multiplication lookahead method, a multiplication shift value is calculated. In the reduction lookahead method, which preferably runs in parallel to the multiplication lookahead method, a reduction shift value is calculated, the reduction 8 shift value equalling the difference of the degree of an intermediate result polynomial shifted by the multiplication shift value and the degree of the current modulus polynomial. While the intermediate result polynomial is multiplied by the variable which is raised to the power of the multiplication shift value, the modulus polynomial is multiplied by the variable which is raised to the power of the reduction shift value. Thus, a threeoperands addition may also be formulated for the GF(2n) arithmetic so that a new intermediate result polynomial can be calculated by summing the last intermediate result polynomial shifted by the multiplication shift value and the multiplicand and by then subtracting therefrom the modulus polynomial shifted by the reduction shift value in order to obtain an updated intermediate result polynomial. All the steps are then repeated, however, using the updated intermediate result polynomial and the modulus polynomial shifted in the last step to successively sum all the partial products, that is, until all the powers of the multiplier have been processed. The threeoperands addition is especially simplified for the case of the GF(2n) arithmetic in that the coefficients of the powers of the variable x either have the value "0" or "1". Thus, both the addition and the subtraction become a simple XORing so that for a calculating unit which is solely adapted for the GF(2n) addition as an arithmetic unit not even an adder is required but solely a bitwise XORing of the three operands. In the case of a dual calculating unit, that is, a calculating unit which is to carry out a modular multiplication in both Z/NZ and in GF(2n), the threeoperands adder already present for the ZDN method can simply be 9 modified for GF(2n) operations by disabling, that is not taking into account, the carry for each bit of the adder. It should be noted that the inventive method for calculating the modular multiplication in GF(2n) has a serialparallel architecture. The threeoperands addition preferably always takes place in parallel, that is for all the bits of the addends which typically comprise a width of 150 to 1100 bits, a new partial product being calculated in a next serial iteration of the inventive method and in a subsequent parallel threeoperands addition being added to the intermediate result already existing. The inventive concept for calculating a modular multiplication is advantageous in that it provides a maximum acceleration by a factor in the order of magnitude of two compared with the Montgomery multiplication for GF(2n) as well. A further advantage of the inventive concept is that now an efficient method for calculating the modular multiplication in GF(2n) is given so that, for example, the ECDSA algorithm (ECDSA = Elliptic Curve Digital Signature Algorithm) over GF(2n) can be calculated. This algorithm is described in "Public Key Cryptography for the Financial Services Industry: The Elliptic Curve D.S.A.".ANSI X9.62  1998. It is pointed out that the elliptic curve cryptography is preferred compared to a cryptography based on a modular arithmetic regarding an integer in that similar standards of security can be obtained with considerably smaller numbers. While for the RSA method over Z/NZ with numbers having a width of 1024 bits good standards of security are obtained, 10 polynomials over GF{2), the degree of which is in the range of 150 to 300 powers of the variable x, suffice for this. A further advantage of the present invention is that the inventive concept for calculating the modular multiplication can easily be integrated into already existing calculating units for the ZDN method since the actual longnumber calculating unit, that is the threeoperands adder, can simply be adapted for GF(2n) by disabling the bitwise carry. Even if the arithmetic units for the reduction lookahead algorithm and for the multiplication lookahead algorithm for GF{2n) are different from the corresponding apparatuses for Z/NZ, this is not decisive for the overall performance of the calculating unit since in this case additions, shifts or subtractions with small numbers occur, which may have a width of 8 or 16 bits so that the chip area for these arithmetic units which are also called control units, compared to the long number calculating unit, that is the threeoperands adder, which can very well have a width of over 2048 bits (in dual implementation for Z/NZ and GF(2n)), does not provide a major contribution. A further advantage of the inventive concept for a modular multiplication in GF(2n) is that, compared to the ZDN method which only works for the Z/NZ arithmetic, many operations can be simplified. Thus, in the inventive GF(2n) modulo multiplication no comparison with the 2/3 times the modulus must be performed. This comparison, in GF(2n) , can simply be substituted by the comparison of the degree of the intermediate result polynomial and the degree of the modulus polynomial. Since the expected value for the multiplication shift value and the expected value for the reduction shift value are identical, the two lookahead methods are decoupled 11 of each other so that there is the potential that the two lookahead methods work independently of each other which in turn brings computing time advantages about. In the following preferred embodiments of the present invention are described in detail referring to the enclosed accompanying drawings in which: Fig. 1 is a flow chart for illustrating the modular exponentiation in GF(2n); Fig. 2 is a highlevel flow chart of the inventive method; Fig. 3 is a flow chart of the multiplication lookahead method for calculating the multiplication shift value; Fig. 4 is a flow chart of the reduction lookahead method for calculating the reduction shift value; Fig. 5 is a part of a threeoperands adding unit for the GF{2n) arithmetic or the Z/NZ arithmetic; Fig. 6 is a detailed illustration of the carry disabling function; Fig. 7 is a block diagram of a Z/NZ/GF(2n) calculating unit; Figs.8a to 8c are schematic illustrations for illustrating the calculation of the reduction shift value; and 12 Fig .9 is a general diagram of the ZDN method for performing a modular multiplication in Z/NZ. Fig. 1 is a general flow chart for splitting a modular exponentiation C(x) = (M(x) )E mod N{x) into a series of multiplications. M(x) and N(x) are polynomials of the variable x. E is an exponent in binary representation having a bit length L(E). The algorithm basically consists of examining whether a bit of the exponent E, that is E{e), equals 1. If this is the case, the current contents of the result register is multiplied by M(x), the modulo reduction with the modulus polynomial N(x) being performed immediately after that. If, however, a bit of the exponent equals 0, no multiplication by M(x) is performed. In both cases the current contents of the register C(x) is multiplied by itself, that is, squared, the modulo reduction then taking place. The index for the digit of the exponent, that is e, is then incremented by 1, the loop then being passed through again. This is performed until all the digits of the exponent E have been processed, that is, until e equals L{E). The algorithm is then finished and in the register for C(x) there is the result of the modular exponentiation. The central operation of the modular exponentiation is thus the modular multiplication of a multiplicand C(x) by a multiplier M(x). Fig. 2 shows a highlevel block diagram of the inventive method for modular multiplying a multiplicand by a multiplier. The method starts at a start block 200. In a 13 block 202, the global variables M, C and N, which are polynomials of the variable x, are initialised. In a block 2 04, the intermediate result polynomial Z is then initialised to 0. In a block 206, the control variable m is initialised to L{M). L(M) indicates the length of the multiplier M in bits . L(M) thus corresponds to the degree of the multiplier polynomial. In a block 208, a control variable n is initialised to 0. The function of the control variable n will be explained later. Subsequently, a multiplication lookahead method 210 and a reduction lookahead method 212 are preferably performed in parallel. The multiplication lookahead method serves to calculate a multiplication shift value sz and preferably a multiplication lookahead parameter a as well. The reduction lookahead method serves to calculate a reduction shiftvalue sN and preferably a reduction lookahead parameter b as well. In a block 214, a shifted intermediate result polynomial Z' is calculated by multiplying the current intermediate result polynomial Z by the variable x which is raised to the power of the multiplication shift value sz. In a block 216, a shifted modulus polynomial N' is preferably calculated in parallel by multiplying the current modulus polynomial N by the variable x which is raised to the power of the reduction shift value sN. In a block 218, the socalled threeoperands addition which is the central operation of the inventive multiplication method is carried out. In block 218, an updated intermediate result polynomial Z is calculated, which is obtained by the 14 addition of the intermediate result polynomial Z' and the multiplicand C multiplied by the multiplication lookahead parameter a and of the shifted modulus polynomial N' multiplied by the reduction lookahead parameter b. In a block 220, it is examined whether the control variable m equals 0 and whether the control variable n equals 0 at the same time. If the control variable m equals 0, this means that all the bits of the multiplier M{x) have been processed. If the control variable n equals 0, this means that the shifted modulus polynomial N' corresponds to the original polynomial N of block 202 again. If these two conditions are met, block 220 will thus be answered with a YES so that the result of the modular multiplication, that is Z(x), can be output in a block 222. The method for modular multiplying is then finished in a block 224. If block 220, however, is answered with a "NO", this either means that there are still bits of the multiplier which have not been processed or that the modulus polynomial N' which is held in the register for the modulus polynomial is still greater than the original modulus polynomial defined in block 202. Expressed differently, this means that the degree of the current polynomial held in the register for the modulus polynomial is greater than the degree of the original modulus polynomial N which has been defined in block 202. If this is the case, a return will be performed as is shown by a feedback 226 in Fig. 2 to carry out both the multiplication lookahead method and the reduction lookahead method again. In contrast to the first step in which the Z register, due to the initialisation in block 204, has been set to 0, it is now 15 the result of the threeoperands operation 218 of the previous step, which is in the Z register. In the same way, it is no longer the original modulus N defined in block 202, which is in the modulus register N, but the modulus polynomial N' which has been shifted by the reduction shift value sN. The original modulus N(x} which has been defined in block 202 will thus only be in the N register during the first iteration step, while during iteration {iteration loop 226) it is always the shifted modulus polynomial which is in the modulus register, that is a modulus polynomial which has been multiplied by the variable x which is raised to the power of a reduction shift value sN. Reference will now be made to Fig. 3 which illustrates a more detailed illustration of the multiplication lookahead method, that is of block 210 in Fig. 2. The multiplication lookahead method starts at a block 300. It receives, as global variables, the parameter m of Fig. 2, a further control variable curk, which will be explained later, as well as the multiplier M. This is illustrated by a block 302 in Fig. 3. In a block 304, the multiplication shift value sz is initialised to 0. Furthermore, the multiplication lookahead parameter a, which will be explained later, is initialised to a value of 1 (block 306). It is then examined in a block 308 whether the actual bit or the coefficient of the currently processed power of x, respectively, eguals 0 or not. If it is determined in block 308 that the currently processed bit of the multiplier is unequal to 0, that is if the determination of block 308 is answered with a YES, the control variable m will be incremented by 1 in a block 310. Furthermore, the 16 multiplication shift value s2 is also incremented by 1 in a block 312. In a block 314, the resulting parameters of the multiplication lookahead method, that is the multiplication lookahead parameter a and the multiplication shift value sz, are output. If the question in block 308 is answered with a NO, a jump to a further determination block 316 will be performed. It is determined here whether the control variable m is still smaller than the length, that is the degree, of the multiplier M. In addition, it is examined whether the current multiplication shift value sz is smaller than or unequal to the parameter curk, respectively. If both questions are answered with a YES, a jump to a block 318 will be performed to increment the parameter m by 1. Furthermore, in a block 320, the multiplication shift value sz is also incremented by 1. Subsequently, the next bit of the multiplier M is examined, which is illustrated in Fig. 3 by a feedback branch 322. If it is, however, determined in block 316 that one of the two questions in block 316 is answered with a NO, a jump to a block 324 will be performed, in which the multiplication lookahead parameter a is set to 0. It can thus be seen that the multiplication lookahead parameter a which is output in block 314 can either be 0 or 1. The multiplication lookahead method is then finished in a block 326. In the following, the mode of operation of the multiplication lookahead parameter will be explained. The multiplication lookahead method, which is utilized according to the invention, is a lookahead algorithm for the GF(2n) multiplication with variable shifts over zeros, wherein the 17 number of the variable shifts cannot become arbitrarily great, but, at most, be equal to a value CURk. "CURk" means "current k", that is "current value of the parameter k". In the following, a multiplier polynomial with the coefficient "10001" will be exemplary examined. First, the most significant bit thereof is examined. This bit has the value "1" so that block 308 is answered with a YES, which leads to the parameter m being incremented by 1 and to the multiplication shift value sz being increased by 1 as well. The multiplication lookahead algorithm is thus already finished since the examined bit of the multiplier had the value "1", in such a way that in the threeoperands addition the multiplicand C must be added. In a next passage of the multiplication lookahead algorithm the second bit is examined. This bit has a value of 0 so that block 308 is answered with a NO. If the examined bit is just the second bit of the multiplicand and if the multiplication shift value sz, due to the initialisation in block 304, is 0, block 316 will be answered with a YES so that the control variable m is incremented by 1 (318) and the multiplication shift value is also incremented by 1 (320). Block 308 is then entered again via the branch 322. Since the next bit also has a value of "0", this block is repeatedly answered with a NO and block 316 is again the current one. m is still smaller than L(M) so that this question is answered positively. sz just has the value 1. When it is supposed that CURk has a value of 2, this question is also answered positively so that in blocks 318 and 320 increments of m and sz take place again. After the passage of block 320, sz has the value of 2. A jump via the branch 320 to block 308 is again performed to determine whether the current next bit is a 1 or a 0. For the 18 present example, block 308 is again answered with a NO since in this case the third bit in a sequence of zeros is examined. Block 316, however, is now answered with a NO since sz is 2 and the variable CURk is also 2. This means that the multiplication lookahead method is thus, so to speak, cancelled even if the third 0 could also be used to effect a new shift. sz, however, must be limited to the top since otherwise an infinitely long Z register would have to be provided to be able to store the shifted intermediate result polynomial Z', which is calculated in step 214 of Fig. 2. CURk is thus set depending on the current movement of the Z register to allow the greatest possible shift value sz, which contributes to a gain in velocity, on the one hand but to manage with a limited register length for the shifted intermediate result polynomial Z' at the same time. The threeoperands operation in block 218 of Fig. 2 thus degenerates into a twooperands operation since the parameter a in block 324 of Fig. 3 has been set to 0. As can be seen from Fig. 3, in the branch of block 324 no further increments of m took place so that in a renewed passage of the multiplication lookahead algorithm it is now the third 0 bit in the sequence that is examined in block 308. Since this bit has a value of 0, block 308 is again answered with a NO so that the multiplication shift value sz is incremented by 1 and that the control variable in block 318 is also incremented. The last bit of the multiplier, that is the "1", is now examined. Since this bit is unequal to 0, block 308 is answered with a YES, the control variable is incremented for a last time and the multiplication shift value sz is also incremented again until the multiplication lookahead algorithm for this iteration is finished (block 326). All the bits of the multiplicand have now been examined 19 so that the iteration loop 226 of Fig. 2 is finished since it is examined in block 220 whether m equals 0, which now applies for the present example. In the following, reference is made to Fig. 4 to describe the reduction lookahead method which, in Fig. 2, is designated with the reference numeral 212. In a block 4 00, the reduction lookahead method starts. In a block 402, various global variables of which especially N and Z are to be emphasised are defined. N is the register value for the modulus polynomial of the preceding step, while Z is the updated intermediate result polynomial of the preceding step as well. k is the maximum shift value for Z, CURk is the current shift value for Z and MAX is the length, that is the number of bits, of an overflow buffer which serves to store the leftshifted polynomials N and Z. When block 216 of Fig. 2 is considered, it can be seen that, if an arbitrarily great reduction shift value sN were provided, an arbitrarily great register for N would have to be provided like in the analogue case of the multiplication lookahead method. This, however, is not desirable for reasons of space and efficiency so that by means of the parameter MAX it is also taken into account that the modulus polynomial, too, can only be shifted by a certain number of bits to the left, that is to the top. In a block 404, a parameter s1 which will be described later is initialised to 0. It is then determined in a block 406 whether the parameter n which indicates the number of bits of N which are in the overflow buffer equals 0, or whether s1 equals k. If block 406 is answered with a YES, a jump to a block 408 will be performed, in which the reduction lookahead parameter b is set to 0. If, however, the question in block 406 is answered with a NO, the parameter n is 20 incremented by 1 {block 410). At the same time, the parameter S1 is incremented by 1, as is illustrated by a block 412. Subsequently, the central comparison takes place in block 414, by means of which it is to be determined by how much the modulus polynomial is to be shifted so that in the threeoperands operation (block 218 of Fig. 2) a modular reduction of the intermediate result polynomial takes place. For this, the auxiliary reduction shift value si is determined so that the degree of the polynomial which arises from the multiplication of x, x being raised to the power of si, multiplied by the updated intermediate result polynomial of the preceding step, is equal to the degree of the current modulus polynomial. This is carried out step by step as is indicated by an iteration loop 416 until either a YESresult is obtained in block 406 or a YESresult is obtained in block 414. If block 414 is answered with a YES, the reduction lookahead parameter b will be set to 1 in a block 418. In a block 420, a new parameter n is then calculated from the difference of the multiplication shift value sz and the current value n. The real reduction shift value SN is then calculated in a block 422 by forming the difference of the multiplication shift value s2 and the auxiliary reduction shift value Si. It is pointed out that the multiplication shift value sz is provided by the actually parallel passing multiplication lookahead algorithm as is indicated by an arrow 230 in Fig. 2. Without the introduction of an auxiliary reduction parameter Si, only a serial implementation of the multiplication lookahead method and, after this, of the reduction lookahead method would be feasible, which for reasons of efficiency is not desirable. Therefore, the auxiliary reduction parameter si is used, by means of which the actual calculation of the reduction shift value sN can be 21 prepared that far that the extensive iteration loop (branch 416 in Fig. 4) can actually be processed in parallel to the multiplication lookahead algorithm, while the actual calculation of the reduction shift value SN can be performed by a fast formation of the difference of two short numbers s2 and Si. Thus the sequence is the following. sz and si are calculated in parallel. sz is then delivered from the. multiplication lookahead algorithm via the branch 230 of Fig. 2, which can also be seen in Fig. 4, to the reduction lookahead algorithm so that the reduction shift value sN is already directly available in the next cycle. This will be explained later referring to Figs. 8a to 8c. After block 422, it is determined in a block 424 whether n is greater than MAX minus k. If this question is answered with a YES, a new curk will be calculated in a block 426. If the question in block 424 is answered with a NO, curk will be equated with k in a block 428. In a block 430, the result values of the reduction lookahead method, that is b and sN, are output so that the reduction lookahead method is finished in a block 432. Referring to a detailed explanation of the multiplication lookahead parameter a and the reduction lookahead parameter b as well as the storage management parameters n, MAX, k and curk, reference is made to DE 3631992 C2. Unlike the ZDN method for Z/NZ, in which the parameters a and b can take values of +1, 0 and 1, the corresponding parameters a, b in the inventive method can only take the values of 0 and 1. The lookahead parameters a and b are solely optionally required for the modular multiplication according to the present invention, namely in the case that there are not arbitrarily great storage locations available for N and Z. Generally, the 22 inventive method can, however, be easily carried out under the condition that arbitrarily great registers are available, in which case the multiplication lookahead method will never be cancelled but always performed until a "1" is found in the multiplier. Until then, referring to block 214 of Fig. 2, sz has a certain possibly great value so that the shifted intermediate result polynomial Z' can possibly take an extensively great value. Due to the fact that a 1 has been found in the multiplier, the multiplicand is then added to the shifted intermediate result polynomial Z' in block 218. It is, however, an essential feature that concurrently with each multiplication step a modular reduction also takes place so that the numerical values as a whole can be kept to a tolerable level. For this, the reduction shift value sN according to the present invention is selected in such a way that the degree of the shifted modulus polynomial is equal to the degree of the current intermediate result polynomial. If then the shifted modulus polynomial is subtracted from the sum of Z'(x) and C(x), the updated intermediate result Z will typically always be smaller than Z' so that a reduction has been obtained. Thus, it can be seen that the updated intermediate result polynomial Z, which is calculated by step 218 in Fig. 2, is not necessarily reduced regarding the original modulus polynomial of block 202, but is maybe reduced during the whole iteration only regarding a leftshifted modulus polynomial, that is a modulus polynomial with a higher degree. However, this need not be like this. If this case is, however, to arise, it will be achieved by step 220 in which it is determined whether n equals 0, that is whether N has bits in the overflow buffer or not, that further 23 subtractions of the modulus from the updated intermediate result take place so that Z can then progressively be reduced into the original remainder class. If n equals 0, this means that there are no more bits of N in the overflow buffer, which in turn means that the finally obtained shifted modulus polynomial equals the original modulus polynomial of block 202. Thus, it can be seen that the inventive method for modular multiplying basically can also be carried out without lookahead parameters a and b, in which case, however, theoretically unlimited registers for 2 and N would be required  if arbitrary multipliers were assumed. If a storage limit is provided for Z and N, that is if the lookahead parameter a and b may be 0, a multiplication lookahead parameter a equal to 0 means that no multiplicand is added to the shifted Z'. Analogue to this, a reduction lookahead parameter b equal to 0 means that the shifted modulus polynomial is greater than the shifted intermediate result polynomial Z' , for which reason no reduction is required so that the modulus subtraction can be omitted as well. In such a case the threeoperands operation would degenerate completely. At this stage it is also pointed out that in the case of a limited buffer for the registers Z and N it must be watched that N is kept from its Home MSB by at least k bits as long as the variable m has not reached the value 0. It is further pointed out that in the case of a GF(2n) arithmetic, that is if the coefficients of the polynomial can only be 0 or 1, the addition operation corresponds to the 24 subtraction operation and can generally be carried out as an XORing. If, however, coefficients of the polynomial were allowed in a different number system, for example, in an octal numbering system or a decimal numbering system, the subtraction would, of course, not correspond to the addition. In the following, reference is made to Fig. 8a to 8c to illustrate the calculation of the reduction shift value sz using the auxiliary reduction shift value si. In Fig. 8a, an intermediate result polynomial Z and a modulus polynomial N are illustrated. Only as an example, the intermediate result polynomial has a degree of 4, that is, 4 bits, while the modulus polynomial has a degree of 9, that is, 9 bits. It is further assumed that in block 214 of Fig. 2 a shifted intermediate result polynomial Z' is calculated, which can be obtained by multiplying the variable x which is raised to the power of sz. It is assumed that there were 8 zeros in the multiplier, which leads to the fact that the multiplication shift value sz has been 8. To obtain a modular reduction, the modulus N must reach the order of magnitude of the shifted intermediate result polynomial Z'. According to the invention, the modulus polynomial N is to be shifted that far that the degree of the shifted intermediate result polynomial Z' and the degree of the shifted modulus polynomial N are equal. As can be seen from Fig. 8b, a reduction shift value sN equal to 3 is required for this. It can also be seen from Fig. 8b that finding out sN can really only be performed when sz has been calculated, which means that a parallel execution of blocks 210 and 212 of Fig. 2, as is preferred for the present invention, is not possible. For this reason, the auxiliary shift parameter si is introduced. As can be seen from Fig. 8a, the auxiliary 25 shift parameter si equals the difference of the degree of the intermediate result polynomial 2, and the modulus polynomial N. It is an advantage of si that this value can be calculated without sz knowing the current step. It can be seen from Fig. 8c that sz always equals the sum of si and sN. sN is thus always dependent on sz and si in such a way that the following equation applies: SN = Sz  Si' . The timeconsuming iterative method for determining sN can thus be split into a timeconsuming iterative method for determining si (loop 416) and a fast difference operation (block 422 in Fig. 4). Thus, a nearly parallel execution of both lookahead methods is possible/ the only serial component being that before calculating block 422 (Fig. 4), the actual value of sz has been calculated and provided by the multiplication lookahead algorithm (arrow 230 in Fig. 2) . As has already been explained, an essential advantage of the inventive concept for calculating the modular multiplication over GF(2n) is based on the fact that it can be integrated into the already existing long number calculating unit for the ZDN method. Fig. 5 shows a part of an inventively adapted threeoperands calculating unit for performing the threeoperands addition with Z, aC and bN. In Fig. 5, three bit slices [i], [i1], [i2] connected to one another are illustrated. Each bit slice includes a threebit counter 500 and a full adder 510 to obtain, on the output side, a bit Z[i], Z[i1] and Z[i2], respectively of the 26 updated intermediate result polynomial. The full adder further has a carry output for the carry input of the next higher full adder. If, for example, polynomials with the degree of 200 are processed, 200 bit slice adders of Fig. 5 must be connected in parallel. To modify a bit slice of Fig. 5 for GF(2n), as is shown in Fig. 5, an AND gate 520 must be inserted between the upper output of the threebit counter and the second lowest input of the full adder of the next higher stage. If a 0 is fed into the enable input 530, the value x will always be 0. The function of the full adder 510 then always degenerates to an addition of y and 0. In the case of Z/NZ, the enable input of the AND gate, however, is provided with a "1" so that the AND gate has no further effect. In GF(2n) the output of the AND gate is thus 0. In Z/NZ however, X is required, wherein the output of the AND gate can be unequal to 0. The enabling is thus realised by an AND gate. The addition in the full adder, however, becomes trivial for the case of GF(2n), that is a 0 at the enable input 530. Fig. 6 shows the situation at the AND gate 520. The calculating unit, which is partially illustrated in Fig. 5, will have the effect of a normal adder if the enable signal SC = 1. It will, however, have the effect of an XOR circuit if the enable signal is SC = 0. Fig. 7 shows a schematic block diagram of a calculating unit for Z/NZ and GF(2n). The calculating unit is grouped around the long number arithmetic unit 700 which performs the three 27 operands operation already described either for Z/NZ or GF(2n) . The calculating unit further includes a Z/NZ control unit 710 as well as a GF{2n) control unit 720 and a mode selection apparatus 730. If the calculating unit is to calculate operations modulo an integer, the mode selection 730 will control the arithmetic unit 7 00 in such a way that a true addition operation is performed, the arithmetic unit being connected to the Z/NZ control unit 710 on the input side and the output side. If, however, the calculating unit is to operate a GF(2n) arithmetic, the mode selection 730 will activate the arithmetic unit 7 00 in such a way that instead of an addition, an XOR operation is performed and that the input and the output of the arithmetic unit are connected to the GF(2n) control unit. Thus, separate arithmetic units are no longer required to house both an integer modulo arithmetic as well as a 0 polynomial modulo arithmetic in one calculating unit. It is pointed out that due to the fact that the threeoperands operation is performed for all bits in parallel most of the chip space is consumed by for the arithmetic unit 700, while the further smaller calculations which are to be carried out in the control unit 710 and 720 can be managed with far shorter numbers so that, as far as the bit area is concerned, this hardly makes a difference. In contrast to a calculating unit which required an individual calculating unit for both the integer arithmetic as well as the polynomial arithmetic, the inventive concept for calculating the modular multiplication thus allows a 28 reduction of the chip area by almost 50%. Especially for Smart Cards this significant saving in chip area leads to considerable competitive advantages. 29 We Claim 1. Method for modular multiplying a multiplicand (C) by a multiplier (M) using a modulus (N), said multiplicand (C), said multiplier (M) and said modulus (N) being polynomials of a variable (x) from the body GF (2"), within a cryptographic calculation, said multiplicand (C), said multiplier (M) and said modulus (N) being parameters in said cryptographic calculation, said method comprising the following steps: (a) performing (210) a multiplication lookahead method to obtain a multiplication shift value (sz), said multiplication shift value (sz) being incremented at a power of said multiplier, which is not present in the multiplier polynomial; (b) multiplying (214) said variable (x) raised to the power of said multiplication shift value (sz) by an intermediate result polynomial (Z) to obtain a shifted intermediate result polynomial (Z'); (c) performing a reduction lookahead method (212) to obtain a reduction shift value (sN), said reduction shift value (sN) being equal to the difference of the degree of said shifted intermediate result polynomial (Z) and the degree of said modulus polynomial (N); (d) multiplying (216) said variable (x) raised to the power of said reduction shift value (sN) by said modulus polynomial (N) to obtain a shifted modulus polynomial (N'); 30 (e) summing (218) said shifted intermediate result polynomial (Z') and said multiplicand (C) and subtracting said shifted modulus polynomial (N') to obtain an updated intermediate result polynomial (Z); and (f) repeating (226) steps (a) to (e) until all the powers of said multiplier (M) have been processed have been processed so that a multiplication result is obtained, which is a parameter of the cryptographic calculation, wherein in the repetition of steps (a) to (e) in step (d) said updated intermediate result polynomial (Z) of the previous step (e) is used as said intermediate result polynomial (Z), and in step (c) said shifted polynomial of the previous step (d) is used as a modulus polynomial (N). 2. Method as claimed in claim 1, wherein said multiplying (210) in step (d) is carried out by shifting said intermediate result polynomial (Z) by a number of digits equalling said multiplication shift value (sz), and wherein said multiplying (216) in step (d) is carried out by shifting said modulus polynomial (M) by a number of digits equalling said reduction shift value (sN). 31 3. Method as claimed in claims 1 or 2, wherein coefficients of said polynomials can only take the values "0" or "1", and wherein said summing and subtracting (218) in step (e) is carried out by bitwise XORing said intermediate result polynomial (Z'), said multiplicand (C) and said shifted modulus polynomial (N'). 4. Method as claimed in one of the preceding claims, wherein said step of said reduction lookahead method (212) to obtain a reduction shift value (sN) comprises the following steps: determining (414) an auxiliary shift value (si) so that the degree of said modulus polynomial (N) and the degree of said updated intermediate result polynomial (2) of the previous step (e) multiplied by a variable which is raised to the power of said auxiliary shift value (si) are equal, and forming (422) the difference of said multiplication shift value (sz) and said auxiliary shift value (si) to obtain said reduction shift value (SN). 5. Method as claimed in claim 4, wherein said step of performing said multiplication lookahead method (210) and said step of determining (414) said auxiliary shift value (Si) are carried out parallel to each other. 6. Method as claimed in one of the preceding claims, wherein said multiplication shift value (sz) is limited to a maximum multiplication shift value (k), 32 wherein said step of performing (210) said multiplication lookahead method comprises the following steps: if said multiplication shift value equals said maximum multiplication shift value (k), equating said multiplication shift value (sz) with said maximum shift value (k), creating (306, 324) a multiplication lookahead parameter (a) with a predetermined value, and wherein said step of summing comprises the following steps: if said multiplication lookahead parameter (a) has said predetermined value, summing only said shifted intermediate result polynomial (Z') and said shifted modulus polynomial (N'). 7. Apparatus for modular multiplying a multiplicand (C) by a multiplier (M) using a modulus (N), said multiplicand (C), said multiplier (M) and said modulus (N) being polynomials of a variable (x) from the body GF (2"), within a cryptographic calculation, said multiplicand (C), said multiplier (M) and said modulus (N) being parameters in said cryptographic calculation, said apparatus comprising: 33 (a) means for (210) performing a multiplication lookahead method to obtain a multiplication shift value (sz), said multiplication shift value (sz) being incremented at a power of said multiplier, which is not present in the multiplier polynomial; (b) means for (214) multiplying said variable (x) which is raised to the power of said multiplication shift value (sz) by an intermediate result polynomial (Z) to obtain a shifted intermediate result polynomial (Z'); (c) means for (212) performing a reduction lookahead method to obtain a reduction shift value (SN), said reduction shift value (SN) being equal to the difference of the degree of said shifted intermediate result polynomial (Z) and the degree of said modulus polynomial (N); (d) means for (216) multiplying said variable (x) which is raised to the power of said reduction shift value (sN) by said modulus polynomial (N) to obtain a shifted modulus polynomial (N'); (e) means for (218) summing said shifted intermediate result polynomial (Z') and said multiplicand (C) and subtracting said shifted modulus polynomial (N') to obtain an updated intermediate result polynomial (Z); and (f) means for (226) repeatedly controlling said means (a) to (e) until all the powers of said multiplier (M) have been processed, wherein in a repeated control of said means (a) to (e) 34 said means (214) for multiplying to obtain a shifted intermediate result polynomial is arranged to use said updated intermediate result polynomial (Z) of the previous control of said means (218) for summing as an intermediate result polynomial (Z), and said means (212) for performing a reduction lookahead method is arranged to use, in a repeated control, as the modulus polynomial (N), said shifted modulus polynomial of the previous control of said means (216) for multiplying to obtain a shifted modulus polynomial. 8. Apparatus as claimed in claim 7, wherein said means for (214) multiplying to obtain a shifted intermediate result polynomial (Z') and said means for (216) multiplying to obtain a shifted modulus polynomial (N') are implemented as controllable shift registers to perform, depending on said multiplication shift value (sz) or on said reduction shift value (sN), a shift of the register contents by a corresponding number of digits. 9. Apparatus as claimed in claims 7 or 8, wherein said means for (218) summing and for subtracting comprises a bitwise XORing device for XORing said intermediate result polynomial (Z'), said multiplicand (C) and said shifted modulus polynomial (N'). 10. Apparatus as claimed in claims 7 or 8, wherein said means for (218) summing and subtracting comprises: 35 a counter (500) with three input lines and two output lines, wherein a bit of said intermediate result polynomial (Z) can be applied to a first input line, wherein a bit of said multiplicand (C) can be applied to a second input line, and wherein a bit of said shifted modulus polynomial (N') can be applied to a third input line; a full adder (510) with three inputs and one output, a loworder output of said counter (500) being connected to a higher order input line of said full adder (510); a switch (520) connected between a higher order output line of said counter (500) and a middle input of a full adder (510) for a higher order bit; and a control unit (530) for opening said switch (520) when polynomials are to be processed. 11. Apparatus as claimed in claim 7, wherein a calculating unit is formed as the apparatus for multiplying the multiplicand by the multiplier using the modulus, the calculating unit additionally being formed for multiplying a multiplicand integer by a multiplier integer using a modulus integer, wherein the means for summing is formed as a threeoperands adder (700) comprising a carry disabling means (730), the means for summing being arranged for combining either integer operands or polynomial operands, 36 wherein the apparatus comprises a control means (730) for controlling said carry disabling means so that a carry is deactivated when the polynomial operands are processed by the means for summing and so that the carry is activated when the integer operands are processed by the means for summing. 12.Apparatus as claimed in claim 11, wherein said threeoperands adder (700) having the carry disabling means comprises: a counter (500) with three input lines and two output lines, wherein a bit of an intermediate result polynomial is applicable to a first input line, wherein a bit of said multiplicand (C) polynomial is applicable to a second input line, and wherein a bit of said shifted modulus polynomial is applicable to a third input line; a full adder (510) with three inputs and one output, a loworder output of said counter (500) being connected to a higher order input line of said full adder (510); a switch (520) being connected between a higher order output line of said counter (500) and a middle input of a full adder (510) for a next higher bit; and a control unit (530) for opening said switch (520) when polynomials are to be processed. 37 13.Apparatus as claimed in claim 12, wherein a plurality of threeoperands adders are formed, the number of threeoperands adders being greater than or equal to the number of digits of the modulus integer or the modulus polynomial. The invention relates to a method for modular multiplying a multiplicand (C) by a multiplier (M) using a modulus (N), said multiplicand (C), said multiplier (M) and said modulus (N) being polynomials of a variable (x) from the body GF (2"), within a cryptographic calculation, said multiplicand (C), said multiplier (M) and said modulus (N) being parameters in said cryptographic calculation, said method comprising the following steps: performing (210) a multiplication lookahead method to obtain a multiplication shift value (sz), said multiplication shift value (sz) being incremented at a power of said multiplier, which is not present in the multiplier polynomial; multiplying (214) said variable (x) raised to the power of said multiplication shift value (sz) by an intermediate result polynomial (Z) to obtain a shifted intermediate result polynomial (Z'); performing a reduction lookahead method (212) to obtain a reduction shift value (sN), said reduction shift value (SN) being equal to the difference of the degree of said shifted intermediate result polynomial (Z) and the degree of said modulus polynomial (N); multiplying (216) said variable (x) raised to the power of said reduction shift value (sN) by said modulus polynomial (N) to obtain a shifted modulus polynomial (N'); summing (218) said shifted intermediate result polynomial (Z') and said multiplicand (C) and subtracting said shifted modulus polynomial (N') to obtain an updated intermediate result polynomial (Z); and repeating (226) steps (a) to (e) until all the powers of said multiplier (M) have been processed have been processed so that a multiplication result is obtained, which is a parameter of the cryptographic calculation, wherein in the repetition of steps (a) to (e) in step (d) said updated intermediate result polynomial (Z) of the previous step (e) is used as said intermediate result polynomial (Z), and in step (c) said shifted polynomial of the previous step (d) is used as a modulus polynomial (N). 

00881kolnp2003correspondence.pdf
00881kolnp2003description(complete).pdf
00881kolnp2003letters patent.pdf
00881kolnp2003priority document.pdf
881kolnp2003grantedabstract.pdf
881kolnp2003grantedclaims.pdf
881kolnp2003grantedcorrespondence.pdf
881kolnp2003granteddescription (complete).pdf
881kolnp2003granteddrawings.pdf
881kolnp2003grantedform 1.pdf
881kolnp2003grantedform 18.pdf
881kolnp2003grantedform 2.pdf
881kolnp2003grantedform 3.pdf
881kolnp2003grantedform 5.pdf
881kolnp2003grantedgpa.pdf
881kolnp2003grantedletter patent.pdf
881kolnp2003grantedreply to examination report.pdf
881kolnp2003grantedspecification.pdf
881kolnp2003grantedtranslated copy of priority document.pdf
Patent Number  212731  

Indian Patent Application Number  881/KOLNP/2003  
PG Journal Number  50/2007  
Publication Date  14Dec2007  
Grant Date  12Dec2007  
Date of Filing  10Jul2003  
Name of Patentee  INFINEON TECHNOLOGIES AG.  
Applicant Address  ST. MARTIN STRASSE 53, 81669 MUNCHEN  
Inventors:


PCT International Classification Number  G 06 F 7/72  
PCT International Application Number  PCT/EP02/00719  
PCT International Filing date  20020124  
PCT Conventions:
