Fast Modular Multiplication

A modular multiplication requires a remainder operation, which is a slow operation if the modulus is a general integer. For example, contemporary processors can multiply two 64-bit integers, producing a 128-bit result, with a latency of 3 or 4 clock cycles. But, dividing a 128-bit integer by a 64-bit integer, producing a 64-bit quotient and a 64-bit remainder, is considerably slower (tens of clock cycles).

If the modulus is a power of two, say 2n2^n,the remainder operation is very fast; the remainder is just the last n bits of the number being remaindered. In 1985, Peter Montgomery came up with a beautiful way to explore this to efficiently perform general remaindering operations without performing an expensive division.

The Greatest Common Divisor

Let pkp_k be the kk-th prime number, so that p1=2,p2=3,p3=5p_1=2, p_2=3, p_3=5, and so on,

Each positive integer can be factored into prime factors in a unique way (this is the fundamental theorem of arithmetic).

Let a=Πk=1pkaka = \Pi^\infty_{k=1}p^{a_k}_k, where aka_k is the number of times pkp_k divides aa. Since aa is a finite number, almost all of the aka_k values will be zero.

Likewise of bb, let b=Πk=1pkbkb = \Pi^\infty_{k=1}p^{b_k}_k.

Then gcd(a,b)=Πk=1pkmin(ak,bk)gcd(a,b) =\Pi^\infty_{k=1}p^{min(a_k,b_k)}_k and lcm(a,b)=Πk=1pkmax(ak,bk)lcm(a,b) =\Pi^\infty_{k=1}p^{max(a_k,b_k)}_k

If gcd(a,b)=1gcd(a,b) = 1 then aa and bb are said to be relatively prime (or coprime).

The greatest common division can be generalized to a polynomial with integer coefficients!

Algorithm

Assume that a0a \ge 0 and that b0b \ge 0. Then:

  • gcd(a,b)=gcd(b,a)gcd(a,b) = gcd(b,a), and so gcd(a,b)=gcd(max(a,b),min(a,b))gcd(a,b) = gcd(max(a,b), min(a,b)). Thus, by exchanging aa with bb if necessary, we may assume that aba \ge b.

  • as any positive integer divides 0 we have gcd(a,0)=agcd(a,0) = a for a>0a \gt 0. The mathematicians say that gcd(0,0)=0gcd(0,0) = 0, and so we can say that gcd(a,0)=agcd(a,0) = a as long as a0a \ge 0.

  • If aba \ge b the gcd(a,b)=gcd(ab,b)gcd(a,b) = gcd(a-b,b). We can keep subtracting bb from (the updated) aa until it becomes smaller than bb, and so gcd(a,b)=gcd(amodb,b)=gcd(b,amodb)gcd(a,b) = gcd(a mod b, b) = gcd(b, a mod b).

These observations give rise to the following so-called Euclid's algorithm (coded in C, but it can easily be translated to another programming language):

long gcd(long a, long b)
{
    while(b != 0) { long c = a % b; a = b; b = c; } return a;
}

The GNU MP library has a function, mpz_gcd, for this; pari-gp does this with the gcd function.

Last updated