Primality - Factorization

Transcription

Primality - Factorization
Primality - Factorization
Christophe Ritzenthaler
November 9, 2009
1
Prime and factorization
Definition 1.1. An integer p > 1 is called a prime number (nombre premier) if it
has only 1 and p as divisors.
Example 1. There are infinitely many prime numbers. The biggest generic one is
(((((((((23 + 3)3 + 30)3 + 6)3 + 80)3 + 12)3 + 450)3 + 894)3 + 3636)3 + 70756)3 + 97220
Interested readers may read http: // www. cs. uwaterloo. ca/ journals/ JIS/ VOL8/
Caldwell/ caldwell78. html for the origin of this number. It has 20,562 decimal digits
and the proof was built using fastECPP on several networks of workstations.
We will write P the set of prime numbers. To estimate the efficiency of some algorithms, we need results on density of primes.
Theorem 1.1. Let π(x) = #{p ≤ x prime}. One has
π(x) ∼
x
.
log x
Let n ≥ 2 be an integer and c an integer prime to n. Let πn,c (x) = #{p ≤ x prime, p =
kn + c}. One has
1
x
πn,c (x) ∼
.
φ(n) log x
To find a prime number, the number of attempts is then of the size of x. Indeed,
the probability to fail in k attempts is (1 − 1/ log(x))k so the probability to succeed
1 − (1 − 1/ log(x))k ∼ 1 − e−k/ log(x)
which is closed to 1 for any k = log(x)1+ .
Remark 1. For x ≥ 17, one has π(x) > x/ log x and for x > 1 one has π(x) <
1.25506(x/ log x).
Let us finish with the fundamental result.
1
Theorem 1.2. Every integer a > 1 can be written as the product of prime numbers
Y
a=
pe(p)
p∈P
with e(p) ≥ 0 and e(p) = 0 except for finitely many primes p. Up to permutation, the
factors in this product are uniquely determined.
2
Prime numbers
To produce big prime numbers is very important for cryptographic applications. For a
given n, there is no generic algorithm which can compute a random prime number less
than n. However a result by Hadamard and de la Vallée Poussin shows that #{p ≤
n prime} ∼ logn n (see 1). So the usual method is to pick random numbers and to test if
they are prime or not. This requests that we have fast algorithms to test primality.
2.1
2.1.1
Primality test
Trial division
The simplest algorithm is based on the following result.
Proposition 2.1. n is a composite number if and only if it has a prime divisor p such
√
that p ≤ n.
√
Proof. Since n is composite, n = ab and either a or b is smaller than n.
√
This proposition suggests that one can try all prime numbers less or equal to n using
Eratosthenes sieve. Following the estimate density of prime (see 1), it means that we
√
√
1
make up to n/ log n divisions, leading to an exponential algorithm in O(e( 2 −) log n ).
2.1.2
Fermat test and Carmichael numbers
By Fermat little theorem, one knows that if n is a prime number then an−1 ≡ 1 (mod n)
for all a ∈ Z coprime with n. If the theorem was an equivalence, we would have an easy
polynomial algorithm to test if a number is a prime. Unfortunately
Example 2. Consider n = 341 = 11 · 31. One has 2340 ≡ 1 (mod 341). Such a number
is called pseudo-prime (pseudo-premier) in base 2.
We can prove that there are infinitely many pseudo-primes in base 2 by showing that if
n is such a number then 2n − 1 also. Indeed because n is a pseudo-prime in base 2 one
has n|2n−1 − 1, i.e. there is c such that nc = 2n−1 − 1. Now
22
n −1−1
n−1 −1)
− 1 = 22(2
− 1 = 22nc − 1.
The last expression is divisible by 2n − 1 so
n −1−1
22
≡1
(mod 2n − 1).
2
To finish the proof, one has to show that 2n − 1 is not a prime. Since n = ab, 2n − 1 is
divisible by 2a − 1.
An idea is then to change the value of a : for instance 3340 ≡ 56 (mod 341). Unfortunately, there are numbers that are pseudo-prime in any base. Such numbers are
called Carmichael numbers (for instance 561 = 3 · 11 · 17). It has been shown by Alford,
Granville and Pomerance in 1994 that there are infinitely many Carmichael numbers so
Fermat test cannot be completely sure. Let us show some properties of these numbers.
Proposition 2.2. An (odd) composite number n ≥ 3 is a Carmichael number if and
only if it is square free and for each prime divisor p of n, p − 1 divides n − 1.
Proof. First it is easy to see that a Carmichael number is odd : indeed (−1)n−1 ≡ 1
(mod n) if and only if n is odd.
Let a be a Carmichael number, for any a prime to n one has
an−1 ≡ 1
(mod n).
Let p be a prime divisor of n. There exists a primitive element modulo p that is prime
to n. Indeed, let a a primitive element modulo p and n = pr · m with m coprime to
p. There exists an element (still denoted a) in Z/pr Z lifting the initial a (because the
morphism Z/pr Z → Z/pZ est surjectif). We find s ∈ Z/mZ coprime to m and since
Z/nZ ' Z/pr Z × Z/mZ we construct the element a ∈ Z/nZ image of (a, s). Such an
element satisfies the properties for a. Now, one has of course an−1 ≡ 1 (mod p) but as
a is primitive p − 1 divides n − 1.
Now suppose that n = p2 m and write a = 1 + pm. One has
ap ≡ 1 + p2 m + . . . ≡ 1
(mod n)
So the order of a is p. But p does not divide n − 1 (p|n) so we get a contradiction.
Conversely, let n be a square-free integer such that p − 1 divides n − 1 for all prime
divisors p of n. Let a be prime to n one has
ap−1 ≡ 1
(mod p)
and because n − 1 is a multiple of p − 1,
an−1 ≡ 1
(mod p).
Using the Chinese Remainder theorem for all the factors p, one gets
an−1 ≡ 1
(mod n).
Corollary 2.1. Any Carmichael number is the product of at least 3 distinct odd primes.
3
Proof. Because a Carmichael number is without square factor and is not prime it has
at least two prime factors. Let us assume that n = pq with p < q. Then q − 1 divides
pq − 1 = p(q − 1) + p − 1 so q − 1 divides p − 1. Absurd.
Example 3. Show that if 6m + 1, 12m + 1 and 18m + 1 are primes then n = (6m +
1)(12m+1)(18m+1) is a Carmichael number. First by the Chinese Remainder theorem,
one can see that if n = ab with a, b coprime then for any x prime to n one has
xlcm(φ(a),φ(b)) ≡ 1
(mod n).
Now lcm(φ(6m + 1), φ(12m + 1), φ(18m + 1)) = 36m and also 36m|n − 1. One can check
that 1729 is such a number.
2.1.3
Lucas test
Let n > 1 be an integer. We will show that if there exists an a such that an−1 ≡ 1
(mod n) and aq 6≡ 1 (mod n) for all q|n − 1, q 6= n − 1, then n is prime. This is a
m
very good test for Fermat numbers Fm , i.e. numbers of the form n = 22 + 1 (For
m = 0 . . . 32 only the first five are prime. F33 is so big that it may be many years before
we can decide its nature). But obviously this test is not good for a generic prime since
we must know the factorization of n − 1.
Let assume that such an a exists and let d be the order of a in (Z/nZ)∗ . Since an−1 ≡ 1
(mod n), d|(n − 1). More exactly as no proper divisor of n − 1 is the order of a, one has
d = n − 1. Now n − 1 = d|φ(n). This is possible only if n is prime.
2.1.4
Rabin-Miller test
Contrary to the Fermat test, the Miller-Rabin test can prove the compositeness of any
composite number (i.e. there is no analog of Carmichael numbers for this test). But
Rabin-Miller test is a Monte-Carlo algorithm : it always stops ; if it answers yes, the
number is composite and if it answers no then the answer is correct with a probability
greater than 3/4.
Let n be an odd positive integer and s = max{r ∈ N, 2r |n − 1}. Let d = (n − 1)/2s .
Lemma 2.1 (Miller). If n is a prime and if a is an integer prime to n then we have
r
either ad ≡ 1 (mod n) or there exists r ∈ {0, . . . , s − 1} such that a2 d ≡ −1 (mod n).
Proof. The order of a is a divisor of n − 1. It can be d and then ad ≡ 1 (mod n). If
r
r−1
it is not then its order is 2r d for r ∈ {1, . . . , s}. So a2 d ≡ 1 (mod n) and a2 d is a
r−1
non-trivial square root of 1 so a2 d ≡ −1 (mod n).
If we find an a which is prime to n and that satisfies neither of the conditions, then
n is composite. Such an integer a is called a witness (témoin) for the compositeness of
n.
Example 4. Let n = 561. a = 2 is a witness for n. Indeed here s = 4, d = 35
and 235 ≡ 263 (mod 561), 22·35 ≡ 166 (mod 561), 24·35 ≡ 67 (mod 561), 28·35 ≡ 1
(mod 561).
4
For the efficiency of the Rabin-Miller test, it is important that there are sufficiently
many witnesses for the compositeness of a composite number.
Theorem 2.1 (Rabin). If n ≥ 3 is an odd composite number, then the set {1, . . . , n − 1}
contains at most (n − 1)/4 numbers that are prime to n and not witnesses for the
compositeness of n.
Proof. Let k be the maximum value of r for which there is anQ
integer a prime to n
that satisfies the second identity. We set m = 2k d. Let n = p pe(p) be the prime
factorization of n. Let
J
= {a : gcd(a, n) = 1, an−1 ≡ 1
m
K = {a : gcd(a, n) = 1, a
(mod n)}
(mod pe(p) ) for all p|n}
≡ ±1
L = {a : gcd(a, n) = 1, am ≡ ±1
M
= {a : gcd(a, n) = 1, am ≡ 1
(mod n)}
(mod n)}.
We have M ⊂ L ⊂ K ⊂ J ⊂ (Z/nZ)∗ . For each a which is not a witness for the
compositeness of n, the residue class a belongs to L. We will prove that the index of L
in (Z/nZ)∗ is at least four.
The index of M in K is a power of 2. Indeed one can write
M
x
→K →M
7→ x
7→ x2 .
Let denote I the image of the morphism s : x → x2 from the multiplicative group K. s
has kernel a group of order 2j for some j so #I = #K/2j . Now since #I divides #M
we can write #M = #I · a and [K : M ] = 2j /t and is a power of 2. Let’s say 2j . If
j ≥ 2 then we are finished.
If j = 1 (i.e. [L : K] = 2) then n has two prime divisors. It follows from Cor. 2.1 that n
is not a Carmichael number. This implies that J is a proper subgroup of (Z/nZ)∗ and
the index of J in (Z/nZ)∗ is at least 2. Therefore the index of L in (Z/nZ)∗ is at least
4.
Finally, let j = 0. Then n is a prime power, say n = pe with e > 1. But φ : (Z/nZ)∗ →
Z/(p − 1)Z × Z/pe−1 Z is an isomorphism. As n − 1 is prime to p an−1 ≡ 1 (mod n) if
and only if φ(a) = (µ, 0). So [(Z/nZ)∗ : J] = #Z/pe−1 Z = pe−1 . This is bigger than 4
except for n = 9 which can be checked by hand.
To apply the Rabin-Miller test, we choose a random number a ∈ {2, . . . , n − 1}. If
s−1
gcd(a, n) > 1 then n is composite. Otherwise we compute ad , a2d , . . . , a2 d . If we find
a witness for the compositeness of n, then we have proved that n is composite. By Th.
2.1, the probability that n is composite and that a is not a witness is less than 1/4. So
if we repeat the test t times we can make this probability less than (1/4)t . For t = 10
this probability is less than 10−6 .
Remark 2. Under the Generalized Riemann hypothesis (which is conjectural but believed
true), it can be proved that there is always a witness for the compositeness of n with
5
a ≤ O((log n)2 ).
If we want a absolute test, Adleman, Pomerance, Rumely, Cohen and Lenstra have given
an algorithm (APRCL) which is slower but still feasible on numbers of 1000 digits (it
C log log |n|2
runs in O(|n|2
)).
In 2002, M. Agrawal, N. Kayal and N. Saxena have found a deterministic polynomial
algorithm to solve the problem of primality.
3
Factorization
Now given an n that is known to be composite, how can we find its decomposition in
prime factors ? We are going to present algorithms to obtain a non-trivial factor. By
repeating inductively the algorithm, we can then factorize the number.
3.1
Trial division
To find small prime factors of n, a precomputed table of all prime numbers below a
fixed bound B is computed. This can be done using the sieve of Eratosthenes. A typical
bound is B = 106 .
Example 5. We want to factor n = 321 + 1. Trial division with primes less than 50
yields the factors 22 , 72 , 43. If we divide n by those factors, we obtain m = 1241143.
Since 2m−1 ≡ 793958 (mod m), this number is still composite.
3.2
Pollard p − 1 method
This algorithm is efficient when n has a prime factor p such that p − 1 has only small
prime divisors. Indeed, by Fermat’s little theorem, one has
ak ≡ 1
(mod p)
for all multiple k of p − 1. If p − 1 has only small prime divisors, one can try
k=
Y
qe
q∈P,q e ≤B
where B is a given bound. Now if ak − 1 is not divisible by n, then gcd(ak − 1, n) is a
non-trivial factor of n.
Example 6. Let n = 1241143 of the previous example. We set B = 13. Then k =
8 · 9 · 5 · 7 · 11 · 13 and gcd(2k − 1, n) = 547. So n = 547 · 2269 which are both prime
numbers.
6
3.3
3.3.1
Quadratic sieve
Idea
The quadratic sieve finds integer x, y such that
x2 ≡ y 2
(mod n)
x 6≡ ±y
(mod n).
and
x2
y2
Then n is a divisor of
−
= (x − y)(x + y) but of neither x − y or x + y. Hence
g = gcd(x − y, n) is a proper divisor of n.
Example 7. Let n = 7429, x = 227, y = 210. Then x2 − y 2 = n, x − y = 17 so 17|n.
3.3.2
Determination of x and y
The idea from the previous section is also used in other factoring algorithms, such as
the number field sieve (NFS), but those algorithms have different ways of finding x, y.
We describe how x, y are found in the quadratic sieve.
√
Let m = b nc and
f (X) = (X + m)2 − n.
We first explain the procedure on an example.
Example 8. Let n = 7429. Then m = 86. One has
f (−3) = 832 − 7429 = −540 = −1 · 22 · 33 · 5,
f (1) = 872 − 7429 = 140 = 22 · 5 · 7,
f (2) = 882 − 7429 = 315 = 32 · 5 · 7.
This implies
832 ≡ −1 · 22 · 33 · 5
(mod 7429),
2
2
≡ 2 ·5·7
(mod 7429),
2
2
(mod 7429).
87
88
≡ 3 ·5·7
If the last two congruences are multiplied then we obtain
(87 · 88)2 ≡ (2 · 3 · 5 · 7)2
(mod n).
Therefore we can set x ≡ 87 · 88 (mod n) ≡ 227 and y ≡ 2 · 3 · 5 · 7 (mod n) ≡ 210.
In the example we have presented number s for which the value f (s) has only small
prime factors. Then we use the congruence
(s + m)2 ≡ f (s)
(mod n).
From those congruences, we select a subset whose products yields squares on the leftand the right-hand sides. The left-hand side of each congruence is a square anyway.
Also we know the prime factorization of each right-hand side. The product of a number
of right-hand sides is a square if the exponents −1 and all prime factors are even. In
the next section, we explain how an appropriate subset of congruences is chosen.
7
Table 1: Factor base and sieving
# decimal digits of n
# factor base in thousand
# sieving interval in million
3.3.3
50
3
.2
60
4
2
70
7
5
80
15
6
90
30
8
100
51
14
110
120
16
120
245
26
Choosing appropriate congruences
The selection process is controlled by coefficients λi ∈ {0, 1}. If λi = 1 the congruence
i is chosen; otherwise it is not. The product of the right hand sides of the chosen
congruences is
(−1 · 22 · 33 · 5)λ1 · (22 · 5 · 7)λ2 · (32 · 5 · 7)λ3
λ1
(−1)
2λ1 +2λ2
·2
3λ1 +2λ3
·3
λ1 +λ2 +λ3
·5
·7
λ2 +λ3
=
.
We want this number to be a square, so we have to solve the following linear system:
λ1 ≡ 0
(mod 2)
2λ1 + 2λ2 ≡ 0
(mod 2)
3λ1 + 2λ3 ≡ 0
(mod 2)
λ1 + λ2 + λ3 ≡ 0
(mod 2)
λ2 + λ3 ≡ 0
(mod 2).
A solution is λ1 = 0, λ2 = λ3 = 1.
In general we choose a positive integer B. Then we look for integers s such that f (s)
has only prime factors that belong to the factor base
F (B) = {p ∈ P, p ≤ B} ∪ {−1}.
Such values f (s) are called B-smooth. If we have found as many values for s as the
factor base has elements, then we try to solve the corresponding linear system over
Z/2Z. Faster algorithms than Gauss algorithm exist in this case.
3.3.4
Sieving
It remains to be shown how the values of s are found for which f (s) is B-smooth. One
possibility is to compute the value f (s) for s = 0, ±1, ±2, . . . and to test by trial division
whether f (s) is B-smooth. Unfortunately, those values typically are not B-smooth. This
is very inefficient as the factor base is large for large n (see Tab. 1). A more efficient
method is to use sieving techniques, which are described as follows.
We explain a simplified version that shows the main idea. We fix a sieving interval
S = {−C, −C +1, . . . , 0, 1, . . . , C}. We want to find all s ∈ S such that f (s) is B-smooth.
8
To find out which of the values f (s) is divisible by a prime number p in the factor base,
we start from the end. We fix a prime p. The equation f (s) ≡ 0 (mod p) has two
solutions si,p which can be computed quickly. Then we try to find values si,p + kp ∈ S.
After each step, we divide the corresponding f (s) by p. Prime powers can be treated
similarly.
Example 9. Let n = 7429, m
sieve interval, we use the set S
s
−3
−2
(s + m)2 − n −540 −373
Sieve with 2 −135
Sieve with 3
−5
Sieve with 5
−1
Sieve with 7
= 86. The factor base is the set {2, 3, 5, 7} ∪ {−1}. As
= {−3, . . . , 3}.
−1
0
1
2
3
−204 −33 140 315 492
−51
35
123
−17 −11
35
41
7
7
1
1
Remark 3. The optimum size of the factor base is roughly
√
√2/4
log n log(log n)
B= e
and the sieving interval is in C = B 3 . The heuristic running time is Ln (1/2, 1). The
fastest current algorithm is NFS which is in Ln (1/3, (64/9)1/3 ).
9