RSA Encryption Algorithm

The essence is that a user has a private key (essentially two integers, d and n) of which d is revealed to nobody else. Corresponding with this private key is a public key (also essentially two integers, e and n) that is revealed to anyone who wishes to know it.

The RSA algorithm (the one we and the browsers currently use) works because:
            M = E d mod n
                 where E = M e mod n       and       0 < M < n       and       0 < E < n
     Here we think in terms of M being a message, and E being the encrypted form of it. However, the symmetry of it all means that anything encrypted with one key can be decrypted by the other. The really clever bit is that even if you know e and n, you can only find d if you can find the two prime numbers which multiplied together give n. This is not easy for large values of n.

Encryption works by a system encrypting a message using the recipient's public key. Only someone who possesses the private key can read the message.

Authentication works by a system confirming that the user knows the private key. This is done by taking the user's public key, using it to encrypt a test string, and asking the user's to machine to decrypt it and send it back. Only a machine that knows the user's private key can do this.

Why does this magic work?

From the RSA standard (PKCS#1):
This section describes RSA key generation.

Each entity shall select a positive integer e as its public exponent.

Each entity shall privately and randomly select two distinct odd primes p and q such that ( p – 1 ) and e have no common divisors, and ( q – 1 ) and e have no common divisors.

The public modulus n shall be the product of the private prime factors p and q:
           n = p q

The private exponent shall be a positive integer d such that  d e – 1   is divisible by both   p – 1   and   q – 1.

If we take a message M and encrypt it to produce E, and then decrypt it to produce M', to prove (show?) that the algorithm works we need to show that M' = M.
           M' = E d mod n
                = ( M e mod n ) d mod n
                = ( M e ) d mod n
                = M  e d mod n
                = M × M ( e d – 1 ) mod n

So the algorithm will work if:
           M ( e d – 1 ) mod n   =   1
                for all values of M that can be used. (Clearly it does not work for M = 0.)

Because e d – 1   is a multiple of   p – 1 , this will be true if:
           M ( p – 1 ) mod n   =   1



Fermat's Little Theorem states (proof see below):
           M p mod p   =   M mod p
and for values of M that are not divisible by p:
           M ( p – 1 ) mod p   =   1

Therefore:        M ( e d – 1 ) mod p   =   1
and:                 M ( e d – 1 ) mod n   =   1 + p Np       where    Np < q
Similarly:          M ( e d – 1 ) mod n   =   1 + q Nq       where    Nq < p
But p and q are distinct odd primes, so for:       p Np = q Nq
we must have:      Np = Nq = 0
Therefore:
           M ( e d – 1 ) mod n   =   1   Q.E.D.

Funny values of M.

The previous proof works for all values of M which are not divisible by either p or q. We only need consider the case of M beiong a multiple of p, as the same argument can apply to q just by swapping over p and q.

We need to look again at:
           M' = M × M ( e d – 1 ) mod n
We consider M = k p, remembering that ( e d – 1 ) = N ( q – 1 ) and n = p q . Thus we want to evaluate:
           M' = k p (k p) N ( q – 1 ) mod n

Because the whole expression is a multiple of p, we can take the modulus with respect to q of the other part, knowing that we are discarding a multiple of n = p q. Thus we have:
           M' = k p [ (k p) N ( q – 1 ) mod q ] mod n

The section in square brackets will be one unless (k p) N happens to be a multiple of q. However, this cannot be, because p is prime and thus not a multiple of q, and k < q, because M < n.

Therefore we have:     M' = k p mod n     from which we can discard mod n, because:     M = k p < n,     leaving:
           M' = k p = M    Q.E.D.


Fermat's Little Theorem: M p mod p   =   M mod p

The method consists of proving that if the theorem is true for M it is true for M + 1. Given that it is trivially true for M = 1, we will then be able to deduce that it is true for bigger values of M.

The binomial theorem tells us:
       ( M + 1 ) p   = M p   +   p M (p–1)   +   ½ p(p–1) M (p–2)   +   . . . . .   +   pCr M (p–r)   +   . . . . .   +   p M   +   1
The numbers   pCr   are called binomial coefficients, and are of the form:
            ( p × (p–1) × (p–2) . . . (pr+1) ) / ( 1 × 2 × 3 . . . r )
In the above equation for   ( M + 1 ) p   all the terms except for the first and last are multiples of p. There is no possibility of p cancelling with anything in the denominator because it is prime.

Thus:        ( M + 1 ) p   mod p   = ( ( M p mod p ) + 1 )   mod p
So if     M p mod p   =   M     is true for some value of M, then it is also true for M + 1. However, it is obviously true for M = 1, therefore for all values of M:
           M p mod p   =   M mod p          Q. E. D.


The Binomial Theorem

This theorem states that:
           ( x + 1 ) n   = x n   +   n x (n–1)   +   ½ n(n–1) x (n–2)   +   . . . . .   +   nCr x (n–r)   +   . . . . .   +   n x   +   1
       where
           nCr   =   n × (n–1) × (n–2) × . . . × (nr+1)
      1 × 2 × 3 × . . . × r

Imagine   ( x + 1 )   written out n times.
The coefficient of  x (n–r)  is obtained by choosing the 1 term out of r of the terms, and the x term out of   n – r   of the terms.
There are   n   ways of choosing the first one, and then   n – 1   ways of choosing the second one, and then   n – 2   ways of choosing the third one, and so on.
However, say our first choice is the ninth term, and then take it in turn the all the other terms in order to get the   x (n–2)   term, we will later take the tenth term with all the others, including the ninth term. In fact, we shall include every pairing twice; hence the factor ½.
When we calculate the   x (n–3)   term we find that we have taken each possible arrangement of each choice, when we only wanted one. There are 6 ways of arranging 3 objects, and in general the number of ways of arranging r objects is:
           1 × 2 × 3 × . . . × r
           which is why we divide by this number in calculating   nCr.


, Tutorial on Chinese remainder theorem