RSA algorithm

RSA (Rivest–Shamir–Adleman) stops people from understanding messages they are not allowed to read. A message only some people can understand is called an encrypted message. Any message can become an encrypted message. An encrypted message can be given to anyone because they will not understand what it says. To understand the encrypted message a person must first get the original message back. This is called decryption.

RSA uses two special numbers. One number is called the public key and allows anyone to write an encrypted message. The second number is called a private key and allows understanding the encrypted message (decryption). Both numbers belong together. They are a pair. RSA gives everyone a different pair of numbers to use.

Reading and writing encrypted messages is called cryptography. There are different kinds of cryptography. Cryptography that uses two numbers (keys), like RSA, is called asymmetric cryptography and public key cryptography.

Usually only reading encrypted messages is important. Anyone may write them. This is why the public key can be given to everyone and is called public. The private key needs to be kept secret.

RSA uses math that is easy to do but hard to undo. Someone who does not have the private key must do a lot of work to find it. Reading an encrypted message with the private key is easy.

The math takes a large number and finds numbers that can be multiplied to equal the large number. The smaller numbers are called factors. If a number has no factors it is called a prime number otherwise it is called a composite number. Finding the smallest factors for a large number is called prime factorization.

Concepts used

RSA uses these concepts from cryptography:

A one-way function that is easy to compute; but finding a way to reverse it, or computing the reverse is very hard.
A discrete logarithm, which works much like the normal logarithm, but only whole numbers are used, and in general, a modulus (a math operation that finds the remainder of a number when divided by another number) is used. As an example, a^x=b, modulo n. The discrete logarithm is about finding the smallest number x that makes this equation true, when a, b and n are known.
It is potentially solved quickly by Shor's algorithm.

Generating keys

The keys for the RSA algorithm are generated the following way:

Choose two different large random prime numbers [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math]. This should be kept secret.
Calculate [math]\displaystyle{ n = p q }[/math].
- [math]\displaystyle{ n }[/math] is the modulus for the public key and the private keys.
Calculate the totient: [math]\displaystyle{ \phi (n) = (p-1)(q-1) }[/math].
Choose an integer [math]\displaystyle{ e }[/math] such that [math]\displaystyle{ 1 \lt e \lt \phi(n) }[/math], and [math]\displaystyle{ e }[/math] is co-prime to [math]\displaystyle{ \phi(n) }[/math].
i.e.: [math]\displaystyle{ e }[/math] and [math]\displaystyle{ \phi (n) }[/math] share no factors other than [math]\displaystyle{ 1 }[/math]: [math]\displaystyle{ \gcd(e, \phi(n)) = 1 }[/math]; See greatest common divisor.
- [math]\displaystyle{ e }[/math] is released as the public key exponent.
Compute [math]\displaystyle{ d }[/math] to satisfy the congruence relation [math]\displaystyle{ d e \equiv 1 \!\!\!\!\pmod{\phi(n)} }[/math].
i.e.: [math]\displaystyle{ de = 1 + x\cdot\phi(n) }[/math] for some integer [math]\displaystyle{ x }[/math]. (Simply to say: Calculate [math]\displaystyle{ d = (1 + x\cdot\phi(n))/e }[/math] to be an integer)
- [math]\displaystyle{ d }[/math] is kept as the private key exponent.

Notes on the above steps:

Step 1: Numbers can be probabilistically tested for primality.
Step 3: changed in PKCS#1 v2.0 to [math]\displaystyle{ \lambda(n) = {\rm lcm}(p-1, q-1) }[/math] instead of [math]\displaystyle{ \phi(n) = (p-1)(q-1) }[/math].
Step 4: A popular choice for the public exponents is [math]\displaystyle{ e = 2^{16} + 1 = 65537 }[/math]. Some applications choose smaller values such as [math]\displaystyle{ e = 3 }[/math], [math]\displaystyle{ 5 }[/math], or [math]\displaystyle{ 35 }[/math] instead. This is done to make encryption and signature verification faster on small devices like smart cards but small public exponents may lead to greater security risks.
Steps 4 and 5 can be performed with the extended Euclidean algorithm [en]; see modular arithmetic.

The public key is made of the modulus [math]\displaystyle{ n\, }[/math] and the public (or encryption) exponent [math]\displaystyle{ e\, }[/math].
The personal key is made of p,q and the private (or decryption) exponent [math]\displaystyle{ d\, }[/math] which must be kept secret.

For efficiency a different form of the private key can be stored:
- [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math]: the primes from the key generation;
- [math]\displaystyle{ d\bmod(p - 1) }[/math] and [math]\displaystyle{ d\bmod(q - 1) }[/math]: often called dmp1 and dmq1;
- [math]\displaystyle{ q^{-1}\bmod{p} }[/math]: often called iqmp.
All parts of the private key must be kept secret in this form. [math]\displaystyle{ p\, }[/math] and [math]\displaystyle{ q\, }[/math] are sensitive since they are the factors of [math]\displaystyle{ n\, }[/math], and allow computation of [math]\displaystyle{ d\, }[/math] given [math]\displaystyle{ e\, }[/math]. If [math]\displaystyle{ p\, }[/math] and [math]\displaystyle{ q\, }[/math] are not stored in this form of the private key then they are securely deleted along with other intermediate values from key generation.
Although this form allows faster decryption and signing by using the Chinese Remainder Theorem (CRT) it is considerably less secure since it enables side channel attacks [en]. This is a particular problem if implemented on smart cards, which benefit most from the improved efficiency.
(Start with [math]\displaystyle{ y \equiv x^e \!\!\!\!\pmod{n} }[/math], and let the card decrypt that. So it computes [math]\displaystyle{ (y^d \bmod{p}) }[/math] or [math]\displaystyle{ (y^d \bmod{q}) }[/math], whose results give some value [math]\displaystyle{ z }[/math]. Now, induce an error in one of the computations. Then [math]\displaystyle{ \gcd(z-x,n) }[/math] will reveal [math]\displaystyle{ p }[/math] or [math]\displaystyle{ q }[/math].)

Encrypting message

Alice gives her public key [math]\displaystyle{ (n, e) }[/math] to Bob and keeps her private key secret. Bob wants to send message [math]\displaystyle{ M }[/math] to Alice.

First he turns [math]\displaystyle{ M }[/math] into a number [math]\displaystyle{ m }[/math] smaller than [math]\displaystyle{ n }[/math] by using an agreed-upon reversible protocol known as a padding scheme. He then computes the ciphertext [math]\displaystyle{ c\, }[/math] corresponding to:

[math]\displaystyle{ c = m^e \mod{n} }[/math]

This can be done quickly using the method of exponentiation by squaring. Bob then sends [math]\displaystyle{ c\, }[/math] to Alice.

Decrypting message

Alice can recover [math]\displaystyle{ m\, }[/math] from [math]\displaystyle{ c\, }[/math] by using her private key [math]\displaystyle{ d\, }[/math] in the following procedure:

Given [math]\displaystyle{ m\, }[/math], she can recover the original distinct prime numbers, applying the Chinese remainder theorem to these two congruences yields

[math]\displaystyle{ m^{ed} \equiv m \bmod{pq} }[/math].

Thus,

[math]\displaystyle{ c^d \equiv m \bmod{n} }[/math].

Therefore:

[math]\displaystyle{ m = c^d\ mod\ n }[/math]

A working example

Here is an example of RSA encryption and decryption. The prime numbers used here are too small to let us securely encrypt anything. You can use OpenSSL to generate and examine a real keypair.

1. Choose two random prime numbers [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q \, }[/math]:

[math]\displaystyle{ p = 61 }[/math] and [math]\displaystyle{ q = 53 \, }[/math];

2. Compute [math]\displaystyle{ n = p q \, }[/math]:

[math]\displaystyle{ n = 61 \times 53 = 3233 \! }[/math];

3. Compute the totient [math]\displaystyle{ \phi(n) = (p-1)(q-1) }[/math]:

[math]\displaystyle{ \phi(n) = (61 - 1)(53 - 1) = 3120 \! }[/math];

4. Choose [math]\displaystyle{ e\gt 1 }[/math] coprime to [math]\displaystyle{ 3120 \, }[/math]:

[math]\displaystyle{ e = 17 \, }[/math];

5. Choose [math]\displaystyle{ d\, }[/math] to satisfy [math]\displaystyle{ e d \equiv 1 \bmod{\phi(n)} }[/math]:

[math]\displaystyle{ d = 2753 \, }[/math], with [math]\displaystyle{ 17 \times 2753 = 46801 = 1 + 15 \times 3120 }[/math].

The public key is ([math]\displaystyle{ n=3233 }[/math], [math]\displaystyle{ e=17 }[/math]). For a padded message [math]\displaystyle{ m\, }[/math]the encryption function [math]\displaystyle{ c = m^e \bmod{n} }[/math] becomes:

[math]\displaystyle{ c = m^{17} \bmod 3233\, }[/math]

The private key is ([math]\displaystyle{ n=3233 }[/math], [math]\displaystyle{ d=2753 }[/math]). The decryption function [math]\displaystyle{ m = c^d \bmod{n} }[/math] becomes:

[math]\displaystyle{ m = c^{2753} \bmod 3233\, }[/math]

For example, to encrypt [math]\displaystyle{ m=123 }[/math], we calculate

[math]\displaystyle{ c = 123^{17} \bmod 3233 = 855 }[/math]

To decrypt [math]\displaystyle{ c = 855 }[/math], we calculate

[math]\displaystyle{ m = 855^{2753} \bmod 3233 = 123 }[/math]

Both of these calculations can be computed fast and easily using the square-and-multiply algorithm for modular exponentiation.

Deriving RSA equation from Euler's theorem

RSA can easily be derived using Euler's theorem and Euler's totient function.

Starting with Euler's theorem,

[math]\displaystyle{ m^{\phi (n)} \equiv 1 \pmod{n} }[/math]

we must show that decrypting an encrypted message, with the correct key, will give the original message. Given a padded message m, the ciphertext c, is calculated by

[math]\displaystyle{ c \equiv m^e \pmod n }[/math]

Substituting into the decryption algorithm, we have

[math]\displaystyle{ c^d \equiv (m^e)^d \equiv m^{ed} \pmod n }[/math]

We want to show this value is also congruent to m. The values of e and d were chosen to satisfy,

[math]\displaystyle{ ed \equiv 1 \pmod {\phi (n)} }[/math]

Which is to say, there exists some integer k, such that

[math]\displaystyle{ ed = k\times \phi(n) + 1 }[/math]

Now we substitute into the encrypted then decrypted message,

[math]\displaystyle{ \begin{align} m^{ed} &\equiv m^{k \phi(n) + 1} \\ &\equiv m \times m^{k\phi(n)} \\ &\equiv m \times \left(m^{\phi(n)}\right)^k \pmod n \end{align} }[/math]

We apply Euler's theorem, and achieve the result.

[math]\displaystyle{ m\times (1)^k \equiv m \pmod n }[/math]

Padding schemes

When used in practice, RSA must be combined with some form of padding scheme, so that no values of M result in insecure ciphertexts. RSA used without padding may have some problems:

The values m = 0 or m = 1 always produce ciphertexts equal to 0 or 1 respectively, due to the properties of exponentiation.
When encrypting with small encryption exponents (e.g., e = 3) and small values of the m, the (non-modular) result of [math]\displaystyle{ m^e }[/math] may be strictly less than the modulus n. In this case, ciphertexts may be easily decrypted by taking the eth root of the ciphertext with no regard to the modulus.
RSA encryption is a deterministic encryption algorithm. It has no random component. Therefore, an attacker can successfully launch a chosen plaintext attack against the cryptosystem. They can make a dictionary by encrypting likely plaintexts under the public key, and storing the resulting ciphertexts. The attacker can then observe the communication channel. As soon as they see ciphertexts that match the ones in their dictionary, the attackers can then use this dictionary in order to learn the content of the message.

In practice, the first two problems can arise when short ASCII messages are sent. In such messages, m might be the concatenation of one or more ASCII-encoded character(s). A message consisting of a single ASCII NUL character (whose numeric value is 0) would be encoded as m = 0, which produces a ciphertext of 0 no matter which values of e and N are used. Likewise, a single ASCII SOH (whose numeric value is 1) would always produce a ciphertext of 1. For systems which conventionally use small values of e, such as 3, all single character ASCII messages encoded using this scheme would be insecure, since the largest m would have a value of 255, and 255³ is less than any reasonable modulus. Such plaintexts could be recovered by simply taking the cube root of the ciphertext.

To avoid these problems, practical RSA implementations typically embed some form of structured, randomized padding into the value m before encrypting it. This padding ensures that m does not fall into the range of insecure plaintexts, and that a given message, once padded, will encrypt to one of a large number of different possible ciphertexts. The latter property can increase the cost of a dictionary attack beyond the capabilities of a reasonable attacker.

Standards such as PKCS have been carefully designed to securely pad messages prior to RSA encryption. Because these schemes pad the plaintext m with some number of additional bits, the size of the un-padded message M must be somewhat smaller. RSA padding schemes must be carefully designed so as to prevent sophisticated attacks. This may be made easier by a predictable message structure. Early versions of the PKCS standard used constructions, which were later found vulnerable to a practical adaptive chosen ciphertext attack. Modern constructions use secure techniques such as optimal asymmetric encryption padding (OAEP) to protect messages while preventing these attacks. The PKCS standard also has processing schemes designed to provide additional security for RSA signatures, e.g., the Probabilistic Signature Scheme for RSA (RSA-PSS).

Signing messages

Suppose Alice uses Bob's public key to send him an encrypted message. In the message, she can claim to be Alice but Bob has no way of verifying that the message was actually from Alice since anyone can use Bob's public key to send him encrypted messages. So, in order to verify the origin of a message, RSA can also be used to sign a message.

Suppose Alice wishes to send a signed message to Bob. She produces a hash value of the message, raises it to the power of d mod n (just like when decrypting a message), and attaches it as a "signature" to the message. When Bob receives the signed message, he raises the signature to the power of e mod n (just like encrypting a message), and compares the resulting hash value with the message's actual hash value. If the two agree, he knows that the author of the message was in possession of Alice's secret key, and that the message has not been tampered with since.

Note that secure padding schemes such as RSA-PSS are as essential for the security of message signing as they are for message encryption, and that the same key should never be used for both encryption and signing purposes.

RSA Algorithm Media

Adi Shamir, co-inventor of RSA (the others are Ron Rivest and Leonard Adleman)

References

Other websites

The Original RSA Patent as filed with the U.S. Patent Office by Rivest; Ronald L. (Belmont, MA), Shamir; Adi (Cambridge, MA), Adleman; Leonard M. (Arlington, MA), December 14, 1977, U.S. Patent 4,405,829 .
PKCS #1: RSA Cryptography Standard (RSA Laboratories website)
- The PKCS #1 standard "provides recommendations for the implementation of public-key cryptography based on the RSA algorithm, covering the following aspects: cryptographic primitives; encryption schemes; signature schemes with appendix; ASN.1 syntax for representing keys and for identifying the schemes".
Explanation of RSA using colored lamps at YouTube
Thorough walk through of RSA
Prime Number Hide-And-Seek: How the RSA Cipher Works Archived 2013-07-25 at the Wayback Machine
Onur Aciicmez, Cetin Kaya Koc, Jean-Pierre Seifert: On the Power of Simple Branch Prediction Analysis
A New Vulnerability In RSA Cryptography, CAcert NEWS Blog
Example of an RSA implementation with PKCS#1 padding (GPL source code) Archived 2012-07-22 at the Wayback Machine
Kocher's article about timing attacks
An animated explanation of RSA with its mathematical background by CrypTool Archived 2018-09-28 at the Wayback Machine
An interactive walkthrough going through all stages to make small example RSA keys Archived 2020-07-07 at the Wayback Machine
Hacking Secret Ciphers with Python, Chapter 24, Public Key Cryptography and the RSA Cipher
Grime, James. "RSA Encryption". Numberphile. Brady Haran. Archived from the original on 2018-10-06. Retrieved 2019-06-20.
Prime Numbers, Factorization, and their Relationship with Encryption

Template:Cryptography public-key