RSA - Prime numbers

The RSA (Rivest-Shamir-Adleman) algorithm allows you to create a public and private key, and then use them to encrypt and decrypt information. The public key is widely known and anyone can use it to encrypt any message. Only the holder of the private key can decrypt the received message. The holder of the private key can also use it to encrypt data, and the holder of the public key can decrypt this data. The security of the RSA encryption algorithm is based on the factorization of large prime numbers mentioned below.

The public key consists of the modulus n (p*q) and the public exponent e, while the private key consists of the same modulus n and the private exponent d .

All the numbers associated with the private key should be kept secret: the number d and the next three p, q and φ(n) which can be used to compute d.

Security vulnerability in the RSA algorithm

So let’s start with the following formulas:

6y+1 where y > 0, y mod 10 ≠ 4, y mod 10 ≠ 9
^and
6x-1 where x > 0 , x mod 10 ≠ 1, x mod 10 ≠ 6

^{{(6y+1), (6x-1) | y > 0, y mod 10 ≠ 4, y mod 10 ≠ 9, x > 0 , x mod 10 ≠ 1, x mod 10 ≠ 6}}

Here are some initial numbers from the discussed formulas:
7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 49, 53, 59, 61, 67, 71, 73, 77, 79, 83, 89, 91, 97…

These formulas create all prime numbers (except 2, 3 and intentionally 5) and near-prime numbers, i.e. products and squares of prime numbers. (For more on near-prime numbers, see NEAR-PRIME NUMBERS). The formulas used above eliminate all natural numbers that cannot be p, q and n values occurring in the RSA algorithm. As we will see, the product of p and q is always a near-prime number and creates the n value of the public key.

The advantage of these formulas, which will be appreciated by the IT industry, is the ability to generate a set of numbers in which as much as ~73.33% of all natural numbers that cannot be prime numbers are omitted. The downside to the existence of such patterns is the reason that RSA encryption can be subject to serious threats that can compromise the privacy and security of many IT systems.

As a curiosity, it should be added that 2/3 of the value of n public keys belongs to the formula 6y+1, and only 1/3 of the value of n belongs to the set 6x-1, because:

{(6x-1)*(6x-1)} ⊆ {6y+1}
{(6y+1)*(6y+1)} ⊆ {6y+1}

whereas

{(6x-1)*(6y+1)} ⊆ {6x-1}

The first factor

As we know, the n value of the public key always belongs to one of the sets 6x-1 or 6y+1. Using the formulas given above, we generate a set of numbers up to the square root of n. Thus, we will generate only ~26.66% of natural numbers smaller than √n. One of the generated numbers will always correspond to p or q.

Example

We have the value n=1037. We calculate √1037 = ~32.

We will now use numbers from these formulas that are less than 32. Here are the numbers generated from these formulas: 7, 11, 13, 17, 19, 23, 29, 31.

We know that one of these numbers is definitely n. Let’s quickly calculate that 17 is a divisor of 1037. So the second factor is 61, because:

1037 / 17 = 61

As we can see, when using this method, we only used 7, 11, 13 and 17, which are the first four numbers from the formulas 6x-1 and 6y+1. In theory, we were prepared to use the maximum of 8 generated numbers out of a possible 32 natural numbers to find one prime factor. In practice, we used the first 4 generated numbers, out of 17 possible natural numbers, to finally hit one of the p or q factors.

Or we could take the risk of using just one set of numbers. With luck, we’d use only ~13.33% of all integers less than √n and speed up the factorization process by half.

Database

Another threat lies in the fact that it is possible to generate a database with ready factorization of n. Using the 6x-1 and 6y+1 formulas, you can eliminate ~73.33% of all natural numbers that cannot be p and q values, which greatly facilitates the work of the potential attacker.

If the attacker writes a script that multiplies each number obtained from the formulas 6y+1 and 6x-1 by every other number from these formulas, including himself, then he will store these values in the database with information about which primes have produced a specific value of the key n public, it will have a ready database with factorization of n values from the RSA algorithm.

In order to optimize time and resources, the attacker will probably generate such a database from the range of numbers in which p and q values can only be found up to the range of n values. Calculating the value of φ(n) and using a simple modular reversal algorithm will allow the attacker to calculate the d value in an effective way private key.

The only issue is the computing power and time needed to generate such a database, but it is worth considering the fact that if such a database is generated even once, then the era of RSA encryption will end. One solution to this problem is to change the encryption method to another one that is not so vulnerable to attacks using such methods. Of course, as of today, without the widespread availability of quantum technology and computers, this is a serious threat if the attacker has the appropriate computing power and financial resources.

The article is also available in Polish, click here.