Understanding Complexity of Cryptographic Algorithms

1. Introduction

In this tutorial, we’ll examine computational complexity issues within cryptographic algorithms.

The discussion will not focus on any concrete cryptographic algorithm, but we’ll expose their basic general laws.

2. Symmetric and Public Key Cryptosystems

Suppose two people, $A$ and $B$ , want to communicate in a secret form. $A$ wants to send a message $M$ so that only $B$ can understand its content. To do this, it uses a protocol called cryptosystem or cipher, which produces a cryptogram or ciphertext $C$ of the original message via an encryption function $e (\cdot)$ :

$C=e(M)$

$B$ uses a decryption function $d (\cdot)$ , basically the inverse function of $e (\cdot)$ , to get the original message:

$d(C)=d(e(M))=M$

Encryption and decryption can be considered components of the coding and decoding subsystems in a classic communication mechanism:

Suppose a third person $Z$ able to intercept the message that $A$ sent to $B$ . The most unfavorable condition forces us to consider the fact that $Z$ knows the protocol used by $A$ and $B$ . At this point, $Z$ can decrypt the message by simply applying $d(C)$ .

Therefore, we need an additional element to make communication secure. This element is a secret key, or simply a key, known to $B$ and without which $Z$ cannot decrypt the message despite knowing the communication protocol used.

But does $A$ have to know $B$ ‘s secret key? Until the end of the twentieth century, most cryptographers would have answered this question in the affirmative. In this case, we talk about symmetric key systems, and their security rests on the secrecy of the key.

In 1976, Diffie and Hellman published the article “New directions in cryptography“, in which they proposed public-key cryptosystems. In them, $A$ encrypts the message with a public domain key or public key, that $B$ can decrypt with his private key, known only to him. Anyone can send an encrypted message to $B$ , but only $B$ can know its content. This eliminates the problem of key security, the weak point of symmetric cryptosystems.

3. Classical vs. Modern Cryptography

In the scenario where $A$ and $B$ suspect a potential attack by $Z$ , we can ask ourselves two fundamental questions:

What can $Z$ do with the message-cryptogram pair it gets?
Which result, from the point of view of security, is satisfactory for $A$ and $B$ ?

Depending on how we answer these questions, we have two different approaches: classical vs. modern cryptography.

To deepen the issues, we recommend the excellent text by Talbot and Welsh “Complexity and Cryptography, an introduction“.

3.1. Classical Cryptography

Based on Information Theory and largely elaborated by Shannon, for which it is known as the information-theoretic approach. The basic assumption is as follows:

The cryptogram must not reveal any information about the message.

This assumption leads to the concept of perfect-secrecy that we can formalize as follows:

$\forall\left[m\in M;c\in C\right]\Rightarrow\Pr(M=m,C=c)=\Pr(M=m)$

This formula simply says that given a concrete message $m$ between a set of possible messages $M$ and given a concrete cipher $c$ between a set of possible ciphers $C$ , the probability of $m$ is independent of $c$ . Even if $Z$ has access to the cryptogram of the message, he cannot learn anything about its content.

One problem with this approach is that a perfect-secrecy system requires a key length at least as large as any message that can be encrusted with it, making it unsuitable for modern communication systems, such as the Internet.

3.2. Modern Cryptography

Modern cryptography takes a completely different approach. The basic assumption is:

It doesn’t matter if a cryptogram reveals information about the message. What matters is whether $Z$ can efficiently extract this information.

If we assume that $Z$ has an unlimited computational capacity, then the previous proposition does not hold. Hence, modern cryptography considers that:

$Z$ has limited computational resources.

But if this is true for $Z$ , it is also true for $A$ and $B$ , which leads to an additional assumption:

There are mathematical functions that are easy to compute but difficult to invert, called one-way functions.

In this scenario, $A$ and $B$ can encrypt messages with few computational resources, but $Z$ can get information from the message only if it has high computational capabilities. The last assumption clarifies the importance of issues related to the complexity of computational procedures, therefore to the ease or difficulty of their implementation.

Modern cryptography is the one used basically today in encrypted transactions and communications. However, any system that allows exponentially increasing computational capabilities, such as the quantum computer, is potentially endangered.

4. Framework of the Complexity Theory

Any computing system, including cryptographic ones, can only use computable functions. What matters is the ease or difficulty in making the calculation.

Suppose a specific problem, for example, sorting a set of numbers. We’ll call this problem $\Pi$ . Complexity Theory tries to answer questions like these:

$\Pi$ is inherently easy or difficult to solve?
Given $\Pi_ {1}$ and $\Pi_ {2}$ , which is easier to solve?

To give an answer, we classify algorithms into different complexity classes, which group computational procedures with common characteristics, according to the following ideas:

We measure the execution time of an algorithm in terms of the number of elementary operations.
The running time of an algorithm depends on the size of the input.

The $O$ -notation establishes a symbolism to express these ideas. An algorithm of complexity $O (n ^ {2})$ , for example, must perform a number of elementary operations equal to the square of the size of the input.

5. Complexity Classes

The $O$ -notation allows us to define the complexity classes in the terms set out above. Further details can be found in our tutorial P, NP, NP-Complete and NP-Hard Problems in Computer Science.

The following figure illustrates the structure of the complexity classes discussed below:

5.1. P Class

It is the class of algorithms solvable in polynomial time. Given a constant $c$ , it includes algorithms executable in time $O (n ^ {c})$ . We consider these algorithms feasible and efficient, and it is possible to study them theoretically by reducing them to decision problems (algorithms with yes / no answers).

Polynomial problems with very large $c$ are in any case intractable, although formally they belong to class P. We consider problems not solvable in polynomial time to be, in general, infeasible.

As is known, the inverse functions $d (\cdot)$ of many cryptographic algorithms lie on the factoring of very large prime numbers. The factorization of an integer of $n$ bits by trial division occurs in time $O (2 ^ {n / 2})$ , an exponential time, which makes the procedure infeasible even for $n$ of the order of a few hundred.

While the primality test is in P, the most efficient prime number factorization algorithm has time $O \left (2 ^ {n ^ {1/3}} \right)$ , far from polynomial time. We believe that the factorization problem is not in P, but it is a conjecture, and therefore there is no proof. Algorithms such as RSA and Rabin cryptosystems are based on this conjecture.

5.2. NP Class

It is the class of problems verifiable in polynomial time. The factorization problem is clearly in NP, although it is solvable in exponential time: given a set of factors and a prime number, it is possible to verify in polynomial time whether the factorization is correct, simply by multiplying the factors.

Another well-known example is the traveling salesman problem. The salesman has to visit a group of cities once. The problem is to identify the sequence of cities leading to the shortest path.

If we approach the problem in a naive way, trying to build all possible paths, we immediately realize that the problem becomes intractable even for modest numbers. Some didactic examples show that if we consider all the atoms of the Earth as computing devices, each of which has speed orders of magnitude higher than the most powerful computer, it will take a computation time of the order of the age of the universe even for a few hundred cities.

5.3. Reducibility

A $\Pi_ {1}$ problem reduces to a $\Pi_ {2}$ problem if solving $\Pi_ {2}$ allows us to solve $\Pi_ {1}$ .

Intuitively, reducibility provides an idea of equivalence between problems from a complexity point of view.

5.4. Class NPC and NP-completeness

A $\Pi$ problem is said to be NP-complete (NPC) if $\Pi \in$ NP and each NP problem reduces to $\Pi$ . These are the most difficult problems in NP and the ones most likely to be intractable.

Although, in principle, a complexity class desirable for building cryptographic systems, it’s not a sufficient condition for this purpose.

An important consequence is that if we find a polynomial time-solving algorithm for a $\Pi$ problem in NPC, then we can reduce all NP problems to P and solve them in polynomial time. The result would be that P = NP, which is considered unlikely.

The equality P = NP would invalidate many of the crypto algorithms that lie on inverse functions not computable in P.

5.5. BPP Class

The class of BPP (bounded-error probabilistic polynomial-time) problems include those problems solvable with algorithms that do not operate in strictly deterministic time but which make use of random choices during computation. We will not go into details, but the topic has relation to some of the steps that make up cryptographic systems, such as the choice of the secret key.

The theory tells us that NP $\nsubseteq$ BPP and is a result in favor of the hypothesis P $\neq$ NP.

6. Complexity-Based Cryptography

From all the previous considerations, we can glimpse the parallelism existing between Complexity Theory and Cryptography. The first attempts to identify problems not solvable in polynomial time, while the second attempts to build not breakable protocols in polynomial time.

The current complexity-based cryptography addresses the issue according to some guiding principles:

Using a reductionist approach, trying to relate the huge amount of possible problems and their computational aspects with a few assumptions about their complexity
Identifying the basic properties of computational problems for which we can build cryptographic protocols

Thus, in a cryptographic system, the encrypt function $e (\cdot)$ performs in polynomial time, while the decrypt function $d (\cdot)$ is only verifiable in polynomial time. The combinatorial explosion of all possibilities makes it impossible to decipher the message by trial and error.

However, as we mentioned, the P $\neq$ NP assumption is not sufficient for a cryptographic protocol. The reason is due to the fact that this inequality is relative to the worst case.

In more formal terms, it means that if a problem $\Pi$ is not in P, then for any polynomial algorithm, there are inputs that fail to solve $\Pi$ . But these situations could be a minority or even very difficult to find. We need systematic ways to find these instances for every computationally intractable problem.

7. One-Way Functions

The previous objection can be solved with the concept of a one-way function. In turn, it is based on the concept of negligible function.

A $\epsilon: \mathcal {\mathbb {N}} \rightarrow [0,1]$ function is negligible if for every $\alpha$ constant there is a $\beta$ such that:

$\forall n\geq\beta\Rightarrow\epsilon(n)\leq\frac{1}{n^{\alpha}}$

$\epsilon$ decreases faster than the inverse of any polynomial.

A one-way function satisfies the following conditions:

Calculable in polynomial time
Not invertible in polynomial time. Formally, given a random input $x$ of length $n$ and a randomly chosen probabilistic polynomial-time algorithm $\mathcal{A}$ , there exists a negligible function $\epsilon$ such that

$\Pr\left[\mathcal{A}(f(x))=x\right]\leq\epsilon(n)$

The input length $n$ is the equivalent of the key length in a cryptographic protocol.

It is possible to show that if NP $\subseteq$ BPP, then there are no one-way functions.

7.1. Examples of One-Way Functions

Some well-known examples are:

Multiplication, $f (p, q) = pq$ , with $p$ and $q$ prime numbers of equal length. The inversion of $f$ is the factorization problem, which, as we have already seen, is considered infeasible
Subset Sum, $f (x_ {1}, x_ {2}, \ldots, x_ {n}, S) = \left (x_{1}, x_ {2}, \ldots, x_ {n}, \sum_ {i \in S} x_ {i} \right)$ , being $x_{i}$ an integer of $n$ bits and $S \subseteq [n]$ . The inversion is the Subset Sum problem, used for example in Knapsack cryptographic schemes. This is an NP problem, although there is no certainty that $f$ is one way, even we think it is
Discrete Log Collection, $f_ {G, g} (x) = g ^ {x}$ , where $G$ is a cyclic group, $g$ is the generator of $G$ and $x \in \left \{1, \ldots,| G | -1 \right \}$ . Inversion (Discrete Log problem) is infeasible
The RSA Collection, $f_ {n, e} (x) = x ^ {e} \mod n$ , where $n$ is the product of two primes of equal length
Rabin’s Collection, $f_ {n} (x) = x ^ {2} \mod n$ . Used in the Rabin cryptosystem and in the Rabin digital signature scheme. The inversion is at least as hard as the factorization of $n$
Hash functions and block ciphers

It can be shown that most cryptographic operations can be performed with one-way functions.

8. Trapdoor Functions

For many cryptographic systems, such as public-key cryptography, one-way functions are not enough, as additional properties are needed, such as the trapdoor property. In this case, the function can be easily reversed by providing this trapdoor information.

A collection of functions $F = \left \{f_ {i}: D_ {i} \rightarrow R_ {i} \right \}$ is a collection of trapdoor functions if it satisfies the following properties:

There exists a probabilistic algorithm in polynomial time such that, given a certain input $n$ , it generates the pair $(i, t_ {i})$ , where $i$ is the index of one of the functions of the family, and $t_ {i}$ is the trapdoor information
Given as input $(i, x \in D_ {i})$ it is possible to calculate $f_{i} (x)$ in time P
Given an input $(i, f_ {i} (x))$ , there is no probabilistic polynomial-time algorithm that generates $x$ with nonnegligible probability
Given $(t_ {i}, f_ {i} (x))$ it is possible to calculate $x$ in P

The RSA and Rabin collections have the trapdoor property: they can be inverted in polynomial time given the factorization of the modulus $n$ .

In this context, a remarkable theorem establishes:

If trapdoor functions exist, then public-key encryption schemes exist.

This result transforms trapdoor functions into the natural tool within public-key cryptography.

9. Conclusion

We have treated the relationship between complexity and cryptography from an introductory point of view. The points of contact between the two disciplines are fundamental to understand the reasons behind the choice of particular techniques or protocols.

After this general discussion, the natural path we recommend is that of studying the individual algorithms. The reader is invited to deepen the subject.

Full Archive

About Baeldung

Core Concepts

Operating Systems

Artificial Intelligence

Graph Theory

Latex