Continued Fractions and Probability Estimations in Shor’s Algorithm: A Detailed and Self-Contained Treatise

Johanna Barzen; Frank Leymann

doi:10.3390/appliedmath2030023

and

Institute of Architecture of Application Systems, University of Stuttgart, Universitätsstr. 38, 70569 Stuttgart, Germany

^*

Author to whom correspondence should be addressed.

AppliedMath2022, 2(3), 393-432;https://doi.org/10.3390/appliedmath2030023

This article belongs to the Special Issue Applications of Number Theory to the Sciences and Mathematics

Version Notes

Order Reprints

Abstract

Shor’s algorithm for prime factorization is a hybrid algorithm consisting of a quantum part and a classical part. The main focus of the classical part is a continued fraction analysis. The presentation of this is often short, pointing to text books on number theory. In this contribution, we present the relevant results and proofs from the theory of continued fractions in detail (even in more detail than in text books), filling the gap to allow a complete comprehension of Shor’s algorithm. Similarly, we provide a detailed computation of the estimation of the probability that convergents will provide the period required for determining a prime factor.

Keywords:

quantum algorithms; quantum computing; continued fractions; hybrid quantum algorithms

1. Introduction

Shor’s algorithm [1] for prime factorization is generally considered as a major milestone and a breakthrough in quantum computing: it solves a practically very relevant problem (which is, e.g., an underpinning of cryptography) with an exponential speedup compared to classical methods.

The algorithm is based on the fact that determining a divisor and finally a prime factor of a natural number

n \in ℕ

can be reduced to finding the period p of the modular exponentiation function

f (x) = a^{x} m o d n

for an a with

0 < a < n

(see Section 3.2.1).

The overall algorithm is hybrid, consisting of classical computations and a quantum computation. The classical computations are computing greatest common divisors with the Euclidian algorithm, and perform a continuous fraction analysis. A detailed discussion of the latter is one of the two foci of this contribution (see Section 2).

The quantum part mainly consists of: (i) creating an entangled state based on an oracle computing the modular exponentiation function f above, (ii) performing a quantum Fourier transform (QFT) on this state, and (iii) measuring it. The oracle produces the following state:

|a ⟩| b ⟩ = \frac{1}{\sqrt{N}} \overset{N - 1}{\sum_{x = 0}} |x ⟩| f (x) ⟩

(1)

After applying the quantum Fourier transform and a measurement, the first part (i.e., the |a⟩-part) of the quantum register is in state

\frac{1}{\sqrt{N A}} \overset{A - 1}{\sum_{j = 0}} ω_{N}^{j p y} | y ⟩

(2)

In this state, the searched period p already appears in its amplitude. The measured value y can then be used with high probability (see Section 3.4, Theorem 16) to compute the period p of the modular exponentiation function f by analyzing the convergents of a continued fraction (see Section 3.4.1) and finally, based on the period, a prime factor (see Section 3.2.1). A detailed discussion on how this is achieved is the second focus of this contribution (see Section 3).

Structure of the Article

The article is structured as follows: in Section 2 we cover all details about continued fractions that are required to comprehend the corresponding aspect of Shor’s algorithm.

Section 2.1 defines the notion of a continued fraction, gives examples of how to compute the continued fraction representation of a rational number, and demonstrates how to compute the number that a continued fraction (and thus convergents) represents.

Convergents as the fundamental tool in the theory of continued fractions are detailed in Section 2.2: after defining the term, basic theorems about convergents such as the recursion theorem, two sign theorems, monotony properties, convergent comparison, nesting of a number by its convergents, and several distance estimations are proven.

Next, the brief Section 2.3 presents infinite regular continued fractions to represent non-rational numbers. A corresponding algorithm is provided to compute such continued fractions.

Section 2.4 gives several upper bounds and lower bounds for the difference between a number and its convergents. Exploiting one of these bounds, the convergence of the convergents of an infinite regular continued fraction of a number to this number is proven. Semiconvergents are defined and corresponding monotony properties are given.

Best approximations of a real number are introduced in Section 2.5. It is proven that best approximations of the second kind are convergents and vice versa (Lagrange’s theorem). Best approximations of the first kind are proven to be convergents or semiconvergents (another theorem by Lagrange). Finally, Legendre’s theorem is presented, which is the main result about continued fractions required by Shor’s algorithm: it allows the implication that a given fraction is a convergent of another number.

Section 3 is devoted to estimating the probability that convergents can be used to compute periods, i.e., that Legendre’s theorem can be applied.

At the beginning of Section 3, Section 3.1 proves a lower bound and an upper bound for the secant lengths of the unit circle. This estimation is central for estimating the aforementioned probability.

Section 3.2 contains many different estimations of parameters that appear in the measurement result of Shor’s algorithm. In Section 3.2.1, we recall the very basics of modular arithmetic, relate this to group theory, and use Lagrange’s theorem from group theory to prove that the period of the modular exponentiation function in Shor’s algorithm is less than the number to be factorized (Lemma 8). Intervals of consecutive multiples of the period are studied in Section 3.2.2: it is shown that multiples of N are sparsely scattered across these intervals (Note 12). This implies that measurement results are somehow centered around multiples of

N / p

(Corollary 9). The cardinality of arguments in the superposition that build the pre-image of a certain

f (x)

is estimated in Section 3.2.3. Section 3.2.4 proves bounds of phases of amplitudes relevant for computing the probability of measurement results as a geometric sum.

Finally, Section 3.3 computes this probability: it is proven that a measurement result is close to a multiple of

N / p

with probability of approximately

4 / π^{2}

(Lemma 10).

Section 3.4 shows that this measurement result fulfills the assumption of Legendre’s theorem (Theorem 15). Thus, by computing convergents, the period can be determined (Theorem 16 and Section 3.4.1).

Section 3.5 sketches how the main results contribute to the proof of Shor’s algorithm. Its purpose is to avoid getting lost in the huge amount of low-level details.

A brief conclusion and discussion of related work ends this contribution with Section 4.

2. Continued Fractions

2.1. Definition of Continued Fractions and Their Computation

We define the notion of continued fractions and give an example of how to compute them.

Definition 1.

An expression of the form

a_{0} + \frac{b_{1}}{a_{1} + \frac{b_{2}}{a_{2} + \frac{b_{3}}{⋱}}}

(3)

with

a_{i}, b_{i} \in ℂ

is called an infinite continued fraction.

If, in this expression, it is

b_{i} = 1

for all i,

a_{0} \in ℤ

, and

a_{i} \in ℕ

for i≥1, the expression is called a regular continued fraction.

A finite regular continued fraction (simply called a continued fraction) satisfies, in addition, the condition

\exists N \in ℕ \forall k \in ℕ : a_{N + k} = 0

(convention: “1/0 = 0”).

A continued fraction is, thus, the following expression:

[a_{0}; a_{1}, \dots, a_{N}] \overset{\underset{d e f}{}}{=} a_{0} + \frac{1}{a_{1} + \frac{1}{⋱ + \frac{1}{a_{N - 1} + \frac{1}{a_{N}}}}}

(4)

A continued fraction of a rational number a/b is computed as follows: the integer part

⌊ a / b ⌋

becomes

a_{0} \in ℤ

, leaving the non-negative rational remainder

x_{1} / y_{1} \in ℚ

. The latter is now written as

1 / (y_{1} / x_{1})

, resulting in

a_{0} + \frac{1}{(\frac{y_{1}}{x_{1}})}

Next, the integer part

⌊ y_{1} / x_{1} ⌋

becomes

a_{1}

, leaving a rational remainder that is treated as before. This processing stops until the rational remainder is zero. Figure 1 gives an example of the processing.

Figure 1. Example of a straightforward computation of a continued fraction.

Beside this straightforward proceeding to compute continued fractions, the well-known Euclidian algorithm can be used for this purpose too. Figure 2 gives a corresponding example; it should be self-descriptive.

Figure 2. Using the Euclidian algorithm to compute a continued fraction.

Formally, a continued fraction can always be reduced such that its last element is greater than or equal to 2.

Note 1.

Let

[a_{0}; a_{1}, \dots, a_{N}]

be a continued fraction. Then:

[a_{0}; a_{1}, \dots, a_{N}] = [a_{0}; a_{1}, \dots, a_{N - 1} + \frac{1}{a_{N}}]

(5)

Especially, it can always be achieved that a continued fraction

[a_{0}; a_{1}, \dots, a_{N}]

satisfies

a_{N} \geq 2

.

Proof.

The following simple computation proves the first claim:

\begin{matrix} [a_{0}; a_{1}, \dots, a_{N}] & = a_{0} + \frac{1}{a_{1} + \frac{1}{⋱ + \frac{1}{a_{N - 1} + \frac{1}{a_{N}}}}} \\ = a_{0} + \frac{1}{a_{1} + \frac{1}{⋱ + \frac{1}{(a_{N - 1} + \frac{1}{a_{N}})}}} \\ = [a_{0}; a_{1}, \dots, a_{N - 1} + \frac{1}{a_{N}}] \end{matrix}

Furthermore, if

a_{N} = 1

in

[a_{0}; a_{1}, \dots, a_{N}]

then

a_{N - 1} + 1 / a_{N}

≥ 2. This is because, by definition,

a_{k} \geq 1

for 1 ≤ k ≤ N. □

Equation (5) implies a straightforward way to compute the value represented by a continued fraction

[a_{0}; a_{1}, \dots, a_{N}]

: see Figure 3.

Figure 3. Computing the value of a continued fraction based on Equation (5).

2.2. Convergents

Next, we define the “workhorses” of the theory of continued fractions.

Definition 2.

[a_{0}; a_{1}, \dots, a_{m}]

is called m-th convergent of the continued fraction

[a_{0}; a_{1}, \dots, a_{N}]

for 0 ≤ m ≤ N, or m-th convergent of the infinite regular continued fraction

[a_{0}; a_{1}, \dots]

.

Convergents can be computed recursively based on the following theorem:

Theorem 1.

(Recursion Theorem)

Define:

$p_{0} = a_{0}$ ;
$p_{1} = a_{1} a_{0} + 1$ ;
$p_{n} = a_{n} p_{n - 1} + p_{n - 2}$ for n ≥ 2;

and define:

$q_{0} = 1$ ;
$q_{1} = a_{1}$ ;
$q_{n} = a_{n} q_{n - 1} + q_{n - 2}$ for n ≥ 2.

Then, for every convergent

[a_{0}; a_{1}, \dots, a_{n}]

, it is:

[a_{0}; a_{1}, \dots, a_{n}] = \frac{p_{n}}{q_{n}}

(6)

Proof (by induction).

Let n = 0, 1: Then,

[a_{0}] = a_{0} = \frac{p_{0}}{q_{0}}

and

[a_{0}; a_{1}] = a_{0} + \frac{1}{a_{1}} = \frac{a_{0} a_{1} + 1}{a_{1}} = \frac{p_{1}}{q_{1}}

.

Induction hypothesis:

[a_{0}; a_{1}, \dots, a_{n}] = \frac{p_{n}}{q_{n}} = \frac{a_{n} p_{n - 1} + p_{n - 2}}{a_{n} q_{n - 1} + q_{n - 2}}

.

Induction step n → n + 1: According to Note 1, it is

[a_{0}; a_{1}, \dots, a_{n}, a_{n + 1}] = [a_{0}; a_{1}, \dots, a_{n} + \frac{1}{a_{n + 1}}]

and the last continued fraction has n elements, i.e., the induction hypothesis applies:

\begin{matrix} a_{0}; a_{1}, \dots, a_{n} + \frac{1}{a_{n + 1}}] = \frac{(a_{n} + \frac{1}{a_{n + 1}}) p_{n - 1} + p_{n - 2}}{(a_{n} + \frac{1}{a_{n + 1}}) q_{n - 1} + q_{n - 2}} = \frac{\frac{a_{n} a_{n + 1} + 1}{a_{n + 1}} p_{n - 1} + p_{n - 2}}{\frac{a_{n} a_{n + 1} + 1}{a_{n + 1}} q_{n - 1} + q_{n - 2}} \\ = \frac{(a_{n} a_{n + 1} + 1) p_{n - 1} + a_{n + 1} p_{n - 2}}{(a_{n} a_{n + 1} + 1) q_{n - 1} + a_{n + 1} q_{n - 2}} \\ = \frac{a_{n + 1} (a_{n} p_{n - 1} + p_{n - 2}) + p_{n - 1}}{a_{n + 1} (a_{n} q_{n - 1} + q_{n - 2}) + q_{n - 1}} \overset{(A)}{=} \frac{a_{n + 1} p_{n} + p_{n - 1}}{a_{n + 1} q_{n} + q_{n - 1}} \\ \overset{(B)}{=} \frac{p_{n + 1}}{q_{n + 1}} \end{matrix}

Here, (A) is valid because of the induction hypothesis, and (B) is the definition of

p_{n + 1}

and

q_{n + 1}

. □

The recursion theorem implies the often used.

Corollary 1.

Numerators and denominators of convergents of a continued fraction

[a_{0}; a_{1}, \dots, a_{N}]

with

a_{0} \geq 0

are strictly monotonically increasing:

p_{n} > p_{n - 1} and q_{n} > q_{n - 1} for all n \in ℕ .

Proof (by induction).

Let n = 1: By definition,

p_{0} = a_{0}

,

p_{1} = a_{1} a_{0} + 1

. Because

a_{i} \geq 1

for i ≥ 1, and

a_{0} \geq 0

, it is

p_{1} > p_{0} \geq 0

. Similarly,

q_{1} > q_{0} > 0

Now,

p_{n} = a_{n} p_{n - 1} + p_{n - 2}

and

q_{n} = a_{n} q_{n - 1} + q_{n - 2}

for n ≥ 2. With

a_{n} \geq 1

by definition, and

p_{n - 1} > p_{n - 2}

(≥1) as well as

q_{n - 1} > q_{n - 2}

(≥1) by induction hypothesis, the claim follows. □

The next theorem is about the sign of a combination of the numerators and denominators of consecutive convergents of a continued fraction.

Theorem 2.

(Sign Theorem)

For

[a_{0}; a_{1}, \dots, a_{n}] = \frac{p_{n}}{q_{n}}

, the following holds:

p_{n} q_{n - 1} - p_{n - 1} q_{n} = {(- 1)}^{n - 1}

(7)

Proof (by induction).

For n = 1, it is

p_{1} q_{0} - p_{0} q_{1} = (a_{1} a_{0} + 1) \cdot 1 - a_{0} \cdot a_{1} = 1 = {(- 1)}^{0}

Induction step n → n + 1:

\begin{matrix} p_{n + 1} q_{n} - p_{n} q_{n + 1} & = (a_{n + 1} p_{n} + p_{n - 1}) q_{n} - p_{n} (a_{n + 1} q_{n} + q_{n - 1}) \\ = a_{n + 1} p_{n} q_{n} + p_{n - 1} q_{n} - p_{n} a_{n + 1} q_{n} - p_{n} q_{n - 1} \\ = p_{n - 1} q_{n} - p_{n} q_{n - 1} = - (p_{n} q_{n - 1} - p_{n - 1} q_{n}) \\ \overset{(A)}{=} - {(- 1)}^{n - 1} = {(- 1)}^{n} \end{matrix}

(A) uses the induction hypothesis. □

In case the numerators and denominators stem from the n-th convergent and the (n − 2)-nd convergent, the last n-th element of the convergent becomes part of the equation.

Theorem 3.

(Second Sign Theorem)

For

[a_{0}; a_{1}, \dots, a_{n}] = \frac{p_{n}}{q_{n}}

, the following holds:

p_{n} q_{n - 2} - p_{n - 2} q_{n} = {(- 1)}^{n} a_{n}

(8)

Proof.

It is

p_{n} = a_{n} p_{n - 1} + p_{n - 2}

and

q_{n} = a_{n} q_{n - 1} + q_{n - 2}

.

Multiplying the first equation by

q_{n - 2}

and the second equation by

p_{n - 2}

results in

q_{n - 2} p_{n} = q_{n - 2} a_{n} p_{n - 1} + q_{n - 2} p_{n - 2}

and

p_{n - 2} q_{n} = p_{n - 2} a_{n} q_{n - 1} + p_{n - 2} q_{n - 2}

. Next, both equations are subtracted:

\begin{matrix} p_{n} q_{n - 2} - p_{n - 2} q_{n} & = q_{n - 2} a_{n} p_{n - 1} + q_{n - 2} p_{n - 2} - p_{n - 2} a_{n} q_{n - 1} - p_{n - 2} q_{n - 2} \\ = q_{n - 2} a_{n} p_{n - 1} - p_{n - 2} a_{n} q_{n - 1} \\ = a_{n} (p_{n - 1} q_{n - 2} - p_{n - 2} q_{n - 1}) \\ \overset{(A)}{=} {(- 1)}^{n} a_{n} \end{matrix}

where (A) is implied by the sign theorem (Theorem 2) and considering

{(- 1)}^{n - 2} = {(- 1)}^{n}

. □

The sign theorem immediately yields the important.

Corollary 2.

Numerator and denominator of a convergent are co-prime.

Proof.

Let t be a divisor of

p_{n}

and

q_{n}

, i.e.,

t | p_{n}

and

t | q_{n}

. Then,

t | (p_{n} q_{n - 1} - p_{n - 1} q_{n})

, but

(p_{n} q_{n - 1} - p_{n - 1} q_{n}) = {(- 1)}^{n - 1}

according to the sign theorem. Thus, t = ±1. □

Convergents can be represented as a sum of fractions with alternating sign and whose denominators consist of products of two consecutive denominators from the recursion theorem.

Theorem 4.

(Representation as a Sum)

Each convergent can be represented as a sum:

[a_{0}; a_{1}, \dots, a_{n}] = a_{0} + \frac{1}{q_{1} q_{0}} - \frac{1}{q_{2} q_{1}} + \dots + {(- 1)}^{n - 1} \frac{1}{q_{n} q_{n - 1}}

(9)

Proof.

Let

[a_{0}; a_{1}, \dots, a_{n}] = \frac{p_{n}}{q_{n}}

. Since

- \frac{p_{i}}{q_{i}} + \frac{p_{i}}{q_{i}} = 0

, we can write

[a_{0}; a_{1}, \dots, a_{n}] = \frac{p_{n}}{q_{n}} - \frac{p_{n - 1}}{q_{n - 1}} + \frac{p_{n - 1}}{q_{n - 1}} - \frac{p_{n - 2}}{q_{n - 2}} + \frac{p_{n - 2}}{q_{n - 2}} - \dots + \frac{p_{1}}{q_{1}} - \frac{p_{0}}{q_{0}} + \frac{p_{0}}{q_{0}}

Computing the differences results in

\begin{matrix} [a_{0}; a_{1}, \dots, a_{n}] & = \frac{p_{n} q_{n - 1} - q_{n} p_{n - 1}}{q_{n} q_{n - 1}} + \frac{p_{n - 1} q_{n - 2} - q_{n - 1} p_{n - 2}}{q_{n - 1} q_{n - 2}} + \dots + \frac{p_{1} q_{0} - q_{1} p_{0}}{q_{1} q_{0}} + \frac{p_{0}}{q_{0}} \\ \overset{(A)}{=} \frac{{(- 1)}^{n - 1}}{q_{n} q_{n - 1}} + \frac{{(- 1)}^{n - 2}}{q_{n - 1} q_{n - 2}} + \dots + \frac{{(- 1)}^{0}}{q_{1} q_{0}} + a_{0} \end{matrix}

where the sign theorem is applied in (A) and the last term

a_{0} = p_{0} / q_{0}

is the recursion theorem. □

The next theorem is key for many estimations in the domain of continued fractions.

Theorem 5.

(Monotony Theorem)

Let

x_{n} \overset{\underset{d e f}{}}{=} \frac{p_{n}}{q_{n}} = [a_{0}; a_{1}, \dots, a_{n}]

denote the n-th convergent. Then:

x_{2 n} < x_{2 n + 2}

and

x_{2 n + 1} > x_{2 n + 3}

I.e., even convergents are strictly monotonically increasing, and odd convergents are strictly monotonically decreasing.

Proof.

We compute the following difference, where (A) again uses

- \frac{p_{i}}{q_{i}} + \frac{p_{i}}{q_{i}} = 0

:

\begin{matrix} x_{n} - x_{n - 2} & = \frac{p_{n}}{q_{n}} - \frac{p_{n - 2}}{q_{n - 2}} \overset{(A)}{=} \frac{p_{n}}{q_{n}} - \frac{p_{n - 1}}{q_{n - 1}} + \frac{p_{n - 1}}{q_{n - 1}} - \frac{p_{n - 2}}{q_{n - 2}} \\ = \frac{p_{n} q_{n - 1} - q_{n} p_{n - 1}}{q_{n} q_{n - 1}} + \frac{p_{n - 1} q_{n - 2} - q_{n - 1} p_{n - 2}}{q_{n - 1} q_{n - 1}} \\ \overset{(B)}{=} \frac{{(- 1)}^{n - 1}}{q_{n} q_{n - 1}} + \frac{{(- 1)}^{n - 2}}{q_{n - 1} q_{n - 2}} = \frac{{(- 1)}^{n - 1} q_{n - 2} + {(- 1)}^{n - 2} q_{n}}{q_{n} q_{n - 1} q_{n - 2}} \\ = \frac{{(- 1)}^{n - 2} q_{n} - {(- 1)}^{n - 2} q_{n - 2}}{q_{n} q_{n - 1} q_{n - 2}} = \frac{{(- 1)}^{n - 2} (q_{n} - q_{n - 2})}{q_{n} q_{n - 1} q_{n - 2}} \\ = \frac{{(- 1)}^{n} (q_{n} - q_{n - 2})}{q_{n} q_{n - 1} q_{n - 2}} \overset{(C)}{=} \frac{{(- 1)}^{n} a_{n} q_{n - 1}}{q_{n} q_{n - 1} q_{n - 2}} = \frac{{(- 1)}^{n} a_{n}}{q_{n} q_{n - 2}} \end{matrix}

(B) is because of the sign theorem, and (C) follows from

q_{n} = a_{n} q_{n - 1} + q_{n - 2}

, i.e., the recursion theorem.

Now, because of

a_{n}, q_{n}, q_{n - 2} > 0

, it is

\frac{a_{n}}{q_{n} q_{n - 2}} > 0

. Thus,

\frac{{(- 1)}^{n} a_{n}}{q_{n} q_{n - 2}} > 0

for n even and

\frac{{(- 1)}^{n} a_{n}}{q_{n} q_{n - 2}} < 0

for n odd. This implies

x_{n} = \frac{{(- 1)}^{n} a_{n}}{q_{n} q_{n - 2}} + x_{n - 2} > x_{n - 2}

for n even as well as

x_{n} = \frac{{(- 1)}^{n} a_{n}}{q_{n} q_{n - 2}} + x_{n - 2} < x_{n - 2}

for n odd. □

While even convergents are increasing and odd convergence are decreasing, all even convergents are smaller than all odd convergents. This is the content of the next very important theorem.

Theorem 6.

(Convergents Comparison Theorem)

For

0 \leq 2 n, 2 m + 1 \leq N

, it is

x_{2 n} < x_{2 m + 1}

Proof.

As before, using the sign theorem in (A), we obtain

x_{n} - x_{n - 1} = \frac{p_{n}}{q_{n}} - \frac{p_{n - 1}}{q_{n - 1}} = \frac{p_{n} q_{n - 1} - q_{n} p_{n - 1}}{q_{n} q_{n - 1}} \overset{(A)}{=} \frac{{(- 1)}^{n - 1}}{q_{n} q_{n - 1}} = \frac{{(- 1)}^{n - 1}}{β_{n}}

with

β_{n} : = q_{n} q_{n - 1}

. Because

q_{n}, q_{n - 1} > 0

, it is β_n > 0, i.e., the sign of

\frac{{(- 1)}^{n - 1}}{β_{n}}

is in fact

{(- 1)}^{n - 1}

.

Thus,

x_{2 n + 1} - x_{2 n} = \frac{{(- 1)}^{2 n}}{β_{2 n + 1}} > 0

, and we get

x_{2 n + 1} = \frac{{(- 1)}^{2 n}}{β_{2 n + 1}} + x_{2 n} > x_{2 n}

. This shows that an even convergent

x_{2 n}

is strictly smaller than its immediate succeeding odd convergent

x_{2 n + 1}

.

But what about an arbitrary odd convergent

x_{2 m + 1}

? For n < m, the monotony theorem (Theorem 6) yields

x_{2 n} < x_{2 m}

and we showed before that

x_{2 m} < x_{2 m + 1}

; thus,

x_{2 n} < x_{2 m + 1}

.

For n > m, the monotony theorem yields

x_{2 m + 1} > x_{2 n + 1}

and with

x_{2 n + 1} > x_{2 n}

we see

x_{2 n} < x_{2 m + 1}

. □

The following often-used corollary computes the difference of two immediately succeeding convergents by mean of the denominators of the convergents, while the difference of the n-th convergent and the (n − 2)-nd convergent adds the n-th element of the n-th convergent as a factor.

Corollary 3.

\frac{p_{n}}{q_{n}} - \frac{p_{n - 1}}{q_{n - 1}} = \frac{{(- 1)}^{n - 1}}{q_{n} q_{n - 1}}

(10)

and

\frac{p_{n}}{q_{n}} - \frac{p_{n - 2}}{q_{n - 2}} = \frac{{(- 1)}^{n} a_{n}}{q_{n} q_{n - 2}}

(11)

Proof.

Equation (10) is the first equation from the proof of Theorem 6. The second equation follows because of

\frac{p_{n}}{q_{n}} - \frac{p_{n - 2}}{q_{n - 2}} = \frac{p_{n} q_{n - 2} - p_{n - 2} q_{n}}{q_{n} q_{n - 2}} \overset{(A)}{=} \frac{{(- 1)}^{n} a_{n}}{q_{n} q_{n - 2}}

where (A) is because of the second sign theorem (Theorem 3). □

We already saw that the even convergents are strictly monotonically increasing, that the odd convergents are strictly monotonically decreasing, and that each even convergent is less than all odd convergents. According to the next theorem, the value of a continued fraction lies between the even convergents and the odd convergents, i.e., this value is larger than all even convergents and smaller than all odd convergents. The situation is depicted in Figure 4.

Figure 4. Nesting of the value of a continued fraction by its convergents.

Note that the notion of the value of a continued fraction is defined for finite continued fractions. In Section 4, this notion will also be defined for regular infinite continued fractions.

Theorem 7.

(Nesting Theorem)

Let x be the value of the continued fraction

[a_{0}; a_{1}, \dots, a_{N}]

and let

x_{k}

be its convergents. Then:

\forall m, n < N : x_{2 m} < x < x_{2 n + 1}

(12)

Proof.

The value of x is the convergent with the highest index N, i.e.,

x = x_{N} = [a_{0}; a_{1}, \dots, a_{N}]

.

Let N = 2k be even. Since even convergents are strictly monotonically increasing, we know that

\forall 2 m < N : x_{2 m} < x_{2 k} = x_{N} = x

, and according to the convergent comparison theorem (Theorem 6), we know

\forall 2 n + 1 : x = x_{N} = x_{2 k} < x_{2 n + 1}

.

Let N = 2k + 1 be odd. Since odd convergents are strictly monotonically decreasing, we know that

\forall 2 n + 1 < N : x_{2 n + 1} > x_{2 k + 1} = x_{N} = x

, and according to the convergent comparison theorem (Theorem 6), we know

\forall 2 m : x = x_{N} = x_{2 k + 1} > x_{2 m}

. □

Because the value of a continued fraction is nested within its even convergents and odd convergents, the distance of this value from any of its convergents can be estimated by the distance of two consecutive convergents:

Theorem 8.

(Distance Theorem)

Let

x = [a_{0}; a_{1}, \dots, a_{N}]

and let

x_{k}

be its convergents. Then:

\forall n : |x - x_{n}| < |x_{n - 1} - x_{n}|

(13)

and

\forall n : |x - x_{n}| < |x_{n + 1} - x_{n}|

(14)

Proof.

Let n be even. Then,

x_{n} < x < x_{n - 1}

, i.e.,

x - x_{n} < x_{n - 1} - x_{n}

. Additionally, it is

x - x_{n} > 0

and

x_{n - 1} - x_{n} > 0

. Thus,

|x - x_{n}| < |x_{n - 1} - x_{n}|

for n even.

Now, let n be odd. It is

x_{n - 1} < x < x_{n}

, which implies

x - x_{n} > x_{n - 1} - x_{n} \Leftrightarrow - (x_{n} - x) > - (x_{n} - x_{n - 1})

\Leftrightarrow x_{n} - x < x_{n} - x_{n - 1}

. Because of

x_{n} - x > 0

and

x_{n} - x_{n - 1} > 0

, it is

|x_{n} - x| < |x_{n} - x_{n - 1}|

⇔

|x - x_{n}| < |x_{n - 1} - x_{n}|

for n odd.

Together, this proves Equation (13). Equation (14) is proven similarly. □

Figure 5 shows the corresponding geometric situation for an even n.

Figure 5. The distance between two succeeding convergents is greater than the distance of a convergent and the value of its continued fraction.

Similarly, the difference between any two arbitrary convergents can be estimated by the difference of the convergent with the smaller index and its immediate predecessor:

Theorem 9.

(Difference Theorem)

Let

x = [a_{0}; a_{1}, \dots, a_{N}]

and let

x_{k}

be its convergents. Then:

\forall m > n : |x_{m} - x_{n}| < |x_{n - 1} - x_{n}|

(15)

Proof.

Let n be even, e.g., n = 2k.

Let m = 2t be even. By Theorem 6, even convergents are smaller than all odd convergents, i.e.,

x_{2 t} < x_{2 k - 1}

for any

t \in ℕ

. Thus,

x_{m} - x_{n} = x_{2 t} - x_{2 k} < < x_{2 k - 1} - x_{2 k} = x_{n - 1} - x_{n}

.

Let m = 2t − 1 be odd. By the monotony theorem (Theorem 5), odd convergents are strictly monotonically decreasing, i.e.,

x_{2 t - 1} < x_{2 k - 1}

for each t > k. Thus,

x_{m} - x_{n} = x_{2 t - 1} - x_{2 k} < x_{2 k - 1} - x_{2 k} = x_{n - 1} - x_{n}

.

For n odd, the proof is analogous. □

The geometry of the last theorem is depicted in Figure 6.

Figure 6. The distance between any two convergents is smaller than the distance between the convergent with the smaller index and its immediate predecessor.

In several calculations, the size of the denominator of a convergent must be estimated:

Lemma 1.

(Size of Denominators)

For the denominator

q_{n}

of a convergent

\frac{p_{n}}{q_{n}} = [a_{0}; a_{1}, \dots, a_{n}]

, the following holds:

\forall n : q_{n} \geq n

(16)

and

\forall n > 3 : q_{n} > n

(17)

Proof.

By definition,

q_{0} = 1 > 0

, and

q_{1} = a_{1} \geq 1

because

a_{i} \in ℕ

, and finally,

q_{2} \overset{(A)}{=} a_{2} q_{1} + q_{0} \overset{(B)}{=} a_{2} q_{1} + 1 \overset{(C)}{\geq} q_{1} + 1 \overset{(D)}{\geq} 2

(A) holds because of the recursion theorem (Theorem 1), (B) is by definition of

q_{0}

, (C) is because

a_{2} \in ℕ

, and (D) has been seen just before (i.e.,

q_{1} \geq 1

). This proves the lemma for

n \leq 2

.

The proof for n ≥ 3 is by induction. It is

q_{n} \overset{(A)}{=} a_{n} q_{n - 1} + q_{n - 2} \overset{(B)}{\geq} q_{n - 1} + q_{n - 2} \overset{(C)}{\geq} q_{n - 1} + (n - 2) \overset{(D)}{\geq} q_{n - 1} + 1 \overset{(E)}{\geq} n

where (A) is the recursion theorem, (B) is because of

a_{n} \in ℕ

, (C) is by induction hypothesis applied to

q_{n - 2}

, (D) is because n ≥ 3, and (E) is by induction hypothesis applied to

q_{n - 1}

. This proves Equation (16).

Equation (17) is proven by induction again. Let n > 3. The argumentation is exactly as before, with the exception of (D):

q_{n} \overset{(A)}{=} a_{n} q_{n - 1} + q_{n - 2} \overset{(B)}{\geq} q_{n - 1} + q_{n - 2} \overset{(C)}{\geq} q_{n - 1} + (n - 2) \overset{(D)}{>} q_{n - 1} + 1 \overset{(E)}{\geq} n

(D) holds because n > 3, i.e.,

n - 2 > 1

. □

In fact, denominators of a convergent grow much faster than the inequation

q_{n} > n

may indicate:

Lemma 2.

(Geometric Growth of Denominators)

Let

q_{n}

(

n \geq 2

) be the denominator of the convergent

\frac{p_{n}}{q_{n}} = [a_{0}; a_{1}, \dots, a_{n}]

. Then:

q_{n} \geq 2^{\frac{n - 1}{2}}

(18)

Proof.

It is

q_{k} = a_{k} q_{k - 1} + q_{k - 2} > q_{k - 1} + q_{k - 2} \overset{(A)}{>} 2 q_{k - 2}

, with (A) because, according to corollary 1, denominators are strictly monotonically increasing, i.e.,

q_{k - 1} > q_{k - 2}

.

By induction, it is

q_{2 k} \geq 2^{k} q_{0}

, and then

2^{k} q_{0} \overset{(A)}{=} 2^{k} \overset{(B)}{\geq} 2^{\frac{(2 k) - 1}{2}}

with (A) because

q_{0} = 1

, and (B) follows from

2^{k} = 2^{\frac{2 k}{2}} \geq \frac{1}{\sqrt{2}} 2^{\frac{2 k}{2}} = 2^{\frac{2 k}{2} - \frac{1}{2}} = 2^{\frac{2 k - 1}{2}}

Similarly, by induction, it is

q_{2 k + 1} \geq 2^{k} q_{1}

and then

2^{k} q_{1} \overset{(A)}{\geq} 2^{k} = 2^{\frac{(2 k + 1) - 1}{2}}

with (A) because of

q_{1} \in ℕ

.

With

n = 2 k

and

n = 2 k + 1

, respectively, Equation (18) is implied. □

2.3. Convergence of Infinite Regular Continuous Fractions

In Section 2.1, we presented an algorithm to compute the continued fraction representation of a rational number. Next, we show how to compute such a representation for a non-rational number (Algorithm 1).

Algorithm 1 Continued Fraction Representation of Non-Rational Number

1.

Let α \in ℝ ∖ ℚ

. Define:

$α_{0} : = α$ $and b_{0} : = ⌊ α_{0} ⌋$ ;
$α_{i} : = \frac{1}{α_{i - 1} - b_{i - 1}}$ $and b_{i} : = ⌊ α_{i} ⌋$ for i ≥ 1.

2.

Then,

[b_{0}; b_{1}, b_{2}, \dots]

is the continued fraction representation of

α . Each α_{i}

is called the i-th complete quotient of α.

The above algorithm does not terminate, i.e., the continued fraction representation of a non-rational number is infinite. This is the content of the following note:

Note 2.

In Algorithm 1, it is

α_{i} \notin ℤ

.

Proof (by induction).

n = 0: Then, by definition,

α_{0} = α \notin ℤ

.

Induction hypothesis:

α_{n} \notin ℤ

.

n → n + 1: Assume

α_{n} - b_{n} \in ℤ

⇒

(α_{n} - b_{n}) = k \in ℤ

⇒

α_{n} = k + b_{n} \in ℤ

, which is a contradiction to the hypothesis! Thus,

α_{n} - b_{n} \notin ℤ

⇒

α_{n + 1} : = \frac{1}{α_{n} - b_{n}} \notin ℤ

. □

Figure 7 gives the computation of the continued fraction representation of

\sqrt{2}

:

Figure 7. Computing the continued fraction of

\sqrt{2}

.

2.4. Bounds Expressed by Denominators of Convergents

In the following, we give upper bounds and lower bounds of the approximations of a number by the convergents of its continued fraction representation by means of the denominators of the convergents.

First, we start with estimations of upper bounds:

Lemma 3.

(Upper Bounds)

Let

p_{n} / q_{n}

be a convergent of the continued fraction representation of x. Then:

|x - \frac{p_{n}}{q_{n}}| < \frac{1}{q_{n} q_{n + 1}} < \frac{1}{q_{n}^{2}} \leq \frac{1}{n^{2}}

(19)

Proof.

With

x_{n} = p_{n} / q_{n}

, it is

|x - x_{n}| < |x_{n + 1} - x_{n}|

(see Theorem 8, Equation (14)). According to Corollary 3 (Equation (10)), it is

x_{n + 1} - x_{n} = \frac{p_{n + 1}}{q_{n + 1}} - \frac{p_{n}}{q_{n}} = \frac{{(- 1)}^{n}}{q_{n} q_{n + 1}}

Thus,

|x - x_{n}| < |x_{n + 1} - x_{n}| = |\frac{{(- 1)}^{n}}{q_{n} q_{n + 1}}| = \frac{1}{q_{n} q_{n + 1}} \overset{(A)}{<} \frac{1}{q_{n}^{2}} \overset{(B)}{\leq} \frac{1}{n^{2}}

where (A) holds because of

q_{n + 1} > q_{n}

(Corollary 1), and (B) is true because of

q_{n} \geq n

(Lemma 1). □

An immediate consequence of this theorem is the convergence of the sequence of the convergents of a continued fraction to the value of the continued fraction. This, by the way, is the origin of the name “convergents”.

Corollary 4.

The series

(p_{n} / q_{n})

of the convergents of the continued fraction representation of

x \in ℝ ∖ ℚ

converges to x:

l i m \frac{p_{n}}{q_{n}} = x

Proof.

The claim follows immediately from

|x - \frac{p_{n}}{q_{n}}| < \frac{1}{n^{2}}

. □

Often, two fractions are compared by means of their mediant (“mediant” means “somewhere in between”).

Definition 3.

For

a / b, c / d \in ℚ

and b, d > 0, the term

\frac{a + c}{b + d}

is called the mediant of the two fractions.

The following simple inequation is often used.

Note 3.

(Mediant Property)

Let

a / b, c / d \in ℚ

and b, d > 0 and

\frac{a}{b} < \frac{c}{d}

.

Then:

\frac{a}{b} < \frac{a + c}{b + d} < \frac{c}{d}

(20)

Proof.

It is

\frac{a}{b} < \frac{c}{d} \Rightarrow a d < b c \Rightarrow b c - a d > 0

and

b, d > 0 \Rightarrow b (b + d) > 0

. This implies

\frac{a + c}{b + d} - \frac{a}{b} = \frac{b (a + c) - a (b + d)}{b (b + d)} = \frac{b c - a d}{b (b + d)} > 0

and thus

\frac{a}{b} < \frac{a + c}{b + d}

. The inequation

\frac{a + c}{b + d} < \frac{c}{d}

follows similarly. □

Mediants of convergents that are weighted in a certain way are another important concept for computing bounds:

Definition 4.

The term

x_{n, t} = \frac{t p_{n + 1} + p_{n}}{t q_{n + 1} + q_{n}}

with

1 \leq t \leq a_{n + 2}

is called the (n,t)-th semiconvergent.

Semiconvergents of an even n are strictly monotonically increasing, and semiconvergents of an odd n are strictly monotonically decreasing. This is the content of the following lemma.

Lemma 4.

(Monotony of Semiconvergents)

Let n be even. Then,

x_{n, t} < x_{n, t + 1}

.

Let n be odd. Then,

x_{n, t} > x_{n, t + 1}

.

Proof.

A simple calculation and the use of the sign theorem (Theorem 2) results in

\begin{matrix} x_{n, t + 1} - x_{n, t} = & \frac{(t + 1) p_{n + 1} + p_{n}}{(t + 1) q_{n + 1} + q_{n}} - \frac{t p_{n + 1} + p_{n}}{t q_{n + 1} + q_{n}} \\ = \frac{{(- 1)}^{n}}{((t + 1) q_{n + 1} + q_{n}) (t q_{n + 1} + q_{n})} \end{matrix}

The denominator of the last fraction is always positive. Thus, the last term is positive iff n is even (i.e.,

x_{n, t + 1} - x_{n, t} > 0

), and it is negative iff n is odd (i.e.,

x_{n, t + 1} - x_{n, t} < 0

). □

In order to simplify proofs in what follows, the following conventions are used:

p_{- 1} \overset{\underset{d e f}{}}{=} 1 and q_{- 1} \overset{\underset{d e f}{}}{=} 0

(21)

With this,

x_{- 1, 1} = \frac{p_{0} + p_{- 1}}{q_{0} + q_{- 1}} = \frac{a_{0} + 1}{1 + 0} = a_{0} + 1

becomes a semiconvergent. Now,

x_{1} = \frac{p_{1}}{q_{1}} \overset{(A)}{=} \frac{a_{1} a_{0} + 1}{a_{1}} = a_{0} + \frac{1}{a_{1}} \overset{(B)}{\leq} a_{0} + 1 = x_{- 1, 1}

where (A) is the recursion theorem and (B) follows because

a_{1} \geq 1

; thus,

x_{1} \leq x_{- 1, 1}

.

Furthermore, it is

x_{- 1, t} = \frac{t p_{0} + p_{- 1}}{t q_{0} + q_{- 1}} = \frac{t a_{0} + 1}{t \cdot 1 + 0} = \frac{t a_{0} + 1}{t} = a_{0} + \frac{1}{t}

for

1 \leq t \leq a_{1}

.

Putting things together, it is

x_{- 1, 1} = a_{0} + 1 > a_{0} + \frac{1}{2} > \dots > a_{0} + \frac{1}{a_{1}} = x_{1}

(22)

Based on this, we can refine Figure 4, which depicts the nesting and ordering of convergents by including semiconvergents: Between two succeeding convergents (e.g.,

x_{n}

and

x_{n + 2}

in Figure 8, the corresponding semiconvergents ordered according to Lemma 4 are nested (in increasing order as shown for an even n in Figure 8). Furthermore, beyond

x_{1} = a_{0} + \frac{1}{a_{1}}

, the semiconvergents

x_{- 1, t}

are added.

Figure 8. Nesting of convergents and semiconvergents (n even).

Now, we are prepared to prove a lower bound of the approximation of a number by the convergents of its continued fraction representation by means of the denominators of the convergents.

Lemma 5.

(Lower Bounds)

Let

p_{n} / q_{n}

be a convergent of the continued fraction representation of x. Then:

|x - \frac{p_{n}}{q_{n}}| > \frac{1}{(q_{n} + q_{n + 1}) q_{n}}

(23)

Proof.

The proof is based on the following claims:

Claim 1.

n even ⇒

\frac{p_{n}}{q_{n}} < \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} < x < \frac{p_{n + 1}}{q_{n + 1}}

.

Proof.

\frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}}

is the mediant of

\frac{p_{n + 1}}{q_{n + 1}}

and

\frac{p_{n}}{q_{n}}

. Thus, the mediant property (Note 3) shows that

\frac{p_{n}}{q_{n}} < \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} < \frac{p_{n + 1}}{q_{n + 1}}

. Then:

\frac{p_{n}}{q_{n}} < \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} \overset{(A)}{<} \frac{2 p_{n + 1} + p_{n}}{2 q_{n + 1} + q_{n}} < \dots < \frac{a_{n + 2} p_{n + 1} + p_{n}}{a_{n + 2} q_{n + 1} + q_{n}} \overset{(B)}{=} \frac{p_{n + 2}}{q_{n + 2}}

where (A) follows by the monotony of even semiconvergents (Lemma 4), and (B) is the recursion theorem. Because of Theorem 7 (note that n + 2 is even and n + 1 is odd), it is

\frac{p_{n + 2}}{q_{n + 2}} < x < \frac{p_{n + 1}}{q_{n + 1}}

. This proves Claim 1.

□_{(c l a i m 1)}

Claim 2.

n odd ⇒

\frac{p_{n - 1}}{q_{n - 1}} < x < \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} < \frac{p_{n}}{q_{n}}

.

Proof.

As before,

\frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}}

is the mediant of

\frac{p_{n + 1}}{q_{n + 1}}

and

\frac{p_{n}}{q_{n}}

. Because n is odd, it is

\frac{p_{n + 1}}{q_{n + 1}} < \frac{p_{n}}{q_{n}}

(Theorem 7). Thus,

\frac{p_{n + 1}}{q_{n + 1}} < \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} < \frac{p_{n}}{q_{n}}

because of the mediant property (Note 3). Then:

\frac{p_{n}}{q_{n}} > \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} \overset{(A)}{>} \frac{2 p_{n + 1} + p_{n}}{2 q_{n + 1} + q_{n}} > \dots > \frac{a_{n + 2} p_{n + 1} + p_{n}}{a_{n + 2} q_{n + 1} + q_{n}} \overset{(B)}{=} \frac{p_{n + 2}}{q_{n + 2}}

where (A) follows by the monotony of odd semiconvergents (Lemma 4), and (B) is the recursion theorem. Because of Theorem 7 (note that n − 1 is even and n + 2 is odd), it is

\frac{p_{n - 1}}{q_{n - 1}} < x < \frac{p_{n + 2}}{q_{n + 2}}

, and because n is odd, it is

\frac{p_{n + 2}}{q_{n + 2}} < \frac{p_{n}}{q_{n}}

. This proves Claim 2.

□_{(c l a i m 2)}

With Claim 1, for even n, it is

\frac{p_{n}}{q_{n}} < \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} < x

⇒

x - \frac{p_{n}}{q_{n}} > \frac{p_{n} + p_{n + 1}}{q_{n} + q_{n + 1}} - \frac{p_{n}}{q_{n}}

.

With Claim 2, for n odd, it is

x < \frac{p_{n + 1} + p_{n}}{q_{n + 1} + q_{n}} < \frac{p_{n}}{q_{n}}

⇒

\frac{p_{n}}{q_{n}} - x > \frac{p_{n}}{q_{n}} - \frac{p_{n} + p_{n + 1}}{q_{n} + q_{n + 1}}

⇔

- (x - \frac{p_{n}}{q_{n}}) > - (\frac{p_{n} + p_{n + 1}}{q_{n} + q_{n + 1}} - \frac{p_{n}}{q_{n}})

.

Thus, for any k ∈ ℕ:

|x - \frac{p_{k}}{q_{k}}| > |\frac{p_{k + 1} + p_{k}}{q_{k + 1} + q_{k}} - \frac{p_{k}}{q_{k}}|

. Next, we compute

\begin{matrix} \frac{p_{k} + p_{k + 1}}{q_{k} + q_{k + 1}} - \frac{p_{k}}{q_{k}} & = \frac{(p_{k} + p_{k + 1}) q_{k} - (q_{k} + q_{k + 1}) p_{k}}{(q_{k} + q_{k + 1}) q_{k}} = \frac{p_{k + 1} q_{k} - p_{k} q_{k + 1}}{(q_{k} + q_{k + 1}) q_{k}} \\ \overset{(A)}{=} \frac{{(- 1)}^{k}}{(q_{k} + q_{k + 1}) q_{k}} \end{matrix}

where (A) is the sign theorem (Theorem 2).

This implies

|x - \frac{p_{k}}{q_{k}}| > |\frac{{(- 1)}^{k}}{(q_{k} + q_{k + 1}) q_{k}}| = \frac{1}{(q_{k} + q_{k + 1}) q_{k}}

. □

Because of

q_{k + 1} > q_{k}

(Corollary 1), it is

q_{k} + q_{k + 1} < 2 q_{k + 1} \Leftrightarrow \frac{1}{2 q_{k + 1}} < \frac{1}{q_{k} + q_{k + 1}} \Leftrightarrow \frac{1}{2 q_{k} q_{k + 1}} < \frac{1}{(q_{k} + q_{k + 1}) q_{k}}

Using the last inequality in Lemma 5 (Lower Bounds) and using Lemma 3 (Upper Bounds), we obtain the concluding theorem of this section:

In summary, we have proved the following:

Theorem 10.

(Bounds of Approximations by Convergents)

Let

p_{k} / q_{k}

be a convergent of the continued fraction representation of x. Then:

\frac{1}{2 q_{k} q_{k + 1}} < \frac{1}{(q_{k} + q_{k + 1}) q_{k}} < |x - \frac{p_{k}}{q_{k}}| < \frac{1}{q_{k} q_{k + 1}} < \frac{1}{q_{k}^{2}}

(24)

□

2.5. Best Approximations

Our goal is to approximate a real number by a rational number as good as possible while keeping the denominator of the rational number “small”. Keeping the denominator small is important because in practice, every real number can only be given up to a certain degree of precision, and this is achieved by means of a huge denominator and corresponding numerator. i.e., approximating a real number by a rational number with a huge denominator is canonical, but finding a small denominator is a problem.

This is captured by the following:

Definition 5.

A fraction

p / q \in ℚ

is called a best approximation (of the first kind) of

α \in ℝ

:⇔

\forall c / d \in ℚ : d \leq q \Rightarrow |α - \frac{c}{d}| > |α - \frac{p}{q}|

(assuming c/d ≠ p/q).

Often, the addition “of the first kind” is omitted. By definition, a best approximation of a real number can only be improved if the denominator of the given approximation is increased.

If p/q is a best approximation of α, then

|α - \frac{p}{q}| = \frac{1}{q} |q α - p|

is small and, thus,

|q α - p|

is small. Measuring the goodness of an approximation this way results in the following:

Definition 6.

A fraction

p / q \in ℚ

is called a best approximation of the second kind of

α \in ℝ

:⇔

\forall c / d \in ℚ : d \leq q \Rightarrow |d α - c| > |q α - p|

(assuming c/d ≠ p/q).

The question is whether every best approximation is also a best approximation of the second kind. Now, 1/3 is a best approximation of 1/5 because the only possible fractions for c/d, with d ≤ 3 = q, are 0, 1/2, 2/3, and 1, and these numbers satisfy

|\frac{1}{5} - \frac{c}{d}| > |\frac{1}{5} - \frac{1}{3}|

.

Next, we observe that

|1 \cdot \frac{1}{5} - 0| < |3 \cdot \frac{1}{5} - 1|

with 1 < 3. Thus, with

d = 1

and

q = 3

(i.e.,

d < q

) and

α = 1 / 5

, we found a fraction

c / d = 0 / 1

with

|d α - c| < |q α - p|

! As a consequence, although 1/3 is a best approximation of the first kind of 1/5, it is not a best approximation of the second kind.

Thus, not all best approximations of the first kind are best approximations of the second kind. But the reverse holds true:

Lemma 6.

(Every 2nd Kind Best Approximation is a 1st Kind Best Approximation)

If

p / q \in ℚ

is a best approximation of the second kind of

α \in ℝ

, then

p / q

is also a best approximation of the first kind of α.

Proof (by contradiction).

Assume p/q is not a best approximation of the first kind. Then,

|α - \frac{c}{d}| \leq |α - \frac{p}{q}|

for a fraction c/d with d < q. Multiplying both inequations results in

d |α - \frac{c}{d}| \leq q |α - \frac{p}{q}|

⇔

|d α - c| \leq |q α - p|

, which is a contradiction because p/q is a best approximation of the second kind. □

The next simple estimation about the distance of two fractions by means of the product of their denominators is often used.

Note 4.

(Distance of Fractions)

Let

\frac{a}{b}, \frac{p}{q} \in ℚ

with

\frac{a}{b} \neq \frac{p}{q}

. Then:

|\frac{p}{q} - \frac{a}{b}| \geq \frac{1}{q b}

(25)

Proof.

With

a, p \in ℤ

and

b, q \in ℕ

, it is

p b - a q \in ℤ

. Also,

p b - a q \neq 0

because otherwise

p b = a q \Leftrightarrow \frac{p}{q} = \frac{a}{b}

which contradicts the premise. Thus,

|p b - a q| \in ℕ

, i.e.,

|p b - a q| \geq 1

. This implies

|\frac{p}{q} - \frac{a}{b}| = |\frac{p b - a q}{q b}| = \frac{|p b - a q|}{|q b|} \geq \frac{1}{q b}

where

|q b| = q b

because

b, q \in ℕ

. □

Next, we prove that every best approximation of the second kind is a convergent.

Theorem 11.

(2nd Kind Best Approximations are Convergents)

Let

a / b

be a best approximation of the second kind of

x \in ℝ

, and let

x = [a_{0}; a_{1}, \dots]

be the continued fraction representation of x.

Then

a / b

is a convergent of x.

Proof.

Being a best approximation of the second kind of x,

a / b

satisfies, by definition,

|d x - c| > |b x - a|

for d ≤ b.

Claim 1.

\frac{a}{b} \geq a_{0} = x_{0}

.

Proof (by contradiction).

Assume

\frac{a}{b} < a_{0} \Rightarrow - a_{0} < - \frac{a}{b} \Rightarrow x - a_{0} < x - \frac{a}{b}

; thus,

|x - a_{0}| < |x - \frac{a}{b}| \overset{(A)}{\leq} b |x - \frac{a}{b}| = |b x - a|

, where (A) holds because

b \in ℕ

, i.e., 1 ≤ b. This implies

|1 \cdot x - a_{0}| \leq |b x - a|

, which contradicts

|d x - c| > |b x - a|

for d ≤ b (with

d = 1 \leq b

and

c = a_{0}

). This means that

\frac{a}{b} \geq a_{0} = \frac{a_{0}}{1} \overset{(B)}{=} \frac{q_{0}}{q_{0}} = x_{0}

, (B) is because of the recursion theorem.

□_{(c l a i m 1)}

Thus, the geometric situation is as depicted in Figure 9, i.e.,

a / b

is in the grey shaded area being greater than or equal to the convergent

x_{0}

. This will be refined in what follows.

Figure 9. Any best approximation of the second kind is in the grey shaded area, i.e., greater than or equal to the convergent

x_{0}

.

Next, we proceed with a proof by contradiction assuming that

a / b

is not a convergent of x.

Assumption.

\frac{a}{b} \neq \frac{q_{k}}{q_{k}} = x_{k}

for

k \in ℕ

.

According to Claim 1,

\frac{a}{b} \geq a_{0} = x_{0}

. Thus, one of the following must hold:

(i) \frac{a}{b} \in]\frac{p_{k - 1}}{q_{k - 1}}, \frac{p_{k + 1}}{q_{k + 1}}[for k \geq 1

or

(ii) \frac{a}{b} > \frac{p_{1}}{q_{1}} = x_{1}

This situation is shown in Figure 10.

Figure 10. If a best approximation of the second kind is not a convergent, it is within the indicated grey shaded areas.

Case (1).

If (i) is true, then

|\frac{a}{b} - \frac{p_{k - 1}}{q_{k - 1}}| < |x - \frac{p_{k - 1}}{q_{k - 1}}| \overset{(T h 8)}{<} |\frac{p_{k}}{q_{k}} - \frac{p_{k - 1}}{q_{k - 1}}| \overset{(C)}{=} \frac{1}{q_{k} q_{k - 1}}

where (Th8) is Theorem 8, Equation (14), and (C) is from Corollary 3, Equation (10). Furthermore,

|\frac{a}{b} - \frac{p_{k - 1}}{q_{k - 1}}| \overset{(D)}{\geq} \frac{1}{b q_{k - 1}}

, with (D) because of Note 4 (Distance of Fractions).

Together,

\frac{1}{b q_{k - 1}} \leq |\frac{a}{b} - \frac{p_{k - 1}}{q_{k - 1}}| < \frac{1}{q_{k} q_{k - 1}}

⇒

\frac{1}{b} < \frac{1}{q_{k}}

⇒

b > q_{k}

(iii).

Also, if (i) is true, then

|x - \frac{a}{b}| \geq |\frac{p_{k + 1}}{q_{k + 1}} - \frac{a}{b}| \overset{(E)}{\geq} \frac{1}{b q_{k + 1}}

, where (E) is again using Note 4. This implies

b |x - \frac{a}{b}| \geq \frac{1}{q_{k + 1}}

⇒

|b x - a| \geq \frac{1}{q_{k + 1}}

(iv).

Lemma 3 (Upper Bounds) tells us that

|x - \frac{p_{k}}{q_{k}}| < \frac{1}{q_{k} q_{k + 1}}

which is equivalent to

q_{k} |x - \frac{p_{k}}{q_{k}}| < \frac{1}{q_{k + 1}}

⇔

|q_{k} x - p_{k}| < \frac{1}{q_{k + 1}}

⇒

|q_{k} x - p_{k}| < |b x - a|

(see (iv) just before). Since

q_{k} < b

(see (iii) above), this is a contradiction to

a / b

being a best approximation of the second kind of x. Thus, Case (1) does not occur.

Case (2).

This case is shown in Figure 11. Then,

|x - \frac{a}{b}| > |\frac{p_{1}}{q_{1}} - \frac{a}{b}| \overset{(F)}{=} \frac{1}{b q_{1}}

, where (F) again uses Note 4. This implies

|b x - a| > \frac{1}{q_{1}} \overset{(G)}{=} \frac{1}{a_{1}}

(v) with (G) using the recursion theorem.

Figure 11. Pictorial representation of Case (2).

Now,

x - a_{0} = \frac{1}{a_{1} + \frac{1}{a_{2} + ⋱}} \leq \frac{1}{a_{1}}

, where the last inequality holds because of

\frac{1}{a_{2} + ⋱} > 0

; thus,

|x - a_{0}| \leq \frac{1}{a_{1}} \overset{(H)}{<} |b x - a|

, (H) based on (v) before. This means that

|1 \cdot x - a_{0}| < |b x - a|

with

1 \leq b

, i.e.,

a / b

is not a best approximation of the second kind of x, which is a contradiction. Thus, Case (2) does not occur either.

Consequently, the assumption is wrong and there is a

k \in ℕ

with

\frac{a}{b} = \frac{q_{k}}{q_{k}} = x_{k}

, i.e.,

a / b

is a convergent. □

So, every best approximation of the second kind is a convergent. The next theorem proves the reverse, i.e., that every convergent is a best approximation of the second kind.

Theorem 12.

(Lagrange, 1798—Convergents are 2nd Kind Best Approximations)

Let

p_{n} / q_{n}

be a convergent of

x = [a_{0}; a_{1}, \dots, a_{N}]

,

x \neq a_{0} + \frac{1}{2}

, and

n \neq 0

. Then, for

d \leq q_{n}

and

\frac{c}{d} \neq \frac{p_{n}}{q_{n}}

it is

|d x - c| > |q_{n} x - p_{n}|

, i.e., the convergent is a best approximation of the second kind of x.

The cases

x = a_{0} + \frac{1}{2}

and

n = 0

are excluded because the convergent

\frac{p_{0}}{q_{0}} = \frac{a_{0}}{1}

is not a best approximation of the second kind of

x = a_{0} + \frac{1}{2}

: it is

|1 \cdot x - (a_{0} + 1)| = |a_{0} + \frac{1}{2} - a_{0} - 1| = \frac{1}{2}

and

|1 \cdot x - a_{0}| = |a_{0} + \frac{1}{2} - a_{0}| = \frac{1}{2}

, which implies

|1 \cdot x - (a_{0} + 1)| = |1 \cdot x - a_{0}|

. Setting

d : = 1 \leq q_{0}

,

c : = a_{0} + 1

results in

|d \cdot x - c|

=

|1 \cdot x - (a_{0} + 1)| = |1 \cdot x - a_{0}|

=

|q_{0} \cdot x - p_{0}|

. If

\frac{p_{0}}{q_{0}}

would be a best approximation of the second kind of x, then

|1 \cdot x - (a_{0} + 1)|

>

|1 \cdot x - a_{0}|

would hold.

The proof of Lagrange’s theorem is very technical. First, the expression

|y_{0} x - z_{0}|

is analyzed to find the smallest integral numbers

y_{0}

and

z_{0}

such that the expression is minimized under the constraint

y_{0} \in \{q_{0}, \dots, q_{k}\}

, i.e.,

y_{0}

is a denominator of a convergent. It is shown both that

z_{0} / y_{0}

is a best approximation of the second kind of x, and that

z_{0} = p_{k}

and

y_{0} = q_{k}

.

Proof.

Let

k \in ℤ

and let

p_{k} / q_{k}

be a convergent. First, we are looking for the smallest numbers

y_{0}, z_{0} \in ℤ

with

y_{0} \in \{q_{0}, \dots, q_{k}\}

such that

|y_{0} x - z_{0}|

is minimal.

Step 1.

Pick an arbitrary

z \in ℤ

, and based on this we determine

y_{0} \in \{q_{0}, \dots, q_{k}\}

.

It is

\underset{y}{m i n} |y x - z| = 0 \Leftrightarrow y = \frac{z}{x}

, but in general

y \notin ℤ

. Looking for a solution

y_{0} \in \{q_{0}, \dots, q_{k}\} \subseteq ℤ

that minimizes

|y_{0} x - z|

results in the following potential positions of

z / x

with respect to the denominators

q_{0}, \dots, q_{k}

(see Figure 12):

Figure 12. The potential positions of

z / x

with respect to the denominators

q_{0}, \dots, q_{k}

.

Case 1: $z / x > q_{k}$ . Then, $y_{0} = q_{k}$ is the solution;
Case 2: $z / x < q_{0}$ . Then, $y_{0} = q_{0}$ is the solution;
Let $q_{i} \leq z / x \leq q_{i + 1}$ for 1 ≤ i ≤ k.
Case 3: For $|q_{i + 1} x - z| < |q_{i} x - z|$ (i.e., $z / x$ is closer to $q_{i + 1}$ than to $q_{i}$ ), $y_{0} = q_{i + 1}$ is the solution, and for $|q_{i + 1} x - z| > |q_{i} x - z|$ (i.e., $z / x$ is closer to $q_{i}$ than to $q_{i + 1}$ ), $y_{0} = q_{i}$ is the solution;
Case 4: For $|q_{i + 1} x - z| = |q_{i} x - z|$ (i.e., $z / x$ is exactly in the middle between $q_{i}$ and $q_{i + 1}$ ), $y_{0} = q_{i}$ is the solution because $q_{i} < q_{i + 1}$ , and we are looking for the smallest $y_{0}$ , especially $y_{0} \geq q_{0} = 1$ . $□_{(s t e p 1)}$

Step 2.

Based on the

y_{0}

found, we determine

z_{0}

next. It is

\underset{z}{m i n} |y_{0} x - z| = 0

\Leftrightarrow z = y_{0} x

, but in general,

z \notin ℤ

. In solving the minimization problem within

ℤ

(i.e.,

z_{0} : = \underset{z \in ℤ}{a r g m i n} |y_{0} x - z|

), the following cases can be distinguished (see Figure 13):

Figure 13. The potential positions of

y_{0} x

.

Case 0: It may happen that $y_{0} x \in ℤ$ . Then, choose $z_{0} = y_{0} x$ ;
Case 1: $y_{0} x$ is between two integral numbers s and t, i.e., $s < y_{0} x < t$ . For $|y_{0} x - s| > |y_{0} x - t|$ (i.e., $y_{0} x$ is closer to t than to s), $z_{0} = t$ is the solution; and for $|y_{0} x - s| < |y_{0} x - t|$ (i.e., $y_{0} x$ is closer to s than to t), $z_{0} = s$ is the solution;
Case 2: For $|y_{0} x - s| = |y_{0} x - t|$ (i.e., $y_{0} x$ is exactly in the middle between t and s), $z_{0} = s$ is the solution because $s < t$ , and we are looking for the smallest $z_{0}$ . $□_{(s t e p 2)}$

Claim 1.

z_{0}

is uniquely determined.

Proof (by contradiction).

Assume there exists a

{\tilde{z}}_{0} \in ℤ

with

{\tilde{z}}_{0} \neq z_{0}

and

|x - \frac{z_{0}}{y_{0}}| = |x - \frac{{\tilde{z}}_{0}}{y_{0}}|

. This can only happen iff one term is positive and the other is negative, i.e., for example, if

x - \frac{z_{0}}{y_{0}} > 0

and

x - \frac{{\tilde{z}}_{0}}{y_{0}} < 0

, and then

x - \frac{z_{0}}{y_{0}} = \frac{{\tilde{z}}_{0}}{y_{0}} - x

, i.e.,

x = \frac{z_{0} + {\tilde{z}}_{0}}{2 y_{0}}

.

As an intermediate step we prove:

Claim 2.

z_{0} + {\tilde{z}}_{0}

and

2 y_{0}

are co-prime, i.e.,

g c d (z_{0} + {\tilde{z}}_{0}, 2 y_{0}) = 1

Proof (by contradiction).

Let

{\tilde{z}}_{0} + z_{0} = L p

and

2 y_{0} = L q

with

L > 1

. Then,

x = \frac{z_{0} + {\tilde{z}}_{0}}{2 y_{0}} = \frac{L p}{L q} \Rightarrow x = \frac{p}{q}

and thus

(i) |q x - p| = |q \frac{p}{q} - p| = 0

Assume

L > 2

. Then, with

2 y_{0} = L q

and

L / 2 > 1

, it follows:

(ii) y_{0} = \frac{L}{2} q > q

Now,

y_{0}

has been determined in Step 1 to satisfy

y_{0} = \underset{y}{a r g m i n} |y x - z|

for a given z, especially for

z = p

, i.e.,

y_{0} = \underset{y}{a r g m i n} |y x - p|

. Because

0 = \underset{y}{m i n} |y x - p|

and

|q x - p| = 0

, it must be

q = y_{0}

. This is a contradiction because

q < y_{0}

according to (ii) before. Thus,

1 < L \leq 2

, i.e., L = 2.

With

L = 2

and

2 y_{0} = L q

, we get

y_{0} = q

, which implies. By definition of

z_{0}

,

|q x - p| = |y_{0} x - p| > |y_{0} x - z_{0}|

. However,

|q x - p| = 0

(see (i) above); thus,

0 > |y_{0} x - z_{0}|

, which is a contraction.

□_{(c l a i m 2)}

We continue the proof of Claim 1: It is

\frac{z_{0} + {\tilde{z}}_{0}}{2 y_{0}} = x

and also

x = \frac{p_{N}}{q_{N}}

, i.e.,

\frac{z_{0} + {\tilde{z}}_{0}}{2 y_{0}} = \frac{p_{N}}{q_{N}}

. Because

g c d (z_{0} + {\tilde{z}}_{0}, 2 y_{0}) = 1

according to Claim 2, it follows that

p_{N} = z_{0} + {\tilde{z}}_{0}

and

q_{N} = 2 y_{0}

.

Now, let

N \geq 2

. Then, it is

2 y_{0} = q_{N} \overset{(A)}{=} a_{N} q_{N - 1} + q_{N - 2}

((A) uses the recursion theorem (Theorem 1)), and with Note 1, it is

a_{N} \geq 2

. Thus,

2 y_{0} \geq 2 q_{N - 1} + q_{N - 2}

\Rightarrow y_{0} \geq q_{N - 1} + \frac{q_{N - 2}}{2}

\Rightarrow q_{N - 1} \leq y_{0} - \frac{q_{N - 2}}{2} \overset{(B)}{<} y_{0}

((B) is because

q_{N - 2} > 0

). Now:

|q_{N - 1} x - p_{N - 1}| = |q_{N - 1} \frac{p_{N}}{q_{N}} - p_{N - 1}| = \frac{1}{q_{N}} |q_{N - 1} p_{N} - p_{N - 1} q_{N}| \overset{(C)}{=} \frac{1}{q_{N}} = \frac{1}{2 y_{0}} \overset{(D)}{\leq} \frac{1}{2}

where (C) holds because of the sign theorem and (D) because

y_{0} \geq 1

(see the end of the proof of Step 1).

Furthermore,

\begin{matrix} |y_{0} x - z_{0}| = |y_{0} \frac{z_{0} + {\tilde{z}}_{0}}{2 y_{0}} - z_{0}| & = |\frac{z_{0} + {\tilde{z}}_{0}}{2} - z_{0}| = \frac{1}{2} |z_{0} + {\tilde{z}}_{0} - 2 z_{0}| \\ = \frac{1}{2} |{\tilde{z}}_{0} - z_{0}| \overset{(E)}{\geq} \frac{1}{2} (i i i) \end{matrix}

where (E) is true because

{\tilde{z}}_{0} \neq z_{0}

and, thus,

|{\tilde{z}}_{0} - z_{0}| \geq 1

for integral numbers

{\tilde{z}}_{0}

and

z_{0}

. Together, we obtained

|y_{0} x - z_{0}| \geq \frac{1}{2} \geq |q_{N - 1} x - p_{N - 1}|

, which is a contradiction to the choice of

y_{0}

and

z_{0}

! This proves Claim 1 for

N \geq 2

.

Now, let

N = 1

and choose

a_{1} = 2

(based on Note 1, the highest element of a continued fraction is always greater than or equal 2, thus

a_{1} \geq 2

). Then

x = [a_{0}; a_{1}] = \frac{p_{1}}{q_{1}} \overset{(F)}{=} \frac{a_{1} a_{0} + 1}{a_{1}} = \frac{2 a_{0} + 1}{2} = a_{0} + \frac{1}{2}

((F) is the recursion theorem) which has been excluded from the theorem.

Thus, let

N = 1

and

a_{1} > 2

. Then

|1 \cdot x - a_{0}| \overset{(G)}{=} |q_{0} x - p_{0}| = |q_{0} \frac{p_{1}}{q_{1}} - p_{0}| = \frac{1}{q_{1}} |q_{0} p_{1} - q_{1} p_{0}| \overset{(H)}{=} \frac{1}{q_{1}} \overset{(G)}{=} \frac{1}{a_{1}} < \frac{1}{2}

where (G) applies the recursion theorem and (H) the sign theorem. Because of (iii), it is

|y_{0} x - z_{0}| \geq \frac{1}{2}

, i.e., together,

|q_{0} x - p_{0}| < |y_{0} x - z_{0}|

which contradicts the definition of

y_{0}

and

z_{0}

! This proves Claim 1 for

N = 1

.

□_{(c l a i m 1)}

Next, we observe

Claim 3.

\frac{z_{0}}{y_{0}}

is a best approximation of the second kind of x.

Otherwise:

|b x - a| \leq |y_{0} x - z_{0}|

for an

\frac{a}{b} \neq \frac{z_{0}}{y_{0}}

with

b \leq y_{0}

, which contradicts the definition of

y_{0}

and

z_{0}

!

□_{(c l a i m 3)}

According to Theorem 11,

\frac{z_{0}}{y_{0}}

is a convergent of x, i.e.,

\frac{z_{0}}{y_{0}} = \frac{p_{s}}{q_{s}}

for an

s \leq k

. If

s = k

, the proof is done. Thus, we assume

s < k

.

Claim 4.

For

s < k

, it is

\frac{1}{q_{s} + q_{s + 1}} \geq \frac{1}{q_{k} + q_{k - 1}}

.

Proof.

s < k

⇒

s \leq k - 1

⇒

q_{s} \leq q_{k - 1}

(Corollary 1: denominators are monotonically increasing). Similarly,

s < k

⇒

s + 1 \leq k

⇒

q_{s + 1} \leq q_{k}

. Together, this implies

q_{k} + q_{k - 1} \geq q_{s} + q_{s + 1}

.

□_{(c l a i m 4)}

Next, we get

|q_{s} x - p_{s}| = q_{s} |x - \frac{p_{s}}{q_{s}}| \overset{(I)}{>} q_{s} \frac{1}{(q_{s} + q_{s + 1}) q_{s}} = \frac{1}{q_{s} + q_{s + 1}} \overset{(J)}{\geq} \frac{1}{q_{k} + q_{k - 1}}

where (I) is Lemma 5 (Lower Bounds) and (J) is Claim 4.

Furthermore,

|q_{k} x - p_{k}| = q_{k} |x - \frac{p_{k}}{q_{k}}| \overset{(K)}{<} q_{k} \frac{1}{q_{k} q_{k + 1}} = \frac{1}{q_{k + 1}}

, where (K) holds because of Lemma 3 (Upper Bounds).

With

\frac{z_{0}}{y_{0}} = \frac{p_{s}}{q_{s}}

and the definition of

y_{0} (= q_{s})

and

z_{0} (= p_{s})

(i.e., the minimizing property), it is

|q_{s} x - p_{s}| = |y_{0} x - z_{0}| \leq |q_{k} x - p_{k}|

⇒

\frac{1}{q_{k} + q_{k - 1}} \leq \frac{1}{q_{k + 1}}

, which implies

q_{k + 1} < q_{k} + q_{k - 1}

. This is a contradiction; because of the recursion theorem, it is

q_{k + 1} = a_{k + 1} q_{k} + q_{k - 1} \overset{(L)}{\geq} q_{k} + q_{k - 1}

, where (L) holds with

a_{k} \geq 1

. Thus,

s = k

which proves the overall theorem. □

Putting the last two theorems together yields:

Corollary 5.

a / b

is a best approximation of the second kind of x ⇔ x is a convergent of x. □

According to Theorem 12, every convergent is a best approximation of the second kind, and each best approximation of the second kind is also a best approximation of the first kind (Lemma 6). We keep this observation as:

Note 5.

Every convergent is a best approximation of the first kind. □

But are best approximations of the first kind also always convergents? Not quite: the next theorem proves that a best approximation of the first kind is a convergent or a semiconvergent.

Theorem 13.

(Lagrange, 1798—1st Kind Best Approximations are Convergents or Semiconvergents)

Let

a / b

be a best approximation of the first kind of

x = [a_{0}; a_{1}, \dots, a_{N}]

. Then

a / b

is a convergent or a semiconvergent of x.

Proof.

By definition, it is

|x - \frac{c}{d}| > |x - \frac{a}{b}|

for

\frac{c}{c} \neq \frac{a}{b}

and

d \leq b

.

Claim 1.

a / b > a_{0}

.

Otherwise:

\frac{a}{b} \leq a_{0} = \frac{a_{0}}{1}

; thus,

x - a_{0} \leq x - \frac{a}{b}

. Now,

x - a_{0} = \frac{1}{a_{1} + ⋱} > 0

; thus,

0 < x - a_{0} \leq x - \frac{a}{b}

⇒

|x - \frac{a_{0}}{1}| \leq |x - \frac{a}{b}|

. Because

1 \leq b

, we obtained a contradiction since

a / b

is a best approximation of the first kind.

□_{(c l a i m 1)}

Claim 2.

a / b < a_{0} + 1

.

Otherwise:

\frac{a}{b} \geq a_{0} + 1

and based on the geometric situation depicted in Figure 8, it follows that

|x - \frac{a_{0} + 1}{1}| \leq |x - \frac{a}{b}|

with

1 \leq b

, which contradicts

a / b

being a best approximation of the first kind.

□_{(c l a i m 2)}

Consequently,

a / b

lies between

x_{0} = a_{0}

and

x_{- 1, 1} = a_{0} + 1

(see Equation (22)), i.e.,

x_{0} = a_{0} < \frac{a}{b} < a_{0} + 1 = x_{- 1, 1}

(26)

and is, thus, covered by the set of intervals defined by the convergents and semiconvergents of x (see Figure 8).

Assumption.

a / b

is neither a convergent nor a semiconvergent.

This results in the following cases:

Case 1: $a / b$ lies between two semiconvergents $x_{k - 1, r}$ and $x_{k - 1, r + 1}$ ;
Case 2: $a / b$ lies between two convergents $x_{k}$ and $x_{k + 2}$ ;
Case 3: $a / b$ lies between a convergent and a semiconvergent.

We will show that all three cases lead to a contradiction, i.e., the assumption must be false; thus, the theorem is proven.

Case 1.

a / b

lies between

x_{k - 1, r} = \frac{r p_{k} + p_{k - 1}}{r q_{k} + q_{k - 1}}

and

x_{k - 1, r + 1} = \frac{(r + 1) p_{k} + p_{k - 1}}{(r + 1) q_{k} + q_{k - 1}}

.

Then,

\begin{matrix} |\frac{a}{b} - \frac{r p_{k} + p_{k - 1}}{r q_{k} + q_{k - 1}}| < & ∣ \frac{(r + 1) p_{k} + p_{k - 1}}{(r + 1) q_{k} + q_{k - 1}} \\ - \frac{r p_{k} + p_{k - 1}}{r q_{k} + q_{k - 1}} ∣ \overset{(A)}{=} \frac{1}{((r + 1) q_{k} + q_{k - 1}) (r q_{k} + q_{k - 1})} \end{matrix}

where (A) results from the same computation performed in the proof of Lemma 4.

Furthermore, it is

(i) |\frac{a}{b} - \frac{r p_{k} + p_{k - 1}}{r q_{k} + q_{k - 1}}| = \frac{|a (r q_{k} + q_{k - 1}) - b (r p_{k} + p_{k - 1})|}{b (r q_{k} + q_{k - 1})} \overset{(B)}{\geq} \frac{1}{b (r q_{k} + q_{k - 1})}

where (B) is seen to be valid as follows:

a (r q_{k} + q_{k - 1}) - b (r p_{k} + p_{k - 1}) \in ℤ

and, thus,

|a (r q_{k} + q_{k - 1}) - b (r p_{k} + p_{k - 1})| \in ℕ_{0}

; if it would be zero, the first modulus in (i) would be zero, i.e.,

a / b = x_{k - 1, r}

which contradicts the assumption of the claim, which in turn implies

|a (r q_{k} + q_{k - 1}) - b (r p_{k} + p_{k - 1})| \geq 1

.

Together,

\frac{1}{b (r q_{k} + q_{k - 1})} < \frac{1}{((r + 1) q_{k} + q_{k - 1}) (r q_{k} + q_{k - 1})} \Rightarrow \frac{1}{b} < \frac{1}{(r + 1) q_{k} + q_{k - 1}},

thus,

(ii) b > (r + 1) q_{k} + q_{k - 1}

Because of the monotony of the sequence of semiconvergents

{(x_{s, t})}_{t}

(Lemma 4), it is for an odd

k

(i.e.,

k - 1

even)

x_{k - 1, r} < x_{k - 1, r + 1}

(see the geometric situation in Figure 14);

Figure 14. Distances within an interval of semiconvergents (k odd).

thus,

|x - \frac{a}{b}| > |x - \frac{(r + 1) p_{k} + p_{k - 1}}{(r + 1) q_{k} + q_{k - 1}}|

But with (ii), it is

(r + 1) q_{k} + q_{k - 1} < b

; thus,

a / b

is not a best approximation of the first kind to x, which is a contradiction.

k

even leads to a contradiction too, i.e., Case (1) is not possible

□_{(c a s e 1)}

Case 2.

a / b

lies between

x_{k}

and

x_{k + 2}

.

Then,

|\frac{a}{b} - \frac{p_{k}}{q_{k}}| < |\frac{p_{k}}{q_{k}} - \frac{p_{k + 2}}{q_{k + 2}}| \overset{(C)}{=} \frac{a_{k + 2}}{q_{k} q_{k + 2}} < \frac{1}{q_{k} q_{k + 2}}

where (C) is Equation (11) from Corollary 3, and with Note 4, it is

|\frac{a}{b} - \frac{p_{k}}{q_{k}}| \geq \frac{1}{b q_{k}}

.

Together,

\frac{1}{b q_{k}} < \frac{1}{q_{k} q_{k + 2}}

⇒

\frac{1}{b} < \frac{1}{q_{k + 2}}

⇒

b > q_{k + 2}

. Because of the geometric situation shown in Figure 15, it is

|x - \frac{a}{b}| > |x - \frac{p_{k + 2}}{q_{k + 2}}|

, which is a contradiction to

a / b

being a best approximation of the first kind to x and

b > q_{k + 2}

.

□_{(c a s e 2)}

Figure 15. Distances within an interval of convergents (k even).

Case 3.

a / b

lies between a convergent and a semiconvergent.

This implies that

a / b

lies between

x_{k}

and

x_{k, 1}

(see Figure 8), otherwise

a / b

would lie between two semiconvergents, which has already been covered in Case 1.

Thus,

|\frac{a}{b} - \frac{p_{k}}{q_{k}}| < |x_{k} - x_{k, 1}|

, but

\begin{matrix} |x_{k} - x_{k, 1}| & = |\frac{p_{k}}{q_{k}} - \frac{p_{k + 1} + p_{k}}{q_{k + 1} + q_{k}}| = |\frac{p_{k} (q_{k + 1} + q_{k}) - q_{k} (p_{k + 1} + p_{k})}{q_{k} (q_{k + 1} + q_{k})}| \\ = |\frac{p_{k} q_{k + 1} - q_{k} p_{k + 1}}{q_{k} (q_{k + 1} + q_{k})}| \overset{(D)}{=} \frac{1}{q_{k} (q_{k + 1} + q_{k})} \end{matrix}

where (D) is the sign theorem. I.e., it is

|\frac{a}{b} - \frac{p_{k}}{q_{k}}| < \frac{1}{q_{k} (q_{k + 1} + q_{k})}

. As before, with Note 4, it is

|\frac{a}{b} - \frac{p_{k}}{q_{k}}| \geq \frac{1}{b q_{k}}

⇒

\frac{1}{b q_{k}} < \frac{1}{q_{k} (q_{k + 1} + q_{k})}

⇒

b > q_{k + 1} + q_{k}

.

The geometric situation from Figure 16 reveals

|x - \frac{a}{b}| > |x - \frac{p_{k} + p_{k - 1}}{q_{k} + q_{k - 1}}|

, which is a contradiction to

a / b

being a best approximation of the first kind to x and

b > q_{k + 1} + q_{k}

.

□_{(c a s e 3)}

□

Figure 16. Situation in which

a / b

is between a convergent and its first semiconvergent (k even).

Finally, we give a simple criterion that allows us to prove that a given fraction is a convergent of another real number. This theorem is a cornerstone of computing a prime factor with Shor’s algorithm.

Theorem 14.

(Legendre, 1798—Convergent Criterion)

Let

|x - \frac{a}{b}| < \frac{1}{2 b^{2}}

⇒

a / b

is a convergent of x.

Proof.

We show that

a / b

is a best approximation of the second kind of x. Theorem 11 then proves the claim.

Let

|d x - c| \leq |b x - a|

for

\frac{a}{b} \neq \frac{c}{d}

and

d > 0

. We need to prove

d > b

.

Now,

|b x - a| = b |x - \frac{a}{b}| < b \frac{1}{2 b^{2}} = \frac{1}{2 b}

. This implies

|d x - c| < \frac{1}{2 b}

⇔

d |x - \frac{c}{d}| < \frac{1}{2 b}

⇔

|x - \frac{c}{d}| < \frac{1}{2 d b}

. Thus,

|\frac{c}{d} - \frac{a}{b}| = |\frac{c}{d} - x + x - \frac{a}{b}| \leq |\frac{c}{d} - x| + |x - \frac{a}{b}| < \frac{1}{2 d b} + \frac{1}{2 b^{2}} = \frac{b + d}{2 d b^{2}}

With Note 4 (Distance of Fractions), it is also

|\frac{c}{d} - \frac{a}{b}| \geq \frac{1}{d b}

. Together, it is

\frac{1}{d b} < \frac{b + d}{2 d b^{2}} \Leftrightarrow 1 < \frac{b + d}{2 b} \Leftrightarrow 2 b < b + d \Leftrightarrow d > b . □

3. Probability of the Occurrence of Convergents

3.1. Estimating Secant Lengths

In this part, we use the main arguments of [2].

In order to estimate the probability of the occurrence of a certain state after having performed the quantum Fourier transform, we need the following estimation of a lower bound and an upper bound of the length of a secant of the unit circle:

Lemma 7.

(Secant Length Estimation)

If

φ \in [- π, π]

then

\frac{2 |φ|}{π} \leq |1 - e^{i φ}| \leq |φ|

.

Proof.

The upper bound follows from elementary geometry, namely that the length of a secant is less than or equal to the length of the corresponding arc of a circle (see Figure 17).

Figure 17. The length of a secant is smaller than the arc of the corresponding unit circle.

The length of the arc determined by the angle φ on a circle of radius r is

r φ

, i.e., if the circle is a unit circle, the length of the arc (green in the Figure) is

φ

.

A secant of the unit circle (red in the Figure 17) can be defined by the two complex numbers on the unit circle (black in the Figure 17) that are the endpoints of the secant. Thus, the length of this secant is the difference of these complex numbers. One of these points can always be 1 because a corresponding rotation is length-preserving; the other point is then

e^{i φ}

, where

φ

is the angle of the arc cut by the secant. The length of this secant is then

|1 - e^{i φ}|

.

This proves the inequality

|1 - e^{i φ}| \leq |φ|

.

□_{(u p p e r b o u n d)}

Next, we compute

\begin{matrix} |1 - e^{i φ}| & \overset{(A)}{=} |1 - c o s φ - i s i n φ| \overset{(B)}{=} \sqrt{{(1 - c o s φ)}^{2} + s i n^{2} φ} \\ = \sqrt{1 - 2 c o s φ + c o s^{2} φ + s i n^{2} φ} = \sqrt{2 - 2 c o s φ} \\ = \sqrt{2} \sqrt{1 - c o s φ} \overset{(C)}{=} \sqrt{2} \sqrt{2 s i n^{2} \frac{φ}{2}} \\ \overset{(D)}{=} 2 s i n \frac{φ}{2} \end{matrix}

where (A) uses Euler’s formula, (B) is the definition of the modulus of a complex number with

R e = 1 - c o s φ

and

I m = - s i n φ

, (C) is the double-angle formula, and (D) assumes that

s i n \frac{φ}{2} \geq 0

.

To estimate a lower bound for

s i n \frac{φ}{2}

, we analyze the function

f (x) = s i n x - \frac{2 x}{π}

. From elementary calculus, it is known that a function

ψ

is concave on

D \subseteq ℝ

if and only if its second derivative is not positive on D, i.e.,

ψ^{″} \leq 0

on D.

(Reminder:

ψ

is concave on D:⇔

\forall x, y \in D \forall t \in [0, 1] : ψ (t x + (1 - t) y) \geq t ψ (x) + (1 - t) ψ (y)

,

I.e., for any two points on the graph of

ψ

, the secant between these points is below the graph, Figure 18).

Figure 18. The graphs of sin x and 2x/π.

With

d^{2} s i n x / d x^{2} = - s i n x \leq 0

especially for

x \in [0, π / 2]

, i.e.,

s i n x

is concave on

x \in [0, π / 2]

. Thus, the secant between

s i n 0 = 0

and

s i n \frac{π}{2} = 1

is below the graph of

s i n x

(orange in Figure 18). However, this secant is given by

g (x) = \frac{2}{π} x

(green in Figure 18). Thus, it is

\frac{2}{π} x \leq s i n x

for

x \in [0, π / 2]

, i.e., with

φ : = 2 x

we get

\frac{φ}{π} \leq s i n \frac{φ}{2}

for

φ \in [0, π]

, and this implies

2 \frac{φ}{π} \leq 2 s i n \frac{φ}{2}

for

φ \in [0, π]

.

Now,

|1 - e^{i φ}| = 2 s i n \frac{φ}{2}

(see the computation before) implies

|1 - e^{i φ}| \geq 2 \frac{φ}{π}

for

φ \in [0, π]

.

Furthermore,

s i n x

is convex on

x \in [- π / 2, 0]

; thus, an argument analogous to the above shows that

|1 - e^{i φ}| \geq 2 \frac{|φ|}{π}

for

φ \in [- π, π]

.

□_{(l o w e r b o u n d)}

□

3.2. Estimating Amplitude Parameters

As stated in the introduction, the quantum part of Shor’s algorithm produces in its final step the following quantum state via a measurement:

\frac{1}{\sqrt{N A}} \overset{A - 1}{\sum_{j = 0}} ω_{N}^{j p y} | y ⟩

(27)

Thus, according to the Born rule, the probability

P (y)

of this particular state |y⟩ is the square of the modulus of the amplitude of |y⟩, i.e.,

P (y) = {|\frac{1}{\sqrt{N A}} \overset{A - 1}{\sum_{j = 0}} ω_{N}^{j p y}|}^{2} = \frac{1}{N A} {|\sum_{j = 0}^{A - 1} ω_{N}^{j p y}|}^{2}

(28)

The argument of the modulus is a geometric sum

\sum q^{j}

with

q = ω_{N}^{p y} = e^{\frac{2 π i}{N} p y}

; thus, in case

q \neq 1

,

P (y) = \frac{1}{N A} {|\sum_{j = 0}^{A - 1} q^{j}|}^{2} = \frac{1}{N A} {|\frac{1 - q^{A}}{1 - q}|}^{2}

(29)

With

q^{A} = e^{\frac{2 π i}{N} A p y}

. In this section, in order to compute a lower bound for

P (y)

, we investigate some relations between the following parameters appearing in Equation (28):

n: the number to be factorized;
N: a power of 2 (e.g., $N = 2^{m}$ ) with $n^{2} < N < 2 n^{2}$ ;
- the choice of N effectively determines the domain of numbers that can be represented in the $| a ⟩$ -part of the quantum register (see Equation (1)).
p: the period of the modular exponentiation function $f (x) = a^{x} m o d n$ ;
A: the number of arguments mapped to a given value of $f$ .

We also estimate bounds of the argument

2 π \frac{A p y}{N}

of

q^{A} = e^{\frac{2 π i}{N} A p y}

.

3.2.1. Basics from Number Theory

For convenience, we state the definition of the modulo function.

Definition 7.

The modulo function is the following map:

\begin{matrix} m o d : ℕ_{0} \times ℕ & \to ℕ \\ (z, n) & \mapsto z - ⌊ \frac{z}{n} ⌋ n \overset{\underset{d e f}{}}{=} z m o d n \end{matrix}

(30)

z m o d n

is, thus, the residue left when dividing z by n. I.e., if

r = z m o d n

, then there is a number

k \in ℕ_{0}

such that

z = k n + r

with

0 \leq r < n

.

If

z m o d n = \tilde{z} m o d n = r

, we find numbers

k_{1}

and

k_{2}

such that

z = k_{1} n + r

and

\tilde{z} = k_{2} n + r

with

0 \leq r < n

. This implies that

z - \tilde{z} = (k_{1} - k_{2}) n = : k n

, i.e., n is a divisor of

z - \tilde{z}

(in symbols:

n | (z - \tilde{z})

). We also obtain that

z m o d n = \tilde{z} m o d n

implies that

\tilde{z} = z + k n

.

The equation

z m o d n = \tilde{z} m o d n

is abbreviated as

z \equiv \tilde{z} (m o d n)

; in words,

z

is congruent

\tilde{z}

modulo n. As shown just before,

z \equiv \tilde{z} (m o d n)

is equivalent to

n | (z - \tilde{z})

and to

\tilde{z} = z + k n

. We keep this as

Note 6.

z \equiv \tilde{z} (m o d n)

⇔

n | (z - \tilde{z})

⇔

\tilde{z} = z + k n

. □

Furthermore, we state the definition of modular exponentiation which turns out to play a key role in finding factors.

Definition 8.

For

0 < a < n

, the modular exponentiation function is the following map:

\begin{matrix} f : ℕ_{0} & \to ℕ_{0} \\ x & \mapsto a^{x} m o d n \end{matrix}

(31)

The smallest number p that satisfies

f (x) = f (x + p)

for all x is called the period of f. Especially, with

x = 0

, we get

f (0) = f (p)

which means that

a^{p} m o d n = a^{0} m o d n = 1 m o d n

, i.e.,

a^{p} \equiv 1 (m o d n)

which in turn is equivalent to

n | (a^{p} - 1)

. Thus, we have proven:

Note 7.

f (x) = a^{x} m o d n

has period p ⇔

a^{p} \equiv 1 (m o d n)

⇔

n | (a^{p} - 1)

. □

Finding a factor of n can be achieved by finding the period p of the function

f (x) = a^{x} m o d n

. This is seen as follows: Let p be the period of

f

, then

n | (a^{p} - 1)

, i.e.,

(a^{p} - 1) = k n

. Assume p is even (if p is odd, Shor’s algorithm is repeated with a different a, until an even p is found). With such an even p, it is

(a^{p} - 1) = (a^{p / 2} - 1) (a^{p / 2} + 1) = k n

which implies that

(a^{p / 2} - 1)

and

(a^{p / 2} + 1)

have a common divisor, which in turn means that

g c d (a^{p / 2} - 1, n)

or

g c d (a^{p / 2} + 1, n)

is a divisor of n. Thus, if an even period has been determined, classically efficient calculations can be used to compute a factor of n. If this factor is a prime number, we can finish. Otherwise, we continue determining a factor of the former factor, and so on, until we end up with a prime factor of n.

Next, we determine an upper bound of the period p of the modular exponentiation by using group theory. A simple calculation shows that “

\equiv

” is an equivalence relation on

ℤ

. The equivalence class of

z \in ℤ

is denoted as

[z]

and is referred to as the residue class of z modulo n. It is

[z] = \{\tilde{z} \in ℤ | \tilde{z} \equiv z (m o d n)\} = \{z + k n | k \in ℤ\}

(see Note 6), where the latter set is sometimes written as

z + n ℤ

. The set of all residue classes modulo n is denoted as

ℤ_{n}

, i.e.,

ℤ_{n} = \{[0], [1], \dots, [n - 1]\}

.

We can multiply two residue classes modulo n as follows:

[x] \cdot [y] = [x \cdot y]

. With this multiplication,

ℤ_{n}^{*} = \{[z] \in ℤ_{n} | g c d (z, n) = 1\}

becomes a group. Because

ℤ_{n}^{*} \subseteq ℤ_{n}

, it is

φ (n) : = c a r d ℤ_{n}^{*} \leq c a r d ℤ_{n} = n

. Since every integer is a divisor of itself, it is

g c d (n, n) = n \neq 1

(for

n \geq 2

), i.e., the cardinality of numbers co-prime to n is less than n:

n \geq 2 \Rightarrow φ (n) < n

.

The well-known Lagrange’s theorem from group theory states that for a group G with

c a r d G = m < \infty

and for each

x \in G

, it is

x^{m} = e

(e is the unit element of G)—see Lemma 3.2.5 in [3], for example. Thus, for

x \in ℤ_{n}^{*}

, it is

x^{φ (n)} = 1

, i.e.,

x^{φ (n)} \equiv 1 (m o d n)

. Since the period p is the smallest number with

x^{p} \equiv 1 (m o d n)

, it follows that

p \leq φ (n)

and, thus,

p < n

.

Now, the assumption of Shor’s algorithm is that

0 < a < n

and that

g c d (a, n) = 1

, which ensures that

[a] \in ℤ_{n}^{*}

; thus,

{[a]}^{p} \equiv 1 (m o d n)

and

p < n

.

Lemma 8.

Let p be the period of

f (x) = a^{x} m o d n

. Then,

p < n

. □

3.2.2. Intervals of Consecutive Multiples of the Period

The relation between N and p is depicted in Figure 19; multiples of N are always contained in closed intervals defined by consecutive multiples of p, i.e., it may happen that a multiple of N coincides with a multiple of p.

Figure 19. Multiples of N are enclosed by immediately succeeding multiples of p.

Note 8.

\forall k \in ℕ \exists t \in ℕ : (t - 1) p \leq k N \leq t p .

Proof.

Pick an arbitrary

k \in ℕ

, i.e.,

k N \in ℕ

is also given. Then, we find a

\tilde{t} \in ℕ

such that

\tilde{t} p \geq k N

(trivial because

p > 0

). Let

t

be the smallest of such

\tilde{t}

, i.e.,

t \overset{\underset{d e f}{}}{=} m i n \{\tilde{t} | \tilde{t} p \geq k N\}

. Thus,

(t - 1) p \leq k N

because otherwise

(t - 1) p > k N

, which is a contradiction because t was chosen minimal.

Together,

(t - 1) p \leq k N \leq t p

. The claim follows because k an arbitrary number. □

The situation we just discussed is shown in Figure 20.

Figure 20. Determining the interval of succeeding multiples of p enclosing a multiple of N.

Furthermore, two different multiples of N are in different intervals defined by succeeding multiples of p. Otherwise, the situation of Figure 21 would imply that

N \leq p

, which is a contradiction as shown by the proof of Note 9 below.

Figure 21. No two multiples of N can be enclosed by succeeding multiples of p.

We denote a

t

with

(t - 1) p \leq k N \leq t p

by

t_{k}

. This is justified by the next Note 9 which proves that such a

t

is uniquely determined by

k

. Especially, a multiple

k N

is contained in “its” interval:

\forall k \in ℕ \exists t_{k} \in ℕ : k N \in [(t_{k} - 1) p, t_{k} p]

(32)

Note 9.

Let

k \in ℕ

and

t_{k} \in ℕ

with

(t_{k} - 1) p \leq k N \leq t_{k} p

.

Then,

r \neq s \in \{0, \dots, p - 1\}

implies

t_{r} \neq t_{s}

.

Proof (by contradiction).

Assume

r \neq s

but

t_{r} = t_{s} \overset{\underset{d e f}{}}{=} t

with

(t - 1) p \leq r N \leq t p

and

(t - 1) p \leq s N \leq t p

(see Figure 20). W.l.o.g.

r < s

⇒

r + 1 \leq s

⇒

s N - r N \geq (r + 1) N - r N = N

. Further,

s N - r N \leq t p - (t - 1) p = p

. Together, it is

N \leq s N - r N \leq p

, i.e.,

N < p

.

According to Lemma 8, we know

p < n

⇒

N < p < n < n^{2}

. However, by selection of N (see the bullet list at the beginning of Section 8), it is

n^{2} < N

, which is a contradiction. □

The proof of Note 9 has shown especially:

Corollary 6.

r \neq s \in \{0, \dots, p - 1\} \Rightarrow r N \notin [(t_{s} - 1) p, t_{s} p] □

By Note 9, for

r \neq s \in \{0, \dots, p - 1\}

, the numbers

t_{r}, t_{s}

are different, i.e., for each

k \in \{0, \dots, p - 1\}

, such a unique

t_{k}

exists, i.e., the

p

numbers

t_{0}, t_{1}, \dots, t_{p - 1}

are different. Thus:

Corollary 7.

There exist p different numbers

t_{k}

,

0 \leq k \leq p - 1

, such that

(t_{k} - 1) p \leq k N \leq t_{k} p

. □

These different numbers are strictly monotonically increasing.

Note 10.

Let

k \in ℕ

and

t_{k} \in ℕ

with

(t_{k} - 1) p \leq k N \leq t_{k} p

. Then,

t_{k} < t_{k + 1}

. Thus,

t_{0} < t_{1} < \dots < t_{p - 1}

.

Proof (by contradiction).

Assume

t_{k + 1} \leq t_{k}

; thus,

t_{k + 1} - 1 \leq t_{k} - 1

, which implies

t_{k + 1} p \leq t_{k} p

and

(t_{k + 1} - 1) p \leq (t_{k} - 1) p

.

Now,

k N < (k + 1) N

,

(t_{k} - 1) p \leq k N \leq t_{k} p

, and

(t_{k + 1} - 1) p \leq (k + 1) N \leq t_{k + 1} p

. This implies

(k + 1) N \leq t_{k + 1} p \leq t_{k} p

and

(t_{k} - 1) p \leq k N < (k + 1) N

, which finally results in

(t_{k} - 1) p < (k + 1) N \leq t_{k} p

—which is a contradiction to corollary 6 because this would imply that

(k + 1) N \in [(t_{k} - 1) p, t_{k} p]

. □

Each multiple

k N

of N is “close” to a multiple

t p

in the sense that

k N

is at most

p / 2

apart from

(t_{k} - 1) p

or

t_{k} p

(see Figure 22).

Figure 22. A multiple of N is always “close” to a multiple of p.

More precisely:

Note 11.

\forall k \in ℕ \exists t \in ℕ : |(t - 1) p - k N| \leq \frac{p}{2} \lor |t p - k N| \leq \frac{p}{2} .

Proof.

It is

(t - 1) p \leq k N \leq t p

, i.e., by definition

k N \in [(t - 1) p, t p]

. This implies

k N - (t - 1) p \leq \frac{p}{2}

∨

t p - k N \leq \frac{p}{2}

(see Figure 22), otherwise:

k N - (t - 1) p > \frac{p}{2} \land t p - k N > \frac{p}{2} \Leftrightarrow - (t - 1) p > \frac{p}{2} - k N \land t p > \frac{p}{2} + k N

⇒

t p - (t - 1) p > \frac{p}{2} - k N + \frac{p}{2} + k N = p

, but

t p - (t - 1) p = p

, i.e.,

p > p

, which is a contradiction! This proves the claim

|(t - 1) p - k N| \leq \frac{p}{2} \lor |t p - k N| \leq \frac{p}{2}

. □

As before, from Note 9, it follows that for

k \in \{0, \dots, p - 1\}

, these numbers t are all different. Precisely:

Corollary 8.

Let

0 \leq k \leq p - 1

and

t_{k} \in ℕ

such that

|(t_{k} - 1) p - k N| \leq \frac{p}{2} \lor |t_{k} p - k N| \leq \frac{p}{2}

. If

r \neq s

, then

t_{r} \neq t_{s}

. □

The multiples of N are sparsely scattered across the intervals of consecutive multiples of p. More precisely, intervals of consecutive multiples of p, which contain a multiple of N, are not consecutive. This is the content of

Note 12.

Let

n > 2

,

k \in \{0, \dots, p - 1\}

, and

t_{k} \in ℕ

with

(t_{k} - 1) p \leq k N \leq t_{k} p

.

Then

t_{k + 1} > t_{k} + 1

as well as

t_{k - 1} < t_{k} - 1

.

Proof.

Because of Note 10, it is

t_{k + 1} > t_{k}

; thus,

t_{k + 1} \geq t_{k} + 1

.

Assumption.

t_{k + 1} = t_{k} + 1

.

By definition,

(k + 1) N \in [(t_{k + 1} - 1) p, t_{k + 1} p]

, and by assumption,

t_{k + 1} = t_{k} + 1

; thus, it is

(k + 1) N \in [t_{k} p, (t_{k} + 1) p]

. Furthermore, by definition,

k N \in [(t_{k} - 1) p, t_{k} p]

(see Figure 23).

Figure 23. Situation in case

k N

and

(k + 1) N

lying within two consecutive intervals of consecutive multiples of p.

Now,

[(t_{k} - 1) p, t_{k} p], [t_{k} p, (t_{k} + 1) p] \subseteq [(t_{k} - 1) p, (t_{k} + 1) p]

which implies

k N, (k + 1) N \in [(t_{k} - 1) p, (t_{k} + 1) p]

.

Thus,

N = (k + 1) N - k N \leq (t_{k} + 1) p - (t_{k} - 1) p = 2 p

, i.e.,

N \leq 2 p

(see Figure 23).

By Lemma 8, it is

p < n

⇒

2 p < 2 n

. With

n > 2

⇒

n^{2} > 2 n

and by definition of N, it is

n^{2} < N

; thus,

N > n^{2} > 2 n > 2 p

: a contradiction!

Thus, the assumption is false, which implies

t_{k + 1} > t_{k} + 1

. The claim

t_{k - 1} < t_{k} - 1

is proven similarly. □

The resulting geometric situation is depicted in Figure 24.

Figure 24. If

k N

is in an interval defined by two consecutive multiples of p, the preceding and succeeding intervals do not contain a multiple of N.

If

k N \in [(t_{k} - 1) p, t_{k} p]

, it is

t_{k} p - k N \leq p / 2

or

k N - (t_{k} - 1) p \leq p / 2

(see Figure 22 or Figure 24). In case

k N - (t_{k} - 1) p \leq p / 2

, we define

\hat{t} : = t_{k} - 1

and

k N - \hat{t} p \leq p / 2

results, and in case of

t_{k} p - k N \leq p / 2

, we define

\hat{t} : = t_{k}

implying

\hat{t} p - k N \leq p / 2

. According to Note 9, this

\hat{t}

is uniquely defined. This proves

Note 13.

\forall k \in ℕ \exists! \hat{t} \in ℕ : |\hat{t} p - k N| \leq \frac{p}{2}

, i.e.,

\hat{t}

is uniquely determined by

k

. □

This is next rewritten into a format more useful for what follows.

Note 14.

Let

k \in \{0, \dots, p - 1\}

and

t_{k} \in ℕ

with

t_{k} \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

.

If

r \neq s \in \{0, \dots, p - 1\}

, then

t_{r} \neq t_{s}

.

Proof.

It is

t_{k} \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

⇔

k \frac{N}{p} - \frac{1}{2} \leq t_{k} \leq k \frac{N}{p} + \frac{1}{2}

⇔

k N - \frac{p}{2} \leq p t_{k} \leq k N + \frac{p}{2}

⇔

- \frac{p}{2} \leq p t_{k} - k N \leq \frac{p}{2}

⇔

|p t_{k} - k N| \leq \frac{p}{2}

. Note 13 shows that

t_{k}

is uniquely determined by

k

. □

Finally, we can prove the following:

Corollary 9.

There exist p different numbers

t_{k}

,

0 \leq k \leq p - 1

, such that

t_{k} \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

Proof.

There exist p different numbers

t_{k}

,

0 \leq k \leq p - 1

, such that

(t_{k} - 1) p \leq k N \leq t_{k} p

(Corollary 7). The proof of Note 11 shows that this implies

|(t_{k} - 1) p - k N| \leq \frac{p}{2} \lor |t_{k} p - k N| \leq \frac{p}{2}

. The proof of Note 14 shows that this implies

t_{k} \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

. □

3.2.3. Cardinality of Pre-Images

First, we show that the parameter A is greater than 1, i.e., at least two numbers available in the

| a ⟩

-part of the quantum register are mapped by f to the same value.

Note 15.

A > 1 .

Proof.

As reminded in the introduction, the quantum Fourier transform of Shor’s algorithm produces the following state:

|a ⟩| b ⟩ = \frac{1}{\sqrt{N}} \overset{N - 1}{\sum_{x = 0}} |x ⟩| f (x) ⟩

After measurement of the

| b ⟩

-part of the register, the

| a ⟩

-part is in the state

| a ⟩ = \frac{1}{\sqrt{A}} (|x ⟩ +| x + p ⟩ + |x + 2 p ⟩ + \dots +| x + (A - 1) p ⟩)

(33)

i.e.,

f^{- 1} (| x ⟩) = \{|x ⟩,| x + p ⟩, |x + 2 p ⟩, \dots,| x + (A - 1) p ⟩\}

.

Choose

x < p

—such an x exists because otherwise it would be

p = 0

, but a period p satisfies

p > 0

. With

p < n

(Lemma 8) and

n < n^{2} < N

(by choice of N), it is

x + p < 2 p < N

(see the proof of Note 12). Thus,

x + p

is in the domain of f (in the sense that it is a value in the

| a ⟩

-part of the quantum register available as an argument for f), i.e.,

f (x + p)

is available in the

| b ⟩

-part of the register.

A = 1

would imply that

f^{- 1} (| x ⟩) = \{| x ⟩\}

and, thus,

| x + p ⟩ \notin f^{- 1} (| x ⟩)

, i.e.,

f (| x ⟩) \neq f (| x + p ⟩)

for

p \neq 0

. Since

p > 0

, the function

f

would not be periodic. □

Next, we prove tighter bounds for the parameter A.

Note 16.

(A - 1) p < N < (A + 1) p .

Proof.

As in the proof of Note 15, we choose

x < p

. With

|a = \frac{1}{\sqrt{A}} (|x + |x + p + |x + 2 p + \dots + |x + (A - 1) p),

i.e.,

\{|x ⟩,| x + p ⟩, |x + 2 p ⟩, \dots,| x + (A - 1) p ⟩\}

are all values in the

| a ⟩

-part of the register being mapped to

f (x)

, i.e.,

A

is the largest number satisfying

x + (A - 1) p < N

. With

x \geq 0

, this implies

(A - 1) p < N

—which is the first part of the claim.

Thus,

(A + 1) p > N

. Otherwise,

(A + 1) p \leq N

and with

x < p

, it would be

x + A p < p + A p = (A + 1) p \leq N

, i.e.,

| x + A p ⟩

would also be in the

| a ⟩

-part of the register being mapped to

f (x)

, which is a contradiction to the definition of A. This proves the second part of the claim. □

The next estimation gives an approximation of

N

in terms of the product of

A

and

p

.

Note 17.

N \approx A p .

Proof.

Because of

(A - 1) p < N < (A + 1) p

, the geometric situation is as depicted in Figure 25, i.e.,

N \in [(A - 1) p, A p]

or

N \in [A p, (A + 1) p]

. Thus,

|N - A p| \leq p

.

Figure 25.

N

is embraced by

(A - 1) p

and

(A + 1) p

.

Now,

p < n

(Lemma 8) and

n < n^{2} < N

by choice of N. In practice, n is a large number, i.e.,

n^{2}

is huge compared to n:

n ≪ n^{2} < N

. Together:

p ≪ N

(34)

In this sense,

p

is a small number, i.e.,

|N - A p|

is small too:

N \approx A p

. □

3.2.4. Estimating Arguments of Amplitudes of Potential Measurement Results

The next Lemma is the main result of this section for what follows.

Lemma 9.

Let

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

and

k \in \{0, \dots, p - 1\}

. Then:

2 π \frac{y (A - 1) p}{N}, 2 π \frac{p y}{N} \in [- π, + π]

Proof.

It is

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

⇔

k \frac{N}{p} - \frac{1}{2} \leq y \leq k \frac{N}{p} + \frac{1}{2}

⇔ (multiply with p)

k N - \frac{p}{2} \leq p y \leq k N + \frac{p}{2}

⇔

- \frac{p}{2} \leq p y - k N \leq \frac{p}{2}

⇔

(i) y p - k N \leq \frac{p}{2} \land k N - y p \leq \frac{p}{2}

By Note 16, it is

(ii) (A - 1) p < N \Rightarrow \frac{(A - 1) p}{N} < 1

Furthermore:

(iii) \frac{(A - 1) p}{N} = \frac{A p}{N} - \frac{p}{N} \overset{(A)}{\approx} \frac{N}{N} - \frac{p}{N} = 1 - \frac{p}{N} \overset{(B)}{\approx} 1

where (A) is because of Note 17 (

N \approx A p

), and (B) is because of Equation (34) (

p ≪ N

).

Next, we compute the lower bound for the first fraction of the claim:

\begin{matrix} 2 π \frac{y (A - 1) p}{N} & = 2 π \frac{(A - 1)}{N} y p \\ \overset{(C)}{\geq} 2 π \frac{(A - 1)}{N} (k N - \frac{p}{2}) \\ = 2 π (A - 1) k - π \frac{(A - 1) p}{N} \\ \overset{(D)}{\geq} - π \frac{(A - 1) p}{N} \overset{(E)}{\approx} - π \end{matrix}

where (C) follows from the second inequation of (i) above, (D) is because of

2 π (A - 1) k \geq 0

, and (E) is implied by (iii) above.

The upper bound for the first fraction of the claim is computed next:

\begin{matrix} 2 π \frac{y (A - 1) p}{N} & = 2 π \frac{(A - 1)}{N} y p \overset{(F)}{\leq} 2 π \frac{(A - 1)}{N} (k N + \frac{p}{2}) \\ = 2 π (A - 1) k + π \frac{(A - 1) p}{N} \overset{(G)}{<} 2 π (A - 1) k + π \\ \overset{(H)}{<} 2 π (A - 1) p + π \overset{(I)}{<} 2 π N + π \overset{(J)}{\equiv} π \end{matrix}

where (F) is implied by the first inequation of (i) above, (G) is (ii) above, (H) follows from the prerequisite

k \in I \{0, \dots, p - 1\}

, i.e.,

k < p

, and (I) is the first inequation of (ii) above. Finally, we will estimate

e^{i φ}

, and because of

e^{i 2 π N} = 1

, (J) is justified.

Together,

- π \leq 2 π \frac{y (A - 1) p}{N} \leq π

, which proves the first claim.

□_{(f i r s t f r a c t i o n)}

Next,

\begin{matrix} 2 π \frac{p y}{N} & = \frac{2 π}{N} p y \overset{(K)}{\leq} \frac{2 π}{N} (k N + \frac{p}{2}) \\ = 2 π k + π \frac{p}{N} \overset{(L)}{<} 2 π k + π \overset{(M)}{\equiv} π \end{matrix}

with (K) from the first inequation of (i) before, (L) because

p < N

, and (M) because we will estimate

e^{i φ}

. I.e., the upper bound of the second fraction is as claimed.

The correctness of the lower bound is seen as follows:

\begin{matrix} 2 π \frac{p y}{N} & = \frac{2 π}{N} p y \overset{(N)}{\geq} \frac{2 π}{N} (k N - \frac{p}{2}) \\ = 2 π k - π \frac{p}{N} \overset{(O)}{>} - π \frac{p}{N} \overset{(Q)}{>} - π \end{matrix}

with the second inequation of (i) before giving (N), (O) is because of

2 π k > 0

, and (Q) is true because

0 < p < N

; thus,

0 < p / N < 1

.

□_{(s e c o n d f r a c t i o n)}

□

3.3. Estimating Probabilities

We are now ready to compute the probability

P (y)

that the state |y⟩, which is prepared by the quantum part of Shor’s algorithm, is “close” (i.e., within a distance of

1 / 2

) to a multiple of

p / N

.

Lemma 10.

Assume

q = e^{i 2 π \frac{y p}{N}} \neq 1

and let P be the probability that

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

for a

k \in \{0, \dots, p - 1\}

. Then,

P \approx \frac{4}{π^{2}}

.

Proof.

According to Equation (29), the probability

P (y)

to measure a particular

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

is

P (y) = \frac{1}{N A} {|\frac{1 - q^{A}}{1 - q}|}^{2}

(35)

In case

q \neq 1

(which is the assumption) where

q = e^{i 2 π \frac{y p}{N}}

(the case

q = 1

will be treated separately in Note 18). Thus, with

q^{A} = e^{i 2 π \frac{y A p}{N}}

, it is

|\frac{1 - q^{A}}{1 - q}| = \frac{|1 - q^{A}|}{|1 - q|} = \frac{|1 - e^{i 2 π \frac{y A p}{N}}|}{|1 - e^{i 2 π \frac{y p}{N}}|}

(36)

The structure of the numerator and denominator recommends the estimation of both by means of the Lemma 7 (Secant Length Estimation). However, applying Lemma 7 requires that

2 π \frac{y A p}{N}, 2 π \frac{y p}{N} \in [- π, + π]

. By Lemma 9, we know that under the prerequisite

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

, it is

2 π \frac{p y}{N} \in [- π, + π]

as well as

2 π \frac{y (A - 1) p}{N} \in [- π, + π]

, but Lemma 9 does not imply

2 π \frac{y A p}{N} \in [- π, + π]

.

Now, consider the following calculation:

|\frac{1 - q^{A}}{1 - q}| = |\frac{1 - q^{A - 1}}{1 - q} + q^{A - 1}| \overset{(A)}{\geq} |\frac{1 - q^{A - 1}}{1 - q}| - |q^{A - 1}| \overset{(B)}{=} |\frac{1 - q^{A - 1}}{1 - q}| - 1

(37)

where (A) holds because of

|a + b| \geq |a| - |b|

, and

|e^{i φ}| = 1

implies (B):

|q^{t}| = |{(e^{i 2 π \frac{y p}{N}})}^{t}| = |e^{i (2 π \frac{y p t}{N})}| = 1

.

Equation (37) allows us to apply the secant length estimation (Lemma 7) because in

|\frac{1 - q^{A - 1}}{1 - q}| = \frac{|1 - q^{A - 1}|}{|1 - q|} = \frac{|1 - e^{i 2 π \frac{y (A - 1) p}{N}}|}{|1 - e^{i 2 π \frac{y p}{N}}|}

(38)

it is now

2 π \frac{y (A - 1) p}{N}, 2 π \frac{y p}{N} \in [- π, + π]

according to Lemma 9.

First, we use the second inequation of

\frac{2 |φ|}{π} \leq |1 - e^{i φ}| \leq |φ|

from Lemma 7 with

φ = 2 π \frac{y p}{N} \in [- π, + π]

and obtain

|1 - e^{i 2 π \frac{y p}{N}}| \leq 2 π \frac{y p}{N}

(39)

Then, we use the first inequation of

\frac{2 |φ|}{π} \leq |1 - e^{i φ}| \leq |φ|

from Lemma 7 with

φ = 2 π \frac{y (A - 1) p}{N} \in [- π, + π]

and obtain

|1 - e^{i 2 π \frac{y (A - 1) p}{N}}| \geq \frac{2}{π} \cdot 2 π \frac{y (A - 1) p}{N}

(40)

Using Equations (39) and (40) in Equation (38) results in

|\frac{1 - q^{A - 1}}{1 - q}| = \frac{|1 - e^{i 2 π \frac{y (A - 1) p}{N}}|}{|1 - e^{i 2 π \frac{y p}{N}}|} \geq \frac{2}{π} \cdot 2 π y \frac{(A - 1) p}{N} \cdot \frac{N}{2 π y p} = \frac{2 (A - 1)}{π}

(41)

This result is now used in Equation (37) (step (C) below) and we obtain

\begin{matrix} |\frac{1 - q^{A}}{1 - q}| = |\frac{1 - q^{A - 1}}{1 - q}| - 1 & \overset{(C)}{\geq} \frac{2 (A - 1)}{π} - 1 \\ = \frac{2 A}{π} - \frac{2}{π} - 1 = \frac{2 A}{π} - (\frac{2}{π} + 1) \end{matrix}

(42)

Using Equation (42) in Equation (35) (step (D) below) results in

P (y) = \frac{1}{N A} {|\frac{1 - q^{A}}{1 - q}|}^{2} \overset{(D)}{\geq} \frac{1}{N A} {(\frac{2 A}{π} - (\frac{2}{π} + 1))}^{2}

= \frac{1}{N A} (\frac{4 A^{2}}{π^{2}} - \frac{4 A}{π} (\frac{2}{π} + 1) + {(\frac{2}{π} + 1)}^{2})

= \frac{1}{N A} (\frac{4 A^{2}}{π^{2}} - \frac{8 A}{π^{2}} - \frac{4 A}{π} + \frac{4}{π^{2}} + \frac{4}{π} + 1)

= \frac{4 A}{π^{2} N} - \frac{8}{π^{2} N} - \frac{4}{π N} + \frac{4}{π^{2} N A} + \frac{4}{π N A} + \frac{1}{N A}

\geq \frac{4 A}{π^{2} N} - \frac{8}{π^{2} N} - \frac{4}{π N} = \frac{4 A}{π^{2} N} - \frac{4}{π N} (1 + \frac{2}{π})

Thus,

P (y) \geq \frac{4 A}{π^{2} N} - \frac{4}{π N} (1 + \frac{2}{π})

(43)

According to Note 17, we know

N \approx A p

⇒

\frac{A}{N} \approx \frac{1}{p}

, i.e.,

\frac{4 A}{π^{2} N} \approx \frac{4}{π^{2} p}

(44)

Furthermore, since N is a “huge” number, we know that the following is “small”:

\frac{4}{π N} (1 + \frac{2}{π}) \overset{\underset{d e f}{}}{=} ε

(45)

Using Equations (44) and (45) in Equation (43) results in

P (y) \geq \frac{4}{π^{2}} \frac{1}{p} - ε

(46)

for each

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

. According to Corollary 9, there exist p different numbers

y_{k}

with

y_{k} \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

and for each of them

P (y_{k}) \geq \frac{4}{π^{2}} \frac{1}{p} - ε

. Since we are not interested in a particular

y_{k}

, but in any of them, we need to sum up all probabilities

P (y_{k})

to obtain the overall probability

P

:

P = \overset{p - 1}{\sum_{i = 0}} P (y_{i}) \geq \frac{4}{π^{2}} - p ε \approx \frac{4}{π^{2}}

This proves the claim. □

We still need to estimate the probability for the case

q = 1

.

Note 18.

Let

q = 1

. Then

P (y) = \frac{A}{N}

.

Proof.

In case

q = 1

, the probability is

P (y) = \frac{1}{N A} {|\sum_{j = 0}^{A - 1} q^{A}|}^{2} = \frac{1}{N A} {|\sum_{j = 0}^{A - 1} 1|}^{2} = \frac{1}{N A} A^{2} = \frac{A}{N} □

3.4. Computing the Period

Let y be the result of the measurement produced by Shor’s algorithm. Under the assumption that

q \neq 1

, the following holds:

Theorem 15.

With probability

P \approx \frac{4}{π^{2}}

, there exists a

k \in \{0, \dots, p - 1\}

, such that

|\frac{y}{N} - \frac{k}{p}| < \frac{1}{2 p^{2}}

Proof.

According to Lemma 10, the probability that

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

for a

k \in \{0, \dots, p - 1\}

is

\approx 4 / π^{2}

.

However,

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

⇔

- \frac{1}{2} \leq y - \frac{k N}{p} \leq + \frac{1}{2}

. Dividing the latter inequations by

N

yields:

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

⇔

- \frac{1}{2 N} \leq \frac{y}{N} - \frac{k}{p} \leq + \frac{1}{2 N}

.

Thus,

|\frac{y}{N} - \frac{k}{p}| \leq \frac{1}{2 N}

. By choice of

N

, it is

n^{2} < N

. Furthermore,

p < n

⇒

p^{2} < n^{2}

⇒

p^{2} < N

⇒

\frac{1}{N} < \frac{1}{p^{2}}

. This results in

|\frac{y}{N} - \frac{k}{p}| \leq \frac{1}{2 N} < \frac{1}{2 p^{2}}

. □

Legendre’s Theorem (Theorem 14) proves immediately:

Theorem 16.

With probability

\approx 4 / π^{2}

,

k / p

is a convergent of

y / N

. □

3.4.1. Determining the Period by Convergents: $q \neq 1$

The Algorithm 2 determines with probability of approximately

4 / π^{2}

the period p we are looking for; is is applicable in the case

q \neq 1

:

Algorithm 2 Determining with probability of approximately

4 / π^{2}

the period p we are looking for

Compute $\frac{y}{N} \in ℚ_{> 0}$ ;
- The result of the measurement is $y \in ℕ$ and $N \in ℕ$ has been chosen ⇒ $\frac{y}{N} \in ℚ_{> 0}$ can be computed.
Compute the continued fraction representation $[a_{0}; a_{1}, \dots, a_{m}]$ of $\frac{y}{N} \in ℚ$ ;
Compute the convergents $[a_{0}; a_{1}, \dots, a_{u}] = \frac{g_{u}}{h_{u}}$ , $1 \leq u \leq m$ ;
Determine $h_{ω}$ with $h_{ω} \geq h_{u}$ for $1 \leq u \leq m$ and $h_{ω} < n$
⇒ $\frac{g_{ω}}{h_{ω}}$ is a very good approximation of $\frac{k}{p}$ because $\frac{1}{2 h_{ω}^{2}} \leq \frac{1}{2 h_{u}^{2}}$ ;
Thus, $h_{ω} \approx p$ is a candidate for the period p;
Check whether p is in fact the period.

3.4.2. Determining the Period by Convergents: $q = 1$

In case

q = 1

, the above algorithm is not applicable. However,

q = 1

⇔

e^{\frac{2 π i}{N} p y} = 1

⇔

\frac{p y}{N} \in ℤ

⇔

p = k \frac{N}{y}

with

k \in ℤ

. Thus, the Algorithm 3 can be used:

Algorithm 3

q = 1

⇔

e^{\frac{2 π i}{N} p y} = 1

⇔

\frac{p y}{N} \in ℤ

⇔

p = k \frac{N}{y}

with k \in ℤ

Compute $\frac{N}{y} \in ℚ_{> 0} .$ The result of the measurement is $y \in ℕ$ and $N \in ℕ$ has been chosen ⇒ $\frac{N}{y} \in ℚ_{> 0}$ can be computed;
Select $k \in ℕ$ ;
Compute $k \frac{N}{y}$ ;
If $k \frac{N}{y} \notin ℕ$ , go back to step (2);
If $k \frac{N}{y} \geq n$ , go back to step (2);
$p = k \frac{N}{y}$ is a candidate for the period p;
Check whether p is in fact the period;
If p is not the period:
a.
If some predefined termination criterion is met: stop;
b.
Go back to step (2).

This may yield the period p but does not guarantee it.

3.5. How the Presented Results Relate

The contribution contains several low-level details. In order to avoid getting lost in these details, this section sketches how the main details contribute to the proof of Shor’s algorithm. The Figure 26 at the end of this section is a cartoon of these relations.

Figure 26. How the main results of the paper relate.

3.5.1. Applying the Results about Continued Factions

Determining a divisor and finally a prime factor of a natural number

n \in ℕ

can be reduced to finding the period p of the modular exponentiation function

f (x) = a^{x} m o d n

for an a with

0 < a < n

—see Section 3.2.1 and Note 7.

The quantum part of Shor’s algorithm produces the state

\frac{1}{\sqrt{N A}} \sum_{j = 0}^{A - 1} ω_{N}^{j p y} | y ⟩

from Equation (2). Measuring this state results in a natural number

y \in ℕ

.

The natural number N in Equation (2) must be chosen in advance based on the number n to be factorized: it is chosen as

N = 2^{m}

with

n^{2} < N < 2 n^{2}

—see the introduction of Section 8. This ensures that the relevant arguments to compute the

f (x)

by the quantum part of Shor’s algorithm can be captured as quantum states.

Theorem 15 guarantees with probability

P \approx 4 / π^{2}

the existence of a

k \in \{0, \dots, p - 1\}

such that

|\frac{y}{N} - \frac{k}{p}| < \frac{1}{2 p^{2}}

.

Thus, according to the convergent criterion of Legendre’s Theorem (Theorem 14),

k / p

is a convergent of

y / N

.

The proof of Legendre’s convergent criterion (Theorem 14), in turn, is based on the fact that convergents are exactly the best approximations of the second kind (Theorem 11 and Lagrange’s theorem (Theorem 12)).

The proof of Lagrange’s theorem (Theorem 12—each convergent is a best approximation of the second kind) makes use of the recursion theorem (Theorem 1), the sign theorem (Theorem 2), the monotony property of denominators of convergents (Corollary 1), as well as the estimations of the upper bounds of convergents (Lemma 3) and their lower bounds (Lemma 5).

The proof of Theorem 11 (each best approximation of the second kind is a convergent) makes use of the recursion theorem (Theorem 1), the distance theorem (Theorem 8), the computation of the difference of convergents (Corollary 3), the distance of fractions (Note 4), and the estimation of the upper bounds of convergents (Lemma 3).

The estimations of the lower and upper bounds of convergents depend on the distance theorem (Theorem 8), the computation of the difference of convergents (Corollary 3), the monotony property of denominators of convergents (Corollary 1), and on estimations of the size of denominators of convergents (Lemma 1). The estimation of the lower bounds (Lemma 5) makes use of semiconvergents (Definition 4) and their monotony property (Lemma 4) as well as the nesting theorem (Theorem 7).

Remark: Theorem 13, which proves that best approximations of the first kind are convergents or semiconvergents, is not immediately relevant to Shor’s algorithm and may be ignored when focusing on Shor’s algorithm.

3.5.2. Applying Probability Estimations

According to Equation (29) (which is implied by the Born rule), the probability

P (y)

to measure a particular y is

P (y) = \frac{1}{N A} {|\frac{1 - q^{A}}{1 - q}|}^{2}

for

q = e^{i 2 π \frac{p y}{N}} \neq 1

.

Equations (37) and (38) show that this probability can be estimated as

|\frac{1 - q^{A}}{1 - q}| \geq |\frac{1 - q^{A - 1}}{1 - q}| - 1 = \frac{|1 - e^{i 2 π \frac{y (A - 1) p}{N}}|}{|1 - e^{i 2 π \frac{y p}{N}}|} - 1

. The latter fraction, in turn, can be estimated by means of Lemma 7 (Secant Length Estimation) in case

2 π \frac{y (A - 1) p}{N}, 2 π \frac{p y}{N} \in [- π, + π]

.

Lemma 9 shows that the latter inclusion is satisfied in case of

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

and

k \in \{0, \dots, p - 1\}

.

Lemma 10 proves that with probability

P (y) \approx 4 / π^{2}

it is, in fact,

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

for

k \in \{0, \dots, p - 1\}

. The proof of this lemma is based on a proper estimation of N (Note 17) which in turn relies on Note 16. Further, it makes use of Corollary 9, which is the summary of the various results of Section 3.2.2.

A simple calculation in the proof of Theorem 15 finally shows that

y \in [k \frac{N}{p} - \frac{1}{2}, k \frac{N}{p} + \frac{1}{2}]

implies

|\frac{y}{N} - \frac{k}{p}| \leq \frac{1}{2 N} < \frac{1}{2 p^{2}}

. Thus, with probability

P (y) \approx 4 / π^{2}

, the convergent criterion of Legendre’s Theorem (Theorem 14) is satisfied.

4. Conclusions and Related Work

The literature analyzing, discussing, and refining Shor’s algorithm [1] is vast. Of course, most text books on quantaum computing explain the algorithm too (e.g., [4,5]). In doing so, all this literature puts a sharp focus on the quantum part of the algorithm and sketches its classical parts at various depths. However, the mathematical treatment of the classical aspects is sketchy, omitting most of the details and leaving them as an exercise for the reader with references to corresponding text books from mathematics such as [6] or [7]. The lecture notes by Preskill [2] go a bit deeper, especially on the estimation of probabilities, but still omit the low-level details; however, the authors of the contribution at hand benefited a lot by the treatment in [2]. It is noted that the genesis for the authors’ treatment of probability estimations was inspired by unpublished, non-public work to which the authors had access to several years ago.

In doing so, the contribution at hand is very detailed on the probability estimation of being able to use Legendre’s Theorem in Shor’s algorithm. The authors are not aware of any other publication providing these low-level details.

Furthermore, the contribution at hand is a self-contained treatment on continued fractions up to Legendre’s Theorem. All background that is needed to understand this theorem is presented, including all proofs with low-level details step by step.

The authors hope to foster the comprehension of the classical aspects of Shor’s algorithm even at the level of beginners in quantum computing.

Author Contributions

Writing—original draft, F.L.; Writing—review & editing, J.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially funded by the BMWK project PlanQK (01MK20005N).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shor, P.W. Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer. SIAM J. Sci. Stat. Comput. 1997, 26, 1484–1509. [Google Scholar] [CrossRef] [Green Version]
Preskill, J. Lecture on Quantum Information—Chapter 6. Quantum Algorithms; California Institute of Technology: Pasadena, CA, USA, 2020; Available online: http://theory.caltech.edu/~preskill/ph219/chap6_20_6A.pdf (accessed on 11 July 2022).
Shult, E.; Surowski, D. Algebra; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Nielsen, M.A.; Chuang, I.L. Quantum Computation and Quantum Information; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
Rieffel, E.; Polak, W. Quantum Computing: A Gentle Introduction; The MIT Press: Cambridge, MA, USA, 2011. [Google Scholar]
Hardy, G.H.; Wright, E.M. An Introduction to the Theory of Numbers, 4th ed.; Oxford University Press: New York, NY, USA, 1975. [Google Scholar]
Khinchin, A.Y. Continued Fractions, 3rd ed.; The University of Chicago Press: Chicago, IL, USA, 1964. [Google Scholar]

Figure 1. Example of a straightforward computation of a continued fraction.

Figure 2. Using the Euclidian algorithm to compute a continued fraction.

Figure 3. Computing the value of a continued fraction based on Equation (5).

Figure 4. Nesting of the value of a continued fraction by its convergents.

Figure 5. The distance between two succeeding convergents is greater than the distance of a convergent and the value of its continued fraction.

Figure 6. The distance between any two convergents is smaller than the distance between the convergent with the smaller index and its immediate predecessor.

Figure 7. Computing the continued fraction of

\sqrt{2}

.

Figure 8. Nesting of convergents and semiconvergents (n even).

Figure 9. Any best approximation of the second kind is in the grey shaded area, i.e., greater than or equal to the convergent

x_{0}

.

Figure 10. If a best approximation of the second kind is not a convergent, it is within the indicated grey shaded areas.

Figure 11. Pictorial representation of Case (2).

Figure 12. The potential positions of

z / x

with respect to the denominators

q_{0}, \dots, q_{k}

.

Figure 13. The potential positions of

y_{0} x

.

Figure 14. Distances within an interval of semiconvergents (k odd).

Figure 15. Distances within an interval of convergents (k even).

Figure 16. Situation in which

a / b

is between a convergent and its first semiconvergent (k even).

Figure 17. The length of a secant is smaller than the arc of the corresponding unit circle.

Figure 18. The graphs of sin x and 2x/π.

Figure 19. Multiples of N are enclosed by immediately succeeding multiples of p.

Figure 20. Determining the interval of succeeding multiples of p enclosing a multiple of N.

Figure 21. No two multiples of N can be enclosed by succeeding multiples of p.

Figure 22. A multiple of N is always “close” to a multiple of p.

Figure 23. Situation in case

k N

and

(k + 1) N

lying within two consecutive intervals of consecutive multiples of p.

Figure 24. If

k N

is in an interval defined by two consecutive multiples of p, the preceding and succeeding intervals do not contain a multiple of N.

Figure 25.

N

is embraced by

(A - 1) p

and

(A + 1) p

.

Figure 26. How the main results of the paper relate.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Continued Fractions and Probability Estimations in Shor’s Algorithm: A Detailed and Self-Contained Treatise

Abstract

1. Introduction

Structure of the Article

2. Continued Fractions

2.1. Definition of Continued Fractions and Their Computation

2.2. Convergents

2.3. Convergence of Infinite Regular Continuous Fractions

2.4. Bounds Expressed by Denominators of Convergents

2.5. Best Approximations

3. Probability of the Occurrence of Convergents

3.1. Estimating Secant Lengths

3.2. Estimating Amplitude Parameters

3.2.1. Basics from Number Theory

3.2.2. Intervals of Consecutive Multiples of the Period

3.2.3. Cardinality of Pre-Images

3.2.4. Estimating Arguments of Amplitudes of Potential Measurement Results

3.3. Estimating Probabilities

3.4. Computing the Period

3.4.1. Determining the Period by Convergents: $q \neq 1$

3.4.2. Determining the Period by Convergents: $q = 1$

3.5. How the Presented Results Relate

3.5.1. Applying the Results about Continued Factions

3.5.2. Applying Probability Estimations

4. Conclusions and Related Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

Continued Fractions and Probability Estimations in Shor’s Algorithm: A Detailed and Self-Contained Treatise

Abstract

1. Introduction

Structure of the Article

2. Continued Fractions

2.1. Definition of Continued Fractions and Their Computation

2.2. Convergents

2.3. Convergence of Infinite Regular Continuous Fractions

2.4. Bounds Expressed by Denominators of Convergents

2.5. Best Approximations

3. Probability of the Occurrence of Convergents

3.1. Estimating Secant Lengths

3.2. Estimating Amplitude Parameters

3.2.1. Basics from Number Theory

3.2.2. Intervals of Consecutive Multiples of the Period

3.2.3. Cardinality of Pre-Images

3.2.4. Estimating Arguments of Amplitudes of Potential Measurement Results

3.3. Estimating Probabilities

3.4. Computing the Period

3.4.1. Determining the Period by Convergents: q ≠ 1

3.4.2. Determining the Period by Convergents: q = 1

3.5. How the Presented Results Relate

3.5.1. Applying the Results about Continued Factions

3.5.2. Applying Probability Estimations

4. Conclusions and Related Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

3.4.1. Determining the Period by Convergents: $q \neq 1$

3.4.2. Determining the Period by Convergents: $q = 1$