Next Article in Journal
Quaternions Without Imaginary Quantities or the Vector Representation of Quaternions
Previous Article in Journal
A General Approach to Error Analysis for Roots of Polynomial Equations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Tutorial

On the Differential Topology of Expressivity of Parameterized Quantum Circuits

Institute of Architecture of Application Systems (IAAS), University of Stuttgart, Universitätsstr. 38, 70569 Stuttgart, Germany
*
Authors to whom correspondence should be addressed.
AppliedMath 2025, 5(3), 121; https://doi.org/10.3390/appliedmath5030121
Submission received: 24 July 2025 / Revised: 26 August 2025 / Accepted: 1 September 2025 / Published: 4 September 2025

Abstract

Parameterized quantum circuits play a key role in quantum computing. Measuring the suitability of such a circuit for solving a class of problems is needed. One such promising measure is the expressivity of a circuit, which is defined in two main variants. The variant in focus of this contribution is the so-called dimensional expressivity, which measures the dimension of the submanifold of states produced by the circuit. Understanding this measure needs a lot of background from differential topology, which makes it hard to comprehend. In this article, we provide this background in a vivid as well as pedagogical manner. Especially, it strives towards being self-contained for understanding expressivity, e.g., the required mathematical foundations are provided, and examples are given. Also, the literature makes several statements about expressivity, the proofs of which are omitted or only indicated. In this article, we give proof for key statements from dimensional expressivity, sometimes revealing limits for generalizing them, and sketching how to proceed in practice to determine this measure.

1. Introduction

Quantum algorithms are unitary operations transforming an initial state into another state that either directly represents the solution or that is a state that is measured, and the measurement result leads to the solution of the problem [1]. Quantum algorithms are typically realized as quantum circuits: a composition of unitary operations at lower dimensions that represent the decomposition of the overall unitary transformation. These lower-dimensional unitary operations are typically the operations supported by a concrete quantum computer. Sometimes, the unitary operations are parameterized: for example, these parameters are angles by which a state must be rotated. By varying the parameters between different executions of the circuit, a solution is iteratively computed.
Such variational quantum algorithms (VQAs) [2] and especially their encompassed parameterized quantum circuits (PQC) (Section 2.1) gain a lot of attention not only because of their appropriateness for today’s available computers (which are noisy [3]) but also because parameterized quantum circuits represent applications in themselves (like quantum neural networks [4,5]).
Whether or not a variational quantum algorithm can compute a solution of the problem addressed depends on the “suitability” of the parameterized quantum circuit it encompasses. Thus, having a measure of “suitability” in this context is of utmost importance. Such a measure is called expressivity of the parameterized quantum circuit and has been defined in two major variants:
  • First, a parameterized quantum circuit associates with each parameter tuple a unitary matrix. The more of the unitary group is covered by the parameterized quantum circuit, the better. This is because if circuits that solve the problem (i.e., unitary matrices) exist, then the unitary matrices generated by the VQA should include them. And the likelihood that this is achieved is higher the more of the unitary group is covered. The coverage of the unitary group is suggested to be assessed by means of how close the generated matrices are to being Haar random [6]. This is what we call the “unitary approach”; it is described at a high level in Section 2.3.1 and in more detail in Section 4.8.
  • The focus of this contribution is on the next variant of expressivity. While the first variant is about capturing a solving circuit, the second variant is about the ability to provide a solution itself. For this purpose, the unitary matrices generated by the parameterized quantum circuit are immediately applied to a fixed initial state, which results in another state, i.e., an element of the unit sphere in some Euclidean space R n . Any solution of the problem to be solved by the variational quantum algorithm is an element of this unit sphere. Thus, if the set of elements of the unit sphere covered by this process is a significant subset of the unit sphere, the likelihood is high that a solution is reached. The amount of coverage of the unit sphere is assessed by means of the dimension of the submanifold generated by this process [7]. This is what we call the “state approach”; Section 2.3.2 introduces this approach and provides corresponding definitions, while Section 5 gives more details.
The issues with the “state approach” dealt with in this contribution are twofold: Readers of the corresponding existing quantum computing literature require a lot of background from differential topology. We compile and explain this background in detail and in a vivid as well as pedagogical manner; precise references to the matching mathematical literature are given. Furthermore, the corresponding quantum computing literature makes a bunch of statements and claims that are often not proven, or proofs are only indicated. This makes reading and comprehending the literature difficult. We give detailed proofs for claims made in the context of the “state approach”; especially, we prove that the state approach produces in many situations locally immersed submanifolds and even embedded submanifolds (Section 5.2) such that speaking about “dimension” is meaningful at all. We also give a counter-example that these immersions and embeddings can, in general, not be extended to global immersions and embeddings (Section 5.3). Finally, we provide a “recipe” for determining how the local submanifolds can be enlarged (Section 5.4).

Structure of the Article

The article is structured as follows: Section 2 motivates the need for a measure of expressivity (Section 2.2) based on the structure and use of variational quantum algorithms (Section 2.1). The two major definitions of expressivity are given and discussed (Section 2.3); other approaches are briefly reviewed (Section 2.3.3). Section 3 proves that parameterized quantum circuits are differentiable maps, reminds the format of such circuits in many practical cases, and comments on the structure of the parameter space in practical situations. These sections, as well as the next section, are a tutorial of the corresponding subjects.
Section 4 is the main pedagogical part: After proving a lemma on the existence of open chains in connected spaces (which is needed in Section 5.3), the concept of differentiable manifolds with boundaries is introduced, as well as some of their properties that are needed in the rest of this paper (Section 4.2). Section 4.3 discusses singularities that require special care in making claims or assumptions in our context. Next, differentiable maps and their differentials are discussed (Section 4.4 and Section 4.5). Properties of maps with constant rank are reviewed, and immersions as well as embeddings are discussed (Section 4.6). Submanifolds (both immersed as well as embedded) are the subject of Section 4.7. In Section 4.8, the notion of the volume of manifolds is motivated by describing the concept of both linear approximations of manifolds themselves (Section 4.8.1) and “classical” volumes of parallelepipeds (Section 4.8.2). Riemannian manifolds and their volume forms are introduced in Section 4.8.3. Curvature is revealed as the origin of non-uniform distribution of point sets on a manifold and the use of the Haar measure as a solution allowing uniformly random distributions (Section 4.8.6). This allows us in Section 4.8.7 to introduce a precise definition of expressivity in the unitary approach.
Section 5 provides proofs supporting claims from the literature about the state approach. This section contains mostly original contributions. It is proven in Section 5.1 that parameterized quantum circuits induce local immersions. This implies that slices of the parameter space are locally mapped to immersed and even embedded submanifolds the dimensions of which are determined by the local rank of the map induced by the parameterized quantum circuit (Section 5.2). A usual technique to expand local properties to global ones is sketched in Section 5.3, and it is shown that this technique fails in our case, i.e., that local embeddings cannot be extended onto connected components. But a high-level proceeding is roughly sketched in Section 5.4 (together with an example) on how to determine “large” areas of the parameter space that are mapped to an embedded submanifold by the parameterized quantum circuit. The contribution ends with a conclusion in Section 6.

2. The Notion of Expressivity

This chapter discusses the basic principles of variational quantum algorithms (Section 2.1), their origin (Section 2.2.1), and the need for a metric to assess the “success probability” of such algorithms (Section 2.2.2). Two approaches for such a metric—i.e., expressivity—are introduced in Section 2.3, and other approaches are briefly reviewed.

2.1. Variational Quantum Algorithms

A Variational Quantum Algorithm (VQA) is a pair consisting of a parameterized quantum circuit (also known as. ansatz) and an optimizer [2,8]. The ansatz prepares a state on a quantum computer that is measured [9]. The measurement result is passed to the optimizer that uses a problem-specific cost function to determine, based on the measured values, new parameters for the ansatz. The goal is to find parameters that produce a quantum state whose measurement result optimizes the cost function.
Figure 1 depicts the structure and main ingredients of a variational quantum algorithm. (1) The parameterized quantum circuit (PQC) C ( p 1 , , p k ) is given by a unitary operator C that depends on real parameters p 1 , , p k R . (2) The measurement of the quantum state produced by the quantum circuit is performed by means of problem-specific observables {   O 1 , ,   O m } . The result of the measurement is, thus, (3) a tuple of values ( O 1 , ,   O m ) , which is exactly the value of a cost function F in the parameters p 1 , , p k :
F ( p 1 , , p k ) = ( O 1 , ,   O m )
Thus, a problem is a suitable candidate for being solved on a quantum computer if the cost function cannot be efficiently computed on a classical computer but can be on a quantum computer. Otherwise, a quantum computer would not be needed at all.
(4) Next, an optimizer is used: it computes (5) new parameters p 1 , , p k that move the value of the cost function closer to an optimum. Any kind of optimizer can be used, e.g., gradient-based or derivative-free [10]. When the new parameters have been computed, they are fed into the parameterized quantum circuit C ( p 1 , , p k ) , and the loop starts over again (1). The optimizer uses some termination condition, like the number of iterations performed or the amount of improvement achieved by the new parameters. Once the termination criterion is satisfied, the processing stops with the optimal parameters. The quantum circuit is run with the optimal parameters, and measurement of the prepared state in the computational basis represents the solution of the problem at hand.
A Quantum Neural Network (QNN) is also a parameterized quantum circuit [4]. Its cost function depends on training data in addition to other parameters. Once the optimal parameters have been determined (also known as the QNN has been trained), the parameterized quantum circuit is run with the optimal parameters on new data input to perform the task of the QNN (e.g., classification).
Also, VQAs have been applied in several domains: ref. [2] describe applications in multiple areas like finding ground states of molecules, simulating the dynamic evolution of quantum systems themselves, solving linear equation systems (see also [11]), performing classification tasks, or realizing autoencoders. Optimization problems can be addressed, e.g., for energy distribution [12]. Tasks from computational fluid dynamics are addressed [13].
Note that the overall processing of determining the optimal parameters iterates between a quantum part performed on a quantum computer and a classical optimization part performed on a classical computer. Such a kind of processing is called hybrid quantum-classical processing, and the corresponding algorithm is called a hybrid quantum-classical algorithm (or just “hybrid” for short). In fact, most quantum algorithms and applications exploiting quantum resources are hybrid [14,15].
Examples of variational quantum algorithms and their use can be found in [16] or [17], for example.

2.2. The Origin of Expressivity

This section provides details about how to assess the suitability of a variational quantum algorithm.

2.2.1. Suitability for NISQ

Variational quantum algorithms have a lot of properties that are of principal interest, e.g., they are universal under certain conditions [18]. Also, some problems are inherently solved by variational quantum algorithms like quantum neural networks [5]. But the main operational aim behind variational quantum algorithms is to achieve an advantage or a utility of quantum computing in the NISQ era (Noisy Intermediate Scale Quantum). This era is roughly defined as the years during which quantum computers are still noisy and have only up to a few hundred qubits [3]. “Noisy” means that the states of these qubits are stable only for a short period of time (“decoherence”), and the gates show small errors (“lack of fidelity”). In this era, implementing quantum applications has a lot of issues to consider [14].
One key consequence of being noisy is that an algorithm must use a QPU only for a short period of time, i.e., less than the decoherence time T D . A variational quantum algorithm, thus, strives towards an ansatz C ( p 1 , , p k ) and measurement M via observables { O m } such that the time T C that the ansatz is executed plus the time T M that the measurement requires is significantly smaller than the decoherence time:
T C + T M T D

2.2.2. Finding Solutions

The other fundamental aim is that the ansatz is capable, in principle, of finding a solution to the problem at hand. This capability can be assessed based on the following:
(i)
The plethora of unitaries that an ansatz can produce;
(ii)
The plethora of states that an ansatz can produce.
The first approach (that we call the “Unitary Approach” in what follows) considers the parameterized quantum circuit C ( p 1 , , p k ) as a map Ω that maps the parameter space P of all possible parameters p 1 , , p k to the unitary group U ( n ) based on the fact that each p 1 , , p k P results in a unitary map C ( p 1 , , p k ) U ( n ) (see [19]).
The second approach (called “State Approach” in the following) fixes an initial state | s and uses for each p 1 , , p k P the unitary map C ( p 1 , , p k ) U ( n ) to result in an element C ( p 1 , , p k ) | s S n 1 , which is a map Λ from the parameter space P to the unit sphere S n 1 (see [7]). Figure 2 depicts both approaches.
Then, a measure of the plethora of unitaries Ω ( P ) U ( n ) or a measure of the plethora of states Λ ( P ) S n 1 , respectively, is referred to as the expressivity of the ansatz (or of the parameterized quantum circuit, respectively). Thus, in addition to finding an ansatz that satisfies Equation (2), an ansatz should show high expressivity. Finding such an ansatz is considered to be quite difficult (see [9], for example).
Finally, it should be noted that the notion of “state” used here is an abstract one. Such an abstract state is not always related to a state that can be directly realized physically. A reader interested in this subject is referred to [20].

2.3. Defining Expressivity

Expressivity (sometimes also called “expressibility”) of an ansatz is measured, for example, by its ability to generate states that represent the Hilbert space well (see [6]), i.e., to uniformly explore the entire Hilbert space (see [2]). In this section, we define both approaches sketched in the section before more precisely. Note that other definitions exist, but they are often tailored toward specific questions (e.g., see [21]); Section 2.3.3 provides a corresponding brief overview.

2.3.1. The “Unitary Approach”

Let P R k be a parameter space and let C be the ansatz that depends on the parameters p 1 , , p k P . The map Ω C is defined by Ω C : P U ( n ) with p 1 , , p k C ( p 1 , , p k ) . Let X be a problem to be solved by a quantum circuit, and let S ( X ) U ( n ) be the set of unitary operators that solve the problem X , i.e., that produce a state that approximately optimizes the corresponding cost function.
If Ω C ( P ) contains an element of S ( X ) the ansatz C is called complete for the problem X . Part (a) of Figure 3 shows the solutions S ( A ) and S ( B ) of two problems A and B ; the ansatz C is complete for problem B but not complete for problem A .
But the solutions S ( X ) of a problem X are not known in advance. Thus, the completeness of an ansatz cannot be assessed a priori. Consequently, the more of the unitary group U ( n ) an ansatz covers (intuitively, “the larger” Ω C ( P ) U ( n ) is) the higher the likelihood that it will intersect the set of solutions S ( X ) .
More precisely, based on the fact that the unitary group U ( n ) is a Riemannian manifold (see Note 10), a means to measure volumes is available on U ( n ) . Roughly, for an ansatz C , the number v o l ( Ω C ( P ) ) (i.e., the volume of Ω C ( P ) ) is called the expressivity of C . If C and C ̂ are two ansatzes, ansatz C is called more expressive than ansatz C ̂ if v o l ( Ω C ( P ) ) v o l ( Ω C ( P ) ) .
Note that this definition is not the definition used in the literature, but it has the purpose of being descriptive: the exact definition of expressivity of an ansatz C based on the “unitary approach” compares its distribution of states from sampling with the collection of Haar-random states (see [4,6]). In Section 4.8, we provide more background on this.

2.3.2. The “State Approach”

This approach is the one we focus on mostly; thus, we define it precisely.
Definition 1.
Let C be an ansatz depending on parameters p 1 , , p k . The set of all parameters P R k allowed by C is called the parameter space of C . Let | ι S n 1 be a fixed chosen state, the so-called initial state. Then, the map Λ C defined by Λ C : P S n 1 with p 1 , , p k C ( p 1 , , p k ) | ι is called the state map of the parameterized circuit C .
Λ C ( P ) is the set of all states reachable by means of the parameterized circuit C . As proven in Section 5.1, Λ C ( P ) is under practical conditions locally an immersed submanifold (Lemma 16) of S n 1 , and locally often even an embedded submanifold of S n 1 (Lemma 17). The reader is referred to Section 4.2 and Section 4.7 (or to textbooks like [22] or [23], respectively) for the definition of the terms “manifold” and “submanifold” (often, “manifolds” are assumed to be smooth, i.e., of class C , in what follows—which is without loss of generality according to Theorem 1).
Figure 4 depicts the images of the parameter space P R k in the unit sphere S n 1 produced by two different ansatzes C and C ˜ . In case a circuit produces a submanifold of S n 1 (immersed or embedded—see Section 5), this submanifold is called a circuit manifold, i.e., the circuit manifold is the set of states reachable by means of the parameterized circuit. In the figure, the circuit manifold Λ C ( P ) has dimension 1, and the circuit manifold Λ C ˜ ( P ) is of dimension 2. Intuitively, the dimension of a circuit manifold is a measure of the expressivity of a parameterized circuit [7]: the intuition is that a higher-dimensional circuit manifold covers more of the enclosing manifold and, thus, has a higher chance of meeting the solution of a problem.
But this intuition is misleading: for example (see [23], example 4.20), there is an immersed submanifold L of the torus T 2 which is dense in T 2 , but d i m L = 1 . i.e., L is arbitrarily close to any point p L . Thus, a circuit manifold of dimension 1 may approximate a solution with arbitrary precision (see Figure 5b for a rough indication), while a “small” higher-dimensional circuit manifold may stay “far away” from the solution (see Figure 5a). This means that, in general, dimensional expressivity is not a reliable indicator for the success of an ansatz: an ansatz with a circuit manifold with maximum dimensional expressivity might not solve the underlying problem, while an ansatz with a lower dimensional expressivity might be successful.
However, the dimensional expressivity allows us to identify parameters of a parameterized circuit that are superfluous, i.e., that can be removed without affecting the capability of an ansatz to determine a solution (see [7] for the details). A reduced number of parameters has the advantage of a speedup of the overall processing, e.g., by less optimization work required.

2.3.3. Other Approaches and Relations

While the two approaches sketched before are generally applicable, other approaches to assess expressivity have been proposed, which are specific to certain domains. Also, the relation between expressivity and other aspects of a circuit has been investigated.
For example, ref. [24] introduces a measure of expressivity called the covering number, which is derived from statistical learning theory. This number is the minimum number of balls of a given radius that covers the hypothesis space of a parameterized quantum circuit. Here, the hypothesis space is the subset of unitaries made available by using all parameters. An estimation of the covering number is given that shows that this measure of expressivity depends on the number of qubits used by the circuit, its gates, its observables, as well as the error rate of the quantum computer used.
While the estimation of [24] provides an upper bound of the covering number of a parameterized quantum circuit, ref. [25] derives a lower bound of the covering number. This can be used to determine the appropriateness of a given circuit.
Another measure of expressivity is the effective rank introduced in [26]. The effective rank, i.e., the rank of the quantum Fisher information matrix [27] of a circuit, represents the number of independent parameters of a parameterized quantum circuit. It can be used to determine “superfluous” parameters that can be eliminated, thus, making optimization in the corresponding variational quantum algorithm more effective.
Expressivity can also be assessed by the ability of a parameterized quantum circuit to approximate certain classes of functions. For example, ref. [28] investigates quantum neural networks that approximate multivariate polynomials and Hölder smooth functions. Their notion of expressivity is how well a parameterized quantum circuit can approximate such functions. They explicitly construct such circuits and give approximation boundaries in terms of depth and width of the constructed circuit.
Ref. [29] focusses on parameterized quantum circuits that can be represented as a Fourier series. This is the case when the circuit not only depends on parameters that are iteratively modified but also on input data that is reflected in the circuit by matrix exponentials of Hamiltonian operators. The spectrum of the Fourier series then depends on the eigenvalues of these Hamiltonians, and the Fourier coefficients depend on the parameterized unitaries of the circuit. Ref. [29] uses the unitary approach sketched in Section 2.3.1 to measure expressivity but shows that such parameterized quantum circuits suffer from exponentially vanishing Fourier coefficients for growing numbers of qubits. This limits the expressivity of the corresponding circuits.
Using the unitary approach, ref. [30] studies the expressivity of a parameterized quantum circuit and its entangling power in relation to specific classes of gates used within the circuit. The main finding is that the use of so-called quantum switches in a circuit result in better expressivity when the depth of the circuit increases. Also, the corresponding circuits result in higher entanglement power when their depth increases.
The impact of the types of gates used in a parameterized quantum circuit on its expressivity (according to the unitary approach) is also studied in [31]. It turns out that CNOT gates have a decreasing effect on expressivity, while rotational RX, RY, and RZ gates strengthen it: RX has the strongest effect, followed by RY, followed by RZ.
Ref. [32] investigates the impact of the graph structure of a quantum computer—i.e., the graph consisting of the qubits of the quantum computer as nodes and the physical connections between the qubits as edges—on the expressivity of circuits executed on the machine. Based on two different ansatzes of parameterized quantum circuits, their finding is that a graph with a ring structure achieves the highest expressivity (measured via the unitary approach), followed by a linear graph, an any-to-any graph, and a star-shaped graph.
Variational quantum algorithms are subject to so-called barren plateaus: the graph of the cost function C may become flat such that the optimizer can no longer efficiently determine better parameters. This is a significant problem limiting the practical use of variational quantum algorithms. It turns out that high expressivity is one possible source of barren plateaus. Ref. [33] presents a Lie algebraic theory that can be used to assess the advent of barren plateaus. This allows us to determine whether the expressivity must be adapted.

3. Parameterized Quantum Circuits as Differentiable Maps

Expressivity depends on properties of parameterized circuits and their images in U ( n ) or S n 1 . The proofs of these properties are mostly sketched only in the literature. In what follows, these proofs are given in detail.

3.1. Proof of Differentiability

The most fundamental observation is that in many practical situations, a parameterized quantum circuit is a differentiable map (even smooth, i.e., of class C ). This can be seen by noting that each unitary can be represented by a set of 1-qubit operations and CNOTs, and by proving that these ingredient operations are smooth.
To begin with, we need the following known fact from linear algebra:
Note 1. 
For each unitary matrix  A  there exists a Hermitian matrix  H  such that  A = e i H .
Proof. 
Each unitary is diagonalizable, i.e., there exists a T U ( n ) such that A = T d i a g e i φ 1 , , e i φ n T * with real numbers φ 1 , , φ n R (see [34] Theorem 18.13). Define the real diagonal matrix Φ : = d i a g ( φ 1 , , φ n ) and set H : = T Φ T * . Then, H is Hermitian because H * = ( T Φ T * ) * = ( T * ) * Φ * T * = ( a ) T Φ T * = H , where (a) is valid because Φ * = Φ for a real diagonal matrix Φ . Then it is
e i H = e i T Φ T *     = ( 1 ) e i T Φ T 1     = ( 2 ) T e i Φ T 1   = T e d i a g ( i φ 1 , , i φ n ) T 1     = ( 3 ) T d i a g ( e i φ 1 , , e i φ n ) T 1   = A
Hereby, (1) is valid because T is unitary, i.e., I = T T * ; thus, T 1 = T * . (2) is valid because e S X S 1 = S e X S 1 for each S G L ( n ) , which is seen as follows:
e S X S 1 = k = 1 S X S 1 k k !   = k = 1 S X k k ! S 1     = S k = 1 X k k ! S 1   = S e X S 1
Finally, (3) is valid because
e d i a g ( x 1 , , x n ) = k = 1 d i a g ( x 1 , , x n ) k k !   = k = 1 d i a g ( x 1 k , , x n k ) k !   = d i a g ( e x 1 , , e x n )
Thus, each unitary matrix is the matrix exponential of a Hermitian matrix. □
Note 2. 
Each parameterized 1-qubit operation  U : R I U ( 2 )  is smooth.
Proof. 
According to Note 1, each unitary matrix U is a matrix exponential of a Hermitian matrix H : U = e i H . Especially, for I R and a parameterized 1-qubit operation U : I U ( 2 ) , p U ( p ) , there exists a Hermitian matrix H such that U ( p ) = e i p H . Thus, U ( p ) is a smooth operation. □
If the 1-qubit operation U modifies the i-th qubit of an n qubit quantum register (i.e., all other qubits j i are left unchanged) it can be represented as the following tensor product:
T ( p ) = I I U ( p ) i I I
This is a smooth operation because the coefficients of the corresponding matrices are either constant (i.e., 0 or 1 in case the matrix I ) or are the coefficients of the smooth 1-qubit operation U ( p ) . Finally, building the tensor product of matrices is a differentiable operation itself.
The same argument is used to prove the following:
Note 3. 
The controlled NOT operator with control qubit r and target qubit s  C X [ r , s ] : H n H n  on an n qubit quantum register is smooth.
Proof. 
The matrix C X [ r , s ] is a permutation matrix, i.e., its coefficients are constant ( C X [ r , s ] ) i j 0,1 . Thus, the matrix is a smooth operator. □
A parameterized quantum circuit C ( p 1 , , p k ) has a parameter space P R k (we will discuss properties of the topological structure of the parameter space in Section 3.3), and for each parameter tuple ( p 1 , , p k ) P the resulting operator is a unitary: C ( p 1 , , p k ) U ( n ) . Thus, a parameterized quantum circuit is a map
C : P U ( n ) ( p 1 , , p k ) C ( p 1 , , p k )
As a unitary, C ( p 1 , , p k ) is composed of (parameterized) 1-qubit operators (in the representation of Equation (3)) and CNOT operators C X [ r , s ] (see [1], Section 4.5.2). This composition is performed by multiplying the corresponding matrices. But matrix multiplication is smooth: if a i j ,   ( b s t ) are two matrices, then their product is a matrix, the coefficients of which are a sum of the products of the coefficients of the factor matrices, i.e., ( a i j ) ( b s t ) = ( j a i j b j t ) i , t , thus multiplying matrices are smooth. If the factor matrices of the matrix product are smooth (i.e., their coefficients a i j ,   ( b s t ) are smooth functions), the result is smooth again. Because the factors of the matrix product resulting in C ( p 1 , , p k ) are smooth parameterized 1-qubit operators and smooth CNOTs (see Note 2 and Note 3), C ( p 1 , , p k ) is smooth.
Let ( U i j ( p 1 , , p k ) ) i , j be the matrix representation of C ( p 1 , , p k ) . We choose an arbitrary but fixed initial state | ι = | ι 1 , , ι n H n . Then the product U ( p 1 , , p k ) | ι = ( j U i j ( p 1 , , p k ) ι j ) i is smooth, i.e., the components of the resulting vector are smooth. This finally proves:
Lemma 1. 
Let C : P U ( n ) be a parameterized quantum circuit with matrix representation ( U i j ( p 1 , , p k ) ) and let | ι S n 1 be a fixed initial state. Then, both maps Ω C : P U ( n ) with ( p 1 , , p k ) C ( p 1 , , p k ) as well as Λ : P S n 1 with ( p 1 , , p k ) U ( p 1 , , p k ) | ι are smooth.
Figure 6 summarizes the ingredients of a parameterized quantum circuit: The circuit has an associated parameter space P . The parameterized quantum circuit C itself assigns to each parameter tuple ( p 1 , , p k ) P a unitary operator C ( p 1 , , p k ) U ( n ) (we do not distinguish between the circuit C and its matrix representation U ). This unitary operator transforms an initial state | ι S n 1 into another state C ( p 1 , , p k ) | ι S n 1 . Thus, a parameterized quantum circuit C induces a smooth map Λ : P S n 1 that maps its parameter space P into the unit sphere S n 1 .

3.2. Differentiability in Practice

In addition, an argument from practice can be used to see that a typical parameterized quantum circuit of a variational quantum algorithm is smooth. This is because often (see [2,19]), a typical circuit has the format
C : P U ( n ) ( p 1 , , p k ) j = 1 k e i p j H j W j
with a set of fixed unitary operators { W j } and Hermitian operators { H j } , i.e., e i p j H j is a rotation of angle p j generated by H j . The latter rotations are smooth, their product with a constant matrix is smooth, and, thus, the overall product is smooth. The parameter space P in this case is a Cartesian product of connected intervals in R which are the domains of the corresponding angles p j .

3.3. Differentiability on Parameter Spaces

By now we have not commented on the structure of a parameter space P (except for the special case of Section 3.2). Typically, a parameter space is not an arbitrary set in R k but it is assumed to have suitable properties. Especially, P must support the notion of maps that are differentiable on P as their domain (see Section 4.4). For example, for arbitrary sets P R k a map f : P R n is differentiable, iff a function F : U R n with P U o p e n R k exists that is differentiable in the ordinary sense and that fulfills F | P = f , i.e., F restricted to P is identical to f (in this case, F is called a differentiable extension of f )—see Definition 5.
As another example, if P is a differentiable manifold (with or without boundary), differentiability is defined in every point x P by means of an appropriate chart around x (see Section 4.4 and [23]).
In circuits like the ones in Equation (5), P is a Cartesian product of open intervals. This can slightly be generalized as follows (see Section 4.2): P = P 1 × × P k R k with P i M f R being a connected manifold (with or without boundary) of dimension one, d i m P i = 1 . Such a manifold P i is an open, closed, or semi-closed interval. Thus, P is a hypercube where some of its faces may belong to the hypercube; note that not all of the intervals P i may include boundary points to avoid singularities—see Section 4.3. Thus, in general, P is a manifold with boundary (Section 4.2).

4. Facts from (Differential) Topology

In this Section, we prove that two points in a connected topological space are connected by an open chain (Lemma 2). The definition of manifolds with boundaries is given, as well as the differentiability of maps. The concept of singularities is presented (Section 3.3). Differentiability of maps, their differentials, and properties of maps of constant rank are summarized. Also, we remind the rank theorem, which implies that immersions are locally injective, and we remind a condition under which injective immersions are embeddings. Submanifolds are introduced, and their properties are discussed. Section 4.8 introduces the volume form on manifolds as well as the Haar measure; this allows us to present in Section 4.8.7 more details about the “unitary approach” (as sketched informally in Section 2.3.1).

4.1. A Property of Connected Spaces

In this Section, we remind the notion of connectedness of a topological space (refer to the textbook [35] for details). We prove a property of connected spaces (Lemma 2) that we need later and that we did not find in the English literature (but in the German textbook [36]).
Definition 2. 
A topological space X is called connected if for two non-empty open sets O 1 , O 2 o p e n X , O 1 , O 2 with X = O 1 O 2 it follows that O 1 O 2 .
In other words, a topological space is connected if it cannot be split into two disjoint, non-empty open sets. The following is well-known (e.g., [35], Proposition 9.8):
Note 4. 
Let X be a connected topological space and let Z X be both, open and closed. Then Z = or Z = X .
Definition 3. 
A family of open sets U = { U i } i I , U i o p e n X with i I U i = X is called an open cover of X .
The proof of the following lemma is from [36].
Lemma 2. 
Let X be a topological space. X is connected For each open cover U of X and two points a , b X there are { U 1 , , U n } U with:
(i) 
a U 1 ,  a U i  for  i 1
(ii) 
b U n ,  b U i  for  i n
(iii) 
U i U j     | i j | 1
An open cover U with the properties (i), (ii), and (iii) for each pair of points a , b X is called an open chain connecting the points a , b X (see Figure 7).
Proof. 
” Let X be disconnected. We must show that there is an open cover of X that does not satisfy conditions (i), (ii), and (iii). Because X is not connected, we find open sets O 1 , O 2 o p e n X with O 1 , O 2 , O 1 O 2 = , and O 1 O 2 = X . Choose a O 1 and b O 2 . The open cover U = { O 1 , O 2 } satisfies (i) and (ii) but not (iii).
” Let U be an open cover of X . Two points a , b X are called U -connected (in symbols a b ) if there exists { U 1 , , U n } U with the properties (i), (ii), and (iii).
Then, “ ” is an equivalence relation: reflexivity and symmetry are obvious. Transitivity is seen as follows: for a , b , c X with a b and b c choose a chain of open sets { U 1 , , U n } U connecting a , b and choose a chain of open sets { V 1 , , V m } U connecting b , c .
Set r : = m i n { i 1 , , n 1 j m : U i V j } and s : = m a x { j 1 , , m U r V j } . Then, { U 1 , , U r , V s , , V m } U is a chain of open sets with properties (i), (ii), (iii) connecting a ,   c , i.e., a c .
Let [ x ] X be a -equivalence class. If we show that [ x ] = X then we are done. For y [ x ] and a chain of open sets { U 1 , , U n } U connecting x , y it is U n o p e n X by definition, and it is U n [ x ] because each z U n is U -connected with x . Thus, [ x ] o p e n X . This implies that X [ x ] = y [ x ] [ y ] o p e n X , thus [ x ] c l o s e d X .
Thus, [ x ] X is open and closed in X , but [ x ] . Note 4 before implies [ x ] = X . □

4.2. Differentiable Manifolds

The definition of a manifold with boundaries is given (the seminal textbook [37] provides many details and proofs, and [23] is a more modern treatment of the subject). For this purpose, we need the definition of the n-dimensional real half-space R + n : = x R n | x n 0 ; it is R + n = x R n | x n = 0 R n 1 the boundary of R + n .
Note that in the context of differentiable manifolds, topological spaces are considered to be Hausdorff spaces and second countable. A Hausdorff space requires that any two different points of the space have disjoint neighborhoods; the set of all neighborhoods of a point p is denoted by U p . A space is second countable if it has a countable basis, i.e., every open set of the space is the union of a subset of the basis. See [38] for the detailed definitions of these terms.
Definition 4. 
Let  M  be a Hausdorff and second countable topological space. If for each point  x M  there exists an open neighborhood  U M  and a homeomorphism  φ : U φ ( U ) o p e n R + n , then  M  is called a topological manifold with boundary of dimension  n  (in symbol:  d i m M = n  ). The pair  ( U , φ )  is called a chart of  M  around  x . A set of charts  A = { ( U i , φ i ) | i I }  with  i I U i = M  is called an atlas of  M .
A point  x M  with  φ ( U ) o p e n R n  is called interior point, and a point  x M  with  φ ( x ) R + n  is called boundary point. The set of boundary points of  M  is called the boundary of  M , denoted by  M .
For two intersecting charts  ( U i , φ i ) , ( U j , φ j ) A , i.e., charts with  U i U j  the map  φ i φ j 1 : φ j ( U i U j ) φ i ( U i U j )  is called the transition function between the charts.
M  is called a differentiable manifold of class  C r  (  C r -manifold for short) if all transition functions are differentiable of class  C r  ; the corresponding atlas  A  is called  C r -atlas. For  r = , the manifold (and the atlas) is called smooth.
In Figure 8  p and q are interior points while r is a boundary point.
Textbooks on differential topology (e.g., [23]) or on differential geometry (e.g., [39]) contain many examples of differentiable manifolds as well as corresponding atlases. Standard examples include n-dimensional spheres in the (n + 1)-dimensional Euclidean space R n , a torus, graphs of differentiable functions (see Lemma 9), which then include curves and surfaces in R 3 . Especially, the Euclidean space R n is a differentiable manifold (see next).
A C r -manifold may have several C r -atlases. For example, both, A 1 = { ( R k , i d ) } as well as A 2 = { ( R x k > 1 k , i d ) , ( R x k < 1 k , i d ) } are C r -atlases (for any r ) of R k , and so is A 1 A 2 . In general, when adding a chart ( U , φ ) to a given C r -atlas A and the resulting atlas A { ( U , φ ) } is again a C r -atlas, ( U , φ ) is said to be compatible with A . Adding all compatible charts to A results in the (unique) maximal atlas A ¯ (that contains A ). A maximal C r -atlas is called a C r -differentiable structure of the corresponding manifold. Any C r -atlas A determines a unique C r -differentiable structure (see [23], Proposition 1.17). Thus, we can assume that the atlases of a manifold are differentiable structures.
The following is often used, and its proof can be found in [37].
Lemma 3. 
Let  M  be a  C r -manifold with boundary,  d i m M = n . Then,  M c l o s e d M  and  M  is a  C r -manifold without boundary,  d i m M = n 1 .
Also, it is well-known that any open subset of a manifold is again a manifold:
Note 5. 
(a)  R k  is a smooth manifold without boundary.
(b) For a  C r -manifold  M  any open subset  S o p e n M  is a  C r -manifold.
(c) The interior  M M  of a manifold with boundary is a manifold without boundary.
Proof. 
(a) A = { ( R k , i d ) } is a smooth atlas that shows that R k is a manifold without a boundary.
(b) If ( U , φ ) is a chart of M around x S , then x U S and φ | U S is a homeomorphism. Thus, ( U S , φ | U S ) is a chart of S around x . Also, restrictions of transition functions are of the same differentiability class as the original functions.
(c) It is M M o p e n M and, thus, the claim follows from (b). □
There are several ways in which manifolds can be constructed from existing manifolds, e.g., by building their sums, quotients, and products (see [23]). In our context, the product of manifolds is of interest: if { M 1 , , M k } are C r -manifolds (without boundary) and d i m M i = n i , then their Cartesian product M 1 × × M k is another C r -manifold (without boundary). For example,
  • The unit sphere S 1 R 2 is a smooth manifold. Thus, T n : = S 1 × × S 1 (the product of n copies of S 1 ) is a new smooth manifold called an n-dimensional torus.
  • An open interval ] a i , b i [ o p e n R (for 1 i n ) is a smooth manifold. Thus, the open n-dimensional cuboid Q n : = ] a 1 , b 1 [ × × ] a n , b n [ R n is a smooth manifold.
For manifolds with boundaries, building their products is a bit less straightforward. In general, there is an extensive theory behind this (see [40]). In our context, the following facts suffice:
  • Let { M 1 , , M k } be C r -manifolds (without boundary), d i m M i = n i , and let N be a C r -manifold with boundary, d i m N = n . Then, M 1 × × M k × N is a manifold with boundary, d i m ( M 1 × × M k × N ) = n 1 + + n k + n , and ( M 1 × × M k × N ) = M 1 × × M k × N (see [23], Proposition 1.45).
  • Thus, with ] a i , b i [ o p e n R for 1 i n and [ a , b ] c l o s e d R , the product Q : = ] a 1 , b 1 [ × × ] a n , b n [ × [ a , b ] R n + 1 is a manifold with boundary Q = ] a 1 , b 1 [ × × ] a n , b n [ × { a } ] a 1 , b 1 [ × × ] a n , b n [ × { b } (for an example, see Figure 9).

4.3. Singularities

In Figure 9, the boundary of the manifold Q consists of the left and right edges without their endpoints. This is an implication of the fact that a boundary of a manifold with boundary is a manifold without boundary (see Lemma 3). If the endpoints of the left or right edges were included, they would become the boundary of the manifolds consisting of the left or right edges: contradiction.
This is a general phenomenon: manifolds exclude singularities, i.e., non-differentiable structures. A vertex of a rectangle is an example of such a singularity: Part (a) in Figure 10 depicts the rectangle Q and one of its vertices x . Part (b) focuses on a neighborhood of x in Q . Finally, part (c) extracts the boundary of this neighborhood, moves x to the origin, and rotates the boundary by 45°; movements and rotations are smooth maps, i.e., the resulting graph in (c) is diffeomorphic to the part of the boundary in (b). Obviously, the graph in (c) is the graph of the absolute function. Assume that there is a chart ( U , φ ) of the boundary around x with φ ( x ) = 0 . Then φ is a diffeomorphism (see Note 6 below); thus, φ 1 is differentiable, but φ 1 is the absolute function, which is known to be not differentiable at 0. This contradiction shows that x is a singularity, i.e., vertices must not be part of a rectangular “shape” in order to be a manifold.
Many other kinds of singularities exist. For example (see Figure 11): Part (a) of the figure shows a “cusp”, which is a curve that is not differentiable at the point x . An intersection at point y in part (b) is not differentiable. Also, an edge of a cuboid (see part (c) of the figure) consists of nodes, each of which is a singularity.
The intersection point in part (b) is already a topological singularity; i.e., no reference to differentiability is needed to recognize this: Any point different from y is contained in a neighborhood that is homeomorphic to an open interval in R . Thus, if a chart ( U ,   φ ) around y would exist, U could be chosen to be connected. Then, φ ( U ) R is connected. U { y } consists of four connected components. Deleting a single point from a connected subset of the real line results in two connected components, i.e., φ ( U ) { φ ( y ) } = φ ( U { y } ) consists of two connected components—a contradiction because φ ( U { y } ) consists of four connected components.
Many of the results achieved in differential topology of manifolds are not valid in the presence of singularities. This is why special care must be taken when claiming that geometric objects are manifolds (with or without boundary): it must be proven that they are manifolds to avoid singularities. Singularities are extensively studied (see [41,42], or [43] for example).

4.4. Differentiable Maps

In Figure 8, χ ( W ) is not an open set in R n ; thus, the ordinary definition of differentiability (that is typically defined for open sets) does not apply. Consequently, the notion of differentiability is extended to functions with a domain that is an arbitrary subset of R n .
Definition 5. 
Let  A R n  be an arbitrary set and let  f : A R k  be a map.  f  is called differentiable of class  C r , iff a map  F : U R k  with  A U o p e n R n  exists that is differentiable of class  C r  in the ordinary sense and that fulfills  F | A = f , i.e.,  F  restricted to  A  is identical to  f .  F  is called a differentiable  C r -extension of  f .
For example, f : [ 0,1 ] R 2 , x x 2 is a C -map with F : R R 2 , x x 2 being a differentiable C -extension of f .
Based on this definition, we can define differentiable maps between manifolds (see also Figure 12):
Definition 6. 
Let  M  and  N  be two  C r -manifolds (with or without boundary) with  d i m M = n  and  d i m N = k . A map  f : M N  is said to be differentiable of class  C r , iff for every  p M  there exist a chart  ( U , φ )  of  M  around  p  and a chart  ( V , ψ )  of  N  around  f ( p )  with  f ( U ) V , such that  ψ f φ 1 : φ ( U ) ψ ( V )  is differentiable of class  C r .
Recall that R k is a manifold with the atlas A = { ( R k , i d ) } . Thus, the definition before covers the definition of differentiability of maps f : M R k .
Maps that maintain the differentiable structure of manifolds are of special interest:
Definition 7. 
Let  M  and  N  be two  C r -manifolds. A map  f : M N  is said to be a  C r -diffeomorphism iff  f  is differentiable of class  C r , is bijective, and has an inverse that is also differentiable of class  C r . Furthermore, the manifolds  M  and  N  are called  C r -diffeomorphic.
For example, f : R R 2 , x x 3 is a C -map. According to Lemma 9, N = { ( x , y ) y = x 3 } is a C -manifold and so is R as a Euclidean space. f : R N is bijective with g : N R , ( x , y ) x as inverse map, and g is also of class C . Thus, f : R N is a C -diffeomorphism.
Diffeomorphic manifolds cannot be distinguished based on differential topological properties. A diffeomorphism in differential topology plays the same role as homeomorphisms in general topology or isomorphisms in algebra.
Next, we show that a chart is a diffeomorphism; this is a well-known fact but the proof is instructive:
Note 6. 
Let  M  be  C r -manifold (with or without boundary) with atlas  A  and let  ( U , χ ) A  be a chart. Then,  χ : U χ ( U )  is a  C r -diffeomorphism.
Proof. 
First, we have to show that χ is of class C r ; i.e., we have to show that ψ χ φ 1 : φ ( W ) ψ ( V ) is of class C r for a chart ( W , φ ) of M and a chart ( V , ψ ) of R k . Choose ( W , φ ) = ( U , χ ) as a chart of M and ( V , ψ ) = ( R k , i d ) as a chart of R k . Then ψ χ φ 1 = i d χ χ 1 = i d : χ ( U ) R k , which is of class C r .
χ is a homeomorphism, i.e., χ is bijective. Similar to before it is seen that χ 1 is of class C r . □
The general question is about the relevance of the differentiability class C r . A famous theorem by Whitney [44] proves that studying C -manifolds suffice (a detailed proof of this theorem can be found in [37], 2–2.10). Because of this theorem, the restriction to smooth manifolds (i.e., C -manifolds) is justified:
Theorem 1. 
Let  1 r . Then: Every  C r -manifold is  C r -diffeomorphic to a  C -manifold.

4.5. Differential of a Map

A point p M of a differentiable manifold M (with or without boundary) is associated with its tangent space  T p M (see Figure 13). This tangent space can be imagined as the set of all tangent vectors to M through p ; the precise definition is much more subtle and complex (see [23], chapter 3) but for our purpose this descriptive idea suffices.
T p M is a vector space, and if M has dimension m it is d i m T p M = m ([23] Proposition 3.12). The disjoint union of the tangent spaces of all points p M is referred to as tangent bundle  T M of M: T M = p M T p M . With d i m M = m , T M is a differentiable manifold with d i m T M = 2 m ([23] Proposition 3.18).
If f : M N is a differentiable map between two differentiable manifolds, then the differential  d f p of f at p M is a linear map d f p : T p M T f ( p ) N (see Figure 13). As before, the precise definition is quite complex ([23], chapter 3), but, again, a vague intuition suffices for our purpose, especially because the differential d f p corresponds to the Jacobian matrix of f ([23], p. 61 ff). We will need the latter representation of the differential of a map because it allows us to compute the rank of the differential as the rank of the Jacobian matrix.
In case M = R m and N = R n are “just” Euclidean spaces, M and N are smooth manifolds (see Note 5a), and their tangent spaces are the very same Euclidean spaces, i.e., T x R m = R m and T y R n = R n for any points x R m and y R n ([23] proposition 3.13). The differential d f p : T p M T f ( p ) N of a map f : M N becomes the total derivative D f ( p ) = f i ( p ) / x j ([23] proposition C.3); thus, for v T p M it is d f p ( v ) = D f ( p ) ( v ) = D v f ( p ) where D v f ( p ) is the directional derivative of f in the direction v .
For example, f : R 2 R 2 , ( x , y ) ( x 2 , y 3 ) is a C -map. Its differential d f ( x , y ) : R 2 R 2 is d f ( x , y ) = 2 x 0 0 3 y 2 . For ( 0,1 ) T ( x , y ) R 2 it is d f ( x , y ) 0 1 = 2 x 0 0 3 y 2 0 1 = 0 3 y 2 .

4.6. Maps of Constant Rank

For each linear map L : V W between vector spaces V and W the rank of L is defined as the dimension of the image of L , i.e., r a n k   L : = d i m   i m g   L . Note that in practice, the rank of a linear map is determined as the number of linear independent columns of the matrix representing the map L (see [34], Section 5.3, for more details). Since the image of L is a subspace of W , it is always r a n k   L d i m   W . According to the Dimension Formula of Linear Algebra (see [34] Theorem 10.9) it is r a n k   L + d i m   k e r L = d i m   V ; thus, r a n k   L = d i m   V d i m   k e r L d i m V . This proves the following:
Note 7. 
Let  L : V W  be a linear map. Then:  r a n k   L m i n { d i m   V , d i m   W } .
The rank of differentiable maps f : M N in p M is the rank of the linear map d f p . It is key to the study of local and global properties of differentiable functions f .
Definition 8. 
Let  f : M N  be a differentiable map between the two differentiable manifolds M and N. The rank of  f  at  p M  is the rank of its differential  d f p : T p M T f ( p ) N , in symbols  r a n k p f . If  f  has the same rank  r  at every point  p M  then  f  is said to have constant rank, in symbols  r a n k f = r .
According to the end of Section 4.5, the map f : R 2 R 2 , ( x , y ) ( x 2 , y 3 ) has the differential d f ( x , y ) = 2 x 0 0 3 y 2 . With d f ( x , y ) = 2 x 0 0 3 y 2 it is d f ( 0,0 ) = 0 0 0 0 , i.e., it is r a n k ( 0,0 ) f = 0 . For ( x , y ) ( 0,0 ) it is 2 x 0 and 3 y 2 0 , i.e., r a n k ( x , y ) f = 2 . And for x = 0 but y 0 is 2 x = 0 and 3 y 2 0 , i.e., r a n k ( 0 , y ) f = 1 . For ( x , y ) U : = { ( x , y ) x > 0 y > 0 } o p e n R 2 is r a n k ( x , y ) f = 2 , i.e., f has a constant rank on U .
Since d f p : T p M T f ( p ) N is linear, it is r a n k p f m i n { d i m M , d i m N } (see Note 7 before). In case r a n k p f = m i n { d i m M , d i m N } , f is said to have full rank at p and just full rank if it has full rank at every point of M . Maps of full rank have special names:
Definition 9. 
Let  f : M N  be a differentiable map between the two differentiable manifolds M and N.  f  is called an immersion in case  r a n k f = d i m M , i.e.,  d f p  is injective.  f  is called a submersion in case  r a n k f = d i m N , i.e.,  d f p  is surjective.
In our context, immersions are of special interest. Thus, give some geometric intuition of an immersion, which is helpful in what follows. Remember that the differentiability of a map f at a point p means that it can be locally approximated by a linear map, i.e., by its differential d f p : for points x close to p it is f ( x ) f ( p ) + d f p x . Thus, the properties of the differential approximate locally properties of the map. Injectivity of the linear map d f p means that it maintains the independence of “all directions” in T p M when mapping to T f ( p ) N : no two different vectors are “smashed” together. For the corresponding map f this translates into the fact that f does not “fold” or “collapse” parts of a neighborhood of the point p . Since an immersion f has a constant rank, which means that f transforms a neighborhood of each point p somehow “faithful”, not introducing any “crushing”.
An often-used well-known property is that the rank of a map can locally not decrease:
Lemma 4. 
Let  f : M N  ,  p M  and  r a n k p f = k . Then there is a neighborhood  U U p  such that  r a n k x f k  for each  x U .
Proof. 
It is r a n k d f p = k ; thus, there is k × k -submatrix A ( p ) of d f p with d e t A ( p ) 0 . W.l.o.g. this submatrix is
A ( p ) = f i x j ( p ) 1 i , j k
Next, define the map
Δ : M R ,   x d e t f i x j ( x ) 1 i , j k .
According to the Leibniz formula from linear algebra it is
Δ ( x ) = d e t d f x = σ S k s g n ( σ ) i = 1 k f i x σ ( i ) ( x ) ,
i.e., Δ is continuous (even differentiable) because f is differentiable and the Leibniz formula is a polynomial. Thus, with Δ ( p ) = d e t A ( p ) 0 there is an U U p such that Δ ( x ) 0 for each x U . Consequently, r a n k x f k for each x U . □
Although the rank of a map can locally not decrease, it may increase. For example, for f ( x , y ) : = ( y , x 2 + y ) it is d f ( x , y ) = 0 2 x 1 1 . Thus, d f ( 0,0 ) = 0 0 1 1 and r a n k ( 0,0 ) f = 1 . Arbitrary close to ( 0,0 ) it is d f ( ε , ε ) = 0 2 ε 1 1 , i.e., for ε > 0 it is r a n k ( ε , ε ) f = 2 . However, if the rank is already maximal, it is locally constant since it cannot decrease locally; this proves:
Lemma 5. 
Let  f : M N  be of full rank at  p M . Then it is of full rank in a neighborhood  U U p .
This has an important well-known implication:
Corollary 1. 
Let  M , N   b e   t w o   d i f f e r e n t i a b l e   m a n i f o l d s   a n d   f : M N  a differentiable map. Then:
a. 
If  d f p  is surjective, then there exists a  U U p  such that  f | U  is a submersion.
b. 
If  d f p  is injective, then there exists a  U U p  such that  f | U  is an immersion.
Proof. 
(a) By assumption d f p is surjective, i.e., by definition r a n k p f = d i m N . But r a n k p f m i n { d i m M , d i m N } implies that d i m N m i n { d i m M , d i m N } . Thus, d i m N = m i n { d i m M , d i m N } because d i m N < m i n { d i m M , d i m N } would be a contradiction. Consequently, r a n k p f = m i n { d i m M , d i m N } which shows that f is of full rank. The lemma before proves the claim. (b) is proven the same way. □
The next theorem (whose proof can be found in [23], Theorem 4.5) shows that in case the differential is bijective at a point of a manifold without boundary the map is a local diffeomorphism around that point. Note, that the precondition that the manifold must have no boundary is essential here: The inclusion ι : R + n R n is corresponding counter-example.
Theorem 2 (Inverse Function Theorem). 
Let  M , N  be two differentiable manifolds without boundary and  f : M N  a differentiable map. If  d f p  is bijective there are  U U p  and  V U f ( p )  such that  f | U : U V  is a diffeomorphism.
In the theorem before the conditions that both manifolds must have no boundary can be weakened: the codomain N may have a boundary but the image of f must be in the interior of the codomain, i.e., f ( M ) N N ; the reason is that N N o p e n N is a submanifold without boundary (see Note 5c), i.e., the original inverse function theorem applies.
Special kinds of immersions play an important role:
Definition 10.
Let  f : M N  be a differentiable map between the two differentiable manifolds M and N.  f  is called an embedding iff  f  is an immersion and  f : M f ( M )  is a homeomorphism onto  f ( M ) N  in the subspace topology.
The map f : R R 2 , x ( x 3 , 0 ) is a homeomorphism, f is smooth, but because of d f 0 = 0 , f is not an immersion, thus, no embedding. Thus, not every smooth homeomorphism is automatically an embedding. However, under the following condition injective immersions are already embeddings (the proof can be found in [23], Proposition 4.22):
Lemma 6. 
Let  f : M N  be an injective immersion,  M  and  N  be manifolds with or without boundary. If any of the following holds,  f  is an embedding:
a. 
M is compact.
b. 
M has no boundary, and  d i m M = d i m N .
The following theorem is the basis of many other theorems in differential topology (see [45], Theorem 3.7.5):
Theorem 3 (Rank Theorem). 
Let  M , N  be two differentiable manifolds and  f : M N  a differentiable map. Let  p M  ,  q : = f ( p ) N  and let  U U p  be a neighborhood of  p  such that for each  x U  it is  r a n k x f = r  (i.e.,  f  has constant rank in  U  ). Then there is a chart for  M  around  p  and a chart for  N  around  q , such that  f  has in these charts the form  f ( x 1 , , x r , x r + 1 , , x m ) = ( x 1 , , x r , 0 , , 0 ) .
Several key well-known properties of maps are inherited by their composition:
Note 8. 
Let  L , M , N  be differentiable manifolds, and let  f : L M  and  g : M N  be maps. Then:
a. 
If  f  and  g  are injective or surjective or bijective then  g f  is injective or surjective or bijective.
b. 
If  f  and  g  are immersions then  g f  is an immersion.
c. 
If  f  and  g  are continuous then  g f  is continuous.
d. 
If  f  and  g  are homeomorphisms then  g f  is a homeomorphism.
e. 
If  f  and  g  are embeddings then  g f  is an embedding.
Proof. 
(a) g ( f ( x ) ) = g ( f ( y ) ) implies f ( x ) = f ( y ) because g is injective. Next, injectivity of f implies x = y . This proves the injectivity of g f .
f ( L ) = M and g ( M ) = N , thus, ( g f ) ( L ) = N .This proves the surjectivity of g f . Together, this proves the bijectivity of g f .
(b) d f and d g are injective. According to the chain rule it is d ( g f ) = d g d f , i.e., part (a) shows that d ( g f ) is injective. Thus, g f is an immersion,
(c) Let O N be open. Because g is continuous, g 1 ( O ) M is open. Because f is continuous, f 1 ( g 1 ( O ) ) L is open. With f 1 g 1 = ( g f ) 1 it follows that ( g f ) 1 ( O ) L is open. Thus, g f is continuous.
(d) f and g are bijective, so is g f (see part (a)). f and g are continuous, so is g f (see part (c)). f 1 and g 1 are continuous, so is ( g f ) 1 = f 1 g 1 (see part (c)). Thus, g f is a homeomorphism.
(e) f and g are embeddings, i.e., both are immersions as well as homeomorphisms. Because of part (b) g f is an immersion, and because of part (d) g f is a homeomorphism. Thus, g f is an embedding. □

4.7. Submanifolds

It is important to note that the definition of a manifold is completely independent of any surrounding space like a Euclidean space. In this sense, manifolds are abstract entities. Their concept has been introduced by Bernhard Riemann in 1854 (published 1868 [46]). It generalizes objects like curves and surfaces that had been studied before that time. The latter are entities within a Euclidian space. Thus, it is natural to ask whether any (abstract) manifold is “equivalent” (also known as. “diffeomorphic”) to a corresponding entity in a Euclidean space: this has been proven by Hassler Whitney in 1936 [44]. In this section, we summarize the corresponding concepts and results as relevant in our context.
A manifold maybe contained in another manifold (see Figure 14).
Definition 11. 
Let  M  be differentiable  C r -manifold,  d i m M = n  and let  S M  be a subset of  M  . S  is called an embedded submanifold of  M  of dimension  k  (or of codimension  n k , respectively) and class  C r  if for any point  p S  there is a chart  ( U , φ )  of  M  around  p  such that  φ ( U S ) = φ ( U ) ( R k × { 0 } ) .
Sometimes, embedded submanifolds are also called regular submanifolds. Figure 14 depicts the situation. If A = { ( U i , φ i ) | i I } is an atlas of M then A | S : = { ( U i S , φ i | U i S ) | i I } is an atlas of S . Note that the latter assumes that S is a topological subspace of M , i.e., the topology of S is the subspace topology. This will become important soon.
Especially, since A | S is an atlas of S , S is a C r -manifold in itself:
Note 9. 
Every embedded submanifold  S  of class  C r  and dimension  k  is a  C r -manifold with  d i m S = k .
The following lemma motivates the name “embedded” submanifold (see [47], Theorem 11.14 for a proof):
Lemma 7. 
Let  M  be differentiable  C r -manifold, and let  S M  be an embedded submanifold of  M . Then, the inclusion  ι : S M  is a  C r -embedding (and, thus, by definition an immersion).
Vice versa, the name “embedding” of a map is justified by the following lemma (see [47], Theorem 11.13 for a proof):
Lemma 8. 
Let  M  and  S  be differentiable  C r -manifolds, and let  f : S M  be a  C r -embedding. Then, the image  f ( S ) M  is an embedded submanifold of  M .
A two-dimensional surface is an example of a manifold embedded in the three-dimensional Euclidian space. In general, graphs of differentiable functions are examples of such surfaces. The following lemma and definition generalizes the corresponding situation (see [23], Proposition 5.4):
Lemma and Definition 9. 
Let  M  and  N  be  C r -manifolds (  N  with or without boundary),  d i m M = m  and  d i m N = n  ,  U o p e n M  and  f : U N  be of class  C r . Then the graph  Γ ( f ) M × N  is an embedded  C r -manifold of dimension  m  without boundary.
Here, the graph of  f  is defined as  Γ ( f ) : = ( x , y ) M × N | x U y = f ( x ) .
The Euclidean space R n is a manifold, so its embedded submanifolds are of interest. The following is an often-used mechanism to produce embedded submanifolds in R n : For U o p e n R k and a C r -differentiable map f : U R m the graph Γ ( f ) R k × R m is an embedded C r -manifold of dimension k without boundary (according to the lemma just before). If f : U R n 1 R is of class C r , the graph Γ ( f ) is a hypersurface in R n , i.e., a C r -manifold of dimension n 1 (see Figure 15); obviously, this generalizes the notion of a surface in R 3 .
Another important means to obtain embedded submanifolds is via so-called level-sets and regular values:
Definition 12. 
Let  f : M N  be a  C r -map.
a. For any point  q N  the set  f 1 ( q )  is called a level-set of  f .
b. A point  p M  is called a regular point if  d f p  is surjective (  p  is called a critical point otherwise).
c.  q N  is called a regular value if each point of  f 1 ( q )  is a regular point (  q  is called critical value otherwise).
d. If  q N  is a regular value the level-set  f 1 ( q )  is called a regular level-set.
Obviously, each point of M is a critical point if d i m M < d i m N ( d f p cannot be surjective in this case). Any regular level set is a C r -manifold (see [23], Corollary 5.14):
Lemma 10. 
Let  M  and  N  be  C r -manifolds, and  f : M N  be of class  C r . Then, any regular level-set  f 1 ( q ) M  is an embedded submanifold of  M  with  d i m f 1 ( q ) = d i m M d i m N .
In Figure 16, f 1 ( c 1 ) , f 1 ( c 2 ) , and f 1 ( c 3 ) are three level-sets of the differentiable map f . Assuming that both c 1 and c 2 are regular values, the two level sets f 1 ( c 1 ) and f 1 ( c 2 ) are embedded submanifolds (the first one diffeomorphic to a circle, the second one diffeomorphic to two disjoint circles) of M of dimension 1 because of d i m M = 2 and d i m N = 1 . But f 1 ( c 3 ) is not an embedded submanifold because it has the shape of an “8”, i.e., it contains a singularity in the form of a self-intersection (see Section 4.3). This implies that c 3 is a critical value.
The lemma before allows us to prove that the set of unitary transformations of a complex vector space is a manifold:
Lemma 11. 
U ( n )  is a smooth compact connected manifold (even an embedded submanifold) with  d i m R U ( n ) = n 2 . U ( n )  is even a Lie-group.
Proof. 
It is U ( n ) = { A G L ( n , C ) A * A = I } , and G L ( n , C ) = C n 2 = R 2 n 2 ; thus, it is U ( n ) R 2 n 2 . Define f : R 2 n 2 = G L ( n , C ) G L ( n , C ) = R 2 n 2 by A A * A ; f is differentiable because building the conjugate transpose of a matrix is differentiable, and the multiplication of two matrices is differentiable too. The differential of f is d f A ( V ) = A * V + V * A (see Appendix A).
Let H ( n ) = { A G L ( n , C ) A = A * } be the set of all Hermitian matrices. H ( n ) is a R -vector space of dimension n 2 ([34] Lemma 13.15). Thus, H ( n ) is a smooth manifold of dimension n 2 .
Because f ( B ) * = ( B * B ) * = B * ( B * ) * = B * B = f ( B ) for each B G L ( n , C ) , it is f ( G L ( n , C ) ) H ( n ) , i.e., f is, in fact, a map f : G L ( n , C ) H ( n ) (and f is differentiable).
Next, we show that any unitary map A U ( n ) is a regular point of f : G L ( n , C ) H ( n ) , i.e., that the differential d f A is surjective for each A U ( n ) :
Choose an arbitrary B H ( n ) and define W : = 1 2 A B G L ( n , C ) . Then, d f A ( W ) = 1 2 A * A B + 1 2 B * A * A = ( 1 ) 1 2 I B + 1 2 B * I = 1 2 B + 1 2 B * = ( 2 ) 1 2 B + 1 2 B = B (hereby, (1) is because A * A = I and (2) because B * = B ).
Now it is f 1 ( I ) = U ( n ) ; thus, I H ( n ) is a regular value and U ( n ) = f 1 ( I ) is a regular level set. Consequently, U ( n ) is an embedded submanifold (Lemma 10), and, thus, a manifold itself (Note 9); furthermore, d i m U ( n ) = d i m f 1 ( I ) =   d i m G L ( n , C ) d i m H ( n ) = 2 n 2 n 2 = n 2 .
Next, we prove compactness.
(a) f : G L ( n , C ) G L ( n , C ) with A A * A is especially continuous. Now, U ( n ) = f 1 ( I ) , { I } G L ( n , C ) is a closed set, and pre-images of closed sets under continuous maps are closed, i.e., U ( n ) c l o s e d R 2 n 2 .
(b) Next, A * A = I means especially that the columns of A = ( a i j ) are unit vectors, i.e., i | a i j | 2 = 1 for each 1 j n . Thus, | a i j | 2 1 for all 1 i , j n . This implies that A 2 = i , j = 1 n | a i j | 2 n 2 for each A U ( n ) , i.e., U ( n ) is bounded in R 2 n 2 . Thus, according to the theorem of Heine-Borel, U ( n ) is compact in R 2 n 2 = G L ( n , C ) .
Finally, we prove connectedness.
Le A U ( n ) . Then, A is diagonalizable, i.e., there exists a unitary matrix T such that A = T d i a g e i φ 1 , , e i φ n T * ([34] Theorem 18.13). Let 0 t 1 ; then, A t : = T d i a g e i t φ 1 , , e i t φ n T * U ( n ) (because A , B U ( n ) A B U ( n ) ). It is A 0 = I and A 1 = A , i.e., there is a path from the identity matrix I to A . Thus, any two unitary matrices can be connected by a path (e.g., via I ): U ( n ) is path-connected. Since every path-connected topological space is connected ([35] Proposition 9.26), U ( n ) is connected.
U ( n ) is a Lie group because matrix multiplication is a differentiable map : U ( n ) × U ( n ) U ( n ) , ( a i j ) ( b s t ) = j a i j b j t i t , and for two unitary matrices A , B U ( n ) it is ( A B ) ( A B ) * = A B B * A * = A I A * = A A * = I , i.e., A B U ( n ) . □
Note that the condition “regular” is key in Lemma 10. Without that condition, any closed subset of a manifold can be made the level set of a differentiable function (see [23], Theorem 2.29 for a proof):
Lemma 12. 
Let  M  be a differentiable  C r -manifold, and let  K c l o s e d M  be a closed subset of  M . Then there exists a  C r -function  f : M R  with  f 1 ( 0 ) = K .
In many practical situations, both, M and N are Euclidean spaces. For example, with M = R n + 1 and N = R , the map f : R n + 1 R , x | x | 2 has the differential d f x = 2 x 1 x n 0 for x 0 , i.e., r a n k x f = 1 for x 0 ; thus, d f x is surjective, except at the origin. This implies that each c 0 R is a regular value; according to the lemma before, f 1 ( c ) is a regular level-set for each c 0 . Thus, f 1 ( c ) = { x R n + 1 | x | 2 = c } is a sphere of dimension n , especially the unit sphere S n = f 1 ( 1 ) is an n -dimensional embedded submanifold of R n + 1 .
There are important situations (e.g., in the context of Lie groups—see [48]) in which the notion of a submanifold has to be generalized. Very roughly, any differentiable manifold that is a subset of another differentiable manifold and is “properly situated” there is considered a certain kind of a submanifold. More precisely,
Definition 13. 
Let  M  and  N  be  C r -manifolds and  f : N M  be an injective immersion of class  C r . Then, the image  f ( N ) M  is called an immersed submanifold of  M .
For example, let f : ] π , + π [ R 2 , x s i n 2 x , s i n x . Then, f is injective on the open interval ] π , + π [ (see Appendix B for more details). Also, f is an immersion because d f x = 2 c o s 2 x , c o s x 0 , i.e., r a n k x f = 1 for x ] π , + π [ . Thus, f is an injective C r -immersion, i.e., S : = f ( ] π , + π [ ) R 2 is an immersed submanifold (see Figure 17). Furthermore, f : ] π , + π [ S is also surjective, thus, bijective, but f is not a homeomorphism because the image S c o m p a c t R 2 is compact with the subspace topology (it is closed and bounded), ] π , + π [ R is not compact, while compactness is a topological invariant. Together, S R 2 is an immersed submanifold, but S is not an embedded submanifold of the manifold R 2 . Another argument supporting the latter: any neighborhood of the origin of S contains a singularity, namely a shape like in Figure 11b.
According to Lemma 7, for any embedded submanifold S M the inclusion ι : S M is an embedding and, thus, an injective immersion, i.e., ι ( S ) = S M is an immersed submanifold of M . This proves the following:
Lemma 13. 
Let  M  be a  C r -manifold (with or without a boundary),  S M  an embedded submanifold. Then,  S  is an immersed submanifold.
The opposite is not true: the example before shows that an immersed submanifold is, in general, not an embedded submanifold. However, the following lemma ([23], Proposition 5.21) gives two situations in which an immersed submanifold is already an embedded submanifold:
Lemma 14. 
Let  M  be a  C r -manifold (with or without a boundary),  S M  an immersed submanifold. If  d i m S = d i m M  or if  S c o m p a c t M  then  S  is embedded.
Finally, we answer the question posed at the beginning of this section: Any compact (abstract) manifold is diffeomorphic to an embedded submanifold of a Euclidean space of high enough dimension—more precisely,
Theorem 4. 
Let  r 1  and let M be a compact  C r -manifold with boundary. M can be embedded into  R + 2 m + 1  with  M R + 2 m + 1 = x R 2 m + 1 | x 2 m + 1 = 0 .
Again, this theorem is by Whitney [44], and a detailed proof can be found in [37] (Theorem 1–4.3).
In summary, this section provided the necessary background of dimensional expressivity as needed in Section 5. Before presenting the corresponding details, the next section finally focuses on the unitary approach, giving a precise definition of the corresponding notion of expressivity. This requires explaining first how volumes on manifolds can be measured.

4.8. Volumes of Manifolds and the “Uniform Approach”

Section 2.3.1 motivated to define the expressivity of a variational quantum circuit by the “volume” of Ω C ( P ) in the unitary group U ( n ) : vividly, the larger this volume, the higher the likelihood that Ω C ( P ) hits the solutions S ( X ) of a given problem X .

4.8.1. Linear Approximations

Here, we provide more details about the notion of “the volume” of a differentiable manifold and especially how the notion of “volume” is related to the unitary group. For this purpose, we first remember that a function f is called differentiable at a point x if it can be approximated locally by a linear function:
f ( x + ξ ) = f ( x ) + L ξ + o ( ξ )
Here, ξ is a point in a small neighborhood of x , L is a linear function, and o ( ξ ) is the small error made when considering f ( x ) + L ξ as the value of f ( x + ξ ) : in a small neighborhood, the differentiable function f is nearly the linear function L . For a differentiable function f : R R this means that L is a real number and f ( x ) + L ξ is the tangent at x at the graph of f ; this tangent locally approximates the function f . Thus, the graph of f , i.e., the manifold Γ ( f ) (see Lemma 9), is approximated by this tangent around ( x , f ( x ) ) .
The idea of linear approximation can be used in arbitrary dimensions: in part (a) of Figure 18, a basis v 1 , v 2 has been chosen for the tangent space T p M of the manifold M at point p . The basis spans a parallelepiped Q : = { a 1 v 1 + a 2 v 2 | 1 a 1 , a 2 1 } . If we take small vectors v 1 ,   v 2 the parallelepiped approximates the manifold M around p with a small error, i.e., the manifold looks locally like a very small parallelepiped of the tangent space.
Applying this linear approximation to enough points p 1 , , p k of the manifold, i.e., if the manifold is “covered” by parallelepipeds, the manifold is turned into a linear approximation of the whole manifold: part (b) of Figure 18 depicts this for the upper part of a manifold which looks like small parallelepipeds glued together (in the direction of the tangent spaces).

4.8.2. Approximating Volumes

Such a linear approximation allows us to compute the approximate volume of the manifold v o l ( M ) by computing the volume of the parallelepipeds v o l ( Q p i ) and summing up their volumes:
v o l ( M ) i = 1 k v o l ( Q p i )
The volume of a parallelepiped is computed as follows: let v 1 , , v n be linear independent vectors, and let Q : = { i a i v i | 1 a 1 , a 2 , , a n 1 } be the parallelepiped spanned by these vectors. Then, from linear algebra, it is known that the volume v o l ( Q ) of the parallelepiped is
v o l ( Q ) = d e t v 1 v 2 v n
Define V : = v 1 v 2 v n to be the matrix with columns v 1 , v 2 ,…, v n . Then:
v o l ( Q ) 2 = | d e t ( V ) | 2 = d e t ( V ) d e t ( V )     = d e t ( V T ) d e t ( V )   = d e t ( V T V )   = d e t v i , v j 1 i , j n
Thus,
v o l ( Q ) = d e t v i , v j 1 i , j n

4.8.3. Riemannian Manifolds and Volume Forms

Equation (9) reveals that computing the volume of a parallelepiped depends on a scalar product. Consequently, we need a scalar product for every tangent space of the manifold M .
Definition 14. 
Let  M  be a differentiable manifold, and let  g  be a function that associates with each  p M  a scalar product  g p : T p M × T p M R  “in a differentiable manner”. Then,  g  is called a Riemannian metric on  M , and  ( M , g )  is called a Riemannian manifold.
Note, that the phrase “in a differentiable manner” is left vague: a precise definition would require to define differentiable vector fields which we do not need in this paper. Also, the differentiability of g is not relevant in our context.
Consequently, we assume that M is a Riemannian manifold; in fact, every differentiable manifold is a Riemannian manifold ([39], Proposition 2.4). Then, the volume of a parallelepiped Q p in T p M is as follows:
v o l ( Q p ) = d e t g p ( v i , v j ) 1 i , j n
With Equations (7) and (10) we can approximate the volume of a Riemannian manifold M as follows:
v o l ( M ) i = 1 k v o l ( Q p i ) = i = 1 k d e t g p ( v i , v j ) 1 i , j n
By choosing infinitesimal small parallelepipeds and correspondingly more and more points from the manifold we perform a limit process. In analogy to the limit process that defines the Riemannian integral we write very informally (and only conceptually) with G = g p ( v i , v j ) i , j :
v o l ( M ) = M d e t ( G ) d x
The precise definitions behind this notion need a lot more concepts and machinery. The most fundamental concept needed is that of a volume form d V that abstracts our informal notation d e t ( G ) d x . In our context, the unitary group admits such a volume form:
Note 10. 
The unitary group  U ( n )  admits a (unique) volume form  d V .
Proof. 
Every differentiable manifold is a Riemannian manifold ([39], Proposition 2.4). If a Riemannian manifold M is oriented then it admits a (unique) volume form d V ([39], Proposition 2.41). According to Lemma 11, U ( n ) is a differentiable manifold, thus, it is a Riemannian manifold also. Any Lie group is orientable ([49], Lemma 6). Since U ( n ) is a Lie group (Lemma 13) it admits a (unique) volume form. □
The volume form d V of a manifold is (in local coordinates) d V = d e t ( G ) d x ; it allows (as Equation (12) indicates) us to compute the volume of M , namely v o l ( M ) = M d V . It also allows us to compute the integral of functions f , i.e., M f d V ([see 39], the discussion following Proposition 2.41).

4.8.4. Haar Measure

Computing volumes is tight to differentiable manifolds, not applicable to other “spaces”. For this purpose, the concept of a measure is introduced (see [50] for details) that is more abstract than a volume but mimics its properties. Luckily, in our context, both concepts are the same. This is roughly seen as follows:
Each open set W of a differentiable manifold M is a differentiable manifold by itself (Note 5). Thus, the inclusion map ι : W M induces a Riemannian metric on W , and v o l ( W ) = W d V is defined. This in turn, defines a measure on the Borel sets B ( M ) of M , where B ( M ) is the smallest σ -algebra containing all open sets of M (refer to [50] for details about Borel sets, measure spaces, and measures). This turns the manifold M into a measure space ( M , B ( M ) , v o l ) ([51], Section 1.7). A measure is more general than computing volumes via integrals.
As a Lie group, U ( n ) admits a left-invariant measure, the so-called Haar measure. This measure is unique up to multiplication by a positive constant ([48], Theorem 3.1). A compact Lie group admits a bi-invariant Riemannian metric ([52], Proposition 2.17), which induces a bi-invariant volume form. Since the Haar measure is unique, we obtain the following:
Note 11. 
The volume form  d V  on a compact Lie group is the Haar measure.
Especially, since U ( n ) is a Lie group, instead of abstractly speaking of the measure of a subset of a manifold, we can speak about its volume.

4.8.5. Volume of Submanifold

Having the ability to determine the volume of manifolds, the question of volumes of submanifolds comes up.
Note 12. 
Let  M  be a manifold with volume element  d V , and let  N M  be an (embedded or immersed) submanifold. If  d i m N < d i m M  then  v o l ( N ) = 0 .
Proof. 
Let M , N be differentiable manifolds and let f : N M be differentiable map. If d i m N < d i m M then f ( N ) M has measure zero ([23], Corollary 6.11). For an (embedded or immersed) submanifold N of M the inclusion map ι : N M is differentiable (Lemma 7), i.e., if d i m N < d i m M then N has measure zero in M . □
This implies that any submanifold S of the unitary group U ( n ) with d i m S < d i m U ( n ) has measure zero. Since measure and volume coincide it is v o l ( S ) = 0 for d i m S < d i m U ( n ) :
Note 13. 
Let  S U ( n )  be a submanifold,  d i m S < d i m U ( n ) . Then  v o l ( S ) = 0 .
Thus, v o l ( Ω C ( P ) ) = 0 in case d i m Ω C ( P ) < d i m U ( n ) (if Ω C ( P ) is a submanifold at all—see next chapter). Consequently, the illustrative motivation about measuring expressivity in the “Unitary Approach” (Section 2.3.1) requires refinement.

4.8.6. Uniform Distribution

Let A = ] 0 , ε [ × ] 0 , ε [ R 2 be a small open square in the Euclidian plain. If a point has to be picked randomly from A , the probability for any two points x , y A being picked is the same: picking is “uniformly random”. This means that the probability of a certain point being picked is identical to the probability of any other point in A being picked: the corresponding distribution of probabilities is uniform.
Obviously, the probability of a point being picked from A is related to the volume of A : the smaller A the higher the likelihood of a point within A to be picked. If we move the square around in the Euclidian plain, e.g., translating it by a vector v R 2 to A v = v + A : = { v + t | t ] 0 , ε [ × ] 0 , ε [ } R 2 , the probability distribution remains the same because the volume is unchanged. This is based on the left-invariance of the volume in Euclidean space: by adding v R 2 from left to each vector in A does not change the volume, i.e., v o l ( A ) = v o l ( A v ) = v o l ( v + A ) .
This is different for arbitrary manifolds: the Euclidean space if flat, but picking points from areas on curved manifolds may behave differently. For example, let S 2 be the unit sphere. Then
f : ] 0 , π [ × ] 0,2 π [ S 2 ( ϑ , φ ) ( s i n ϑ c o s φ , s i n ϑ s i n φ , c o s ϑ )
is a chart of S 2 (leaving out both poles as well as the meridian of the sphere—otherwise f would not be a diffeomorphism). Figure 19 depicts the images under f of different parts of the domain:
  • Part (a) shows the image of the whole domain, i.e., the sphere S 2 ;
  • Part (b) is the image of ] π / 2 ε , π / 2 + ε [ × ] 0,2 π [ ( ε is a small positive number), resulting in a belt around the equator;
  • Part (c) is the image of ] π ε , π [ × ] 0,2 π [ resulting in a cap of the north pole;
  • Part (c) is the image of ] 0 , ε [ × ] 0,2 π [ resulting in a cap of the south pole.
Figure 19. Volumes indicating non-uniform distributions.
Figure 19. Volumes indicating non-uniform distributions.
Appliedmath 05 00121 g019
The figure indicates that the volume of the belt is larger than the volume of a cap. But a larger volume of an area means a smaller probability of a point of the area being randomly picked. Thus, points on the sphere with values of ϑ close to π / 2 (i.e., points from the belt) have a smaller probability of being randomly picked than points with values of ϑ close to π (i.e., points from the cap of the north pole) or close to 0 (i.e., points from the cap of the south pole): by picking a random latitude, points close to the poles have a higher probability to be picked than points near the equator. In this sense, points on the sphere are somehow “concentrated” towards the poles. As a consequence, the corresponding probability distribution is not uniform.
The volume that represents the probability distribution of randomly picking a point must reflect this effect of concentration to become a uniform distribution. As the figure indicates, shifting the belt to a pole does change its volume, i.e., the “usual” volume on the sphere is not translation invariant. But the Haar measure is translation invariant by definition. Thus, if we take the Haar measure to compute the probability distribution (i.e., the volume of areas of the sphere), the distribution becomes uniform: every area of the sphere of the same Haar measure has equal probability of containing a particular chosen point.
Consequently, the difference between the Haar measure of an area and its “usual” volume is an indicator of how much a distribution based on the usual volume deviates from being uniform.

4.8.7. Measuring Expressivity

“Deviation” can be assessed by various means. Especially, whether “volume” is computed directly or indirectly may differ. An example of an “indirect approach” is described next and is based on [6].
Whenever A U ( n ) has been uniformly randomly chosen (often called “Haar random”), A | 0 is a uniformly random state. Thus, if S U ( n ) is a set of Haar random unitary matrices, then S | 0 : = { A | 0 A S } is a Haar random set of states. But Ω C ( P ) U ( n ) is not necessary Haar random, thus, the set Ω C ( P ) | 0 is not necessarily a set of uniformly random states.
Different approaches have been defined to compare S | 0 and Ω C ( P ) | 0 . One approach (see [4,6]) is as follows: first, the elements of these sets are turned into matrices, i.e., for | ψ S | 0 and | φ Ω C ( P ) | 0 the density matrices | ψ ψ | and | φ φ | are taken. Note, that in fact | φ = | φ p depends on parameters p P . For S = U ( n ) , the matrix U ( n ) | ψ ψ | d μ is computed as well as the matrix P | φ p φ p | d x . Finally, the deviation between these two matrices is assessed and taken as expressivity of the ansatz C . For example, based on a matrix norm, the expressivity η ( C ) becomes
η ( C ) : = U ( n ) | ψ ψ | d μ P | φ p φ p | d x
The smaller η ( C ) , the closer Ω C ( P ) U ( n ) becomes to being Haar random. If C and C ̂ are two ansatzes, ansatz C is called more expressive than ansatz C ̂ iff η ( C ) < η ( C ̂ ) .
[31] gives a procedure how expressivity of a specific ansatz can be experimentally determined.

5. Circuit Manifolds

In this section we continue the discussion of dimensional expressivity that was interrupted by the section before giving the precise definition of expressivity according to the unitary approach. We provide a detailed proof that a state map (i.e., the map defined by a parameterized quantum circuit according to Definition 1) maps a properly chosen parameter space into a submanifold of S n 1 .

5.1. State Maps Induce Local Immersions

As introduced in Section 2.3.2, let P R k be a parameter space and C be an ansatz depending on the parameters p 1 , , p k P , and let Λ C be the map Λ C : P S n 1 with p 1 , , p k C ( p 1 , , p k ) | ι with a chosen fixed initial state | ι . Thus, Λ C ( P ) S n 1 is the set of all states reachable by the parameterized circuit C . Lemma 1 shows that Λ C is smooth, i.e., of class C s for any s N .
Assumption 1. 
From now on, we assume that the parameter space  P R k  is an embedded submanifold of  R k  of dimension  k . As discussed in Section 3.2 and Section 3.3, this is often the case in practical situations.
Let q P and let r a n k q Λ C = r . Thus, there is an r × r submatrix of the differential d ( Λ C ) q whose determinant is not 0 ; w.l.o.g. with d ( Λ C ) q = :   ( λ i j ) 1 i , j k this submatrix is ( λ i j ) 1 i , j r . According to Lemma 4, there exists a neighborhood W U q open in R k such that r a n k x Λ C r for each x W ; w.l.o.g. W is an open ball centered around q contained in P . Next, let R q r : = { x R k x j = q j , j > r } be the hyperplane of R k parallel to R r through q , and let P ̂ : = R q r W (see Figure 20); i.e., P ̂ is the intersection of W with R q r . If W P is chosen to be small enough, it is contained in a chart ( U , φ ) of P around q , i.e., W U . With P ̂ = P ̂ W open in P ̂ in the subspace topology, { ( P ̂ , φ | P ̂ ) } is an atlas of P ̂ . This shows the following:
Note 14. 
P ̂ P  is an embedded submanifold of  P  with  d i m P ̂ = r .
It is Λ C ( x 1 , , x r , , x j , , x k ) = Λ C ( x 1 , , x r , , x ¯ j , , x k ) for j > r and for corresponding points in P ̂ (i.e., x j = q j = x ¯ j ), thus, Λ C is constant in each of the x j directions. Consequently, Λ C / x j = 0 for j > r . Together with r a n k x Λ C r this implies r a n k x Λ C = r = d i m P ̂ for x P ̂ . This proves the following:
Lemma 15. 
Λ ̂ C : = Λ C | P ̂ : P ̂ S n 1  is an immersion.
Figure 20. Restricting the Parameter Space.
Figure 20. Restricting the Parameter Space.
Appliedmath 05 00121 g020

5.2. Locally Embedded Circuit Manifolds

According to the Rank Theorem (Theorem 3), there exist charts around q and f ( q ) such that Λ ̂ C has the form Λ ̂ C ( x 1 , , , x k ) = ( x 1 , , x r , 0 , , 0 ) . I.e. Λ ̂ C is injective in a neighborhood of q P ̂ . Thus, for W chosen properly (i.e., by possibly shrinking it), Λ ̂ C is injective, and with Λ ̂ C being also an immersion, we achieve the following (Definition 13):
Lemma 16. 
Λ ̂ C = Λ C | P ̂  is an injective immersion, and  Λ ̂ C ( P ̂ ) S n 1  is an immersed submanifold.
According to Note 14, P ̂ P is an embedded submanifold of P with d i m P ̂ = r . By Assumption 1, P R k is an embedded submanifold of R k . Thus (Lemma 7), both inclusions P ̂ P as well as P R k are embeddings, which implies that the composed inclusion ι : P ̂ R k is an embedding (Note 8e). According to Lemma 8, the image of the inclusion ι ( P ̂ ) = P ̂ R k is an embedded submanifold. This proves the following:
Note 15. 
P ̂ R k  is an embedded submanifold with  d i m P ̂ = r .
With P ̂ R k being an embedded submanifold, choose a chart U , φ of P ̂ around q , i.e., it is φ U o p e n R r . Next, chose a compact ball B ε φ q c o m p a c t φ U with center φ q . Now, B ε φ q is a compact manifold (with boundary) of dimension r , and φ 1 is a diffeomorphism; thus, M : = φ 1 B ε φ q P ̂ is a compact submanifold of dimension r . According to Lemma 16, Λ ̂ C is an injective immersion, which implies that especially Λ ̂ C | M : M S n 1 is an injective immersion. Then, according to Lemma 6a, Λ ̂ C | M is an embedding, and Lemma 8 finally proves the following:
Lemma 17. 
Λ ̂ C M S n 1  is an embedded submanifold of dimension  d i m Λ ̂ C M = r a n k q Λ C .
If choosing the construction before instead of the compact ball B ε φ q , the open ball B ε φ q , it is U : = φ 1 B ε φ q open in M , i.e., U U q is an neighborhood of q open in P ̂ , i.e., q U o p e n M . Λ ̂ C | M is an embedding, especially a homeomorphism onto the image, i.e., Λ ̂ C ( U ) o p e n Λ ̂ C ( M ) . The lemma before showed that Λ ̂ C M S n 1 is an embedded submanifold.
Moreover, an open set O open N of a manifold N is again a manifold with d i m O = d i m N (Note 5b). The inclusion ι : O N is the identity in proper charts ( U , φ ) of N and corresponding charts ( O , φ | O ) of O (w.l.o.g O U ), and thus, it is an immersion. With O open N , the inclusion is also a homeomorphism onto O = ι ( O ) . In summary, ι : O N is an embedding.
With Λ ̂ C ( U ) o p e n Λ ̂ C ( M ) , the latter proves that Λ ̂ C ( U ) is an embedded submanifold of the embedded submanifold Λ ̂ C M S n 1 (Lemma 8). As a consequence, Λ ̂ C ( U ) is an embedded submanifold of S n 1 (Note 8e). Furthermore, d i m Λ ̂ C ( U ) = d i m Λ ̂ C ( M ) = r a n k q Λ ̂ C . Since the overall construction is valid for each p P ̂ the following has been proven:
Note 16. 
For each  p P ̂  there exists an open neighborhood of  p , i.e., p U o p e n P ̂ , such that  Λ ̂ C | U  is an embedding, and  Λ ̂ C U S n 1  is an embedded submanifold with  d i m Λ ̂ C ( U ) = r a n k q Λ ̂ C .
Both, U and Λ ̂ C ( U ) are differentiable manifolds without boundary of the same dimension. Moreover, by construction Λ ̂ C | U : U Λ ̂ C ( U ) is differentiable with bijective differential d Λ ̂ C | U p . The Inverse Function Theorem (Theorem 2) then implies the following:
Note 17. 
For each  p P ̂  there exists an open neighborhood  U p  of  p , i.e., p U p o p e n P ̂ , such that  Λ ̂ C | U p : U p Λ ̂ C ( U p )  is a diffeomorphism.

5.3. An Attempt to Extend Locally Embedded Circuit Manifolds

Many local topological properties, i.e., properties valid in a neighborhood of a point, can be extended to connected components. For example, a function that is locally constant, is constant on each connected component. Central for the corresponding proofs is Lemma 2. In the case of locally constant functions, its use is as follows:
Let X be a connected topological space, and let f : X R be locally constant, i.e., each point x X has an open neighborhood x U x o p e n X and a number c x R such that f | U x = c x . Furthermore, if two such neighborhoods U x , U y intersect, the corresponding numbers c x , c y are equal: for z U x U y it is f ( z ) = c x and f ( z ) = c y , i.e., c x = c y and, thus, f | U x = f | U y . Let U = { U x x X } be an open cover consisting of such neighborhoods, and choose two arbitrary points a , b X . According to Lemma 2, there is an open chain { U 1 , , U n } U connecting a and b . Because U i U i + 1 it is f | U i = f | U i + 1 , which implies that f | U i = f | U j for any 1 i , j n . Thus, f | U 1 = f | U n , i.e., f ( a ) = f ( b ) . Because a , b are arbitrary points from X , f is constant.
In our context, the question at hand is whether the rank of a differentiable map (which can locally only increase according to Lemma 4) can globally only increase on a connected component. In order to mimic the proof before, we make the following assumption on the parameter space.
Assumption 2. 
For each  q P  and each  1 r k  let  P ¯ q , r : = R q r P  be connected., i.e., especially  q P ¯ q , r .
More precisely, it suffices that P ¯ q , r has a finite number of connected components, and for the construction that follows, the connected component containing q is chosen. Thus, w.l.o.g., we can assume that P ¯ q , r is connected. This is directly the case in many practical situations, e.g., if P is an appropriate (see the discussion at the end of Section 4.2) semi-open hypercube (see Figure 21). Similarly, a k -dimensional torus is an example of a parameter space P where P ¯ q , r consists of more than one but a finite number of connected components.
Let q P ¯ q , r q with r q : = r a n k q Λ C . According to the construction of Section 5.1, there exists an open neighborhood of q with q U q o p e n P ¯ q , r q such that Λ C | U q is an immersion (Lemma 15), i.e., Λ C | U q has constant rank r q in U q .
For x , y P ¯ q , r q choose U x , U y P as before, such that Λ C | U x has constant rank r x in U x and that Λ C | U y has constant rank r y in U y . Note that x U x o p e n P ¯ x , r x but not necessarily U x o p e n P ¯ q , r q or U x o p e n P , similar for U y . Figure 22 depicts this situation where P ¯ q , r q is two-dimensional, U x one-dimensional, and U y three-dimensional. Thus, it cannot be guaranteed that an open cover of P ¯ q , r q exists for which the rank of Λ C | U is constant for a member U of the open cover. Consequently, even if a P ¯ q , r is connected, it cannot be guaranteed via an argument involving Lemma 2 that the rank is constant on all of P ¯ q , r .
There is even a counterexample:
Let f : R 2 R 2 , ( x , y ) ( x 2 , y 2 ) . Then d f ( x , y ) = 2 x 0 0 2 y . The rank of d f ( x , y ) is as follows:
  • Let ( x , y ) = ( 0,0 ) . Then d f ( 0,0 ) = 0 0 0 0     r a n k ( 0,0 ) f = 0
  • Let y = 0 x 0 . Then d f ( x , 0 ) = 2 x 0 0 0     r a n k ( x , 0 ) f = 1
  • Let y 0 x = 0 . Then d f ( 0 , y ) = 0 0 0 2 y     r a n k ( 0 , y ) f = 1
  • Let y 0 x 0 . Then d f ( x , y ) = 2 x 0 0 2 y     r a n k ( x , y ) f = 1
The following figure shows the landscape of the ranks of this map f .
f has rank 0 at the origin (red dot), and rank 1 at the x-axis and the y-axis (the green and blue lines without the origin). Each point of the rest of the plane (gray shaded area) has rank 2. For the point q shown, it is r a n k q f = 2 = r q . With a properly chosen parameter space P (e.g., the dark gray shaded rectangle), the set P ¯ q , 2 is as depicted. P ¯ q , 2 is a manifold without boundary, P ¯ q , 2 is connected, and d i m P ¯ q , 2 = 2 ; i.e., P ¯ q , 2 satisfies Assumptions 1 and 2. However, the rank of f is not constant on P ¯ q , 2 .

5.4. Determining “Large” Circuit Manifolds

The counterexample before (and Figure 23) indicates that within a set P ¯ q , r q “large” areas A exist in which the rank r a n k Λ C | A = r q of Λ C | A is constant, i.e., Λ C | A is an immersion. For example, with U = { U x o p e n P ¯ q , r q r a n k Λ C | U x = r q } define A : = U U U and A is then such a “large” area.
Locally, Λ C | A is also injective (see the proof of Lemma 16). Thus, Λ C ( A ) is locally an immersed submanifold of S n 1 . The domain of injectivity can be extended in concrete situations by analyzing Λ C ( A ) . Here is an example of such an analysis:
Example: 
Let  P = R  and let  C ( x ) = e i x H  for a hermitian matrix  H . A set  A R  has to be determined such that  C ( x )  is injective. Since every hermitian matrix is normal by definition ([34] Definition 18.1) and each normal matrix is diagonalizable ([34] Theorem 18.2), it is  H = T d i a g e i λ 1 , , e i λ n T *  for a unitary matrix  U  and the real eigenvalues  λ 1 , , λ n . Consequently, it is    e i x H = T d i a g e i x λ 1 , , e i x λ n T * . Finding the set of parameters for which  C ( x ) = e i x H  is injective means to determine when  e i x H = e i y H  implies  x = y . Thus, we need to find  X R  such that  e i x H = e i y H x = y  for  x , y X .
d i a g e i x λ 1 , , e i x λ n , and thus e i x H , may be periodic—which implies non-injectivity: If λ 1 , . λ n Q , then there exists a T > 0 such that e i H x = e i H ( x + T ) for each x .
Assume x y with e i x H = e i y H . Then, e i ( x y ) H = I . Thus, the spectrum of ( x y ) H must be in 2 π Z , i.e., λ j ( x y ) 2 π Z for each eigenvalue λ j of H . If the spectrum of H fulfills
T > 0 : σ ( H ) = { λ 1 , , λ n } 2 π T Z
then e i ( x y ) H I , i.e., e i x H e i y H : C ( x ) is injective. Thus, if H fulfills condition (15), C and, thus, Λ C is injective on all of R = P .
This example can be abstracted to a high-level procedure of how to determine “large” circuit manifolds. First, the matrix Λ C has to be computed based on the ansatz C ; this matrix is used in step (2). For an ansatz like in Equation (5), often used in practice, this is straightforward; otherwise, the steps that proved Lemma 1 in Section 3.1 must be followed. Thenm
  • Fix a point q P
  • Determine r q : = r a n k q Λ C
  • Determine the slice P ¯ q , r q
  • Determine U = { U o p e n P ¯ q , r q r a n k Λ C | U = r q }
  • Build A = U U U o p e n P ¯ q , r q
  • Determine the maximal subset G A such that Λ C | G is injective
  • Determine a maximal subset K c o m p a c t G .
  • Λ C ( K ) S n 1 is an embedded submanifold
Step (4) and (5) determine a maximal open set A o p e n P ¯ q , r q such that Λ C | A has constant rank r q = d i m P ¯ q , r q , i.e., Λ C | A is an immersion. Since Λ C | A is also locally injective (proof of Lemma 16), step (6) succeeds but is highly dependent on the ansatz C (see the example before). After step (6) we know that Λ C | G is an injective immersion, i.e., Λ C ( G ) is an immersed submanifold of S n 1 (Definition 13). And after step (7) we know that Λ C ( K ) S n 1 is an embedded submanifold (Lemma 6, Lemma 8).
In [53], an open-source package called QMetric is introduced that computes several metrics (especially expressivity) of a parameterized quantum circuit. A description of how expressivity, according to the unitary approach, is measured is given in [31].

6. Conclusions and Future Work

The literature about the expressivity of parameterized quantum circuits requires a lot of background in differential topology, which makes it hard to comprehend. This contribution provides the corresponding background in a single place with detailed references to textbooks and seminal papers, which may be consulted for a much deeper dive into the domain. Similarly, statements about properties of parameterized quantum circuits are often not proved in the literature, or a proof is only indicated. This contribution provides proof of statements about dimensional expressivity (to be precise: proof for local versions of such properties). Also, counterexamples are given, and limits are pointed out (e.g., by highlighting the importance of singularities).
Section 5 clearly reveals the need for future work: Conditions under which local embeddings can be extended to (global) embedding in the case of parameterized quantum circuits are needed, especially if the circuits have the form of Equation (5), which is practically relevant. Furthermore, the applicability in practical situations of the procedure sketched in Section 5.4 must be evaluated in practice.

Author Contributions

Writing—original draft, F.L.; Writing—review and editing, J.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

At the end of Section 4.5 it is shown that d f p ( v ) = D f ( p ) ( v ) = D v f ( p ) , i.e., the differential applied to v is the directional derivative of f in the direction of v . By definition, the directional derivative is D v f ( p ) : = l i m t 0 f ( p + t v ) f ( p ) t , i.e., d f p ( v ) = D v f ( p ) = l i m t 0 f ( p + t v ) f ( p ) t .
Thus, for f ( X ) : = X * X we obtain the following:
d f A ( V ) = l i m t 0 f ( A + t V ) f ( A ) t     = l i m t 0 ( A + t V ) * ( A + t V ) A * A t     = l i m t 0 ( A * + t V * ) ( A + t V ) A * A t     = l i m t 0 A * A + t A * V + t V * A + t 2 V * V A * A t     = l i m t 0 t ( A * V + V * A + t V * V ) t     = l i m t 0 ( A * V + V * A + t V * V )   = A * V + V * A
This proves the claim d f A ( V ) = A * V + V * A .

Appendix B

We sketch a proof that f : ] π , + π [ R 2 , x s i n 2 x , s i n x is injective. Assume s , t ] π , + π [ and f ( s ) = f ( t ) . Thus:
  s i n 2 s = s i n 2 t s i n s = s i n t ( A ) 2 s i n s c o s s = 2 s i n t c o s t s i n s = s i n t ( B ) s i n s c o s s = s i n t c o s t s i n s = s i n t ( C ) c o s s = c o s t s i n s = s i n t
(A) is because of s i n 2 x = 2 s i n x c o s x , (B) is the division by “2”, and (C) is because of s i n s = s i n t and division by s i n s , w.l.o.g. s i n s 0 (otherwise s = t = 0 because of s , t ] π , + π [ , which implies injectivity).
Next, injectivity of both, s i n : ] π / 2 , + π / 2 [ R and c o s : ] 0 , + π [ R is considered, thus: c o s s = c o s t for t ] 0 , π [ implies that s ] π , 0 ] ; and vice versa t ] π , 0 [ implies that s [ 0 , π [ . Otherwise s i n s = s i n t for t ] π / 2 , + π / 2 [ implies s π / 2 or s π / 2 .
Now, a case distinction is performed. Case 1: t ] 0 , π / 2 [ implies both, s ] π , 0 ] as well as s π / 2 . But for t ] 0 , π / 2 [ and s ] π , π / 2 [ it is s i n t s i n s . The other cases follow analogously.

References

  1. Nielsen, M.A.; Chuang, I.L. Quantum Computation and Quantum Information; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
  2. Cerezo, M.; Arrasmith, A.; Babbush, R.; Benjamin, S.C.; Endo, S.; Fujii, K.; McClean, J.R.; Mitarai, K.; Yuan, X.; Cincio, L.; et al. Variational Quantum Algorithms. Nat. Rev. Phys. 2021, 3, 625–644. [Google Scholar] [CrossRef]
  3. Preskill, J. Quantum Computing in the NISQ era and beyond. Quantum 2018, 2, 79. [Google Scholar] [CrossRef]
  4. Huang, H.-Y.; Broughton, M.; Mohseni, M.; Babbush, R.; Boixo, S.; Neven, H.; McClean, J.R. Power of data in quantum machine learning. Nat. Commun. 2021, 12, 2631. [Google Scholar] [CrossRef]
  5. Beer, K.; Bondarenko, D.; Farrelly, T.; Osborne, T.J.; Salzmann, R.; Wolf, R. Efficient Learning for Deep Quantum Neural Networks. Nat. Commun. 2020, 11, 808. [Google Scholar] [CrossRef]
  6. Sim, S.; Johnson, P.D.; Aspuru-Guzik, A. Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms. Adv. Quantum Technol. 2019, 2, 1900070. [Google Scholar] [CrossRef]
  7. Funke, L.; Hartung, T.; Jansen, K.; Kühn, S.; Stornati, P. Dimensional Expressivity Analysis of Parametric Quantum Circuits. Quantum 2021, 5, 422. [Google Scholar] [CrossRef]
  8. McClean, J.R.; Romero, J.; Babbush, R.; Aspuru-Guzik, A. The theory of variational hybrid quantum-classical algorithms. New J. Phys. 2016, 18, 023023. [Google Scholar] [CrossRef]
  9. Blekos, K.; Brand, D.; Ceschini, A.; Chou, C.-H.; Li, R.-H.; Pandya, K.; Summer, A. A Review on Quantum Approximate Optimization Algorithm and its Variants. Phys. Rep. 2024, 1068, 1–66. [Google Scholar] [CrossRef]
  10. Nocedal, J.; Wright, S.J. Numerical Optimization; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  11. Xu, W.; Zhou, R.-G.; Li, Y.; Zhang, X. Towards an efficient variational quantum algorithm for solving linear equations. Commun. Theor. Phys. 2024, 76, 115103. [Google Scholar] [CrossRef]
  12. Rodríguez, G.; Alonso, M.; García-Santiago, X.; Gil, G.B. Energy Distribution Optimization Using Variational Quantum Algorithms; Galicia Institute of Technology (ITG): A Coruña, Spain, 2024. [Google Scholar] [CrossRef]
  13. Jaschk, D.; Jaksch, D.; Givi, P.; Daley, A.J.; Rung, T. VariationalQuantum Algorithms for Computational Fluid Dynamics. AIAA J. 2023, 61, 1885–1894. [Google Scholar]
  14. Leymann, F.; Barzen, J. The Bitter Truth About Gate-Based Quantum Algorithms in the NISQ Era; Quantum Science and Technology; IOP Publishing Ltd.: Bristol, UK, 2020. [Google Scholar]
  15. Weder, B.; Barzen, J.; Leymann, F.; Zimmermann, M. Hybrid Quantum Applications Need Two Orchestrations in Superposition: A Software Architecture Perspective. In Proceedings of the 18th IEEE International Conference on Web Services, Madrid, Spain, 20–24 April 2009; Association for Computing Machinery: New York, NY, USA, 2021. [Google Scholar]
  16. Truger, F.; Barzen, J.; Leymann, F.; Obst, J. Warm-Starting the VQE with Approximate Complex Amplitude Encoding. In Proceedings of the 1st International Conference on Quantum Software (IQSOFT 2025), Bilbao, Spain, 10–12 June 2025. [Google Scholar]
  17. Macaluso, A.; Clissa, L.; Lodi, S.; Sartori, C. A Variational Algorithm for Quantum Neural Networks. Comput. Sci.-ICCS 2020 2020, 12142, 591–604. [Google Scholar]
  18. Morales, M.E.S.; Biamonte, J.D.; Zimborás, Z. On the Universality of the Quantum Approximate Optimization Algorithm. Quantum Inf. Process. 2020, 19, 291. [Google Scholar] [CrossRef]
  19. Holmes, Z.; Sharma, K.; Cerezo, M.; Coles, P.J. Connecting ansatz expressibility to gradient magnitudes and barren plateaus. PRX Quantum 2022, 3, 010313. [Google Scholar] [CrossRef]
  20. Bengtsson, I.; Zyczkowski, K. Geometry of Quantum States, 2nd ed.; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
  21. Gil-Fuster, E.; Eisert, J.; Dunjko, V. On the expressivity of embedding quantum kernels. Mach. Learn. Sci. Technol. 2024, 5, 025003. [Google Scholar] [CrossRef]
  22. Gorodski, C. Smooth Manifolds; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
  23. Lee, J.M. Introduction to Smooth Manifolds, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  24. Du, Y.; Tu, Z.; Yuan, X.; Tao, D. Efficient Measure for the Expressivity of Variational Quantum Algorithms. Phys. Rev. Lett. 2022, 128, 080506. [Google Scholar] [CrossRef]
  25. Ghosh, T.; Mandal, A.; Banerjee, S.; Mukherjee, N.; Panigrahi, P.K. Lower bound of the expressibility of ansatzes for Variational Quantum Algorithms. arXiv 2025, arXiv:2311.01330v2. [Google Scholar]
  26. Yao, J. Learning to Maximize Quantum Neural Network Expressivity via Effective Rank. arXiv 2025, arXiv:2506.15375v2. [Google Scholar] [CrossRef]
  27. Liu, J.; Yuan, H.; Lu, X.-M.; Wang, X. Quantum Fisher information matrix and multiparameter estimation. J. Phys. A Math. Theor 2020, 53, 023001. [Google Scholar] [CrossRef]
  28. Yu, Z.; Chen, Q.; Jiao, Y.; Li, Y.; Lu, X.; Wang, X.; Yang, J.Z. Non-asymptotic Approximation Error Bounds of Parameterized Quantum Circuits. arXiv 2024, arXiv:2310.07528v2. [Google Scholar]
  29. Mhiri, H.; Monbroussou, L.; Herrero-Gonzalez, M.; Thabet, S.; Kashefi, E.; Landman, J. Constrained and Vanishing Expressivity of Quantum Fourier Models. arXiv 2024, arXiv:2403.09417v1. [Google Scholar] [CrossRef]
  30. Azado, P.C.; Correr, G.I.; Drinko, A.; Medina, I.; Canabarro, A.; Soares-Pinto, D.O. Expressibility, Entangling Power and Quantum Average Causal Effect for Causally Indefinite Circuits. Phys. Rev. A 2025, 111, 042620. [Google Scholar] [CrossRef]
  31. Liu, Y.; Kaneko, K.; Baba, K.; Koyama, J.; Kimura, K.; Takeda, N. Analysis of Parameterized Quantum Circuits: On the Connection Between Expressibility and Types of Quantum Gates. arXiv 2024, arXiv:2408.01036v2. [Google Scholar] [CrossRef]
  32. Correr, G.I.; Medina, I.; Azado, P.C.; Drinko, A.; Soares-Pinto, D.O. Characterizing Randomness in Parameterized Quantum Circuits Through Expressibility. arXiv 2025, arXiv:2405.02265v2. [Google Scholar]
  33. Ragone, M.; Bakalov, B.N.; Sauvage, F.; Kemper, A.F.; Marrero, C.O.; Larocca, M.; Cerezo, M. A Lie Algebraic Theory of Barren Plateaus for Deep Parameterized Quantum Circuits. Nat. Commun. 2024, 15, 7172. [Google Scholar] [CrossRef]
  34. Liesen, J.; Mehrmann, V. Linear Algebra; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
  35. Parthasarathy, K. Topology; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
  36. von Querenburg, B. Mengentheoretische Topologie; Springer: Berlin/Heidelberg, Germany, 2001. (In German) [Google Scholar]
  37. Hirsch, M.W. Differential Topology; Springer: Berlin/Heidelberg, Germany, 1976. [Google Scholar]
  38. Lopez, R. Point-Set Topology; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar]
  39. Lee, J.M. Introduction to Riemannian Manifolds, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2018; p. S21. [Google Scholar]
  40. Lueck, W.; Macko, T. Surgery Theory; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar]
  41. Arnold, V.I. Catastrophe Theory; Springer: Berlin/Heidelberg, Germany, 1984. [Google Scholar]
  42. Bröcker, T.; Lander, L. Differentiable Germs and Catastrophes; Cambridge University Press: Cambridge, UK, 1975. [Google Scholar]
  43. Joyce, D. A generalization of manifolds with corners. Adv. Math. 2016, 299, 760–862. [Google Scholar] [CrossRef]
  44. Whitney, H. Differentiable Manifolds. Ann. Math. 1936, 37, 3. [Google Scholar] [CrossRef]
  45. Adhikari, A.; Adhikari, M.R. Basic Topology 2; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
  46. Riemann, B. Ueber die Hypothesen, welche der Geometrie zu Grunde liegen. In Abhandlungen der Königlichen Gesellschaft der Wissenschaften zu Göttingen; Dieterichsche Buchhandlung: Mainz, Germany, 1868; Volume 13, Available online: https://www.emis.de/classics/Riemann/Geom.pdf (accessed on 31 August 2025). English Translation: On the Hypotheses Which Lie at the Bases of Geometry. Available online: https://www.maths.tcd.ie/pub/HistMath/People/Riemann/Geom/WKCGeom.html (accessed on 31 August 2025).
  47. Tu, L.W. An Introduction to Manifolds, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
  48. San Martin, L.A.B. Lie Groups; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
  49. Koch, R. Compact Lie Groups; University of Oregon: Eugene, OR, USA, 2022; p. S21. [Google Scholar]
  50. Axler, S. Measure, Integration & Real Analysis; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
  51. Grigoryan, A. Analysis on Manifolds; University of Bielefeld: Bielefeld, Germany, 2024. [Google Scholar]
  52. Alexandrino, M.M.; Bettiol, R.G. Introduction to Lie groups, isometric and adjoint actions and some generalizations. arXiv 2010, arXiv:0901.2374v3. [Google Scholar] [CrossRef]
  53. Illésová, S.; Rybotycki, T.; Beseda, M. QMetric: Benchmarking Quantum Neural Networks Across Circuits, Features, and Training Dimensions. arXiv 2025, arXiv:2506.23765v2. [Google Scholar] [CrossRef]
Figure 1. Structure of a VQA.
Figure 1. Structure of a VQA.
Appliedmath 05 00121 g001
Figure 2. Two Approaches to Assess Expressivity.
Figure 2. Two Approaches to Assess Expressivity.
Appliedmath 05 00121 g002
Figure 3. (a) Completeness and (b) Expressiveness of an Ansatz: The Unitary Approach.
Figure 3. (a) Completeness and (b) Expressiveness of an Ansatz: The Unitary Approach.
Appliedmath 05 00121 g003
Figure 4. Expressiveness of an Ansatz: The State Approach.
Figure 4. Expressiveness of an Ansatz: The State Approach.
Appliedmath 05 00121 g004
Figure 5. Misleading Intuition of Dimensional Expressivity.
Figure 5. Misleading Intuition of Dimensional Expressivity.
Appliedmath 05 00121 g005
Figure 6. Parameterized Quantum Circuit as a Smooth Map.
Figure 6. Parameterized Quantum Circuit as a Smooth Map.
Appliedmath 05 00121 g006
Figure 7. Chain of Open Sets Connecting Two Points.
Figure 7. Chain of Open Sets Connecting Two Points.
Appliedmath 05 00121 g007
Figure 8. Charts and Transition Functions of a Differentiable Manifold With Boundary.
Figure 8. Charts and Transition Functions of a Differentiable Manifold With Boundary.
Appliedmath 05 00121 g008
Figure 9. A Product Manifold Q With Boundary.
Figure 9. A Product Manifold Q With Boundary.
Appliedmath 05 00121 g009
Figure 10. Vertices in Rectangles are Singularities.
Figure 10. Vertices in Rectangles are Singularities.
Appliedmath 05 00121 g010
Figure 11. More Singularities.
Figure 11. More Singularities.
Appliedmath 05 00121 g011
Figure 12. Differentiability of a Map Between Differentiable Manifolds.
Figure 12. Differentiability of a Map Between Differentiable Manifolds.
Appliedmath 05 00121 g012
Figure 13. Tangent space and Differential.
Figure 13. Tangent space and Differential.
Appliedmath 05 00121 g013
Figure 14. A Chart of an Embedded Submanifold.
Figure 14. A Chart of an Embedded Submanifold.
Appliedmath 05 00121 g014
Figure 15. The Graph of a Map as Manifold.
Figure 15. The Graph of a Map as Manifold.
Appliedmath 05 00121 g015
Figure 16. Level-Sets of a Differentiable Map.
Figure 16. Level-Sets of a Differentiable Map.
Appliedmath 05 00121 g016
Figure 17. An Injective Immersion That is No Topological Embedding.
Figure 17. An Injective Immersion That is No Topological Embedding.
Appliedmath 05 00121 g017
Figure 18. Linear Approximation of a Differentiable Manifold.
Figure 18. Linear Approximation of a Differentiable Manifold.
Appliedmath 05 00121 g018
Figure 21. Slicing the Parameter Space.
Figure 21. Slicing the Parameter Space.
Appliedmath 05 00121 g021
Figure 22. Neighborhoods of Constant Rank of Λ C .
Figure 22. Neighborhoods of Constant Rank of Λ C .
Appliedmath 05 00121 g022
Figure 23. Landscape of Ranks of f ( x , y ) = ( x 2 , y 2 ) .
Figure 23. Landscape of Ranks of f ( x , y ) = ( x 2 , y 2 ) .
Appliedmath 05 00121 g023
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Barzen, J.; Leymann, F. On the Differential Topology of Expressivity of Parameterized Quantum Circuits. AppliedMath 2025, 5, 121. https://doi.org/10.3390/appliedmath5030121

AMA Style

Barzen J, Leymann F. On the Differential Topology of Expressivity of Parameterized Quantum Circuits. AppliedMath. 2025; 5(3):121. https://doi.org/10.3390/appliedmath5030121

Chicago/Turabian Style

Barzen, Johanna, and Frank Leymann. 2025. "On the Differential Topology of Expressivity of Parameterized Quantum Circuits" AppliedMath 5, no. 3: 121. https://doi.org/10.3390/appliedmath5030121

APA Style

Barzen, J., & Leymann, F. (2025). On the Differential Topology of Expressivity of Parameterized Quantum Circuits. AppliedMath, 5(3), 121. https://doi.org/10.3390/appliedmath5030121

Article Metrics

Back to TopTop