Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation

During phonation, the vocal folds exhibit a self-sustained oscillatory motion, which is influenced by the physical properties of the speaker’s vocal folds and driven by the balance of bio-mechanical and aerodynamic forces across the glottis. Subtle changes in the speaker’s physical state can affect voice production and alter these oscillatory patterns. Measuring these can be valuable in developing computational tools that analyze voice to infer the speaker’s state. Traditionally, vocal fold oscillations (VFOs) are measured directly using physical devices in clinical settings. In this paper, we propose a novel analysis-by-synthesis approach that allows us to infer the VFOs directly from recorded speech signals on an individualized, speaker-by-speaker basis. The approach, called the ADLES-VFT algorithm, is proposed in the context of a joint model that combines a phonation model (with a glottal flow waveform as the output) and a vocal tract acoustic wave propagation model such that the output of the joint model is an estimated waveform. The ADLES-VFT algorithm is a forward-backward algorithm which minimizes the error between the recorded waveform and the output of this joint model to estimate its parameters. Once estimated, these parameter values are used in conjunction with a phonation model to obtain its solutions. Since the parameters correlate with the physical properties of the vocal folds of the speaker, model solutions obtained using them represent the individualized VFOs for each speaker. The approach is flexible and can be applied to various phonation models. In addition to presenting the methodology, we show how the VFOs can be quantified from a dynamical systems perspective for classification purposes. Mathematical derivations are provided in an appendix for better readability.


Introduction
Phonation is a complex bio-mechanical process wherein the glottal airflow, mediated by the muscles in the larynx and driven by an intricate balance of aerodynamic and mechanical forces across the glottis, maintains the vocal folds in a state of self-sustained vibration [1,2]. During this process, depending on the physical state of the vocal folds, their eigenmodes of vibration synchronize, or strive to do so. This self-sustained motion of the vocal folds during phonation is highly sensitive to perturbations caused by many possible influencing factors, which may affect the speaker during speech production. In recent years, there has been a surge of interest in building voice-based diagnostic aids-computational models based on artificial intelligence and machine learning that can infer the speaker's state (and thereby the factors that are affecting the speaker) from voice. Such applications can benefit greatly from being able to deduce the fine-level nuances in the motion of the vocal folds, and being able to measure the response to various perturbing factors to infer their nature. However, doing this using traditional methods on an individual basis for each speaker is very difficult. Traditional methods to observe and record vocal fold oscillations (VFOs) are based on actual physical measurements taken using various instruments in clinical settings, e.g., [3][4][5].
The primary focus of this paper is to derive the VFOs for a speaker directly from recorded voice signals, alleviating the need for taking physical measurements. The solution we propose is an analysis-by-synthesis approach, based on physical models of phonation. We propose a methodology to deduce the parameters of a chosen phonation model from a speaker's voice recording, which can then be substituted into a VFO model to obtain speaker-specific solutions, which represent the vocal fold oscillations of the speaker.
In the paragraphs below, we first review some relevant facts about the process of phonation. In the sections that follow, we present our proposed methodology in two stages: in the first, we show how we can infer the parameters of a physics-based model of phonation from measurements of glottal excitation. Since the parameters of such models represent the physical properties of the speaker's vocal folds, using them to obtain solutions to the models gives us an estimate of the speaker's VFOs. We subsequently show how to extend the model to include the physics of mucosal wave propagation in the vocal tract, and propose a forward-backward algorithm to estimate the parameters of the joint model. These can then be used in the corresponding phonation model to obtain its solutions.
Later on, we also explain how the solutions of these models (which are the deduced VFOs) can be characterized from a dynamical systems perspective to derive discriminative information in the form of features that are useful for classification tasks.
In this context, it is important to note at the outset that the models we use to demonstrate our methodology are simple and well-established models in the literature. It is not the goal of this paper to propose new models of phonation, but rather to propose a novel methodology to derive their parameters from speech signals, so that they can be solved to yield VFOs on a speaker-by-speaker basis. The main contribution of this paper lies in the derivation of the parameters of these models, and their individualized VFO solutions. The viability of these solutions is demonstrated experimentally using classification experiments.

The Bio-Mechanical Process of Phonation
By the myoelastic-aerodynamic theory of phonation, the forces in the laryngeal region that initiate and maintain phonation relate to (a) pressure balances and airflow dynamics within the supra-glottal and sub-glottal regions and (b) muscular control within the glottis and the larynx. The balance of forces necessary to cause self-sustained vibrations during phonation is created by physical phenomena such as the Bernoulli effect and the Coandǎ effect [6][7][8]. Figure 1 illustrates the interaction between these effects that is largely thought to drive the oscillations of the vocal folds.
The process of phonation begins with the closing of the glottis. This closure is voluntary and facilitated by the laryngeal muscles. Once closed, the muscles do not actively play a role in sustaining the vibrations. Glottal closure is followed by a contraction of the lungs which pushes out air and causes an increase in pressure just below the glottis. When this subglottal pressure crosses a threshold, the vocal folds are pushed apart, and air rushes out of the narrow glottal opening into the much wider supra-glottal region, creating negative intra-glottal pressure (with reference to atmospheric air pressure) [9].
The exact physics of the airflow through the glottis during phonation is well studied, e.g., [2,[10][11][12][13][14]. The current understanding from these, from the airflow perspective, is that the glottis forms a flow separation plane. The air expansion in this region and the low pressure created in the vicinity of the glottis through the Coandǎ effect-induced entrainment cause a lowering of pressure close to the glottis and a net downward force on the glottis. At the same time, lowered pressure in the glottal region due to the Bernoulli effect that ensues from the high-velocity air volume flow through the glottis exerts a negative force on the glottis. The negative Bernoulli pressure causes elastic recoil, causing it to begin to close again. The closing reduces the volume flow through the glottis, diminishing the downward forces acting on it. Increased pressure build-up in the sub-glottal region causes the glottis to open again. This chain of oscillations continues in a self-sustained fashion throughout phonation until voluntary muscle control intervenes to alter or stop it or as the respiratory volume of air in the lungs is exhausted. Figure 1. Schematic of the balance of some key forces through one cycle of the self-sustained vibrations of the vocal folds. Sequential "snapshots" of the cycle are numbered 1-5. Arrows depict the direction and type of forces in the glottal area. The color codes for the arrows depict net forces due to the following: Pink-muscular; Green-Bernoulli effect; Yellow-Coandǎ effect; Blue-vocal fold elasticity and other factors; Black and Red-air pressure. Lighter shades of each color depict weaker forces. Figure from [9] with permission.
For modeling purposes, we note that phonation is not the only source of excitation of the the vocal tract in producing speechsounds, which comprise both voiced and unvoiced sounds. However, phonation is indeed the primary source of excitation of the vocal tract in the production of voiced sounds, wherein the oscillation of the vocal folds modulates the pressure of the airflow to produce a (quasi-) periodic glottal flow wave at a fundamental frequency (the pitch), which in turn results in the occurrence of higher order harmonics. The resultant glottal flow further excites the vocal tract, which comprises the laryngeal cavity, the pharynx, and the oral and nasal cavities, to produce individual sounds. The vocal tract serves as a resonance chamber that produces formants. The identities of the different sounds produced within it are derived from these resonances, which in turn are largely dependent on the configurations of the vocal tract specified by their time-varying crosssectional area.
From this perspective, phonation modeling has typically involved the modeling of two sub-processes: the self-sustained vibration of the vocal folds, and the propagation of the resultant pressure wave through the vocal tract [21]. Each sub-process model has associated parameters that determine the model output, given an input.
Following the above division of the process, we identify the two following model types, each modeling one of the sub-processes: (i) Vocal fold oscillation (VFO) models (also called vocal folds models, or oscillation models), and (ii) vocal tract (VT) models.
Each of these has proven to be useful in different contexts. The VT models describe the interaction of the glottal pressure wave with the vocal tract, which is turn has been described in the literature by varied models, such as statistical models, e.g., [25], geometric models, e.g., [26] and biomechanical models, e.g., [27].
In addition, different models are also applied to describe the aero-acoustic interaction of the glottal airflow and the vocal tract. Some examples of these are reflection-type line analog models and transmission line circuit analog models, e.g., [28], hybrid time-frequency domain models, e.g., [29] and finite-element models, e.g., [30].

The Problem of Parameter Estimation
Each of the models mentioned in Section 1.2 includes a set of parameters that determine its state, and output, given an input. For instance, given the parameters for a VFO model, the glottal flow waveform can be determined; given the glottal flow waveform as an input, and the parameters for a VFO model, the acoustic signal can be determined.
The problem we tackle in this paper is the inverse problem: given a model output, we must estimate the model parameters from it.
This can be of great practical use. For example, with speaker-specific parameter setting, the output of these models can be used as a proxy for the actual vocal fold motion of the corresponding speaker. To obtain the parameters of such models, the traditional solution has been to take actual physical measurements of the vocal fold oscillations, or of the glottal flow using techniques such as high-speed videostroboscopy, as mentioned in Section 1. This is not always feasible.
On the other hand, the inverse problem of parameter estimation of phonation models is quite difficult to solve through purely computational means. For example, in order to estimate the parameters of a vocal tract (VT) model, one must take into account the vocal tract coupling, the effect of the lossy medium that comprises the walls of the vocal tract, lip radiation, etc. Without serious approximations, the inverse problem in this case becomes eventually intractable. Some approaches simplify the solution by discretizing the vocal tract as a sequence of consecutive tubes of varying cross-sectional area, or with a mesh-grid. However, these approximations invariably increase the estimation error. This paper proposes a methodology for solving the inverse problem of phonation models through purely computational means. As mentioned earlier, the methodology follows an analysis-by-synthesis approach. We explain this by first reviewing our previously proposed Adjoint Least Squares Estimation (ADLES) algorithm [31] that estimates the parameters of a VFO model by minimizing the error between a reference glottal flow waveform and the signal generated by the physical VFO model. We then describe our proposed ADLES-VFT algorithm to estimate the parameters of a joint VFO and VT model (also called a body-cover model). Instead of comparing the model-generated excitation at the glottis to a reference glottal flow, the flow is propagated through the vocal tract to generate a signal at the lips, which is compared to a recorded voice signal which is used as a reference. The algorithm proposed iteratively re-estimates the model parameters by minimizing the error between the reference voice sample and this generated signal.
Once estimated, these parameters are used with the VFO model to generate the speaker's VFOs.

Vocal Folds, Vocal Tract and Joint Models
In this section we describe the VFO and VT models that we use as examples in this paper. We also explain the formulation of the joint model that we ultimately use to solve the inverse problem of parameter estimation.

The VFO Model
A schematic illustration of a general mass-spring oscillator model for the vocal folds is shown in Figure 2. This is used to model the phonation process as described below. Airflow from the lungs, driven by the subglottal pressure P s , passes through the glottis, and vocal folds are set into a state of self-sustained vibration, producing the glottal flow u g which is a quasiperiodic pressure wave. The vibration of vocal folds is analogous to a pair of mass-spring-damper oscillators. Further, the glottal flow resonates in the speaker's vocal tract and nasal tract and produces voiced sound.
One-mass models describe the vibration of the vocal folds as that of a single massdamper-spring oscillator. Mẍ where x is lateral displacement of a mass M, B and K are damping and stiffness coefficients, respectively, f is the driving force, and t is time [2]. The driving force is velocity-dependent and can be estimated by Bernoulli's energy law: where P g is the mean glottal pressure, P s is sub-glottal pressure, ρ is air density, and v is the air particle velocity. The kinetic pressure in the supra-glottal region is neglected [2]. Other models, namely two-mass, multi-mass and finite element models can also be used as the basis for the VFO model, and are described briefly in the Appendix A for reference.
For our paper, we adopt the version of the one-mass model of the vocal folds proposed in [24], illustrated in Figure 3. This is an asymmetric body-cover model which models the left and right vocal folds individually as one-mass components of a coupled dynamical system. It incorporates an asymmetry parameter, which can emulate the asymmetry in the vibratory motions of left and right vocal folds, and hence is also ideally suited to modeling pathological or atypical phonation [32].
The key assumptions made in formulating this model are: The degree of asymmetry is independent of the oscillation frequency; The glottal flow is stationary, frictionless, and incompressible; (c) All subglottal and supraglottal loads are neglected, eliminating the effect of sourcevocal tract interaction; (d) There is no glottal closure and hence no vocal fold collision during the oscillation cycle; (e) The small-amplitude body-cover assumption is applicable (see definition below). Assumption 1 (Body-cover assumption). The body-cover assumption assumes that a glottal flowinduced mucosal wave travels upwards within the transglottal region, causing a small displacement of the mucosal tissue, which attenuates down within a few millimeters into the tissue as an energy exchange happens between the airstream and the tissue [2].
This assumption allows us to represent the mucosal wave as a one-dimensional surface wave on the mucosal surface (the cover) and treat the remainder of the vocal folds (the body) as a single mass or safely neglect it. As a result, the oscillation model can be linearized, and the oscillatory conditions are much simplified while maintaining the model's accuracy.
In the one-mass asymmetric model proposed in [24], with reference to Figure 3, the center-line of the glottis is denoted as the z-axis. At the midpoint (z = 0) of the thickness of the vocal folds, the left and right vocal folds oscillate with lateral displacement ξ l and ξ r , resulting in a pair of coupled Van der Pol oscillators: where β is the coefficient incorporating mass, spring and damping coefficients, α is the glottal pressure coupling coefficient, and ∆ is the asymmetry coefficient. For a male adult with normal voice, the reference values for the model parameters (from clinical measurements) are usually approximately set to α = 0.5, β = 0.32 and ∆ = 0.

The VT Model
The literature describes a number of different approaches to modeling the vocal-tract, including bio-mechanical models, statistical models, and geometric models. For reference, they are described briefly in the Appendix B.
In our work we use an acoustic wave propagation model described by PDEs for the vocal tract. The vocal tract itself is represented as tube of length L, beginning at the glottis and ending at the lips. Representing the distance along the central axis of the vocal tract as x, x ∈ (0, L) (where x = 0 at the glottis and x = L at the lips) and the time-varying volume velocity of air at any position x along the vocal tract as u(x, t), it can be shown that the PDE that describes u(x, t) is given by where f (x, t) represents the vocal tract profile, which models the characteristics of the vocal tract, including the effect of the nonuniform yielding wall on the acoustic flow dynamics, the effect of vocal tract coupling, lip radiation, etc., and must also be estimated by our algorithm. The derivation of Equation (4) is given in Appendix B. Note that if the vocal tract is assumed to be a rigid tube f (x, t) = 0 and Equation (4) reduces to the well-known Webster-Horn equation [33]. In deriving Equation (4) we have assumed a static vocal tract, i.e., that the cross-sectional area A(x) of the vocal tract at any position x is constant. This assumption is valid during phonation, in particular during the steady state of sustained phonation; however our solution can also be extended to consider time-varying vocal tracts A(x, t), although we have not done so in this paper. The oscillation of the vocal folds results in the movement of air with a time-varying volume velocity u 0 (t) = u(0, t) at the glottis. The vocal tract modulates this to result in the volume velocity u L (t) = u(L, t) at the lips and the corresponding pressure wave p L (t), where [34] where A(L) is the opening area at the lip, c is the speed of sound, and ρ is the ambient air density.

Estimation of Model Parameters: Solving the Inverse Problem
Our objective is to derive vocal fold oscillations during phonation directly from the speech signal. In order to do so, we will utilize the VFO and VT models.
The VFO model represents the actual dynamics of the vocal folds. Given the model parameters, which are α, β and ∆ for the coupled one-mass model of Equation (3), it can be used to compute the vocal fold oscillations and the volume velocity of air at the glottis. The VT model represents the dynamics of the vocal tract. Given an excitation (at the glottis), it can be used to compute the pressure wave at the lips, which manifests as sound.
Ours is the inverse problem: given the the pressure wave at the output of the vocal tract (i.e., the recorded speech signal) we must estimate the VFO parameters that could generate it. This is an analysis-by-synthesis problem: in order to analyze the given voice signal, we must identify the model parameters that synthesize the closest (in a metric sense) facsimile to it.
We present the solution in a two-step manner. In the first, given the actual excitation of the vocal tract to produce the voice signal, the parameters of the VFO model are estimated to minimize the error between its output and the reference excitation signal. We refer to this as the backward approach (and the corresponding estimation as the "backward" problem), since the reference excitation signal itself must first be derived by passing the voice signal through an inverse model of the vocal tract, i.e., "backward" through the vocal tract. We have previously described our solution to the backward problem in [31], and restate it in Section 4.1 for completeness.
The second, more complete solution considers the joint model, i.e., both the motions of the vocal folds and the propagation of the resulting air flow (the excitation) through the vocal tract. The model parameters are estimated by comparing the signal produced at the lips by the joint model to the recorded voice signal. We refer to this as the "forwardbackward" approach since this requires forward propagating the output of the VFO through the VT model, prior to applying the backward approach. The solution to this problem is the primary contribution of this paper.
The two approaches are illustrated in Figure 4.
Once estimated, the VFO model parameters can be used to compute the oscillations of the vocal folds and their phase-space trajectories. The VT model models the transformation of the glottal signal generated by the vocal folds to the final voice signal. The joint VFO-VT model combines the two, using the output of the VFO model as the input to the VT model. ADLES compares the glottal signal u 0 (t) generated by the VFO model to a reference glottal signal u g (t) to estimate VFO parameters. ADLES-VFT compares the output of the joint model, u L (t), to a reference signal u m (t) obtained from an actual voice recording, to estimate both VFO and VT parameters. The output of the VFO model is the desired vocal-fold oscillation.

Estimating VFO Parameters from the Excitation Signal: The Backward Approach (ADLES)
As the first step, we describe the backward problem: how to derive the VFO model parameters that best explain the glottal excitation for a given phonated signal. We use the approach proposed in [31]-we estimate the VFO model parameters to minimize the error between the volume velocity of air predicted by the model and a reference signal representing the actual glottal excitation to the vocal tract. If the VFO model were to be considered in isolation, this reference could be obtained through actual physical measurements, e.g., through photography [5], physical or numerical simulations [7,17], or by inverse filtering the speech signal using a technique such as [35] (the approach used in [31]).
For the purpose of our discussion in this section, however, we do not specify where this reference excitation is obtained from, since the estimation of VFO model parameters from a given glottal excitation is only a waypoint towards estimation from the joint model that includes both the VFO and VT components. As we will see in Section 4.2, this does not in fact require explicit knowledge of the reference signal at the glottis.
Let u g (t) be the reference signal representing the true air-volume velocity at the glottis that excites the vocal tract. The volume velocity of air u 0 (t) (we remind the reader that u 0 (t) = u(0, t)) at the glottis can also be computed from the vocal fold opening at the glottis as where ξ 0 is the half glottal width at rest, 2ξ 0 + ξ l (t) + ξ r (t) is the complete glottal opening, d is the length of the vocal fold, andc is the air particle velocity at the midpoint of the vocal folds. We assume that the movement of the vocal folds follows the VFO model of Section 3.1. Correspondingly, ξ l (t) and ξ r (t) must obey Equation (3), subject to boundary conditions. The model parameters α, β and ∆ can hence be computed to minimize the difference between the air volume velocity u 0 (t) predicted by the model and the reference u g 0 (t). We define the instantaneous residual R as the error between u 0 (t) and u g (t): The overall 2 error between u 0 (t) and u g (t) is given by the integral where ϑ = [α, β, ∆] represents the parameters of the VFO model, and T represents the complete length of the reference signal.
The actual estimation can now be stated as where C r and C l are constants representing the quiescent positions of the vocal folds, and the folds are assumed to be at rest prior to the onset of phonation. For the computation we set ξ 0 to a typical value of 0.1 cm. The length of the vocal folds d may be set to 17.5 mm (which is within the range of normal lengths for both male and female subjects), and the air particle velocityc to 5000 cm/s [2]. Note that given α, β and ∆ the differential equations of model Equations (11)-(15) (the constraints) can be solved by any ODE solver to obtain ξ l (t) and ξ r (t). So, in principle, we could solve the constrained optimization problem of Equation (9)-(15) by a grid search over (α, β, ∆) to identify the specific values that minimize the squared error of Equation (8). This would, however, be highly inefficient.
Instead we propose the ADLES ("ADjoint LEast Squares") algorithm, which restates the constraints (11)- (15) as Lagrangians on the objective, and derives a gradient descent solution. The detailed derivation of ADLES is given in Appendix C. We summarize the key steps below.
Incorporating constraints (11)-(15) into the objective, we define the Lagrangian: where λ, η, µ l , µ r , ν l and ν r are Lagrangian multipliers. Note that λ and η are also functions of time (we have not explicitly shown the "(t)" above for brevity of notation). We obtain the Lagrangian parameters λ and η as the solution to the following equations: For 0 < t < T: with initial conditions at t = T: Note that given α, β, ∆, ξ r and ξ l , Equations (17)-(24) represent a differential-algebraic system of equations and can be solved by any DAE solver to obtain λ and η.

Estimating VFO Parameters from the Speech Signal: The Forward-Backward Approach (ADLES-VFT)
The backward approach, solved by the ADLES algorithm in Section 4.1, derives the VFO parameters by minimizing the error between the output of the VFO model u 0 (t) and the glottal excitation u g (t). However, in general, u g (t) is not available, and this error cannot actually be computed.
Instead, in the forward-backward approach, we further propagate the generated excitation u 0 (t) through the vocal tract, represented by the VT model of Equation (4) to obtain a signal u L (t) = u(L, t) at the lips. This is the output of the joint VFO and VT models. We compute the error between the generated signal u L (t) and the air velocity measurement derived from the recorded voice signal, which is available, and propagate this error backward through the vocal tract, to obtain the error at the glottis. The VT and VFO model parameters are estimated to minimize this error. Thus, the algorithm itself proceeds through iterations of two steps: a forward step in which the VFO-generated excitation is propagated through the VT model to generate an output signal, and a backward step in which the error between the generated signal and the recorded speech is propagated backward through the VT to adjust the model parameters. We explain the entire procedure below.
The recorded voice signal is, in fact, a pressure wave and records the pressure wave emitted at the lips. Let p m (t) be the measured acoustic pressure at the lip. The corresponding volume velocity is given by [34] u m (t) is now our reference signal at the lips to which u L (t) must be matched, in order to estimate model parameters.
The propagation of u 0 (t) = u(0, t) through the vocal tract is assumed to follow the dynamics of Equation (4). Let H f be the nonlinear operator representing acoustic wave propagation through the vocal tract from the glottis to the lip. The subscript f in H f represents the vocal-tract profile f (x, t) in Equation (4) and indicates the dependence of H f on f (x, t). Thus the vocal-tract output u L (t) is given by Our objective is to minimize the difference between the measured volume velocity u m (t) and the predicted volume velocity u L (t) near the lip subject to constraint that the dynamics of the vocal folds must follow the VFO model of Equation (3). Note that the parameters of the joint model include the VFO model parameters α, β and ∆, and the vocal tract profile f (x, t) required by the VT model. Although we only require the VFO model parameters to determine vocal fold oscillation, the minimization must be performed against all of these. Thus the estimation problem becomes subject toξ where, as before, (30) and (31) represent the asymmetric vocal folds displacement model (3), I.C. stands for initial condition, and Cs are constants. The minimization is performed against the complete set of parameters of the joint VFO-VT model, i.e., α, β, ∆ and f (x, t). Unlike in Equations (11)- (15), this cannot be solved, even in principle, by simply scanning for the optimal α, β and ∆, since H f is characterized by f (x, t) which is also unknown and must be determined.
To solve the optimization problem of (29)-(35), we derive an efficient gradient-descent solution which we term the ADLES-VFT algorithm. The essential steps of the solution are given below. The details of the derivation are in Appendix D.

Forward Pass
First, note that, as before, the constraint Equations (30)-(35) are ordinary differential equations with initial conditions that, given α, β and ∆, can be solved by any ODE solver. The solution will give us the VFO model generated glottal excitation u 0 (t).
Next, we propagate the generated excitation u 0 (t) through the VT model. For this, we must solve subject to where B.C. stands for boundary condition, and I.C. stands for initial condition. The vocal tract is assumed to be circular at the glottis and the lips. Here, n Γ is the outward unit normal to the vocal tract boundary Γ, at the glottis. Equations (36)-(40) represent a set of partial differential equations. The boundary conditions relate to the air volume velocity at the glottal end of the vocal tract. The initial conditions relate to air volume velocity at the initial time, t = 0, when the generated glottal signal u 0 (t) enters the vocal tract. Given u 0 (t) and f (x, t) (36)-(40) can be solved using a PDE solver. In our work we use the finite-element method described in Appendix E.

Backward Pass
The backward pass updates all model parameters including the VT term f (x, t), and VFO parameters based on the error at the output.
Updating f : We denote the estimation residual as: We must propagate this residual backward through the VT model. To do so, we use a time reversal technique [36] and backpropagate the difference (41) into the vocal tract, which gives: subject to where z is the time reversal of u. Note that the boundary conditions and initial conditions in (42)- (45) are now defined at the lip, and the equation itself propagates backward through the vocal tract.
As before, Equations (42)- (45) can be solved by the finite-element method of Appendix E to give us z(x, t). The gradient update rule for f (x, t) is then obtained as where ι is a learning rate parameter (see Appendix D).
Updating α, β, ∆: As in the case of the backward approach of Section 4.1 we define a residual Note that unlike in Section 4.1 the residual in Equation (47) is defined at the lips, rather than at the glottis. As before, we can define the total squared residual error as where ϑ = [α, β, ∆] are the parameters of the vocal folds model (3). F must be minimized with respect to ϑ, subject to the constraints imposed by the VFO model. Once again, as in Section 4.1 we fold in the constraints into the objective through Lagrangian multipliers as where λ, η, µ and ν are multipliers, and λ and η are themselves functions of time. Optimization requires minimization of L(ϑ) with respect to α, β and ∆. This leads us (Appendix D) to the following set of equations for λ and η: with initial conditions (at t = T, i.e., at the lips): Given R(t), u 0 (t) and u L (t) Equations (50)-(57) represent a set of differential-algebraic equations and can be solved with a DAE solver.
We finally obtain the derivative of F w.r.t. α, β and ∆ (represented below as F α , F β and F ∆ , respectively) as The gradient descent update rules for the VFO model parameters are finally obtained as
In this solution, we have adopted the simple gradient descent method. However, other gradient-based optimization approaches, such as the conjugate gradient method, can also be used.
We note here that Algorithm 2 requires several terms to be specified. In our implementation, the quiescent positions of the vocal folds, C r and C l were set to 0. We initialize [α, β, ∆] = [0.8, 0.32, 1.0]-these values were empirically found to work best. f (x, t) is initialized to 0. This effectively initializes Equation (4) to the Webster Horn equation. The step sizes τ α , τ β and τ ∆ are all adaptively set to 0.01/max(F α , F β , F ∆ ), and ι is set to 1. The actual objective minimize, Equation (29), requires scaling p m (t) by A(L)/ρc prior to comparison to u L (t). In practice, since p m (t) is derived from a voice signal recorded at a distance from the lips, the unknown transmission loss between the lips and the microphone must also be considered. To deal with this, we simply normalize both sequences to 0 mean and unit variance, and do not apply any additional scaling. Solve (30)-(31) with initial conditions (32)- (35) with an ODE solver, using the current estimates of α, β and ∆, obtaining ξ r , ξ l ,ξ r andξ l and u 0 (t).

4:
Using current estimate of f (x, t) and u 0 (t), solve the forward propagation model (36)-(40) for u L (t) with a PDE solver, e.g., the finite-element method of Appendix E.

: end while
In the next section, we demonstrate the usefulness of the ADLES and ADLES-VFT algorithms experimentally.

Experimental Results and Interpretation
Unfortunately, until the time of writing this paper, we could not obtain actual electroglottographic measurements or video data of vocal fold motion to compare our derived VFOs to. However, from a computational perspective the algorithms proposed can still be validated in different ways. We explain these below.

Validation 1
Our first validation approach is to use the proxy of showing that the solutions obtained are indeed discriminative of fine-level changes in glottal flow dynamics of the phonation process.
Having recovered the model parameters by our backward or forward approach, we can solve the models to obtain the time-series corresponding to the oscillations of each vocal fold, as estimated from recorded speech samples. We note that the models we have discussed in this paper are essentially dynamical systems represented by coupled nonlinear equations that may not have closed-form solutions, but can be numerically solved.
To interpret these in discriminative settings, we can utilize some well-established methods for characterizing dynamical systems, borrowing them from chaos theory and other areas of applied mathematics (e.g., flow, orbit, attractor, stability, Poincaré map, bifurcation, Lyapunov exponents, etc.). These are described in Appendix F.

Interpreting a System's Phase Portraits Using Its Bifurcation Map
Appendix F describes the concepts and tools used to study the behaviors (e.g., flow, orbit, attractor, stability, Poincaré map, bifurcation) of nonlinear dynamical systems such as Equation (3). The phase space of the system in Equation (3) (representing vocal fold motion) is four-dimensional and includes states (ξ r ,ξ r , ξ l ,ξ l ). For this nonlinear system, it is expected that attractors such as limit cycles or toruses will appear in the phase space. Such phenomena are consequences of specific parameter settings. Specifically, the parameter β determines the periodicity of oscillations; the parameter α and ∆ quantify the asymmetry of the displacement of left and right vocal folds and the degree to which one of the vocal folds is out of phase with the other [24,37]. We can visualize them by plotting the left and right displacements and the phase space portrait.
The coupling of right and left oscillators is described by their entrainment; they are in n:m entrainment if their phases θ r , θ l satisfy |nθ r − mθ l | < C where n, m are integers, and C is a constant [24]. Such entrainment can be revealed by the Poincaré map, where the number of trajectory crossings of the right or left oscillators within the Poincaré section indicates the periodicity of its limit cycles. Therefore, their ratio represents the entrainment. We can use the bifurcation diagram to visualize how the entrainment changes with parameters. An example of such a bifurcation diagram is shown in Figure 5 [15,37].
As we will see later (and as indicated in Figure 5), model parameters can characterize voice pathologies, and these can be visually evident in in phase portraits and bifurcation plots.
We use the backward algorithm to estimate the asymmetric model parameters for clinically acquired pathological speech data. The data comprise speech samples collected from subjects suffering from three different vocal pathologies. Our goal is to demonstrate that the individualized phase space trajectories of the asymmetric vocal fold model are discriminative of these disorders.
The data used in our experiments is the FEMH database [38]. It comprises 200 recordings of the sustained vowel /a:/. The data were obtained from a voice clinic in a tertiary teaching hospital, and the complete database includes 50 normal voice samples (control set) and 150 samples that represent common voice pathologies. Specifically, the set contains 40/60/50 samples for glottis neoplasm, phonotrauma (including vocal nodules, polyps, and cysts), and unilateral vocal paralysis, respectively. The horizontal axis is displacement of a vocal fold, and the vertical axis is its velocity. Figure 6 shows some phase portraits showing the coupling of the right and left vocal folds obtained using the ADLES solution. We observe that the attractor behaviors are typical and even visually differentiable for different types of pathologies. Table 1 shows the results of deducing voice pathologies by simple thresholding of parameter ranges. Specifically, the ranges of model parameters in each row of Table 1 correspond to regions in the bifurcation diagram in Figure 5. Each region has distinctive attractors and phase entrainment, representing distinct vocal fold behaviors and thereby indicating different voice pathologies. By extracting the phase trajectories for the speech signal and, thereby, the underlying system parameters, the ADLES algorithm can place the vocal-fold oscillations in voice production on the bifurcation diagram and thus deduce the pathology. for (a) normal speech: 1 limit cycle, (b) neoplasm: 1 limit cycle, (c) phonotrauma: 2 limit cycles, (d) vocal palsy: limit torus. The convergence trajectory is also shown, and the limit cycles can be observed as the emergent geometries in these plots.

Validation 2
Our second validation approach is to compare the excitation signal obtained through inverse filtering with the glottal flow signal (VFO) obtained through the backward or forward-backward algorithm. The rationale behind this is that within reasonable bounds of error, the glottal flow signal obtained through our model is expected to conform to the oscillation patterns seen in the excitation signal for each speaker. Figure 7 shows the glottal flow obtained by inverse filtering and those obtained by the asymmetric model with the parameters estimated by our ADLES method. We observe consistent matches, showing that the ADLES algorithm does achieve its objectives in individualizing the asymmetric model to each speaker instance.

Validation 3
Our third validation approach is to compare the estimation precision of the backward approach and the forward-backward approach. Table 2 shows the mean absolute error (MAE) of calculating glottal flows and parameters for four voice types (normal, neoplasm, phonotrauma, vocal palsy) obtained by backward (ADLES) and forward-backward (ADLES-VFT) algorithms. The glottal flows obtained by inverse filtering the speech signals are treated as ground truths. Since there is no ground truth for model parameters, we treat the parameters obtained by ADLES as ground truth. These results suggest that our forward-backward algorithm can effectively recover the vocal tract profile, glottal flow, and model parameters.

Validation 4
Our fourth validation comes indirectly from prior studies. Information from a dynamical systems perspective can give insights about the underlying mechanisms and principles that govern the vocal fold dynamics. Examples of features in this category are recurrence analysis features, Lyapunov exponents, Hurst exponents, etc. These are mentioned in Appendix F. Some of these features have been used in real-world applications and proven to be effective. For example, in [39], the authors hypothesize that since COVID-19 impairs the respiratory system, effects on the phonation process could be expected, and signatures of COVID-19 could manifest in the vibration patterns of the vocal folds. In this paper, features have been derived from a signal processing perspective.
This study used the ADLES method to estimate the asymmetric vocal folds model parameters. It further used the parameters and estimation residuals as features to other binary classifiers such as logistic regression, support vector machine, decision tree, and random forest, achieving around 0.8 ROC-AUC (area under the ROC curve) in discriminating positive COVID-19 cases from negative instances, on clinically collected and curated data. The data used contained recordings of extended vowel sounds from affected speakers and control subjects. The authors also discovered that COVID-19 positive individuals display different phase space behaviors from negative individuals: the phase space trajectories for negative individuals were found to be more regular and symmetric across the two vocal folds, while the trajectories for positive patients were more chaotic, implying a lack of synchronization and a higher degree of asymmetry in the vibrations of the left and right vocal folds.
In a companion study, the authors in [40] used the ADLES-estimated glottal flows as features to CNN-based two-step attention neural networks. The neural model detects differences in the estimated and actual glottal flows and predicts two classes corresponding to COVID-19 positive and negative cases. This achieved 0.9 ROC-AUC (normalized) on clinically collected vowel sounds. Yet another study used higher order statistics derived from parameters, and Lyapunov and Hurst exponents derived from the phase space trajectories of the individualized asymmetric models, to detect Amyotrophic Lateral Sclerosis (ALS) from voice with high accuracy (normalized ROC-AUC of 0.82 to 0.99) [41].

Conclusions and Future Directions
In this paper we have presented a dynamical system perspective for physical process modeling and phase space characterization of phonation, and proposed a framework wherein these can be derived for individual speakers from recorded speech samples. The oscillatory dynamics of vocal folds provide a tool to analyze different phonation phenomena in many real-world task settings. We have proposed a backward approach for modeling vocal fold dynamics, and an efficient algorithm (the ADLES algorithm) to solve the inverse problem of estimating model parameters from speech observations. Further, we have integrated the vocal tract and vocal folds models, and have presented a forwardbackward paradigm (the ADLES-VFT algorithm) for effectively solving the inverse problem for the coupled vocal fold-tract model. Extensions of these approaches can use other physical models of voice production, and other physical processes including phonation.
We have shown that the parameters estimated by these algorithms allow the models to closely emulate the vocal fold motion of individual speakers. Features and statistics derived from the model dynamics are (at least) discriminative enough for use in regular machinelearning based classification algorithms to accurately identify various voice pathologies from recorded speech samples. In future, these approaches are expected to be helpful in deducing many other underlying influences on the speaker's vocal production mechanism.
The phase space characterization presented in this paper is based on phase space trajectories (a topological perspective)-the left and right vocal fold oscillations, velocities or accelerations. Measurements can also be performed from statistical, signal processing, information-theoretic and other perspectives. Another direction of suggested research is characterizing the phase space from algebraic perspectives. We can recast the study of the topological structures of the phase space to the study of its algebraic constructs, such as homotopy groups and homology/cohomology groups, which are easier to classify. For example, algebraic invariants can characterize the homeomorphisms between phase spaces (e.g., evolution maps, Poincaré maps) and reveal large-scale structures and global properties (e.g., existence and structure of orbits). We can also build upon the deep connection between dynamical systems and deep neural models. We can study deep learning approaches for solving and analyzing dynamical systems, and explore the integration of dynamical systems with deep neural models to analyze and interpret the behaviors of the vocal folds. We delegate these explorations to future work.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The following is a brief description of various VFO models: • Two-mass models: Two-mass models describe vocal fold motion as two coupled mass-damper-spring oscillators: where x i , M i , B i are the i-th oscillator's displacement, mass, and viscous damping coefficient, K is the coupling stiffness between the two masses, F i is the driving force, and R i is the elastic restoring force [15]. This model assumes (1) small air inertia and quasi-steady glottal flow, (2) negligible supra-glottal pressure, and (3) that the nonlinearity induced by vocal fold collision is small. These assumptions lead to small-amplitude oscillations and model simplification. •

Multi-mass models:
Multi-mass models have a greater degrees of freedom and hence can model vocal fold motion with high precision. They are based on mass-spring-damper motion dynamics which are widely used in multiple problem settings (e.g., [42]). For the i-th mass component, the equation of motion is: is the three-dimensional displacement, M i is the mass, F A i is the anchor force associated with the anchor spring and damper, F V i and F L i are the vertical and longitudinal coupling forces associated with spring and damping, F C i is the collision restoring force, and F D i is the driving force from glottal pressure [18]. In [18], 50 masses are used.
• Finite element models: Finite element models discretize the vocal fold motion in space and time-the geometry of the vocal fold is discretized into small elements (cells). In each cell, the applicable differential equation governed by the law of physics is solved. These models can handle complex geometries, continuous deformation, and complex driving forces [17].
Consider a cube element with six stress and strain components. By the principles of elasticity in mechanics we have: where σ is the stress tensor, is the strain tensor, and S is the stiffness matrix consisting of Young's modulus, shear modulus, and Poisson's ratio [17]. The relation between stress and displacement is governed by: where τ is the shear stress, u and w are the lateral and vertical components of the displacement vector, µ and µ are shear moduli, and C 1 and C 2 are constants [17]. This system of partial differential equations can be efficiently solved by finite element methods.
The following is a brief description of various VT models: •

Bio-mechanical models:
Bio-mechanical models simulate the geometry and articulatory movements of the vocal tract using displacement-based finite element methods and take into account the continuous tissue deformation and variation of the physiological, bio-mechanical, and visco-elastic properties of muscles [27]. They are more scalable and accurate, and allow for the modeling of more fine-grained control over muscular forces, articulator positions, and movements. • Statistical models: These model the vocal tract as statistical factors or components. For instance, factor analysis describes the vocal tract profile as a sum of articulatory components and analyzes the relationship between individual, or a combination of, components and vocal tract parameters [25]. • Geometric models: These attempt to depict the shape and geometric configurations of the vocal tract. They specify articulatory state with vocal tract parameters that define the position and shape of tongue, lips, jaw, larynx, etc. [26]. However, such models are not scalable because they do not account for the continuous variations of the anatomy and articulatory state, require clinical measurements such as from magnetic resonance imaging, and are not amendable to coupling with vocal fold models.

Appendix B
Appendix B.1. Modeling Wave Propagation in the Vocal Tract The vocal tract can be viewed as a compact, orientable, differentiable manifold M embedded in R 3 . Its boundary ∂M includes the wall of the vocal tract. Consider the tangent bundle TM. Denote the set of all vector fields on TM as Γ(TM), which is a C ∞ (M)module [43]. A vector field is a smooth section on TM, Γ(TM) X : M → TM. It associates each point p ∈ M with a tangent vectorv(p) := X| p : C ∞ (M) ∼ → R [43]. Let γ(t) : R ⊇ I → M be a maximal integral curve [43] through p at t 0 , which is a solution to: The curve γ(t) is a one-parameter group. When acting on the Lie group M, it gives the flow Φ : R × M → M.Φ t (p) = γ(t). The particle velocity at p is given by v(p, t) The planar motion of the pressure wave in the vocal tract is governed by the equations [36]: wherep(p, t) is the acoustic pressure, div is the divergence operator, grad is the gradient operator, ρ is the ambient air density, and c is the speed of sound. Equation (A11) describes the conservation of mass, and (A12) describes the conservation of momentum [36].
For notational convenience, we use cylindrical coordinates p = (r, θ, x), where the x direction aligns with the central axis of vocal tract. Denote the inner surface of vocal tract as Σ, and the shape function of inner surface as r = R(θ, x). Then the cross-sectional area of the vocal tract is: The average acoustic pressure is: and the volume velocity is: where v x is the x component of v. Integrating (A11) over the volume of vocal tract bounded by cross sections at x 0 and x gives: where we substitute Equations (A17) and (A18) into (A14) and (A15), and apply Stokes' theorem [28,36]; n v is the component of v normal and outward to the inner surface Σ.
The element of area dσ is given by [28,36]: where Sdθdx is a top 2-form on Σ [43]. Substituting (A19) into (A18) and differentiating w.r.t. x yields: Following similar steps, integrating the x component of (A12) over the cross section at x yields: where p w is the pressure acting on the wall of the vocal tract.

Appendix B.2. The Integrated Vocal Tract Model
To simplify our problem, we combine the wave Equations (A20) and (A21) into a single vocal tract model. Differentiating (A20) w.r.t x and (A21) w.r.t. t, and canceling out the pressure term gives: where the vocal tract profile is absorbed into a single term f (x, t). This represents the characteristics of the vocal tract, i.e., the effect of the nonuniform yielding wall on the acoustic flow dynamics, which needs to be estimated by our algorithm.

Appendix C
In this appendix, we derive the ADLES algorithm to estimate VFO model parameters from a given glottal excitation signal.
Let u g (t) be the actual air volume velocity at the glottis. We can can also derive u 0 (t) from the displacement of the vocal folds as where ξ 0 is the half glottal width at rest, d is the length of vocal fold, andc is the air particle velocity at the midpoint of the vocal fold. We assume the movements of the vocal folds to follow the dynamics specified by Equation (3) in Section 3.1. Our objective is then to estimate the parameters of the model to minimize the difference: such that the movements of the vocal folds follow the VFO model: where C r and C l are constants representing the quiescent positions of the vocal folds.

The Adjoint Least Squares (ADLES) Solution
To solve the functional least squares in (A26), we require the gradients of (A26) w.r.t. the model parameters α, β and ∆. Subsequently, we can adopt any gradient-based (local or global) method to obtain the solution.
Considering the residual and Incorporating the constraints into the objective using Lagrangian multipliers, we define the Lagrangian: where λ, η, µ and ν are Lagrangian multipliers. Taking the derivative of the Lagrangian w.r.t. the model parameter α yields: Integrating the term λ∂ αξr twice by parts yields: Applying the same to η∂ αξl , substituting into (A37) and simplifying the final expression we obtain: Since the partial derivative of the model output ξ w.r.t. the model parameter α is difficult to compute, we eliminate the related terms by setting For 0 < t < T:λ with initial conditions at t = T: As a result, we obtain the derivative of F w.r.t. α as: The derivatives of F w.r.t. β and ∆ are similarly obtained as: Note that both H f and F are bounded. Our objective is to (estimate model parameters Correspondingly, the adjoint operator [44][45][46] is: This is equivalent to solving: It is simple to obtain the solution to (A75): where F ( f k ) * is the adjoint operator. It is difficult to compute F ( f k )F ( f k ) * . We use its property of positive-definiteness to approximate it by γI where I is the identity matrix.
We denote the estimation residual as: We have Now consider the wave Equation (A65). Let u + δu be a solution with variation f + δ f . Substitution into (A65) yields: Subtracting (A65) yields: from (A94)-(A95) we apply boundary condition (A80); from (A95)-(A96) we use boundary condition (A84); from (A96)-(A97) we use definition (A73); from (A97)-(A98) we use definition (A74) and the duality property where λ, η, µ and ν are multipliers. Taking the derivative of the Lagrangian w.r.t. the model parameter α yields Integrating the term λ∂ αξr by parts twice gives: Defining the estimation residual R := H f (u 0 ) − A(L) ρc p m (t), applying this to η∂ αξl , and substituting into (A112), after simplification yields: where the term H f | u 0 ≈ u L /u 0 by linearization. Since the partial derivatives of the displacement ξ w.r.t. the model parameter α are difficult to compute, we cancel out the related terms by setting: For 0 < t < T :λ with initial conditions: At t = T : Assuming the boundary conditions on ξ l and ξ r to hold-an assumption we empirically find to be valid for C r = C l = 0, we obtain the derivative of F w.r.t. α: Similarly, we obtain the derivatives of F w.r.t. β and ∆ For the reverse problem represented by (A131)-(A134), applying the boundary condition (A132) and substituting (A141) back into (A140) yields a similar variational problem: for each time step. By the Lax-Milgram Lemma [49], solving (A146) is equivalent to solving the functional minimization problem: By the calculus of variations, and taking the variation of the functional gives (A146), hence the name variational form [48,49].

Finite Element Approximation
For each time step, we solve (A146) with finite element method. We discretize the domain Ω with a mesh of uniformly spaced triangular cells. We take the P 2 elements as the basis function space, which contains piece-wise, second-order Lagrange polynomials defined over a cell. Each basis function has a degree-of-freedom (DoF) of 6 over a twodimensional cell [48,51].
Each element is associated with a coordinate map that transforms local coordinates to global coordinates and a DoF map that maps local DoF to global DoF [48,51]. Each cell is essentially a simplex and can be continuously transformed into the physical domain. We use the following two previously established points in this context, and proceed as explained thereafter.

1.
Existence of Unique Solution: The solution to the variational problem (A146) exists and is unique [51].

2.
Approximation Error: The Galerkin's method gives the solution u e with error bounded by O(h 3 D 2 u e W 3,2 ), where h is the cell size and D is the bounded derivative operator [49,51].
Assume a solution u = B + c j ψ j (using Einstein summation convention) with basis ψ j ∈ P 2 and coefficients c j . Here the ψ j are piece-wise second-order Lagrange polynomials [48]. Also, the mesh of cells, P 2 elements of basis functions, coordinate map, DoF map, and the assembly of the linear systems, are implemented and solved with the FEniCS software [52].
The function B(x) incorporates the boundary condition and, as an example, can take the form: B(x) = u g + u m − u g x p L p , p > 0 (A148) We also project B(x) over the basis functions P 2 and express it as B(x) = b j ψ j . As a result, we obtain an unified expression u = U j ψ j with U j incorporating b j and c j . Similarly, we have f n = F j n ψ j , u n−1 = U j n−1 ψ j , u n−2 = U j n−2 ψ j . Set the test function as v =ψ i (wherê ψ i is the i th Lagrange polynomial basis in a test cell). Substitution into (A144)  where ψ i is the derivative of ψ i w.r.t. x. Setting M i,j = Ωψ i ψ j dx, K i,j = Ω ψ i ψ j dx and collecting (A149) and (A150) into matrix-vector form, we obtain: where A = M + ∆t 2 c 2 K, and b = ∆t 2 MF n + 2MU n−1 − MU n−2 . Hence, we have reduced the problem of (A146) into solving the linear system (A151), with the solution described above. Furthermore, the matrices M (known as the mass matrix) and K (known as the stiffness matrix) can be pre-calculated for efficiency.
Definition A3 (Flow, orbit, invariance). The map Φ x : I(x) → M is the flow or trajectory through x. The set of all flows γ x := {Φ x | t ∈ I(x)} is the orbit through x. Particularly, a subset S ⊆ M is called Φ-invariant if Φ(t, x) ∈ S for all x ∈ S and t ∈ T.
Appendix F.3. Phase Space Behavior: Attractor The behaviors of flows can be described by their attractor/attraction sets.
Definition A4 (Attractor). An attractor set A ⊆ M in the phase space is a closed subset satisfying for some initial point x, there exists a t 0 such that Φ x (t) ∈ A for any t > t 0 .
Namely, the orbit γ x is "trapped" in the interior of A. A dynamical system can have more than one attractor set depending on the choice of initial points (or the choice of parameters, as we will see later). Locally we can talk about a basin of attraction B(A), which is a neighborhood of A satisfying for any initial point x ∈ B(A), and its orbit is eventually trapped in A.
There are different types of attractor sets. Some are shown in Figure A1. The simplest one is a fixed point or an equilibrium point, to which a trajectory in phase space converges regardless of initial settings of the variables (or their starting point). To study vocal fold behaviors, we are particularly interested in those attractors revealing the periodic motion of the flow in phase space. Such attractors include the limit cycle or the limit torus, which are isolated periodic or toroidal orbits, respectively. Some attractor sets have a fractal structure resultant from a chaotic state of the dynamical system [20,54], and are called strange attractors.

Appendix F.4. Chaos and Exponential Divergence of Phase Space Trajectories
Chaos is a characteristic state of a nonlinear dynamical system. There are different definitions of chaos. A simple one is as follows: Definition A5 (Chaos). Equip a distance metric d on the phase space M. Then C ∈ M is referred to as a chaotic set of Φ if, for any x, y ∈ C, x = y, we have lim n→∞ inf d(Φ n (x), Φ n (y)) = 0 (A152) lim n→∞ sup d(Φ n (x), Φ n (y)) > 0 (A153) Thus, by definition, chaos is a state characterized by extreme sensitivity to initial conditions (trajectories starting from any two arbitrarily close initial conditions diverge exponentially). The Lypunov exponent is used to quantify this divergence. It also measures the measures the sensitivity of the evolution of the dynamical system to initial conditions.