Dynamical Field Inference and Supersymmetry

Knowledge on evolving physical fields is of paramount importance in science, technology, and economics. Dynamical field inference (DFI) addresses the problem of reconstructing a stochastically-driven, dynamically-evolving field from finite data. It relies on information field theory (IFT), the information theory for fields. Here, the relations of DFI, IFT, and the recently developed supersymmetric theory of stochastics (STS) are established in a pedagogical discussion. In IFT, field expectation values can be calculated from the partition function of the full space-time inference problem. The partition function of the inference problem invokes a functional Dirac function to guarantee the dynamics, as well as a field-dependent functional determinant, to establish proper normalization, both impeding the necessary evaluation of the path integral over all field configurations. STS replaces these problematic expressions via the introduction of fermionic ghost and bosonic Lagrange fields, respectively. The action of these fields has a supersymmetry, which means there exists an exchange operation between bosons and fermions that leaves the system invariant. In contrast to this, measurements of the dynamical fields do not adhere to this supersymmetry. The supersymmetry can also be broken spontaneously, in which case the system evolves chaotically. This affects the predictability of the system and thereby makes DFI more challenging. We investigate the interplay of measurement constraints with the non-linear chaotic dynamics of a simplified, illustrative system with the help of Feynman diagrams and show that the Fermionic corrections are essential to obtain the correct posterior statistics over system trajectories.


Introduction
Stochastic differential equations (SDEs) appear in many disciplines like astrophysics [1], biology [2], chemistry [3], and economics [4,5]. In contrast to traditional differential equations the dynamics of the system, which follows the SDE, are influenced by initial and boundary conditions but not entirely determined by them. The uncertainty in the dynamics can be an intrinsic stochastic behavior [6] or simply due to imperfections in the model [7], which describes the dynamical system (DS).
In addition to the uncertainty introduced by the stochastic process driving the evolution of the system, any observation of it is noise afflicted and incomplete. This complicates the inference of the system's state further. In previous studies, linear SDEs [8], especially the Langevin SDE [9], were already investigated extensively. Besides this, many numerical methods to solve partial differential equations were interpreted and the propagation of the uncertainty for these problems has been studied [10,11]. Here, we consider arbitrary SDEs and introduce dynamical field inference (DFI) as a Bayesian framework to estimate the state and evolution of a field following a SDE from finite, incomplete, and noise-afflicted data. DFI rests on information field theory (IFT), which is information theory for fields. IFT [12,13] was developed in order to be able to reconstruct an infinite dimensional sig-

Information Field Theory
In many areas of science, technology, and economics, the difficult task of interpreting incomplete and noisy data sets and computing the uncertainty of the results arises [23,24]. If the quantity of interest is a field, for example, a spatially extended component of our Galaxy [25,26], or of the atmosphere [27,28], which are mostly continous functions over a physical space, the problem becomes virtually infinte dimensional, as any point in space-time carries one or several degrees of freedom. For such problems, which are called field inference problems, IFT was developed. IFT can be considered as a combination of information theory for distributed quantities and statistical field theory.

Notation
Usually, only certain aspects describing our system ψ are relevant. These aspects are called the signal, ϕ. Physical degrees of freedom, which are contained in ψ and not in ϕ, but which still influence the data, are called noise n. If ϕ is a physical field ϕ : Ω → R, it is a function that assigns a value to each point in time and u-dimensional position space. Let us denote a space-time location by x = ( x, t) ∈ Ω = R u × R + 0 , u ∈ N, where space and time will be handled in the same manner initially as in [29,30]. We let the time axis start at t 0 = 0 for definiteness.
The field ϕ = ϕ(x) has an infinite number of degrees of freedom and integrations over the phase space of the field are represented by path integrals over the integration measure Dϕ = ∏ x∈Ω dϕ x [31], with ϕ x = ϕ(x) being a more compact notation. In the following, these space-time coordinate dependent fields are denoted as abstract vectors in Hilbert space. The scalar product between two fields ϕ(x) and γ(x) can be written in short notation as: where γ * is the complex conjugate of γ, which here will play no role, as we deal only with real valued fields.

Bayesian Updating
In order to get to know a field ϕ, one has to measure it. Bayes theorem states how to update any existing knowledge given a finite number of constraints by measurements that resulted in the data vector d. Apparently, it is not possible to reconstruct the infinite dimensional field configuration of ϕ perfectly from a finite number of measurements. This is where the probabilistic description used in IFT comes into play. In probabilistic logic, knowledge states are described by probability distributions.
After the measurement of data d, the knowledge according to Bayes theorem [13] is given by the posterior probability distribution: This posterior is proportional to the likelihood P (d|ϕ) of the measured data given the signal field multiplied by the prior probability distribution P (ϕ). The normalization of the posterior is given by the so-called evidence: P (d) = Dϕ P (d|ϕ) P (ϕ). ( Bayes theorem describes the update of knowledge states. The prior P (ϕ) turns into the posterior P (ϕ|d) given some data d. To construct the posterior, we need to have the prior and the likelihood. The evidence and posterior incorporate those.

Prior Knowledge
The prior probability of ϕ, P (ϕ), specifies the knowledge on the signal before any measurement was performed. Formally, the prior on ϕ can be written in terms of the system prior [12]: where ϕ(ψ) is the function that specifies the field ϕ given the system state ψ. Due to the integration over ψ, the underlying system becomes partly invisible in the probability densities and only the field of interest, the signal field ϕ, remains. Nevertheless, the properties of the original systems will still be present in the field prior P (ϕ). For example, let us consider a situation close to what will be relevant later on. We consider a system comprised of two interacting fields constituting the system ψ = (ϕ, η), which are related via the invertible functional G[ϕ] = η. This implies the conditional probability P (η|ϕ) = δ(η − G[ϕ]), which can be considered as a first-class constraint in Dirac's sense [32]. Then we have, assuming that there exists a unique solution ϕ to the equation G[ϕ] = η, We casted P (ϕ|η) into a form that only requires to have access to G, but not to G −1 . As G is one to one, P (ϕ|η) = δ(ϕ − G −1 (η)) would be our preferred quantity to work with. However, in DFI of non-linear systems, we rarely have G −1 available as an explicit expression and therefore have to restore to Equation (6). Now, we assume that we know the prior statistics of P (η) and find the following implications on P (ϕ), This shows that the field of interest ϕ inherits the statistics of the related field η, however, with a modification by the functional determinant ||∂G/∂ϕ|| that is sensitive to non-linearities in the field relation. Here, the probability P (ϕ|η) contains already the two elements that will lead to SUSY in DFI, the delta function, which will be represented with bosonic Lagrange fields and the functional determinant, for which fermionic fields are introduced. Since both terms contain the functional G, it is plausible that bosons and fermions might be connected via a symmetry.

Likelihood
Let us now turn to the measurement and its likelihood. The measurement process of the data can always be written as: if we define the signal response to be R[ϕ] = d (d|ϕ) := Dd P (d|ϕ) d and the noise as . In measurement practice, the response converts a continuous signal into a discrete data set. The linear noise of the measurement is given by the residual vector in data space between data and signal response, . The statistics of the noise, which can be signal dependent, then determines the likelihood, Note, however, that we might want to specify initial conditions of a dynamical field via data as well. Let ϕ 0 = ϕ(·, t 0 ) be the initial field configuration at initial time t 0 . Then, we specify the initial data to be exactly this initial field configuration, d 0 = ϕ 0 , the corresponding response as R 0 [ϕ] = ϕ(·, t 0 ), and the noise to vanish, P (n) = δ(n). Now, the initial condition is represented via the likelihood P (d 0 |ϕ) := P (d|ϕ, d 0 =ϕ(·, t 0 )) = δ(ϕ(·, t 0 ) − ϕ 0 ). This initial data likelihood can be combined with any other data on the later evolution, d l , via P (d|ϕ) = P (d 0 |ϕ) P (d l |ϕ), where d = (d 0 , d l ) is the combined data vector.

Information
Bayes theorem Equation (2) can be rewritten in terms of statistical mechanics by defining an information Hamiltonian, or short the information, which contains all the information needed for inference, and the partition function, which serves as a normalization factor, Note, these formal definitions of information Hamiltonian and partition function hold in the absence of a thermodynamic equilibrium. This formulation of field inference in terms of a statistical field theory permits the usage of the well-developed apparatus of field theory, as we briefly show in the following.

Partition Function
There is an infinite number of possible signal field realizations that meet the constraints given by a finite number of measurements as encoded in the field posterior P (ϕ|d). For practical purposes, for example to have a figure in a publication showing what is known about a field, one has to extract lower dimensional views of this very high dimensional posterior function. These can be obtained by calculating posterior expectation values of the signal field, like its posterior mean m = ϕ (ϕ|d) = Dϕ P (ϕ|d) ϕ or its uncertainty dispersion D = (ϕ − m) (ϕ − m) † (ϕ|d) . Thus, we want to be able to calculate posterior field moments.
Given some data on a signal field ϕ, the posterior n-point function is: The involved integral can be calculated exactly in case the posterior P (ϕ|d) is a Gaussian. Otherwise, the posterior may be expanded around a Gaussian.
With the help of the moment generating function: which incorporates a moment generating source term J † ϕ = dxJ * (x)ϕ(x), the moments can be calculated via derivation with respect to J as: Likewise, the connected correlation functions, also called cumulants, are defined as: Particularly, the cumulants of the first and second order are of importance as they describe the posterior mean and uncertainty dispersion, m = ϕ c (ϕ|d) = ϕ (ϕ|d) and , respectively. Thus, the ultimate goal of any field inference is to obtain the moment generating partition function Z d [J] as any desired n-point correlation function can be calculated from it. For this reason, this partition function will be the focus of our investigations.

Free Theory
An illustrative example for the signal reconstruction and the simplest scenario in IFT is given by the free theory. The underlying initial assumptions of the free theory lead to a theory without non-linear field interactions. In other words, the information H(d, ϕ) includes no terms of an order higher than quadratic in the signal field ϕ.
The free theory emerges in practice under the following conditions: (i) A Gaussian zero-centered prior, P (ϕ) = G(ϕ, Φ), with known covariance Φ = ϕϕ † (ϕ) ; (ii) A linear measurement, d = R ϕ + n, with known linear response R and additive noise; (iii) A signal-independent Gaussian noise, P (n|ϕ) = G(n, N), with known covariance N = nn † (n) . The information H(d, ϕ) is then calculated via the data likelihood and the signal prior, With the assumptions of the free theory and Equation (9) the likelihood is: Thus, the information for the free theory is given by: Here, the so-called information source j, the information propagator D, and H 0 were introduced. The latter contains all the terms of the information that are constant in ϕ. The others are, The second form of the information propagator D can be verified via explicit calculation, and also holds in the limit N → 0 of a noise-less measurement. The information can be expressed in terms of the field: by completing the square in Equation (21), which is also known as the generalized Wiener filter solution [33]. This can also be written in a form that permits a noiseless measurement limit, which can be verified with a very analogous calculation.
Only terms, which depend on the signal field ϕ need to be considered and therefore the symbol " =" is introduced, to mark the equality up to an additive constant. We therefore have: Knowing the information, the moment generating function of the free theory, Z G [J], is constructed in the next step on the way of calculating the best fit reconstruction of the signal by means of expectation values.
All higher order (n > 2) cumulants vanish and the non-vanishing cumulants are, As higher-order cumulants vanish, the posterior distribution can be written as a Gaussian with mean m and uncertainty covariance D, Hence, computations in free theory are simple, as the Gaussian posterior can be treated analytically. The usage of the same symbol D for the information propagator, the inverse of the kernel of the quadratic term in the information, and the posterior uncertainty dispersion is justified, as they coincide in the free theory, but only there.
In other cases, when the signal or noise are non-Gaussian, the response non-linear or the noise is signal dependent, the theory becomes interacting in the sense that H(d, ϕ) contains terms that are of higher than quadratic order. Thus, the information of this nonfree, interacting theory incorporates not only the propagator and source terms of the free theory but also interaction terms between more than two signal field values. We will encounter such situations for a field with non-linear dynamics.

Field Prior
In the previous section, we saw how to infer a signal field from measurement data d with some measurement noise n particularly in the case of a free theory. Now, we consider a DS, for which the time evolution of the signal field is described by an SDE: We want to see how this knowledge can be incorporated into a prior for the field for DFI. The first part of the SDE in Equation (34), ∂ t ϕ(x) = F[ϕ](x), describes the deterministic dynamics of the field. The excitation field ξ turns the deterministic evolution into an SDE and mirrors the influence of external factors on the dynamics. DFI aims to infer a signal in such a DS using the tools from IFT. Thus, in DFI next to the observational n, which results from the measurement contaminated by nuisance influences, the excitation field ξ of the SDE has to be considered during inference.
Care has to be taken as the domains of the fields ϕ and ξ differ. While ϕ(x) is defined far all x ∈ Ω = R u × R + 0 , the fields ∂ t ϕ and ξ live only over Ω = R u × R + , from which the intial time slice at t 0 = 0 is removed. Equation (34) therefore makes only statements about fields on Ω , although it also depends on the intial conditions ϕ 0 = ϕ(·, t 0 ). As such requires specification, an initial condition prior P (ϕ 0 ) is required. We further introduce the notation ϕ = ϕ(·, t = t 0 ) for all field degrees of freedom except the ones fixed by the initial condition, ϕ 0 , so that we have ϕ = (ϕ 0 , ϕ ).
The SDE in Equation (34) can be condensed and generalized by a differential operator G[ϕ], G : C n,1 (Ω) → C(Ω ), which contains all the time and space derivatives of the SDE up to order n in space. In other words, the operator G acts on the space C n,1 which is the class of all functions that have continuous first derivatives in time and continuous n-th derivatives in space.
Within the framework of this study, we will assume that the excitation of the SDE has a prior Gaussian statistics, with known covariance Ξ. For a general G, ξ in its present form does not fully specify ϕ, for this additional initial conditions ϕ 0 at time t 0 have to be specified. We fix this by augmenting ξ with: and by extending G to: with G : C n,1 (Ω) → C(Ω) such that G [ϕ] = η and G −1 [η] = ϕ hold and are both uniquely defined. Then, the prior probability for the signal field is according to Equation (6), and the functional determinant becomes: where we note that δG/δϕ : C n,1 (Ω) × C(Ω ) → C(Ω ) and therefore, after evaluation of this for a specific field configuration ϕ, δG[ϕ]/δϕ : C(Ω ) → C(Ω ) is a linear operator, which actually is an isomorphism. Thus, we get finally: If we want to have the initial conditions unconstrained, we could set P (ϕ 0 ) = const. This is possible, as we could specify initial or later time conditions via additional data on the field, as explained before.

Partition Function
DFI builds on P (d, ϕ) = P (d|ϕ) P (ϕ), the joint probability of data and field, to obtain field expectation values by investigating the moment generating partition function: Here, we used that the measurement noise exhibits Gaussian statistics with known covariance N. We observe that the generating function J is not needed, as we could equally well take derivatives with respect to j in order to generate moments.
Central to this partition function is the field prior: This contains a signal-dependent term B(ϕ) from the excitation statistics as well as another one, J (ϕ), from the functional determinant. In particular, the calculation of this determinant remains a computational problem. The aim of the next section is to represent the Jacobian determinant J by a path-integral over fermionic fields for the data-free partition function:

Grassmann Fields
Grassmann numbers {χ 1 ,χ 1 , . . . χ N ,χ N } are independent elements, which anticommute among each other [34][35][36] and thus follow the Pauli principle, Consequently, a corresponding function depending on the Grassmann numbers χ andχ can be Taylor expanded to: A special feature of Grassmann numbers is that the integration and differentiation to them are the same. As a consequence, one can write down the following Grassmann integrals: In order to represent the Jacobian with infinite dimensions by a path integral, we need to transform the Grassmann variables to Grassmann fields with infinite dimensions. This leads us to path integrals over Grassmann fields, with the following integration rules, where theχ † is the adjoint of the anti-commuting fieldχ. The scalar product: will here be taken only over the domain Ω without the inital time slice, as the Grassmann fields are introduced to represent a determinant of the functional J (ϕ), which is also defined only over this domain. In the following, we abbreviate the notation by writing dx for Ω dx.

Path Integral Representation of Determinants and δ-Functions
By means of the Grassmann fields, we derive the path integral representation for J , the absolute value of the determinant of the Jacobian δG [ϕ] δϕ [37]. For this purpose, we take two unitary transformations U and V with the property that M = V δG [ϕ] δϕ U becomes diagonal with positive and real entries. These are then used to transform the Grassmann fields: This leads to a weighting of the path integral differentials by the determinants of U and V: Here we used the identity of integration and differentiation for Grassmann variables dχ = ∂ ∂χ = ∂χ ∂χ ∂ ∂χ = |U| −1 dχ to transform their differentials. The determinant of the operator M is given by the product of the operators, from which we can infer the Jacobian determinant: As the operator M is diagonal with eigenvalues {m i } on the diagonal, we can write its determinant as a product of N eigenvalues in the limit of infinite dimensions N by means of Equations (47)-(49).
The insertion of the result for the determinant of the diagonal matrix M in the definition of the Jacobian in Equation (57) using Equation (55) yields: Finally, we find the representation of the Jacobian in terms of an integral over independent Grassmann fields, We note that an equivalent expression is: as the factor −i cancels out in taking the absolute value. In the following, we will not track such multiplicative factors of unity absolute value for probabilities, as these can be fixed at the end of the calculation. The other term in P (ϕ) = B(ϕ) J (ϕ) P (ϕ 0 ) as expressed by Equation (44), Here, it is useful to step back to the initial form including the excitation field: with H(ξ) = − ln G(ξ, Ξ) = 1 2 ξ † Ξ −1 ξ + 1 2 ln |2πΞ|, and to replace the δ-function by means of a path integral. In order to do so the representation of the δ-function as an integral over Fourier modes is recalled: The migration of this to path-integral representation is achieved by the introduction of a Lagrange multiplier field β(x), With this, the field prior reads: with H(ϕ 0 ) = − ln P (ϕ 0 ) the information on the initial conditions.

Ghost Field Path Integrals in DFI
With the introduction of the fields β, χ, andχ, the DFI partition function is now given by path integrals over the excitations and additional two fermionic and two bosonic degrees of freedom, which are summarized to a tuple of fields ψ = (ϕ, β, χ,χ), (note, the here defined ψ differs from the initially introduced system state, also denoted by ψ. As the latter will not be used any more in this work, the reuse of the symbol is hopefully acceptable).
Next, the exponent of the partition function in Equation (66) is reshaped in order to be Q-exact. This means that the exponent shall only depend on the introduced functional {Q, ·} for a suitable X. For this we investigate the two ghost and Lagrange field dependent terms in Equation (66) separately.
The fermionic ghost field dependent exponent is: and the bosonic Lagrange field dependent exponent is: Thus the whole ghost and Lagrange field dependent exponent can be written as a Q-exact expression using Equations (68) and (69): According to these auxiliary calculations, the partition function in Equation (66) takes the form, The integration over the excitation fields creates a partition function that only contains the fields of the set ψ = (ϕ, β, χ,χ). With the aid of the following relation for a bosonic field y(x) that is independent of ϕ: the integration over the excitation field can be performed for a Gaussian excitation field (H(ξ) = 1 2 ξ † ξ) by means of Equation (72): Now, we define the odd function: for reasons of clarity. Besides we revive the statistical mechanics formalism for the definition of the partition function from Equation (12) as well as the corresponding ghost and Lagrange field dependent information H(ψ): Here, = indicates equality up to a constant term due to the not tracked absolute phase of our expressions. By comparison, we find the following relation between the prior information Hamiltonian of the signal field H(ϕ) from Equation (12) Let us now emphasize the first time derivative in the SDE by taking the definition of the SDE from Equation (34) , so that the θ-functional becomes: Here we introduced the functional on the set of fields ψ: Evaluating the information for this θ-functional using Equation (77) one gets: The Fermionic field χ was only defined over Ω the field domain without the initial time slice in order to represent the determinant of the Jacobian of G(ϕ) with respect to ϕ . One can extend the support of χ to Ω, including the initial time slice by introducing a split notation for this extended χ = (χ 0 , χ ) † , with χ denoting the original Fermionic field over Ω . We then find that the ghost field has to vanish at the initial time step t 0 , i.e., χ = (0, χ ) † in order to assure that the following expression does not diverge. Here, we abbreviate such that, The crucial insight is given by Equation (85). If χ 0 = 0, the expression A would diverge and Equation (83) would not hold. In order to reestablish a compact notation in Equation (86), we note that any finite assignment of ∂ t χ 0 = 0 would only make a vanishing contribution to the integral as being on an infinitesimal smart support.
The information Hamiltonian of Equation (83) has two parts. We call the left part, which contains the time derivatives of the fermionic and bosonic fields, the dynamic information. The right part, which is described by the Poisson bracket, is referred to as the static information. The derivation of Poisson brackets in a system with fermionic and bosonic fields is described in [38,39].
This yields the partition function, So far we represented the partition function in terms of the signal field, ϕ, and the three fields, β, χ,χ.
In case of a white excitation field ξ, the partition function of DFI can be derived using the Markov property. For this, we start with the IFT partition function for a bosonic field ϕ and a fermionic field χ and decompose it in terms of time-ordered conditional probabilities: where ϕ 0 = ϕ(·, t 0 ) is the field at initial time t 0 = 0 while there is no χ 0 = χ(·, t 0 ). The conditional probabilities can then be represented as QFT transition amplitudes [40,41] between states of the system denoted by the Dirac notation as: At this stage, these are formal definitions, with the time localized states ϕ k , , with t being some unspecified time. Here, j and k label time-slice field configurations, like ϕ(·, t) = ϕ j and ϕ(·, t) = ϕ k , and their associated times are t = t j and t = t k . The first line does not contain a usual scalar product between states, as the variables have first to be brought to a common time. This is done in the second line by the transfer operator M(t k , t j ), which describes the mapping of states at time t j to such at t k . In [19], it is shown that a representation of these state vectors is given by the exterior algebra over the field configuration space.
By assigning field operators to the fermionic and bosonic fields, χ and ϕ, as well as their momenta, ν and ω, respectively, the partition function in Equation (93) can be rewritten in terms of the generalized Fokker-Planck operator of the statesĤ according to [31,[40][41][42].Ĥ is not to be confused with the information Hamiltonian H(ψ|ϕ 0 ). The precise relation of these will be established in the following.
As mentioned in [18][19][20][21], the time evolution operatorĤ is not Hermitian and thus the time evolution is not described by the Schrödinger equation but by the generalized Fokker-Planck equation instead: These and the following equations define the properties ofĤ. The conditional probabilities for the fields ϕ k and χ k , given the fields at the previous time step ϕ k−1 , χ k−1 are given by the transition amplitudes between the corresponding states and are defined via the time evolution: At this point we multiply with unity, where the |ω k , ν k are momentum eigenstates of the field that obey on equal time slices: If we choose infinitesimal small time steps, we can evaluate the time-evolution operator on the momentum eigenstate, which leads to the following expression for the conditional probability: The formal definition of H(ϕ k , χ k , ω k , ν k ) for ∆t → 0 is: With this in mind the conditional transition probability distributions can be written in terms of the function H. In the next step, these are inserted into the partition function in Equation (93). Taking the limit ∆t → 0, N → ∞ leads to: In the end, the partition function in Equation (90) needs to be equal to the partition function in Equation (103) in order to guarantee consistency of the theory. This permits the following identifications, dt H(ψ t ) = i{Q(ψ),Q(ψ)}.
To sum up, it was shown that the auxiliary fieldsχ and β are simply the momenta of the ghost field χ and the signal field ϕ, respectively. And, for the moment the more important finding is that the time evolution is governed by the Q-exact static information , i.e., dt H(t) = i{Q,Q}. Comparing Equation (89) to Equation (106), we find this enters directly the information Hamiltonian, which can be regarded in combination with Equation (80) as the central connection between STS and IFT, relating the information Hamiltonian H(ψ|ϕ 0 ) for the full system trajectory to the Fokker-Planck evolution operators H(ψ t ) on individual time-slices. H is a dimensionless quantity, whereas H has the units of a rate. In [19] it is shown that {Q, ·} is the path-integral version of the exterior derivatived in the exterior algebra. This recognition allows to identify the time-evolution in Equation (106) as the path-integral version of the time-evolution operator in the Focker-Planck equation. Moreover, it is demonstrated that this time-evolution operator isd-exact and since the exterior derivative is nilpotent, the exterior derivative commutes with the time-evolution. The conclusion is made that this corresponds to a topological supersymmetry. Firstly,d as the operator representative of {Q, ·} interchanges fermions and bosons, since it replaces one bosonic field variable by a fermionic one. Secondly, since a physical system is symmetric with regard to an operator, if the operator commutes with the time-evolution operator. As this is the case ford andĤ, the field dynamics is supersymmetric.
Here it should be recalled that the ghost fields are scalar with fermionic statistics. In thise sense, the symmetry generated by the charge Q can be considered as a Becchi-Rouet-Stora-Tyutin (BRST) symmetry [43] in the context of this paper. Still, for further investigations of STS in [18,19], the formulation of the generated symmetry as a topological supersymmetry according to [44] is crucial. For this reason, we talk about a topological supersymmetry in this paper).

Spontaneous SUSY Breaking and Field Inference
The supersymmetry of a dynamical field can be spontaneously broken [18][19][20][21]. This coincides with the appearance of dynamical chaos as characterized by positive Lyaponov exponents for the growth of the difference of nearby system trajectories. It is intuitively clear that the occurrence of chaos will reduce the predictability of the system and therefore make field inference from measurements more difficult. We hope that the here established connection of DFI and STS will permit to quantify the impact of chaos on field inference in future research. For the time being, we investigate the reverse impact, that of measurements on the supersymmetry of the field knowlege as encoded in the partition function.

Abstract Considerations
In Section 2.6, we introduced the moment generating function in IFT in order to calculate field expectation values after measurement data d became available. For a dynamical field, this can now be written with the help of STS according to Equation (29) as: Note that we removed the −i factor from the Fermionic variables that was introduced in Equation (61) in order to connect to the conventions of the STS literature. Doing so alleviates the necessity to take the absolute value from the corresponding term. From Equation (108), we see that the combined information representing the knowledge from measurement data d and about the dynamics as expressed by the θ-function from Equation (74) consists of several parts, The first part, −χ † ∂ t χ − iβ † ∂ t ϕ + {Q(ψ),Q(ψ)}, describes the dynamics of the field ϕ and that of the ghost fields χ and χ for times after the initial moment by a Q-exact term, meaning that supersymmetry is conserved if only this would affect the fields for non-inital times t > t 0 . The last term, H(ϕ 0 ) = − ln P (ϕ 0 ), describes our knowledge on the initial conditions and not of the evolving field. The middle term, H(d|ϕ) = − ln P (d|ϕ), describes the knowledge gain by the measurement. If it addresses non-inital times, it is in general not Q-exact. Thus, if one would take the perspective of including the measurement constraints into the system dynamics, as it was done with the noise excitation, the thereby extended system would not be Q-exact any more. The reason for this is that "external forces" need to be introduced into the system description to guide its evolution through the constraints set by the measurement, which are not stationary and Gaussian as the excitation noise is. Or more precisely, the knowledge state on the excitation field ξ is in general not a zero-centered Gaussian prior with a stationary correlation structure any more, but a posterior P (ξ|d) with explicitly time-dependent mean and correlation structure in ξ.

Idealized Linear Dynamics
In order to illustrate the impact of chaos on the predictability of a system, we analyze a simplified, but instructive scenario. Our starting point is the information Hamiltonian for all fields, Equation (109), integrated over the β field, The information Hamiltonian contains now, in this order, terms that represent the excitation noise statistics G(ξ, Ξ) (as ξ = G[ϕ]), the functional determinant of the dynamics (represented with help of fermionic fields), the measurement information H(d|ϕ), and the information on the initial condition H(ϕ 0 ). We assume the system ϕ to be initially ϕ(·, 0) = ϕ 0 at t = 0 and to obey Equation (34) afterwards with ξ ← G(ξ, 1), i.e., Ξ = 1. We can then define a classical field ϕ cl that obeys the excitation-free dynamics: and a deviation ε := ϕ − ϕ cl from this, which evolves according to: ε(·, 0) = 0 and (112) Here, we performed a first-order expansion in the deviation field. Furthermore, we assume that only a sufficiently short period after t = 0 is considered, such that second-order effects in ε as well as any time dependence of A can be ignored. For this period, we have the solution: Further, we imagine that a system measurement at time t = t o probes perfectly a normalized eigendirection b of A, i.e., that we get noiseless data according to: Here, is the linear measurement operator, b fulfills: with λ b the corresponding eigenvalue, and † denoting the adjoint with respect to spatial coordinates only. λ b is also the Lijapunov coefficient of the dynamical mode b, which is stable for λ b < 0 and unstable for λ b > 0. The latter is a prerequisite for chaos. Finally, to exclude any further complications, we assume that A can be fully expressed in terms of a set of such orthonormal eigenmodes, Now, we are in a convenient position to work out our knowledge on ε for all times for which our idealizing assumptions hold.
A priori, the deviation evolves with an average: and an dispersion, most conveniently expressed in the eigenbasis of A, of: We introduced here with f a (t, t ) := a †ε t ε † t a (ξ) the a priori temporal correlation function of a field eigenmode a. Since both the dynamics as well as the measurement keep the eigenmodes separate in our illustrative example, we only obtain additional information on the mode b from our measurement. This is given according to Equation (33) by the posterior: with posterior mean: and posterior uncertainty: which follow respectively from Equations (27) and (23) for the limit of vanishing noise covariance N. Expressing these in the eigenbasis of A gives: and (124) Figure 1 shows the mean and uncertainty dispersion of the measured mode for various values of λ b . The correlation between different modes a = a vanishes and therefore any mode a = b behaves like a prior mode shown in grey in Figure 1. For the measured mode b, the propagator is in general non-zero, but vanishes for times separated by the observation, e.g., D (b,t)(b,t ) = 0 for t < t o < t , as one can easily verify: Thus, the perfect measurement introduces a so-called Markov-blanket, which separates the periods before and after it from each other. Knowing anything above earlier times than t o does not inform about later times, as the measurement at t o provides the only relevant constraint for the later period. The equal time uncertainty of the measured mode is: (126) Figure 1 shows this for a number of instructive values of λ b . The impact of the Liapunov exponent on the predictability of the system is clearly visible. The larger the Liapunov exponent, the faster the uncertainties grow. This can be seen by comparison of the top panels or by inspection of the bottom middle panel of Figure 1. Thus, chaos, which implies the existence of positive Liapunov exponents, makes field inference more difficult. This, however, is only true on an absolute scale. If one considers relative uncertainties, as also displayed in Figure 1 on the bottom right, then it turns out that these grow slowest for the more unstable modes. This is the memory effect of chaotic systems, which can remember small initial disturbances for long, if not infinite times.
To simplify the system further, we concentrate first on the case λ b = 0, which corresponds to a Wiener process. For this we get: implying a posterior mean of: and an information propagator of: This provides the equal time uncertainty for our measured mode b: which is also shown in Figure 1 in both middle panels. This scenario with λ b = 0 corresponds to a Wiener process, which sits on the boundary between the stable Ornstein-Uhlenbeck process with λ b < 0 and the instability of chaos with λ b > 0. This marginal stable case should now be taken into the non-linear regime.  Figure 1. Illustration of the knowledge on a measured system mode b. Top row: A priori (gray) and a posteriori (cyan) field mean (lines) and one sigma uncertainty (shaded) for an Ornstein-Uhlenbeck process (left, λ b = −1), a Wiener process (middle, λ b = 0), and a chaotic process (right, λ b = 1) of a system eigenmode b after one perfect measurement at t o = 1. Bottom row: The same, but on logarithm scales and for Liapunov exponents λ b = −3, −2, −1, 0, 1, 2, and 3, as displayed in colors ranging from light to dark gray in this order (i.e., strongest chaos is shown in black). Left: Posterior mean. Middle: Uncertainty of prior (dotted) and posterior (dashed). Right: Relative posterior uncertainty.

Idealized Non-Linear Dynamics
We saw that the posterior uncertainty is a good indicator for the difficulty to predict the field at locations or times where or when it was not measured. This holds-modulo some corrections-also in the case of non-linear dynamics, which introduces non-Gaussianities into the field statistics.
In order to investigate such a non-Gaussian example, we extend the previous case with λ b = 0 to the next order in ε, while still assuming that all modes are dynamically decoupled (up to that order), such that we only need to concentrate on the dynamics of where again † denotes an integration in position space only. This mode will exhibit an infinite posterior mean for times larger than t o . To understand why, let us first investigate the noise free solution of ∂ t ε b = 1 2 µ b ε 2 b for some finite starting value ε(t i ) = ε i at t i > t o . This might have been created by an excitation fluctuation during the period [t o , t i ] for which always a potentially tiny, but finite probability exists. The free solution after t i is given by: which develops a singularity for ε i µ b > 0 in the finite period τ = 2/(ε i µ b ). Thus, there is a finite probability that at time t s = t i + τ the system is at infinity, and this lets also the expectation value of ε diverge for t s . This moment, when the expectation value has diverged, can be made arbitrarily close to t o , as the Gaussian fluctuations in ξ permit to reach any necessary ε i at say t i = (t s − t o )/2 = τ with a small, but finite probability, where For times t ∈ [0, t o ], in between the moments when the two data points were measured, the posterior mean should stay finite. The reason is that any a priori possible trajectory diverging to (plus) infinity (for µ b > 0) during this period is excluded a posteriori by the data point (t, ε b ) = (1, 1). Such trajectories could not have taken place, as the dynamics does not permit trajectories to return from (positive) infinite values to finite ones, since that would require an infinite large (negative) excitation, which does have a probability of zero.
Let us assume that for the period t ∈ [0, t o ], the second-order approximation of the dynamical equation holds. We then have: and therefore, Inserting this into Equation (110) yields: The free information Hamiltonian H free (d, ϕ, χ, χ) defines the Wiener process field inference problem we addressed before, and has the classical field as well as the bosonic and fermionic propagators given by: respectively. Here, we introduced their Feynman diagram representation as well. The Fermionic propagator is the inverse of δ(t − t )∂ t as is verified by: The interacting Hamiltonian H int (d, ϕ, χ, χ) provides the following interaction vertices: The integration over the time axis in Feynman diagrams can be restricted to the interval [0, t o ] as the propagator vanishes for (exactly) one of the times being larger than t o , see Equation (125).
To first order in µ b , the posterior mean and uncertainty dispersion for 0 ≤ t, t ≤ t o are then given by the Feynman diagrams: see Appendix A. It turns out that all first-order diagrams (in µ b ) with a bosonic three-vertex are zero. The reason for this lies in the fact that these are all of a similar form, with g(t 1 , t 2 ) = µ b m t 1 m t 2 , 1 2 µ b D b t 1 t 2 , and µ b m t 1 D b t 2 t respectively. All these diagrams vanish, because D b t o t o = D b 00 = 0. Thus, to first order in µ b only a correction due to the Fermionic loop is necessary. This is negative (for positive µ b ) as from the sum over trajectories, which go through the initial data (t i , ε bi ) = (0, 0) as well as through the later observed data (t o , ε bo ) = (1, 1), all the trajectories that diverge prematurely (within t ∈ [0, t o ]) are excluded.
The posterior mean and uncertainty of the scenario with λ b = 0 and µ b = 0.3 is displayed for t ∈ [0, t o ] in the middle panel of Figure 2 in red in comparison to those for λ b = 0 and µ b = 0 in cyan. It can there be observed that the exclusion of the diverging trajectories by the observation has made the ensemble of remaining trajectories stay away from high values, which more easily diverge. Furthermore, this effect is solely represented by the fermionic Feynman diagram, as all bosonic corrections vanish (for λ b = 0) up to the considered linear order in µ b . Thus, taking the functional determinant into account, for which the fermionic fields were introduced, is important in order to arrive at the correct posterior statistics. This effect naturally arises in the used Stratonovich formalism of stochastic systems, and is less obvious in Îto's formalism.  Figure 1 just for the non-linear system defined by Equation (147) within the period t ∈ [0, 1] with first-order bosonic and fermionic perturbation corrections for µ b = 0.3 in red, as in Figure 1 without such non-linear corrections in cyan, and with only bosonic corrections in blue (dotted, displayed without uncertainty). The three panels display the cases λ b = −1 (left), λ b = 0 (middle), and λ b = 1 (right). Note that the a priori mean and uncertainty dispersion are both infinite for any time t > 0, as without the measurement, trajectories reaching positive infinity within finite times are not excluded from the ensemble of permitted possibilities.
The Fermionic propagator for λ = 0 is easily verified: Interestingly, the interplay of this non-linear dynamics with the constraint provided by the measurement leads to a reduced a posteriori uncertainty for unstable systems (λ b > 0) for times prior to the measurement. This is not in contradiction to the notion of chaotic systems being harder to predict. Here, we are looking at trajectories that could have leadstarting from some known value-to the observed situation at a later time. Thanks to the stronger divergence of trajectories of chaotic systems, the variety of trajectories that pass through both the initial condition and the later observed situation, is smaller than if the system is not chaotic. Thus, the measurement provides more information for this period in the chaotic regime, but less for the period after the measurement.

Conclusions and Outlook
We brought dynamical field inference based on information field theory and the suspersymmetric theory of stochastics into contact. To this end, we showed that the DFI partition function becomes the STS one if the excitation of the field becomes white Gaussian noise and no measurements constrain the field evolution. In this case, the dynamical system has a supersymmetry. We note that neither STS nor DFI are limited to the white noise case.
For chaotic systems, this supersymmetry is broken spontaneously. As the presence of chaos limits the ability to predict a system, DFI for systems with broken supersymmetry should become more difficult. We hope that the here established connection of STS and DFI allows to quantitatively investigate this.
While re-deriving basic elements of STS within the framework of IFT, we carefully investigated the domains on which the different fields and operators live and act, respectively, using the perspective that the continuous time description of the system should be the limiting case of a discrete time representation for vanishing time steps. Thereby, we showed, for example, that the fermionic ghost field has to vanish on the initial time slice for the theory to be consistent.
Furthermore, we showed that most measurements of the field during its evolution phase do not obey the system's supersymmetry, and are not Q-exact. Nevertheless, the formalism of STS is still applicable and might help to develop advanced DFI schemes. For example, two of the challenges DFI is facing are the representation of the dynamics enforcing delta function and a Jacobian in the path integral of the DFI partition function. For these, STS introduces bosonic Lagrange and fermionic ghost fields. Using those in perturbative calculations, for example via Feynman diagrams, might allow to develop DFI schemes that are able to cope with non-linear dynamical systems.
In order to illustrate how such a non-linear dynamics inference would look like, we investigate a simplified situation, in which the deviation of a system driven by stochastic external excitation from the classical (not perturbed system) is measured at an initial and a later time. The simplifications we impose are that (i) the measurement probes exactly one eigenmode of the linear part of the evolution operator for these deviations, that (ii) the evolution operator stays stationary during the considered period (thus different modes do not mix), and that the non-linear part of the evolution is also (iii) stationary, (iv) second order in the observed eigenmode, and (v) keeps that mode also separate from the other modes (no non-linear mode mixing). Under these particular conditions (i)-(v), the field inference problem becomes a one dimensional problem for the measured mode as a function of time, which can be treated exactly for a vanishing non-linearity and perturbatively with the help of Feynman diagrams in case of non-vanishing non-linearity. Thereby, it turns out that the Fermionic contributions, which implement the effect of the functional determinant, are key to obtain the correct a posteriori mean of the system.
The investigation of the illustrative example show a few things. First, predicting the future evolution of a more chaotic system from measurements is harder than for a less chaotic one as the absolute uncertainty of the measured mode increases faster in the former situation. This is not very surprising, but the following insight might be: The relative uncertainty (uncertainty standard deviation over absolute value of the deviation) grows slower for a chaotic system. This is an echo of the known memory effect of chaotic systems, which remember small perturbations in unstable modes for a longer time thanks to their rapid amplification. Third, non-linear dynamics, which can lead to even more drastic divergence of system trajectories (even to infinity in finite times), makes prediction of the future even harder, but enhances the amount of information measurements provide for periods between them. Due to the larger sensitivity of the system to perturbations, the measurements now exclude more trajectories that were possible a priori.
Thus, the interplay of measurements and non-linear chaotic systems is complex and more interesting phenomena should become visible as soon as the simplifying assumptions (i)-(v) made in our illustrative example are dropped. For those, the inclusion of the Fermionic part of the information field theory of stochastic systems will be as essential to obtain the correct statistics on the system trajectories as it is in our idealized illustrative example. We believe that insights provided by the stochastic theory of supersymmetry will continue to pay off in investigations of more complex systems, which we leave for future research.

Acknowledgments:
We acknowledge insightful discussions with Reimar Leike, and Jens Jasche. This work was supported partly by the Excellence Cluster Universe.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: