Transport Coefficients from Large Deviation Functions

We describe a method for computing transport coefficients from the direct evaluation of large deviation function. This method is general, relying on only equilibrium fluctuations, and is statistically efficient, employing trajectory based importance sampling. Equilibrium fluctuations of molecular currents are characterized by their large deviation functions, which is a scaled cumulant generating function analogous to the free energy. A diffusion Monte Carlo algorithm is used to evaluate the large deviation functions, from which arbitrary transport coefficients are derivable. We find significant statistical improvement over traditional Green-Kubo based calculations. The systematic and statistical errors of this method are analyzed in the context of specific transport coefficient calculations, including the shear viscosity, interfacial friction coefficient, and thermal conductivity.

We describe a method for computing transport coefficients from the direct evaluation of large deviation function. This method is general, relying on only equilibrium fluctuations, and is statistically efficient, employing trajectory based importance sampling. Equilibrium fluctuations of molecular currents are characterized by their large deviation functions, which is a scaled cumulant generating function analogous to the free energy. A diffusion Monte Carlo algorithm is used to evaluate the large deviation functions, from which arbitrary transport coefficients are derivable. We find significant statistical improvement over traditional Green-Kubo based calculations. The systematic and statistical errors of this method are analyzed in the context of specific transport coefficient calculations, including the shear viscosity, interfacial friction coefficient, and thermal conductivity.
The evaluation of transport coefficients from molecular dynamics simulations is a standard practice throughout physics and chemistry. Despite significant interest and much study, such calculations remain computationally demanding. Traditional methods exploit Green-Kubo relationships [1,2] and rely on integrating equilibrium time correlation functions [3][4][5][6]. While general, depending only on identifying a relevant molecular current, these methods often suffer from large statistical errors due to finite time averaging, making them cumbersome to converge [7,8]. Alternative methods exist that directly drive a current through the system by the application of specific boundary conditions [9,10] or by altering the equations of motion [11][12][13][14]. These direct methods typically mitigate sampling difficulties by requiring that only the current is averaged rather than its time correlation function. However, such methods are generally not transferable to different transport processes. Moreover, as a nonequilibrium simulation, the details of how the current is generated can affect their convergence [15] and fidelity [16,17]. Here we propose a new way to compute transport coefficients that utilizes only equilibrium fluctuations, as in Green-Kubo calculations, but is evaluated by averaging a current, as in direct methods. Rather than applying a physical field to drive a current, we apply a statistical bias to the system's dynamics and measure the resultant response. The response is codified in the relative probability of a given current fluctuation, so this calculation is identical to the evaluation of a free energy, albeit in a path ensemble [18]. Such path ensemble free energies are known as large deviation functions [19], and with trajectory based importance sampling methods to aid in their calculation, we arrive at a method to evaluate transport coefficients that is both general and quickly convergent.
Large deviation theory has emerged as a useful formalism for considering the fluctuations of time integrated * dlimmer@berkeley.edu observables [20]. In fact, large deviation theory underpins many recent developments in nonequilibrium statistical mechanics, including generalized fluctuation theorems [21,22] and thermodynamic uncertainty principles [23,24]. The large deviation function is a scaled cumulant generating function and, like its equilibrium counterpart, the free energy, it codifies the stability and response of dynamical systems. While these advances in nonequilibrium statistical mechanics have yielded important relationships for systems far from equilibrium, they have also brought new insight into near-equilibrium phenomena. Andrieux and Gaspard have illustrated this especially clearly, resolving how Onsager's reciprocal relations and their generalizations beyond linear response follow from the large deviation function for the total entropy production and its symmetry provided by the fluctuation theorem [25]. They have shown the connection between the moments of a large deviation function for a time integrated current and phenomenological transport coefficients within both linear and nonlinear response regimes [26,27]. We use this insight-that large deviation functions can encode the dynamical response of a system driven away from equilibrium-to construct an efficient method for the evaluation of transport coefficients from molecular dynamics simulations.
To compute a large deviation function for a time integrated molecular current, we employ a trajectory based importance sampling procedure. Beginning with transition path sampling [28], Monte Carlo algorithms have been derived to uniformly sample path space for systems evolving with detailed balanced dynamics. These methods have found application in computing rate constants and finding reaction pathways for complex condensed phase processes spanning autoionization to viral capsid assembly [29][30][31][32][33][34][35][36]. Indeed, it was identified in early work that a reaction rate constant could be computed from a thermodynamic-like integration along path space, resulting in a relative free energy in trajectory space [18,28]. This observation was never generalized to other dynamical responses, like phenomenological transport coefficients. With the development of diffusion Monte Carlo algorithms like the cloning algorithm that directly target large deviation functions [37,38], such generalization is possible. Recent extensions of diffusion Monte Carlo algorithms that incorporate importance sampling using an iterative feedback approach [39], cumulant expansions [40], or approximate auxiliary processes [41], have improved the efficiency of these algorithms enough to make calculations for complex, high dimensional systems possible. In this way, we proceed numerically by computing directly an effective thermodynamic potential like that Onsager identified when he first formulated his thermodynamic theory of linear response [42], with the large deviation function serving to characterize this potential. This conceptually distinct approach from traditional methodologies provides new ways of thinking about transport processes that are amenable to the kinds of analysis typically reserved for static equilibrium observables, such as their dependence on ensemble and generalization to nonlinear regimes [43]. While we are restricted to linear response coefficients in this article, generalization to nonlinear response regimes is straightforward [27].
The rest of the paper is organized in the following manner. In Section 2, we summarize important results of the large deviation theory, and illustrate its connection to phenomenological transport coefficients. We also outline the simulation methodology used to compute large deviation functions. In Section 3, we test our method by comparing it with calculations using the Green-Kubo formalism. We study three specific cases: the shear viscosity of TIP4P/2005 water [44], the interfacial friction coefficient between a Lennard-Jones fluid and a Lennard-Jones wall, and the thermal conductivity of a Weeks-Chandler-Anderson [45] solid. We use these models to frame a discussion of the relative systematic and statistical errors associated with our new methodology in comparison to traditional Green-Kubo calculations. We provide some final remarks on our method in Section 4.

I. THEORY AND METHODOLOGY
We consider systems evolving according to a Markovian stochastic dynamics, though generalization to deterministic dynamics is straightforward. In the absence of an external stimulus, these dynamics obey microscopic reversibility and thus will sample a Boltzmann distribution [46]. Under a bias, applied either at the boundaries of the system or through an external field, a current is expected to arise. If the bias is small, the response of the system can be linearized and a transport coefficient, L, is defined through where J is a current and X is its conjugate generalized force, which could be proportional to a temperature or concentration gradient. The entropy production for this process is equal to the product of the force and the current, or S = JX [47]. The transport coefficient, L, is thus a response function relating the applied force to the generated current, L = dJ/dX, in the limit that X → 0. This is the object we aim to compute.

A. Transport coefficients from large deviation functions
To compute the response coefficient, L, we must identify a corresponding dynamic variable whose fluctuations will report on the system's response to the bias. Specifically, we define a time averaged current as where t N is some observation time, and j(c t ) is a fluctuating variable computable from the molecular configuration, c t , at time t. If j is correlated over a finite amount of time, then the fluctuations of J can be studied by computing its cumulant generating function, where ψ(λ) is known as the large deviation function and λ is a statistical field conjugate to J [19]. Here, the average · · · is taken within an ensemble of paths of length t N , denoted as a vector of all the configurations visited over that time, or C(t N ) = {c 0 , c 1 , · · · , c t N }. The probability of observing such a path is given by, where ρ[c 0 ] represents the distribution of initial conditions, and ω[· · · ] are the transition probabilities between time-adjacent configurations. As a cumulant generating function, the derivatives of ψ(λ) report on the fluctuations of the current J. For example, the first two derivatives yield where J is the average current and δJ = J − J is its deviation from the mean, whose squared average yields the variance of J. Because the dynamics we consider obey microscopic reversibility, ψ(λ) obeys a generalized fluctuation theorem, ψ(λ) = ψ(X − λ) where X is the generalized force as in Eq. 1. This symmetry relates the likelihood of a current to its time-reversed conjugate [26,48], and implies a fluctuation-dissipation relationship, or a relation between the second derivative of the large deviation function and the transport coefficient, L. Here . . . λ denotes the average in the biased path ensemble. In the limit of X → 0, where as previously observed [26], we find that the curvature of the large deviation function around λ = 0 is equal to the response function L up to a factor of 2.
Analogously, higher order derivatives can be related to nonlinear transport coefficients. For small values of λ, the large deviation function can be expanded as which is parabolic, and completely determined by L. This implies the distribution of J is Gaussian, with a variance of 2L/t N . This inversion is a direct reflection of Onsager's notion of an effective thermodynamic potential, where the probability of a current is given by the exponential of the entropy production.
The connection between the large deviation result and the Green-Kubo formalism can be understood by expanding the definition of J in Eq. 5. Without loss of generality, within Green-Kubo theory a transport coefficient can be computed from, where L(t m ) is an integral over the time correlation function of j(c t ), and in the long time limit is equal to L [47]. As J = 0 for an equilibrium system, where X = 0, it is straightforward to relate the second derivative of the large deviation function with respect to λ evaluated at λ = 0, to L as where we have made use of the time-translational invariance of the equilibrium averaged time correlation function, and assumed that j(c 0 )j(c t ) decays faster than 1/t. This equation is known as the Einstein-Helfand relation and is well known to yield an equivalent expression for transport coefficients [49]. Provided an estimate of ψ(λ) accurate enough to compute ψ (0), we thus have a means of evaluating L.

B. Calculation of large deviation functions
To evaluate the large deviation function for J, we use a variant of path sampling known as the cloning algorithm [37,38]. The cloning algorithm is based on a diffusion Monte Carlo procedure [50] where an ensemble of trajectories is integrated in parallel. Each individual trajectory is known as a walker, and collectively the walkers undergo a population dynamics whereby short trajectory segments are augmented with a branching process that results in walkers being pruned or duplicated in proportion to a weight. This algorithm has been used extensively in the study of driven lattice gases [51] and models of glasses [52,53]. Alternative methods for importance sampling trajectories, such as transition path sampling [31] or forward flux sampling [54], could be used similarly.
Generally, to importance sample large deviation functions, the original trajectory ensemble, P [C(t N )], can be biased to the form [55] where the large deviation function ψ(λ) is the normalization constant computable as in Eq. 3. Ensemble averages for an arbitrary observable, O, within the unbiased distribution and the biased one, are related by where the denominator is exp[ψ(λ)t N ] in the limit of large in Eq. 12, then we find a familiar relationship between biased ensembles, where ) λ is the probability of observing a given value of the current J in the biased ensemble, and p(J) is that in the unbiased ensemble. This demonstrates that ψ(λ) is computable as a change in normalization through histogram reweighting [55]. In order to arrive at a robust estimate for ψ(λ), the two distributions, p λ (J) and p(J), must have significant overlap. However, for large systems or long observation times, each distribution narrows, and sampling p λ (J) by brute force is exponentially difficult. To evaluate the large deviation function, the cloning algorithm samples P λ [C(t N )] by noting that it can be expanded to where we have discretized the integral for J over a time δt. The argument of the product is the transition probability times a bias factor that is local in time. This combination of terms cannot be lumped together into a physical dynamics, as it is unnormalized. However, it can be interpreted as a population dynamics where the nonconservative part proportional to the bias is represented by adding and removing walkers. In particular, in the cloning algorithm, trajectories are propagated in two steps. First, N w walkers are integrated according to the normalized dynamics specified by ω[c i−1 → c i ] for a trajectory of length nδt. Over this time, a bias is accumulated according to  (19) where, due to the multiplicative structure of the Markov chain, the bias is simply summed in the exponential. After the trajectory integration, n i (t) identical copies of the ith trajectory are generated in proportion to W i (t, nδt), where ξ is a uniform random number between 0 and 1 and . . . is the floor function. This process will result in a different number of walkers, and thus each walker in the new population is copied or deleted uniformly until N w are left. With this algorithm, the large deviation function can be evaluated after each branching step as the deviation of the normalization, which is an exponential average over the bias factors of each walker. In the limit of a large number of walkers, this estimate is unbiased [56]. The local estimate can be improved by averaging over the observation time, which upon repeated cycles of integration and population dynamics yields a statistically converged estimate of ψ(λ). Alternatively, ψ(λ) can be computed from histogram reweighting using Eq. 13 from the distribution of Js generated from each walker. In the preceding, all calculations are integrated with LAMMPS [57] and where specified, combined with a diffusion Monte Carlo code called the Cloning Algorithm for Nonequilibrium Stationary States (CANSS) [41]. A detailed description of convergence criteria for this algorithm can be found in Reference [58].

II. RESULTS AND DISCUSSION
To illustrate the utility of our method, we have tested it in three model transport processes. In Table 1, we list all the transport coefficients considered in this section, along with their corresponding Green-Kubo relations and the dynamical variables whose large deviation function we compute. For all of the models studied, we generate trajectories by integrating a Langevin equation of motion. A Markovian, stochastic equation is needed for the calculation of the large deviation function using the method presented in the previous section. For the position of particle i, denoted r i = {x i , y i , z i }, this equation has the form where the dots denote time derivatives, U (r N ) is the total intermolecular potential from all N particles at position r N , m i is the particle's mass, γ is the frictional coefficient, and R i is a random force. The statistics of the random force is determined by the fluctuation-dissipation theorem, which for each component is is Dirac's delta function and δ ij is the Kronicker delta. For all our calculations, we have chosen γ carefully so that the thermostat has little effect on the transport properties of the system and we are able to recover response coefficients consistent with calculations done using Newtonian trajectories, when possible.

A. Validation of methodology: shear viscosity
To illustrate our methodology, we first consider the evaluation of the shear viscosity, η, which is typically easy to compute with traditional methods. The phenomenological law that defines the shear viscosity is Newton's law of viscosity, which relates the shear stress of a fluid to an imposed shear rate, where σ xy is the xy-component of the stress tensor, and (∂v x /∂y) is the gradient of the x component of velocity in the y direction. The relevant molecular current for this process is the momentum flux, which is equivalent to σ xy . The stress tensor is computable as where V is the constant volume of the system, and v ki and f ki are the velocity and force exerted on particle i in the k direction, respectively. Given this identification of the current, its associated thermodynamic force is X = (V /k B T )(∂v x /∂y). From Eqs. 1 and 27 we can identify the relation between the shear viscosity and L as η = L(V /k B T ). We compute the shear viscosity for the TIP4P/2005 model of water [44], which has been reported previously using Green-Kubo theory [59]. Our simulation system consists of 216 water molecules with density ρ = 1g/cm 3 and temperature T = 298K, integrated with the Langevin equation in Eq. 25 with γ = 1 ps −1 . The simulation is thus done in an ensemble of constant number of molecules N , volume V , and temperature T . We have verified that for γ = 1 ps −1 , the shear viscosity computed is the same as that from an ensemble with constant energy or an NVT ensemble using a Nosé-Hoover thermostat [60]. The molecules are held rigid with the SHAKE algorithm [61] and we employ a timestep of 1 fs. For all of the calculations, we first equilibrate the simulation for 20 ns.
First, we compute η using the Green-Kubo formula in Eq. 14. Note that other elements of the stress tensor can be averaged over to achieve better statistics. In both the Green-Kubo method and are new proposed calculation, the statistical benefit would be identical, so for notational clarity we will consider only the xy component. We average the stress-stress time correlation function over 20 ns, and this function is shown in Figure 1a. The time correlation function is oscillatory due to the inertial recoil of the dense fluid, and has largely decayed within 1 ps, though there is a slow component to the decay from the approximate conservation of momentum for times shorter than the timescale for the Langevin thermostat. From Green-Kubo theory, the viscosity is the integral of this function. Shown in the main part of Figure 1a, is η(t M ) as a function of the upper limit of the integral as in Eq. 9, which has plateaued by t M = 10 ps. Also shown are the associated statistical errors, which grow with t M . The calculated shear viscosity from 5 independent simulations and a cutoff time of 20 ns is 0.876 ± 0.015 mPa · s. This value is in good agreement with that previously reported [59].
Alternatively, we can compute the shear viscosity from the large deviation function for Σ xy defined in Eq. 15. As the shear viscosity decays quickly for this model, importance sampling is unnecessary, so we illustrate the basic principle by brute force reweighting. Specifically, we generate an estimate of p[Σ xy ] with t N = 80 ps, from a 20 ns long equilibrium trajectory. Then, we reweight the distribution to compute p λ [Σ xy ] according to Eq. 13. Examples of the equilibrium and biased distributions are shown in the inset of Figure 1(b). The added bias shifts the distribution to a different mean, and the overlap between these two distributions determines the efficiency of our sampling. The large deviation function ψ(λ), shown in the main panel in Figure 1(b), is evaluated by Eq. 3 for different λ's. The parabolic form of ψ(λ) is in agreement with the Gaussian distribution of the fluctuation in Σ xy in the linear response regime. Given that ψ(λ) is a parabola centered at the origin, it is straightforward to compute η from fitting the curve in Figure 1(b) to Eq. 10 over a range of |λ| ≤ 1.5 × 10 −4 atm −1 ps −1 . From this, we obtain an estimate of the viscosity η = 0.882 ± 0.017 mPa · s, which is in agreement with our Green-Kubo result. Both errors reported are the standard error of the mean.

B. Analysis of systematic error: interfacial friction coefficient
Having validated the basic methodology, we next focus on the systematic errors determining its convergence. As a case study, we consider computing the interfacial friction coefficient between a liquid-solid interface. This friction coefficient is defined by the linear relationship, where f x is the total lateral force exerted on the solid wall on the x direction, A is the lateral area of the interface, and v s is the tangential velocity of the fluid relative to the solid. As before, we can identify a relevant molecular current as the momentum flux along the wall, in this case proportional to the sum of the x component of the forces of all N l liquid particles on the N c wall particles, where the force is given by the gradient of the liquid-solid interaction potential, u ls . Given this current, we can identify its conjugate force as X = (A/k B T )v s , and consequently, the friction coefficient is given by µ = L(A/k B T ). The system is modeled as a fluid of monatomic particles confined between two stationary atomistic walls parallel to the xy plane. The fluid particles interact through a Lennard-Jones (LJ) potential with characteristic length scale d, energy scale , time τ = md 2 / with m as the mass of the fluid particle, and is truncated at 2.5d. Reduced units will be used throughout this and the following section, and we set k B = 1. The walls are separated by a distance H z = 18.17d along the z axis. Periodic boundary conditions are imposed along x and y directions, with the lateral dimensions of the simulation domain H x = H y = 15.90d. Each wall is constructed with 1568 atoms distributed as (111) planes of face-centeredcubic lattice with density ρ w = 2.73d −3 , while the fluid density is ρ f = 0.786d −3 . The wall atoms do not interact with each other, but are allowed to oscillate about their equilibrium lattice sites under the harmonic potential u h (r) = kr 2 /2, with a spring constant k = 600 /d 2 .
The mass of the wall atoms is chosen to be m c = 4m. The interaction between the wall and the fluid atoms is also modeled by a LJ potential with the same length scale d and truncation, but a slightly smaller energy wf = 0.9 , to model the solvophobicity of the wall [62]. Only the  Previous studies have recognized that µ is difficult to compute due to the confinement of the corresponding hydrodynamic fluctuations [63,64], which results in a large systematic error. This difficulty has led to some questioning the reliability and applicability of Green-Kubo calculations, such as the one derived in [65] and shown in Eq. 16, to compute µ. Indeed, we have found that the details of the simulation, such as the ensemble, system geometry and γ used in the Langevin thermostat, all have an important influence on the calculation of µ. This sensitivity is because the fluctuations that determine the friction are largely confined to two spatial dimensions, which is well known to result in correlations that have hydrodynamic long time tails, whose integral may be divergent [66]. However, both our large deviation function method and the Green-Kubo calculations are based on equilibrium fluctuations. Provided a simulation geome-try, equation of motion, and ensemble, the system samples the exact same trajectories, so we expect the friction coefficient computed in both ways to agree. Shown in the inset of Figure 2(a) is the Green-Kubo correlation function, which includes a very slow decay extending to at least 100 τ , following short time oscillatory behavior from the layered density near the liquid-solid interface. The main panel of Figure 2(a) shows µ computed with increasing integration time, t M . Averaging over 4 independent samples with a cutoff t M = 1000τ , our estimation of the friction coefficient is µ = 0.109 ± 0.019 τ /d 2 . The interfacial friction coefficient is also computed from the large deviation function, with t N = 400τ , using the time integrated force, Eq. 17, as our dynamical observable. The large deviation function and the average time integrated force, F x λ , are shown in the main panel and inset of Figure 2(b), respectively, demonstrating that within the range of λ we consider the system still responds linearly. With λ = 10 −3 σ/ τ and t N = 4000τ , importance sampling gives us an estimate of the friction coefficient as µ = 0.121 ± 0.002 τ /d 2 , in reasonable agreement with the Green-Kubo estimate and with previous reports [63].
In both the Green-Kubo and the large deviation function calculations, the main source of systematic error is from finite time. This error is especially highlighted in this example, where the time correlation function decays very slowly. We consider the systematic errors in the estimate of µ by defining a relative error as where µ(t) is the finite time value of the friction coefficient, and µ its asymptotic value at t → ∞. The form of the time dependent systematic error is different in the Green-Kubo method compared to the large deviation estimate. In the Green-Kubo method, systematic errors come from truncating the integral before the correlation function has decayed, and we denote this time t M , the cutoff time in the integral of the correlation function. In the large deviation calculation, systematic errors come from both truncating the integral as well as sub-timeextensive contributions to the exponential expectation value, which are more analogous to finite size effects in normal free energy calculations. These contributions are both determined by the path length t N . The relative systematic error is shown in Figure 3 for both methods. For this case, it appears that the Green-Kubo method always has a smaller error than the large deviation function method, though their magnitudes are comparable.
In the Green-Kubo method, it follows that if we know the analytical form of the correlation function, we can determine the scaling of the relative error. In the case of interfacial friction, Barrat and Boquet have proposed that for a cylindrical geometry where the dimension on the confined direction is much smaller than the other two direction, the force autocorrelation should decay asymptotically as ∼ 1/t 2 using hydrodynamic arguments [65]. This is a direct consequence of the fact that the velocity autocorrelation function decays as ∼ 1/t in a 2- The red line is a fit to the function y = a ln(bt)/t, and the orange line is a fit to y = a/t.
dimensional system [66], neglecting the self-consistent mode coupling correction that adds an imperceptible √ ln t correction [67,68]. This is confirmed in our simulation result in Figure 3 (orange line), where the integral of the force correlation function decays as ∼ 1/t.
Since the large deviation function has a Gaussian form, we can analyze the form of the finite time correction exactly as where ψ(λ) is the long time limit of the large deviation function, andψ(λ, t N ) is its finite time estimate. This follows from a fluctuation correction about a saddle point integration. Physically, this correction arises from a t N that is too short, such that ψ(λ) is not the dominant contribution to the tilted propagator, but rather includes temporal boundary terms from the overlap of the distribution of initial conditions and the steady state distribution generated under finite λ [56]. If we expand the first term, we arrive at which consists of the term included in the Green-Kubo expression, as well as an additional term modulated by a factor of 1/t N . Given that the correlation decays as ∼ 1/t 2 , the first term on the right hand side scales as ∼ 1/t N , as in the Green-Kubo method, while the second term scales as ∼ (1/t N ln t N ). This form is shown in Figure 3 and agrees very well with our data. These additional terms explain why the magnitude of the systematic error is larger for the large deviation function. In cases where the Green-Kubo correlation function decays faster than 1/t 2 , we expect that the dominant contribution to the error will come from the last term in Eq. 31.
C. Analysis of statistical error: thermal conductivity We finally discuss the statistical error of our method by studying the thermal conductivity, κ, of a solid system with particles that interact via the Weeks-Chandler-Anderson potential [45]. The thermal conductivity is defined through Fourier's law, where e is the energy current per unit area, and ∇T is the temperature gradient. From the expression for entropy production, the thermodynamic force is given by X = −(1/k B T 2 )∇T , and so the thermal conductivity κ = L/(V k B T 2 ). As the relevant molecular current, we study the fluctuations of the heat flux q given by where e i is the per-particle energy, f ik is the force on atom i due to its neighbor k from the pair potential, and r ik is the coordinate vector between the two particles. We use a system size of 10 3 unit cells, with lattice spacing 1.49d. A Langevin thermostat with γ = 0.01τ −1 maintains the system at the state point T = 1.0 /k B , ρ = 1.2d −3 , which yields identical results for κ as an NVE calculation. We focus on the diagonal component, κ xx , of the thermal conductivity tensor. Within Green-Kubo theory, the thermal conductivity can be computed by integrating the autocorrelation function of the x component of the heat flux, q x , as in Eq. 18. The inset of Figure 4(a) is the decay of the autocorrelation function, which comprises a fast decay from the high-frequency vibrational modes, followed by a slower decay that contributes most to the thermal conductivity and arises due to the low frequency acoustic modes [69]. To compute κ from the integral, as shown in the main part of Figure 4(a), the upper time limit is chosen as t M = 1500τ , though the relaxation of the correlation extends only to around 5τ . To compute κ from the large deviation function, we study fluctuations in the time averaged heat flux, Q x , defined in Eq. 19. The transport coefficient, κ, is again calculated using Eq. 10 by assuming the large deviation function ψ(λ) as a parabola, which is justified in Figure 4 thermal conductivity from the Green-Kubo method using a long trajectory of 1.5 × 10 6 τ is κ = 34.3 ± 2.2 1/τ d, while the estimate from the large deviation function using N w = 1000 walkers and λ = 10 −4 is κ = 34.01 ± 0.78 1/τ d.
While the average values of κ agree between the two methods, the statistical convergence varies significantly. To make a fair comparison, we set the total observation time of the trajectories to the same time as the upper limit of the Green-Kubo integral, i.e. t N = t M = 1500τ , which is much longer than the characteristic decay of the current autocorrelation function. To compensate for computational overhead of propagating N w trajectories in parallel in the cloning algorithm, the total averaging time of the Green-Kubo method is chosen as t tot = t M × N a , and N a equals the walker number, N w , so that the two methods require approximately the same computational effort. Both N a and N w will be denoted as N s reflecting the number of independent samples of each fluctuating quantity. We measure the statistical error by the relative error which is plotted in Figure 5 for both methods. As usual, the statistical error depends on both the relative size of observable fluctuations, and the number of independent samples. We find that as the standard deviations of both methods scale as 1/ √ N s as expected, our importance sampling clearly helps to suppress the statistical error compared to the Green-Kubo method with similar computational effort, decreasing the magnitude of the error by an order of magnitude at fixed N s . Even though we have to choose a bias small enough to guarantee a linear response, we do see that larger bias helps to yield statistically reliable results.
Jones and Mandadapu have performed a rigorous error analysis on the estimates of Green-Kubo transport coefficients with the assumption that the current fluctuations follow a Gaussian process [6]. They found that the variance of κ is a monotonically increasing function of t M , and arrived at an upper bound for the relative error which depends on only the number of trajectory segments of length t M . As a consequence, the statistics become worse when the system has longer correlation times, and there are no ways of controlling the intrinsic variance of the observable. On the other hand, in the large deviation method, the relative error in the large deviation function is which depends on not only the number of samples, in this case N w , but also has a dependence on λ and L. In general, as λ increases, the walkers will become more correlated. However, within the regime of linear response, or to first order in λ, the number of uncorrelated walkers should be N w . Because the large deviation function, ψ(λ), scales as λ 2 while its second derivative, ψ (λ), has no dependence on λ, the relative size of the fluctuations can be tuned by changing λ away from 0. This is verified in Figure 5, where increased λ generates an order of magnitude reduction in the statistical error relative to the Green-Kubo calculation. This decrease in the statistical error is also confirmed for a series of λ's. This tunability afforded by the large deviation function calculation is the same advantage afforded by direct simulation of transport processes where the relative size of fluctuations is determined by the size of the average current produced by driving the system away from equilibrium. Instead of evaluating κ from the large deviation function directly, we could have derived it from the change in the average current produced at a given λ. However, in such a case, the relative error would only scale as |λ| rather than λ 2 .

III. CONCLUSIONS
In this paper, we have explored the possibility of calculating transport coefficients from a large deviation function or a path ensemble free energy. The robustness of our method is tested by a variety of model systems ranging in composition and complexity of molecular interactions. Our method is general, and we expect the addition of importance sampling to be beneficial in instances where statistical errors are dominant. More precisely, our analysis shows that the systematic errors for both the Green-Kubo calculation and the large deviation calculation are asymptotically the same if the time correlation function decays faster than 1/t 2 . If the correlation function decays slower, than there will be a larger systematic error for the large deviation function calculation that will need to be converged at large t N . In such cases, the form of this error follows from Eq.31 and scales as 1/t N ln t N . Such slow decay is expected for low-dimensional systems where the current includes hydrodynamic modes. Our analysis of the relative statistical errors between the Green-Kubo and the large deviation function calculations show that our method requires generically fewer statistically uncorrelated samples for comparable statistical accuracy. This is a consequence of the importance sampling employed. The magnitude of this statistical efficiency, defined as the number of independent samples needed for a given error (N w /N a ) increases linearly with the size of the transport coefficient, L and increases rapidly with the increasing bias, as λ 4 .
While we have considered only linear response coefficients, our method can be easily extended to the nonlinear regime or to off-diagonal entries in the Onsager matrix, where Green-Kubo formulas are even more cumbersome to evaluate and few direct methods exist or can be formulated. These extensions are possible since the diffusion Monte Carlo algorithm is capable of sampling rare fluctuations in the non-Gaussian tails of the distribution. Moreover, it is also possible to probe the response around nonequilibrium steady states, as the method presented here does not rely on an underlying Boltzmann distribution.