InfPolyn, a Nonparametric Bayesian Characterization for Composition-Dependent Interdiffusion Coefficients

Composition-dependent interdiffusion coefficients are key parameters in many physical processes. However, finding such coefficients for a system with few components is challenging due to the underdetermination of the governing diffusion equations, the lack of data in practice, and the unknown parametric form of the interdiffusion coefficients. In this work, we propose InfPolyn, Infinite Polynomial, a novel statistical framework to characterize the component-dependent interdiffusion coefficients. Our model is a generalization of the commonly used polynomial fitting method with extended model capacity and flexibility and it is combined with the numerical inversion-based Boltzmann–Matano method for the interdiffusion coefficient estimations. We assess InfPolyn on ternary and quaternary systems with predefined polynomial, exponential, and sinusoidal interdiffusion coefficients. The experiments show that InfPolyn outperforms the competitors, the SOTA numerical inversion-based Boltzmann–Matano methods, with a large margin in terms of relative error (10× more accurate). Its performance is also consistent and stable, whereas the number of samples required remains small.


Introduction
In many industrial processes that involve diffusion, e.g., alloy solidification, heat treatment, coating and electric packaging, the characterization of composition-dependent interdiffusion coefficients is a crucial task, as it quantifies a diffusion process clearly. The classic approach is based on Boltzmann-Matano analysis [1,2] which transforms the diffusion system into a linear system of equations. However, the Boltzmann-Matano analysis is only applicable to a binary system and becomes problematic for a system with more than three components, as it generates an under-determined system of equations that mathematically does not yield a unique solution. To address such a challenge, a number of methods have been developed over the years. Kirkaldy et al. [3] introduced the Kirkaldy-Matano method and provided extra equations to the linear system by adding additional M diffusion paths with intersection points. Although the results have been shown to be accurate, this method cannot generalize well to a multi-component system because the difficulty in experimentally generating intersection points grows drastically with the number of components M + 1 [4]. Alternatively, methods based on one diffusion couple were proposed. Dayananda and Sohn [5] suggested integrating over certain composition ranges along the diffusion path to evaluate an average interdiffusion coefficient. Cermak and Rothova [6] later extended this method by choosing an infinitely small integration interval. Nevertheless, as is pointed out by Cheng et al. [7], the integration approach can lead to ill-conditional problems. A pseudobinary approach is introduced by considering only two components diffused into the diffusion zone. This method takes advantage of its time independence in the first-order linear equations and thus is very efficient when the pseudobinary condition is strictly satisfied in experiments. In practice, such experimental conditions may be difficult to meet, and in addition, for a multi-component system with a limited number of experimental samples, the linear equations are not capable of eliminating the extra solutions [8,9]. Separately, Zhang and Zhao [10] suggested a forward-simulation approach by iteratively optimizing the interdiffusion coefficients with repeated forwardsimulations, similar to the classic inference approach for inverse problems. Although such a method is shown to be accurate and stable, it incurs an overwhelming computational cost because each iteration requires a complete diffusion simulation with a fine spatialtemporal grid.
Another branch of the one-diffusion-couple method lies in assuming a polynomial functional form for the interdiffusivities. Ideally, with a proper design of the polynomial function, one can compute the coefficients of the polynomial functions to estimate the interdiffusion using a numerical inverse method [11]. This numerical inverse approach is adopted by Chen et al. [4] to include the atomic mobility [12] to study the diffusion in the solution phase of a multicomponent system. To improve the efficiency of the numerical inverse method, Cheng et al. [13] recast the original parabolic inverse problem [11] as a linear multi-objective optimization to improve computation efficiency while maintaining similar accuracy. The optimization algorithm places weak limits on the experimental samples and is applied to interdiffusivities of solid solution as well as various alloy systems [14,15]. This approach was recently improved by Qin et al. [16], who suggest solving an underdetermined linear system using compress sensing, a popular regularization technique, to increases stability against high order polynomial functions. However, the L 1 penalty imposed by compress sensing may introduce inappropriate prior assumptions, leading to inferior overall performance. We will see this issue in detail in the later experiment section.
Despite the notable performance and the popularity of the polynomial functional interdiffusion coefficient approaches, they share a fatal issue-how does one design the polynomial functions? Considering a quadternary system (M = 3), we have 3 × 3 polynomial functions requiring careful designs; modifying one function will affect the results of the other two. The challenge grows quadratically with the number M. Without proper design and repeated validations, the polynomial approach will lead to overfitting or underfitting, making this approach infeasible in practice.
One way to resolve this challenge is to use a complicated enough model with many polynomial terms and utilize classic Bayesian inference techniques [17] to estimate the posterior of the polynomial coefficients. In particular, Girolami [18] proposed an interesting Markov chain Monte Carlo (MCMC) for nonlinear and complex differential equations where the fully analytic expressions for the posterior distribution do not exist, which is similar to our problem. Despite its elegance and great accuracy, an MCMC approach often suffers from slow convergence and poor mixing, making it less practical for complex applications. To improve inference efficiency, the approximate Bayesian computation (ABC) and their variations, e.g., MCMC ABC and sequential Monte Carlo ABC (SMC ABC) are put forth by Alahmadi et al. [19]. However, despite being accurate and easy to implement, these types of sampling methods do not scale well with the number of parameters to be inferred. With unknown polynomials, the large number of parameters makes such methods impractical even with the latest accelerated variations [20,21].
Recently, the Gaussian process (GP) [22] has been utilized in dealing with data that are generated from a system of differential equations. As a back-box regression model, GP is proposed for fast parameter posterior estimations with the derivative information of the differential equations even with partially observed data [23]. The explicit derivative information is further utilized to improve a general GP's performance for data that are generated from differential equations [24]. The derivative in a given system of differential equations is further harnessed through a constraint manifold such that the derivatives of the Gaussian process must match an ordinary differential Equation (ODE) [25]. Despite their success, these works generally require explicitly known differential equations to work. Thus, they cannot directly be implemented for our problem.
A closely related work is [26], where GP is used as a generalization for a parametric function for binary images. However, their work cannot be directly implemented in our problem because our systems of equations will lead to a mixture of GPs that are augmented by the derivative of concentrations, whereas there is normally only one GP to estimate in most of the previous works [23][24][25][26][27].
To address the challenge of stable characterization of the interdiffusion coefficients, we introduce InfPolyn (Infinite Polynomial), a nonparametric Bayesian framework for the characterization of composition-dependent interdiffusion coefficients.
In particular, we first extend the general polynomial fitting method with an infinite number of polynomial terms. We then integrate out the polynomial coefficients with a Gaussian prior to derive a nonparametric functional form for the interdiffusion coefficients.
To further improve our model with prior assumptions of an interdiffusion system, we introduce a diagonal-dominant prior for the functions of the interdiffusion coefficients. Unlike most Bayesian fitting problems, the interdiffusion coefficients are not known/observable to us. Thus, we introduce latent variables, the virtual ghost interdiffusion coefficients to address this issue. Finally, we derive a tractable joint likelihood function for model training. We compare InfPolyn with the state-of-the-art Matano-based numerically inverse methods and their variations. In ternary and quaternary systems with polynomial, exponential, and sinusoidal interdiffusion coefficients, InfPolyn shows a significant improvement over the competitors in terms of relative errors. In most of the experiments, our model shows an excellent performance with only 40 EPMA measurements, which is very desirable in practical interdiffusion coefficient estimations.
Essentially, InfPolyn is a functional estimation method tailored for the characterization of interdiffusion coefficients by imposing a mixture of the SOTA nonparametric models, GPs, and particular prior knowledge. Unlike the classic Bayesian inference approaches [18,19], InfPolyn does not require a time-consuming sampling process and is thus much more efficient. The highlights of this work for interdiffusion coefficient characterizations are as follows: • InfPolyn does not require assumptions for the particular functional form of the interdiffusion coefficient; it is robust against overfitting and underfitting. • InfPolyn does not require a significant number of training data. • Prior knowledge of the interdiffusion system can be added easily in the framework of InfPolyn.
We hope the success of the nonparametric Bayesian framework can inspire more interesting applications in other interdiffusion coefficient estimation methods, e.g., the forward-simulation approach [10], in the material community. Thus, we publish our code and will maintain it as an open source toolbox on Github (https://github.com/wayXing/ InfPolyn, accessed on 26 June 2021).
The rest of this paper is organized as follows. The interdiffusion coefficient estimation problem is introduced in Section 2, followed by a brief summary of the Matano-Boltzmann numerical inverse method with polynomial functions in Section 3. Our method is presented in Section 4, including the derivation, prior knowledge assumptions, and model training. The comparisons to the other SOTA methods through ternary and quandary systems are demonstrated in Section 5. Finally, Section 6 summarizes our work.

Statement of the Problem
We firstly formulate our problem mathematically as a foundation of this work. Consider a general one-dimensional diffusion system with (M + 1) components. According to Fick's second law [28], the diffusion process is fully characterized by where ∇ is the partial derivative operator, c i is the concentration of i component (note that the concentration is a function of space and time c i (t, x)); D ij is the interdiffusion coefficient w.r.t. the concentration gradient of component j. In many textbook examples, D ij is assumed constant, but in practice, D ij depends on the concentrations of all components c = (c 1 , · · · , c M ) T . Our goal is to find D ij (c) for all i, j = 1, · · · , M with, ideally, a concentrations profile C = (c(t e , x 1 ) T , · · · , c(t e , x N ) T ) T ∈ R N×M at some terminal time t e and spatial locations {x n } N n=1 , where N is the number of sampling points at different locations. To avoid clutter, we denote c n = c(t e , x n ). One may notice that an important factor, temperature, is not considered in the formulation. This is due to the general process of the experiment. To conduct the experiment and obtain the concentration profile, one first bonds two blocks of materials together and holds them at certain temperatures to activate interdiffusion at the initial interface. The annealing procedure may last from hours to days, depending on the speed of forming an interdiffusion zone wide enough for analysis. The temperature remains constant during the long-lasting annealing process except for the beginning and ending stages, which take short time. Thus, the temperature is considered constant for the interdiffusion coefficient characterizations. To fabricate just one diffusion couple, around 50-100 sample points are often selected in a line parallel to the direction of element diffusion within the interdiffusion zone. Each sample point is analyzed through electron probe micro-analysis (EPMA), which requires several minutes for the equipment to detect the concentrations. As a result, the experiment is time-consuming, and only a small amount of samples, i.e., small N, can be provided.

Boltzmann-Matano Polynomial Interdiffusion Coefficients
We follow the original work of the Boltzmann-Matano method [2], which is widely used to extract concentration-dependent interdiffusion coefficient {D ij } from experimental concentration profiles. The Boltzmann-Matano method first integrates Fick's law of diffusion (1) in time to obtain the following system, 1 2t where c i denotes the terminal concentration of i components, ∇c j is the concentration gradient, and x 0 is the known Matano plane, defined by For a binary system, i.e., M + 1 = 2, there is only one composition-dependent interdiffusion coefficient D 11 to determine with one diffusion couple. Based on Equation (2), we can can directly compute D 11 (c n ) for n = 1, · · · , N and then use any curve-fitting method to characterize the function of D 11 (c). For a ternary system, i.e., M = 2, we need to determine D ij (c) for i = {1, 2} and j = {1, 2}. For each sample c n , we can write only two equations whereas there are four unknown parameters. This is an underdetermined system of equations to solve and will lead to multiple solutions. An effective and efficient solution is to assume a continuous function of interdiffusivity in a polynomial form, e.g., an independent quadratic form, where w is the weight coefficient in the polynomial function. Denote the flux of the . Estimation of D ij for j = 1, · · · , M can then be computed by solving the system of equation where D ik (c n ) is the polynomial function fully determined by its weight coefficients given a particular functional form and c n . All weight coefficients W = {w k ij } in the polynomial functions can be computed by solving the optimization problem, where · 2 denotes the L 2 norm, which can be replaced with other norms.

Remark 1.
Since the estimation of D ij (c) for each i = 1, · · · , M only depends on u i and is computed independently, we omit the index i and reformulate the Matano-Boltzmann method with polynomial interdiffusion coefficients to avoid clutter, where u n is the flux for any arbitrary component, and ∇c j n is the concentration gradient for j component, both of which are computed from the profile C; d j (c n ) is the j column of any arbitrary row of D ij (c) that matches the chosen flux at concentration c n ; d n = d 1 (c n ), · · · , d M (c n ) T is the collection. We aim to reveal d j (c) for j = 1, · · · , M.

Optimization for Polynomial Fitting
Equation (7) is a convex optimization problem provided that we have N ≥ 3(K + 1)M EPMA samples and we use a K-order polynomial function of Equation (4) for all d j (c); the closed-form solution is presented in the Appendix A. This is certainly impractical for large M and/or K. In this case, regularization techniques, e.g., L 2 -norm minimization or compress sensing, can be implemented to solve such an underdetermined system. The polynomial fitting approach with regularization is efficient in terms of computational time, space complexity, and implementation simplicity, thanks to many excellent software solutions, e.g., l 1 -magic, SPGL1, and SeDuMi [29][30][31].

InfPolyn for Interdiffusion Coefficients
The challenge of the discussed polynomial based approach is the lack of guidelines on how to build the model, i.e., the selection of the order of the polynomial and the polynomial form. It is unclear how many polynomial terms are needed for each diffusion coefficient such that the model is not overfitting or underfitting. Although regularization techniques [16] can be implemented, the underlying assumptions of regularization are unclear, which can lead to unexpected performance. We need a systematic way to specify the diffusion coefficients with correct prior knowledge in order to achieve better results.
To this end, we propose a nonparametric Bayesian approach that is flexible enough to capture the complex nonlinear relation while restricting itself from overfitting the data by integrating all possible solutions.

Infinite Order Polynomial Model
To start with, we write the polynomial regression, e.g., Equation (4), in a compact form, where the polynomial terms are denoted compactly as where φ j (·) is the predefined feature mapping that encodes the the polynomial functional form. Essentially, we can project the concentration c onto an r−dimensional feature space using an arbitrary mapping φ j (c) ∈ R r . Note that the constant term can also be absorbed into the feature mapping by setting the first element as 1. In the linear model case, the feature mapping is simply Obviously, this polynomial approach is only accurate and stable when we roughly know the functional form of φ j (c). Furthermore, it requires a large number of parameters {w j , β j } to be estimated. Rather than estimating the weight parameters, we consider a matrix Gaussian prior for the weight vector w j , where, Ω j ∈ R r×r indicates the correlation between the weight components. We then integrate out the weights and directly work with the marginal, which admits a closedform solution, This is also known as the Gaussian process (GP) [22]. If we use a countably infinite feature space, i.e., r → ∞, we formally define a sum over infinite polynomial terms. Thus, we call our model InfPolyn, infinite polynomial. Our model now becomes a nonparametric model that contains no explicit parameters w j . The model parameters are now encoded in T Ω j φ j (c), which indeed indicates a inner product in the the feature space spanned by φ j (c).

Kernel Formulation
Note that Ω j is p.s.d. by its definition, and we can encode the inner product using a . This is known as the kernel trick, which works by replacing the explicit feature mapping and covariance with a kernel function k j (c, c ) to indicate an inner product in the feature space. Different kernels can capture different functional features. For instance, a periodic kernel can capture periodic functions such as sinusoidal functions. If we do not know the explicit form of the kernel function, which is true in most cases, the automatic relevance determination (ARD) kernel, is commonly adopted as it generally provides good performance in most cases, especially in regression problems [22]. In this formulation, denotes the Hadamard product, I is an identity matrix, θ j0 is the scaling factor for the kernel function, andθ j ∈ R M×1 is a vector with scaling factors for each input components, i.e., the concentrations of different elements. We denote θ j = (θ j0 ,θ T j ) T for clarity. These parameters θ j are known as the hyperparameters because they control the random process (10) statistically rather than in a determinant way (e.g., the aforementioned polynomial fitting). In this work, we use the ARD kernel throughout unless stated otherwise.

Ghost Interdiffusion Coefficients
Given that d j (c) is a Gaussian process as stated in Equation (10), any number of observations form a joint Gaussian distribution, based on which a closed-form likelihood can be easily calculated. Unfortunately, unlike the classic regression problems, we do not have any direct observations of d j (c), and we cannot directly obtain the optimized hyperparameters {θ j , β j }. To resolve this problem, we borrow the pseudo-inducing points idea [32] and introduce a set of virtual ghost interdiffusion coefficients, {h jg = d j (z jg )} G g=1 , that are sampled from the function d j (c) for virtual concentrations {z jg } G g=1 . These latent variables must form a joint Gaussian distribution (because they are sampled from a Gaussian process of (10)), where h j = (h j1 , · · · , h jG ) T is the collection of the ghost interdiffusion coefficients and [K j ] gg = k j z jg , z jg is the covariance matrix computed through the kernel function and the latent locations Z j = {z jg } G g=1 . Normally, h j and Z j are latent variables that need to be integrated out during the model training and predictions.

Diagonal-Dominating Prior
Following Occam's razor, if the dominant diagonal diffusion coefficients D ii (c) for i = 1, · · · , M can fully explain the diffusion process, it is reasonable to suppress the nondiagonal diffusion coefficients D ij (c) for i = j to encourage a simpler model. To inject this preference of model, we design a special Laplace prior for the mean value for each Gaussian process of (12), where δ(·, ·) is the delta function and i is the row that matches the choice of d j (c). We use a Laplace prior rather than a Gaussian prior to encourage sparsity of the diffusion concentration for non-diagonal locations. The particular prior parameters may be adjusted according to a different system to reflect our prior knowledge.

Joint Model Training
With each interdiffusion coefficient d j (c) fully specified previously, the observed flux u n can be recovered by where we use the noise term n to capture the model inadequacy, uncertainty, and noise as a Gaussian distribution, n ∼ N (0, σ 2 ), for the observed flux; f n denotes the unknown true flux. Eventually, the last piece of this work is the the estimation of the posterior of all where L(Θ, B, Z, H, σ) = log p(u) is the log likelihood of our model, which can be computed by comparing the predicted flux f and the observed flux u. More specifically, the log marginal likelihood can be computed by To complete the integration in Equation (16), we first notice that is simply a Gaussian; p(f) is a mixture of M Gaussians, which is also Gaussian because In this equation, (17) and (18) into Equation (16) to derive the joint log likelihood, we get, We can now use any optimization techniques, e.g., gradient descent, to finish the MAP optimization. Although the fully independent training conditional (FITC) approximation [33] can be used to force Q j to be a diagonal matrix and thus to enable quick computations [32], due to the multiplier ∇c j , Q is generally non-diagonal, and this computation acceleration will not work in our case. The main computation for the joint likelihood (19) is the inverse of joint covariance matrix Q + σ 2 I −1 and it log determinant log |Q + σ 2 I|. Using an LU decomposition trick [22], we can compute these two terms at time complexity O(n 3 ) and space complexity O(n 2 ). For the interdiffusion problem, most of the time we have N ≤ 100 EPMA samples, making our method practically efficient.

Interdiffusion Coefficients Predictions
With all model parameters being optimized, we can derive the posterior of the diffusion coefficients for any concentration c * as where k * j = k j [c * , z j1 , · · · , k j c * , z jG T is the covariance between c * and the other ghost coefficient locations z jg and [K j ] * * = k j (c * , c * ). The derivation details are shown in the Appendix A for clarity.

Results
In practical experiments, the interdiffusion coefficients are unknown and uncontrollable, leading to difficulties for unbiased evaluations. Thus, we first assess InfPolyn on numerical examples of ternary (M = 2) and quaternary (M = 3) systems. To imitate a real system but not to lose generality, we use polynomial and exponential functions to construct the interdiffusion coefficient functions. To give an example, the fourth-order polynomial function in a two-component system is represented as where for each coefficient in the polynomial a t,r ij , the superscript r represents the degree of polynomial and the value of them are generated independently from uniform distributions U (0, 1). We put constraints on the high order terms to prevent the diffusion coefficients from increasing/decreasing drastically with the concentrations c; the diffusion matrix is considered symmetric to ensure numerical stability for the diffusion simulations. Note that this symmetric structure prior information is not injected into InfPolyn or other competing models. For the ternary system, the initial conditions for the forward simulation are where 1(z) denotes the Heaviside step function, which equals to 0 when z < 0 and equals to 1 when z ≥ 0. Similarly, for the quaternary system, we defined the initial condition as With the defined initial condition and the interdiffusion coefficient functions, we use a finite difference (DF) diffusion forward solver to simulate a diffusion process until the terminal time and obtain the terminal concentration profile C. To remain numerically stable and accurate, we use a second-order central difference for space and a fourth-order Runge-Kutta for time. The forward simulation solver uses a spatial step ∆x = 0.000625, which suggests 1601 grinds points on the space domain [0, 1], the terminal time is set to 10 4 ∆t.
We then take equally spaced samples from the terminal concentration profile to mimic the EPMA process to provide the terminal concentration profile C. Unless stated otherwise, the terminal concentration profile consists of 40 samples. Since we are concerned with the center areas where the diffusion process is significant, the EPMA samples are limited in the range of [0.44, 0.56] in order to avoid numerical error closed to the boundaries for all Boltzmann-Matano method. All variables are considered dimensionless in the experiments. To evaluate the performance for different methods, we follow Cheng et al. [13] and use the relative error (RE), whereD ij (c) and D ij (c) are the predicted and truth interdiffusion coefficients for concentration c, respectively. As a Boltzmann-Matano numerical inversion-based method, InfPolyn are compared with the other SOTA Boltzmann-Matano numerical inversion-based methods, i.e., the polynomial interdiffusion methods [13] with 3rd and 4th orders of the polynomial, the compress sensing approach [16] combined with 4th-order polynomial (high order model enough to capture the subtle changes), and the L 2 regularization approach, which replaces the L 1 penalty term in the work of Qin et al. [16] with an L 2 penalty term, combined with a 4th-order polynomial function.

Case Study 1: Polynomial Diffusion Coefficients
In this case study, we assess InfPolyn in a ternary system and a quaternary with 4th-order polynomial interdiffusion coefficients: where each coefficients a m,r ij are randomly generated using independent uniform distributions. To ensure the symmetrical structure of matrix A m,r , we force a m,r ij = a m,r ji by taking their average. In a general interdiffusion process, the interdiffusion coefficients are supposed to be smooth and close to constants, which also prevents instability in the numerical forward solver. To ensure this prior knowledge, we constrain the polynomial coefficients by a m,r ij ∼ U (0, 1)10 (r−5) . The particularly used values are shown in the Appendix A. The REs for x ∈ [0.4, 0.6] for the ternary and the quaternary system are shown in Figures 1 and 2. We omit areas outside [0.4, 0.6] because the REs are just extended flat lines without interesting information. As expected, the 4th order polynomial method has a strong model capacity and it can thus achieve few lowest REs at as is shown in some figures within Figures 1 and 2. However, if we look at the whole area of interest, the overall performance is the worst among all methods. In particular, due to the overfitting issue, the 4th order polynomial method shows a highly fluctuational performance, which is highly depreciated for real applications. It is not surprising to learn that the 3rd-order polynomial approach shows slightly fewer fluctuations but also fewer lowest REs. This is indeed the aforementioned dilemma of model selection for the polynomial based methods. Similar to results shown in [16], adding a regularization term of L 1 can ease the overfitting issue and greatly overcome the performance fluctuation issue in both Figures 1 and 2. Unfortunately, the improvement comes with the price of low model capacity, leading to a rather flat-fitting RE. The 4th-order polynomial method combined with a L 2 regularization term shows a similar improvement. It is, however, difficult to tell which regularization terms are better. The L 1 regularization works better with the ternary system in Figure 1, whereas the L 2 approach outperforms the L 1 with a large margin in most cases of Figure 2. The inconsistency of performance for the L 1 and L 2 regularization approaches certainly hinders their applications for practical problems. In contrast, guided by the correct priors and benefited from the nonparametric nature, InfPolyn shows a consistent and accurate fitting and outperforms the competitors by a significant margin. Thanks to the model flexibility of InfPolyn, it can capture the dramatic changes in the center while maintaining a good fitting in the other flat areas. In all cases, InfPolyn can not only remain stable (indicated by a smooth RE curve) but also achieve the lowest REs in most areas. Furthermore, note that the diagonal interdiffusion coefficients in general show a lower relative error. This is because, in the simulation setting, the diagonal interdiffusion coefficients play a dominant role in the diffusion process. For the non-diagonal interdiffusion coefficients, the REs are amplified by being divided by smaller true interdiffusion coefficients.

Case Study 2: Exponential Diffusion Coefficients
In general, the diffusion coefficients can be highly complex that they are not in polynomial forms. To imitate such challenging situations, in this case study, we assess InfPolyn in ternary and quaternary systems with the following interdiffusion coefficient that combines an exponential term and a sinusoidal term, where the functional coefficients a m,r ij are similarly sampled from different uniform distributions, i.e., a 0 ij ∼ U (0, 1) × 10 −5 , a m,1 ij ∼ U (0, 1) × 10 −6 , and a m,2 ij ∼ U (0, 1) × 10 −6 . Similarly, to ensure the forward diffusion stability, we use the previous approach to ensure the symmetrical structure of matrices A 0 , A m,1 , and A m,2 . The used exact values of the functional coefficients are shown in Appendix A. The model performances measured by REs are shown in Figures 3 and 4. In this case study, the 3rd-order polynomial slightly outperforms the 4th-order polynomial approaches in most cases in both Figures 3 and 4. Nevertheless, the performance of both 3rd-order and 4th-order polynomial approaches are depreciated due to the fluctuation across the domain. Furthermore, note that REs for the polynomial approaches in Figure 4 are flat and smooth, indicating that a rich model capacity does not necessarily lead to performance fluctuations in all cases. The L 1 and L 2 regularization combined with 4th-order polynomial degenerate the model performance rather than improving them in many cases in Figure 4. This shows evidence that inappropriate implicit prior assumptions caused by the L 1 and L 2 regularization terms can hurt model performance. It might be possible to circumvent this issue by adjusting the penalty weight. However, this will create a new issue of how to properly decide the value of the penalty weight, taking us back to the dilemma of model selections. In contrast, InfPolyn shows a consistent and accurate performance; it outperforms the competitors by a large margin for all cases except forD 31 of the quaternary system in the left area in Figure 4. We would also like to point out that many methods actually fail the quaternary system in Figure 4 as their REs are larger than 1, meaning a total prediction failure.

Case Study 3: Uncertainty Quantification Analysis
Finally, to assess the consistency of InfPolyn, we conducted a ternary system experiment in Case Study 1 based on five distinct random polynomial coefficient sets, which assemble five different diffusion coefficients, and show the performance statistics. To also investigate the influence of the number of the EPMA samples, we ran each experiment with {20, 30, 40, 50} EPMA samples. The minimum number of the EPMA samples was 20 because the 4th-order polynomial has 18 coefficients and thus requires at least 18 EPMA samples to work. For each experiment with the given EPMA samples, the model performance was evaluated by average relative error (ARE), where X indicates the whole spatial domain. We show the statistics of ARE 11 and ARE 22 over the five different diffusion coefficients in Figure 5 using the Tukey box plot. The distinct fact we immediately see is the superiority of InfPolyn compared to the competitors in terms of accuracy and consistency. We then notice that the performance does not improve gradually with the increasing number of EPMA samples for all methods except for the 4th-order polynomial. We believe that each method can already approach reasonable diffusion coefficients (by minimizing the loss function) with only 20 EPMA samples. In this case, more samples will not bring improvement, whereas the performance can fluctuate with different EPMA concentration profiles. Comparing the fluctuations, InfPolyn shows a modest level of changes, whereas the most unstable one is the 4th-order polynomial with L 2 regularization. The most stable method for bothD 11 andD 22 is the 3rd order polynomial, which can indicate a lack of model capacity or an underfitting issue. The only exception of performance improvement is the 4th order polynomial, which improves with more EPMA samples. This is a clear sign of overfitting, which can be addressed by introducing more training data. This explains the overfitting phenomena we previously encountered in Case Studies 1 and 2. Will the performance keep improving and outperform InfPolyn with the trend shown in Figure 5? It may happen with more than 200 EPMA samples, which becomes infeasible in practice. Furthermore, the decreasing trend should slowly disappear at some point, which is already happening forD 22 .
It is also noticeable that the L 1 and L 2 regularization techniques indeed can improve the performance of a 4th-order polynomial by a large margin for all cases with a different number of EPMA samples, which is consistent with the finding in [16].

Case Study 4: Experiment Verification
To present the practical applicability of InfPolyn, we then apply it to the reproduction of the interdiffusion flux from experiment data of the Mg-Al, Mg-Al-Zn, and Mg-Al-Zn-Cu systems collected from the previous literature [13]. These experimental data include composition profiles of the annealed diffusion couples of Mg-Al at 781 K for 36,960 s, Mg-Al-Zn at 868 K for 5400 s, and Mg-Al-Zn-Cu at 755 K for 75,530 s. Since the experimental measurements are taken non-uniformly on the spatial domain for all of the components, they are reprocessed with local polynomials interpolation techniques to provide values on a uniform grid, which is the common preprocessing for the Matano-based approaches. The derivative and integral terms in the Matano equation are then obtained. Given all the preprocessed data as inputs, we then randomly take all of the samples, half of the samples, and a quarter of the samples from the diffusion systems to test the robustness of the testing methods. As shown in Figure 6, the curves for all three cases computed by InfPolyn fit well with the experimental data, which lie in the 95% confident areas, indicating a good uncertainty quantification for the predictions. As for the half size and the quarter size training data, the left areas induce oscillations in some intervals. However, InfPolyn still captures the major tendency of the fluxes with slightly increasing uncertainty.

Conclusions
In this paper, we propose InfPolyn, a novel nonparametric Bayesian framework to estimate the interdiffusivity coefficients and demonstrate its superiority in terms of accuracy and consistency by combining it with the numerical inverse Boltzmann-Matano method [13]. This also becomes the limitation of InfPolyn because the numerical inverse Boltzmann-Matano method has certain limitations. For instance, it cannot generalize to a wide variety of complicated 2D diffusion processes and complex engineering interdiffusion scenarios, e.g., in semiconductors. Nevertheless, InfPolyn can be combined with other methods (such as the forward-simulation approach [10]) to fulfill its potential in the estimations of interdiffusion coefficients. This is outside the scope of this paper and we thus leave it for the future work.
The main novelty of our work is the nonparametric Bayesian framework that allows automatic model selections (resistant to overfitting and underfitting) and meaningful prior knowledge injections. The problem of recovering interdiffusion predictions from concentrations is an ill-posed inverse problem. Thus, the injection of proper priors is a necessary way to recover the ground-truth diffusion coefficients. Unlike methods such as [16] that impose nonphysical priors, our method provides an easy way to inject intuitive priors; e.g., the diagonal diffusion coefficients normally plays a dominant role in an interdiffusion process.

Conflicts of Interest:
The authors declare no conflict of interest.

. Solving a Ternary System Using Boltzmann-Matano Inverse Method
Consider a ternary system where we take EPMA samples at four random locations, [c 1 (x n ), c 2 (x n )],(n = 1, . . . , 4), on the "true" concentration-distance curve at a terminal time. By substituting them into the Boltzmann-Matano equations, we have We assume second-order polynomials for the functional relationships between diffusion coefficients and concentrations, 12 c 2 + α We can now simply solve the linear system of equations to obtain the polynomial coefficients (and thus the diffusion coefficients as functions of the polynomials). The procedure is the same with more EPMA samples or higher orders of polynomials. The only difference is that we may solve an overdetermined system with the criteria of minimizing the L 2 loss.