Parameter Estimation of the Stochastic Primary Nucleation Kinetics by Stochastic Integrals Using Focused-Beam Reﬂectance Measurements

: The kinetic parameters of stochastic primary nucleation were estimated for the batch-cooling crystallization of L-arginine. It is di ﬃ cult for process analytical tools to detect the ﬁrst nucleus. In this study, the latent period for the total number of crystals to be increased to a predetermined threshold was repeatedly measured with focused-beam reﬂectance measurements. Consequently, the latent periods were di ﬀ erent in each measurement due to the stochastic behavior of both primary and secondary nucleation. Therefore, at ﬁrst, the distribution of the latent periods was estimated by a Monte Carlo simulation for some combinations of the kinetic parameters of primary nucleation. In the simulation, stochastic integrals of the population and mass balance equations were solved. Then, the parameters of the distribution of latent periods were estimated and correlated with the kinetic parameters of primary nucleation. The resulting correlation was represented by a mapping. Finally, the parameters of the actual distribution were input into the inverse mapping, and the kinetic parameters were estimated as the outputs. The estimated kinetic parameters were validated using statistical techniques, which implied that the observed distribution function of the latent periods for the thresholds used in the estimation coincided reasonably with the simulated one based on the estimated parameters.


Introduction
The nucleation and growth kinetics of crystallization strongly affects the crystal quality, especially the crystal size distribution (CSD) [1][2][3] and, hence, needs to be described in crystallization processes. However, nucleation occurs stochastically rather than deterministically [4][5][6], which may make it difficult to control the crystallization processes. This stochastic nature of nucleation has been reported to be caused by a rarity of the nucleation event [7,8]. Nucleation can be classified into primary nucleation and secondary nucleation. When the system adequately contains the crystals, secondary nucleation occurs so frequently that it can be regarded as deterministic [9,10]. On the other hand, primary nucleation is often regarded as the stochastic process, and this stochastic behavior has been studied by many researchers [11][12][13].
Some difficulties arise when the primary nucleation kinetics is investigated in the unseeded batch-cooling crystallizer by using process analytical tools (PATs). First, it is impossible for many PATs to detect the very first nucleus in a large crystallizer [14]. The instruments can finally detect nucleation when the solution comes to suspend a certain number of crystals, which may include grown secondary nuclei. Secondly, the ends of the latent periods, when the PATs detect a predetermined state of the solution, can differ from one another in several measurements due to the stochastic behavior of nucleation. In order to overcome these difficulties, the population and mass balance equations, including stochastic primary nucleation, needs to be solved [9]. These differential equations include the terms that are the stochastic processes, and hence, the solutions of the equations are also the stochastic processes. The distribution of the latent periods may be reproduced by the stochastic integrals.
In addition, several difficulties in the stochastic integrals remain unsolved yet. At first, the stochastic behavior of secondary nucleation was ignored in many reports on the stochastic integrals of the population balance equations. However, secondary nucleation may not occur frequently in the early stage of unseeded or partially seeded crystallization and can behave in a stochastic manner. In fact, the stochastic behavior of secondary nucleation was studied in [15,16]. Then, many algorithms developed for solving the stochastic integrals specialized in stochastic primary nucleation. It might be difficult for these algorithms to handle other stochastic processes such as stochastic secondary nucleation.
This work aims to estimate the kinetic parameters of stochastic primary nucleation in cooling crystallization of L-arginine (Arg). First, the kinetic parameters of secondary nucleation and growth are determined by preliminary experiments based on our previous studies [17,18]. Secondly, the parameters of the distribution of latent periods for the total number of crystals to be increased to a predetermined threshold are estimated and correlated with the kinetic parameters of stochastic primary nucleation by means of a Monte Carlo simulation. In this simulation, the stochastic integrals of the population and mass balance equations are repeatedly solved with the method of moments. A reasonable approximation is introduced into the stochastic integrals, which enables stochastic secondary nucleation to be considered. The parameters of the distribution are estimated at several points in the space of the kinetic parameters, and the resulting correlation is represented by a mapping. Thirdly, the latent period is repeatedly measured with focused beam reflectance measurement (FBRM) and Fourier transform infrared-attenuated total reflection spectroscopy (FT-IR-ATR), from which the actual distribution of the latent periods is obtained. Fourthly, the parameters of the obtained distribution of the latent periods are input into the inverse mapping, and the kinetic parameters of stochastic primary nucleation are estimated as the outputs. Finally, the estimated kinetic parameters and the consideration to stochastic secondary nucleation are validated using the stochastic integrals and the statistical techniques.

Substances and Devices
An aqueous solution of L(+)-arginine (C 6 H 14 N 4 O 2 , abbreviated to Arg) was utilized as a target substance in this study. Arg (>98.0%) was purchased from FUJIFILM Wako Pure Chemical Corporation (Osaka, Japan). Arg is a basic amino acid, which nourishing tonics often contain, and there are few studies on the crystallization of Arg. In our previous work [17,18], the characteristics of growth and secondary nucleation for the crystallization of Arg were investigated as a typical hydrophilic amino acid, and hence, Arg was selected also in this study. The physical properties of the substance are listed in Table 1. The in-line measurements of the solution temperature, the chord length distribution (CLD), and the IR spectrum were conducted simultaneously. The solution temperature was measured with a platinum electrode (model CENTER376 made by Sato Shoji Corporation, Tokyo, Japan) and was controlled in a jacketed beaker with a programmable controller (model AD-4820, made by A&D Company Limited, Tokyo, Japan).
The CLD was measured with focused beam reflectance measurement (FBRM, made by Mettler Toledo International Inc., Columbus, OH, USA). The FBRM technique is often applied to obtain the size distribution [19][20][21]. However, the CLD obtained with FBRM is different from CSD, and hence, the CLD needs to be converted into the CSD [22,23]. In our previous work [18,24], the method for the conversion was investigated for the Arg crystals and is represented as follows: where S is mapping matrix, and N is frequency-length or frequency-size distribution. Each element of S corresponds with the probability that a crystal with a certain size will be measured as a certain chord length [18]. In this study, S was computed with a Monte Carlo analysis in MATLAB (version R2019a made by The MathWorks, Inc., Natick, MA, USA) [25,26], in which crystals with a predetermined shape were rotated and translated randomly [24,27]. Settings of the Monte Carlo analysis and the instruments for FBRM are listed in Table 2. A number-size distribution was deduced from the frequency-size distribution and the total mass of crystals based on the mass balance. The IR spectrum was measured with Fourier transform infrared-attenuated total reflection spectroscopy (FT-IR-ATR, made by Mettler Toledo International Inc., Columbus, OH, USA) and was utilized for obtaining the Arg concentration. The peak intensity of the FT-IR-ATR has been reported to be affected by the solution temperature [28,29]. The calibration equation used in this study is represented as follows: where A is peak area to the two-point baseline; C is the concentration or mass of solute per unit mass of solvent; T is the Celsius temperature of the solution; and α, β, and γ are regression coefficients. The regression coefficients were estimated in each measurement. An example of the estimation of the regression coefficients is illustrated in Figure 1, and the measurement conditions of the FT-IR-ATR are listed in Table 3. In Figure 1, the dotted line of concentration denotes that the concentration was unknown. The range of data marked by the gray arrows was utilized to obtain the calibration equation of A = 6.58 C − 0.00696 T + 0.471.

Experimental Procedure
A saturated amount of Arg anhydrate was dissolved in 300 mL of water at an initial temperature higher than the saturated temperature. Then, the solution was linearly cooled down to the final temperature and stirred with a four-blade agitator (φ40) at the rotation speed of 400 rpm. During cooling in the preliminary experiments, sieved seed crystals with the size range of 44-74 μm were added when the solution temperature reached the saturated temperature. After cooling, the solution temperature was kept at the final temperature for a while. The conditions of the preliminary experiments and the main experiments are listed in Table 4. The preliminary experiments were performed for the estimation of the secondary nucleation and growth kinetics, and the main ones for the estimation of the primary nucleation kinetics.

Mathematical Model
Growth is regarded as deterministic, and this rate is represented as follows [30,31]: where kg and g are the growth rate coefficient and order, respectively, and S is the supersaturation ratio. Saturation and supersaturation are expressed as follows: sat = sat ( ) = α sat 2 + β sat + γ sat ,

Experimental Procedure
A saturated amount of Arg anhydrate was dissolved in 300 mL of water at an initial temperature higher than the saturated temperature. Then, the solution was linearly cooled down to the final temperature and stirred with a four-blade agitator (ϕ40) at the rotation speed of 400 rpm. During cooling in the preliminary experiments, sieved seed crystals with the size range of 44-74 µm were added when the solution temperature reached the saturated temperature. After cooling, the solution temperature was kept at the final temperature for a while. The conditions of the preliminary experiments and the main experiments are listed in Table 4. The preliminary experiments were performed for the estimation of the secondary nucleation and growth kinetics, and the main ones for the estimation of the primary nucleation kinetics.

Mathematical Model
Growth is regarded as deterministic, and this rate is represented as follows [30,31]: Crystals 2020, 10, 380 5 of 17 where k g and g are the growth rate coefficient and order, respectively, and S is the supersaturation ratio. Saturation and supersaturation are expressed as follows: where C sat is the solubility at temperature T, T sat is the saturation temperature at concentration C, the subscript "sat" denotes saturation, and regression coefficients of the solubility curve are also listed in Table 1. Primary nucleation and secondary nucleation are regarded as stochastic, and expected values of these rates are represented as follows, respectively [31][32][33]: where k b1 and b1 are the primary nucleation rate coefficient and order, k b2 and b2 are the secondary nucleation rate coefficient and order, the superscript "*" emphasizes that the value is a stochastic variable, and µ 3 is the third moment of the CSD. The i-th moment of the CSD is defined as follows: Here, n is the population density or crystal number per unit mass of solvent per unit crystal size, depending on time t and crystal size L. The total nucleation rate and its expectations are represented as follows, respectively, due to the linearity of the expectations: Then, the probability that the number of crystals newly nucleated from t to t + τ will equal k is denoted as follows: where the concentration begins to exceed the solubility at t = 0, W is the mass, and the subscript "s" denotes the solvent. At most one nucleus is assumed to appear during an infinitesimal time, and hence, the following equation is established: The following equations are derived by solving Equation (13) [9,14]: where Equation (15) means that the number of crystals newly nucleated from t to t + τ follows a Poisson distribution with the parameter in the right side. Here, τ is substituted by a reasonably short time ∆t to derive approximately the following equations: where ∆t should be so short a time that the change in the expectation of the nucleation rate and the increase in the crystal size can be ignored during ∆t. In this study, ∆t was determined as follows: At first, the deterministic population and mass balance equations were solved with a variable step solver. In solving the equations, the solver based on the Runge-Kutta (4,5) formula was utilized in MATLAB [34,35]. Then, the shortest time step that had a predetermined acceptable error was detected when the change in the solution of the equations was sharpest, and this time step was adopted as ∆t.
Finally, the stochastic population and mass balance equations were solved with the step of the solver fixed to ∆t, where random numbers from the Poisson distribution were generated in MATLAB [36]. The stochastic population and mass balance equations are represented as follows [37,38]: where δ is the Dirac delta function, the subscripts "h" and "a" denote the hydrated crystal and anhydrated solute, and all solutions of these equations are also the stochastic processes due to stochastic nucleation. The method of moments is utilized to derive the following equation [38]:

Parameter Estimation
In the preliminary experiments, seed crystals are assumed to have been added so sufficiently that secondary nucleation could be regarded as deterministic and the primary nucleation rate to have been negligible. First, in order to estimate the kinetic parameters of secondary nucleation, the data of so-called metastable zone widths were acquired, and the regression analysis was carried out based on the following approximate equation: Here, R is the cooling rate, the subscript "seed" denotes the seed crystals, and ∆T m and r are defined as follows: where the subscript "m" means that the number of crystals reaches a predetermined threshold in this study. Equation (21) can be derived from Equation (20) as follows [17]: First, it is assumed from t = 0 to the threshold that C and W s are constant, µ 3 is proportional to µ 0 , and that µ 0,seed is negligibly small compared with rµ 0,seed . Secondly, B 2 is substituted for B* into Equation (20) at i = 0, and this equation is integrated from t = 0 to the threshold. Thirdly, the same equation is integrated with µ 3 fixed to the average value. Fourthly, the derivation of the second step is substituted into that of the third step, and the Crystals 2020, 10, 380 7 of 17 relation between the µ 3,seed and the averaged µ 3 is derived. Finally, the derivation of the fourth step is substituted into Equation (20) at i = 0, and this equation is integrated with t substituted by ∆T/R to derive Equation (21). Here, ∆T is a decrease in the solution temperature. Several thresholds can be set in one measurement only if the errors between the initial concentration and actual ones satisfy the following restriction: where ε was set to 0.2 in this study. Equation (24) should be satisfied, because the decrease in concentration is ignored in Equation (21). An example of the data acquisition of metastable zone widths is illustrated in Figure 2. At the points marked by the circles, the errors satisfied Equation (24), and the data were accepted and added to the dataset. The µ 0,seed was obtained as the average value in the preliminary measurements.
Crystals 2020, 10, x FOR PEER REVIEW 7 of 16 where ε was set to 0.2 in this study. Equation (24) should be satisfied, because the decrease in concentration is ignored in Equation (21). An example of the data acquisition of metastable zone widths is illustrated in Figure 2. At the points marked by the circles, the errors satisfied Equation (24), and the data were accepted and added to the dataset. The μ0,seed was obtained as the average value in the preliminary measurements. Secondly, in order to estimate the kinetic parameters of growth, the growth rates were approximately computed by the following equation derived from Equation (19): where Δt is reasonably small measuring interval. The data of G were acquired, and the regression analysis was carried out based on the following equation: Several datapoints can be obtained in one measurement only if the errors between the actual changes in the mass of crystals and contributions of growth satisfy the following restriction: where the numerator in the left side is the contribution of the secondary nucleation, and ε was set to 0.2 in this study. Equation (27) should be satisfied, because the contribution of the secondary nucleation is ignored in Equation (25). In the measuring interval of Δt, the nuclei may grow to have an average size of L1, which should be represented as follows: Thirdly, in order to estimate the kinetic parameters of stochastic primary nucleation, the data of the so-called latent periods were acquired from the main experiments. Several thresholds can be set in one measurement, and the latent period for the j'-th threshold satisfies the following equation: Secondly, in order to estimate the kinetic parameters of growth, the growth rates were approximately computed by the following equation derived from Equation (19): where ∆t is reasonably small measuring interval. The data of G were acquired, and the regression analysis was carried out based on the following equation: Several datapoints can be obtained in one measurement only if the errors between the actual changes in the mass of crystals and contributions of growth satisfy the following restriction: where the numerator in the left side is the contribution of the secondary nucleation, and ε was set to 0.2 in this study. Equation (27) should be satisfied, because the contribution of the secondary nucleation is ignored in Equation (25). In the measuring interval of ∆t, the nuclei may grow to have an average size of L 1 , which should be represented as follows: Thirdly, in order to estimate the kinetic parameters of stochastic primary nucleation, the data of the so-called latent periods were acquired from the main experiments. Several thresholds can be set in one measurement, and the latent period for the j'-th threshold satisfies the following equation: where the subscript "m" means that the total number of crystals reaches a predetermined threshold, and the latent period is also a stochastic variable. An example of the data acquisition of latent periods is illustrated in Figure 3, where the maximum number of crystals was obtained as the approximate average value in the main measurements.  The acquired data gave the parameters of the distribution of the latent periods, such as a mean and a standard deviation. On the other hand, a certain combination of the kinetic parameters of the stochastic primary nucleation may give the corresponding parameters of the distribution of latent periods by solving Equations (19) and (20) repeatedly. In this study, five hundred latent periods were simulated for each threshold and for each combination. Then, the parameters of the distribution were estimated at several points in the space of the kinetic parameters, and the resulting correlation was represented by a mapping as follows: where j is the number of thresholds, θ is the parameter vector, F is the mapping, which contains interpolants computed in MATLAB [39], the subscripts "d" and "B1" denote the distribution and kinetics of the primary nucleation, μ and σ are the mean and standard deviation, and θB1 was defined as follows in this study: Conversely, the parameters of the distribution of the observed latent periods are input into the inverse mapping to estimate the kinetic parameters of the stochastic primary nucleation as follows: where that emphasizes that the value is an unbiased estimator obtained from actual measurements, and the minimization was conducted with the trust-region-reflective algorithm in MATLAB [40][41][42]. The acquired data gave the parameters of the distribution of the latent periods, such as a mean and a standard deviation. On the other hand, a certain combination of the kinetic parameters of the stochastic primary nucleation may give the corresponding parameters of the distribution of latent periods by solving Equations (19) and (20) repeatedly. In this study, five hundred latent periods were simulated for each threshold and for each combination. Then, the parameters of the distribution were estimated at several points in the space of the kinetic parameters, and the resulting correlation was represented by a mapping as follows: where j is the number of thresholds, θ is the parameter vector, F is the mapping, which contains interpolants computed in MATLAB [39], the subscripts "d" and "B1" denote the distribution and kinetics of the primary nucleation, µ and σ are the mean and standard deviation, and θ B1 was defined as follows in this study: θ B1 = log k b1 b1 .
Crystals 2020, 10, 380 9 of 17 Conversely, the parameters of the distribution of the observed latent periods are input into the inverse mapping to estimate the kinetic parameters of the stochastic primary nucleation as follows: where that emphasizes that the value is an unbiased estimator obtained from actual measurements, and the minimization was conducted with the trust-region-reflective algorithm in MATLAB [40][41][42].

Validation
At first, the estimated parameters were substituted into the rate equations, and latent periods were simulated five hundred times. Then, the simulated cumulative distribution function of the latent periods was compared to the observed one with 99% confidence bounds. For calculating the confidence bounds, the Kaplan-Meier method was utilized as follows [43]: where the left side is the probability that the observed latent period will take a value less than or equal to a certain latent period, and this probability in itself is assumed to follow a normal distribution with the parameters in the right side. Here, the subscript "obsd" denotes an observed value, N is the normal distribution, and V is the variance, which is estimated as follows [43]: Here, ν l' is the number of samples within their own latent period at time t l' when some sample finishes its latent period, and it is assumed that no two samples finish their latent period at the same time. In addition, the test statistics of the two-sample Kolmogorov-Smirnov test were computed to obtain the p-value. The test statistic is defined as follows: where the subscript "simd" denotes a simulated value. The p-value is approximately computed with the following equation [44,45]: where ϑ is the Jacobi theta function, and l are defined as follows: where l simd or l obsd is the total number of samples, and l was about 23.8 in this study. The p-value is computed by substituting the test statistic for x. The small p-value implies the difference between the simulated distribution and the observed one. Finally, the deterministic secondary nucleation rate was substituted for the stochastic one into the population and mass balance equations, and the validation was carried out in the same way.

Parameter Estimation
In the preliminary experiments, the kinetic parameters of the secondary nucleation and growth were estimated based on the regression analyses expressed in Equations (21) and (26). The resulting regression lines are illustrated in Figures 4 and 5, respectively, and the estimated parameters are listed in Table 5. In Figure 4, the unit of R is K/s.   In the main experiments, at first, the latent periods for the four thresholds were measured twenty-five times, and the mean and the standard deviations of the latent periods were estimated for each threshold. The means and the standard deviations are listed in Table 6. At j' = 0 in Table 6, the standard deviation is large, which is attributed to a low sensitivity of the instruments for the small number of crystals. Moreover, at j' = 3, the standard deviation is also large, which might be attributed to the agglomeration, breakage, and heat of crystallization. Therefore, the data for these thresholds were rejected in this study.   In the main experiments, at first, the latent periods for the four thresholds were measured twenty-five times, and the mean and the standard deviations of the latent periods were estimated for each threshold. The means and the standard deviations are listed in Table 6. At j' = 0 in Table 6, the standard deviation is large, which is attributed to a low sensitivity of the instruments for the small number of crystals. Moreover, at j' = 3, the standard deviation is also large, which might be attributed to the agglomeration, breakage, and heat of crystallization. Therefore, the data for these thresholds were rejected in this study. In the main experiments, at first, the latent periods for the four thresholds were measured twenty-five times, and the mean and the standard deviations of the latent periods were estimated for each threshold. The means and the standard deviations are listed in Table 6. At j' = 0 in Table 6, the standard deviation is large, which is attributed to a low sensitivity of the instruments for the small number of crystals. Moreover, at j' = 3, the standard deviation is also large, which might be attributed to the agglomeration, breakage, and heat of crystallization. Therefore, the data for these thresholds were rejected in this study. Table 6. The statistics about the latent periods for each threshold.
Threshold (10 5  Then, the means and the standard deviations for the accepted thresholds were estimated at several points in the space of the kinetic parameters, and the resulting correlation was represented by a mapping. The mapping is outlined in Figure 6. In Figure 6, the parameters of the distribution were estimated at 7 × 5 points in the space of the kinetic parameters, and the bold lines correspond to the unbiased estimators obtained from the actual measurements. In short, the resulting kinetic parameters should be reasonably close to all the bold lines. The estimation of the kinetic parameters of the stochastic primary nucleation is outlined in Figure 7, and the estimated parameters are also listed in Table 5. When the dimensions of the parameters of the distribution are less than that of the kinetics, the estimators are undetermined, as is marked by the lines in Figure 7. When they are equal, the estimators are determined uniquely, as is marked by the circle and the square. When those of the distribution are more than that of the kinetics, the estimators are overdetermined, and approximate solutions are obtained, as is marked by the triangle. In this study, the kinetic parameters were defined by Equation (31) and the parameters of the distribution by Equation (30). These parameters can be changed, removed, or added. For example, in order to add the activation energy of the primary nucleation to the kinetic parameters, other parameters of the distribution with a low degree of multicollinearity should be added to the current parameters of the distribution, such as the ones obtained under the operating conditions with other cooling rates or with other initial temperatures. In addition, the confidence intervals of the kinetic parameters can be estimated based on those of the parameters of the distribution. of the stochastic primary nucleation is outlined in Figure 7, and the estimated parameters are also listed in Table 5. When the dimensions of the parameters of the distribution are less than that of the kinetics, the estimators are undetermined, as is marked by the lines in Figure 7. When they are equal, the estimators are determined uniquely, as is marked by the circle and the square. When those of the distribution are more than that of the kinetics, the estimators are overdetermined, and approximate solutions are obtained, as is marked by the triangle.   of the stochastic primary nucleation is outlined in Figure 7, and the estimated parameters are also listed in Table 5. When the dimensions of the parameters of the distribution are less than that of the kinetics, the estimators are undetermined, as is marked by the lines in Figure 7. When they are equal, the estimators are determined uniquely, as is marked by the circle and the square. When those of the distribution are more than that of the kinetics, the estimators are overdetermined, and approximate solutions are obtained, as is marked by the triangle.

Validation
The simulated and observed cumulative distribution functions of latent periods are illustrated in Figure 8, where the secondary nucleation is regarded as stochastic in the simulations, and in Figure 9, where the secondary nucleation as stochastic. In Figures 8 and 9, the LCB and UCB are the lower and upper confidence bounds of an observed function, respectively. In Figure 8b,c, the simulated and observed ones are assumed to coincide reasonably, which might be attributed to the trueness of the estimated parameters. On the other hand, In Figure 8a,d, the differences between the simulated ones and the observed ones are significant, which might be attributed to the large standard deviations mentioned in the previous subsection. Moreover, there is no significant difference between Figures 8  and 9. The p-values of the two-sample Kolmogorov-Smirnov test are also listed in Table 6, which implies that the ignorance of the stochastic behavior of the secondary nucleation makes the larger difference between the simulated distribution and the observed one for the accepted thresholds. distribution by Equation (30). These parameters can be changed, removed, or added. For example, in order to add the activation energy of the primary nucleation to the kinetic parameters, other parameters of the distribution with a low degree of multicollinearity should be added to the current parameters of the distribution, such as the ones obtained under the operating conditions with other cooling rates or with other initial temperatures. In addition, the confidence intervals of the kinetic parameters can be estimated based on those of the parameters of the distribution.

Validation
The simulated and observed cumulative distribution functions of latent periods are illustrated in Figure 8, where the secondary nucleation is regarded as stochastic in the simulations, and in Figure  9, where the secondary nucleation as stochastic. In Figures 8 and 9, the LCB and UCB are the lower and upper confidence bounds of an observed function, respectively. In Figure 8b,c, the simulated and observed ones are assumed to coincide reasonably, which might be attributed to the trueness of the estimated parameters. On the other hand, In Figure 8a,d, the differences between the simulated ones and the observed ones are significant, which might be attributed to the large standard deviations mentioned in the previous subsection. Moreover, there is no significant difference between Figures 8  and 9. The p-values of the two-sample Kolmogorov-Smirnov test are also listed in Table 6, which implies that the ignorance of the stochastic behavior of the secondary nucleation makes the larger difference between the simulated distribution and the observed one for the accepted thresholds.  In this study, the agglomeration, breakage, and the heat of crystallization were not considered in the mathematical model. However, the agglomeration and breakage were reported to be significant at a high magma density [46,47], and the heat of the crystallization is assumed to have greatly affected the latent period at j' = 3 (10% of max.) in Figure 3. In order to improve the accuracy of the estimation, these factors should be added to the mathematical model in future works.

Conclusions
The kinetic parameters of the stochastic primary nucleation were estimated for Arg crystallization. First, the kinetic parameters of the secondary nucleation and growth were determined by preliminary experiments, where the secondary nucleation and growth were regarded as deterministic. Secondly, in order to estimate the remaining kinetic parameters of the primary nucleation, the mean and standard deviation of the latent periods for the total number of crystals to be increased to a predetermined threshold were estimated and correlated with the kinetic parameters of the primary nucleation by means of a Monte Carlo simulation, where the primary nucleation and secondary nucleation were regarded as stochastic. Thirdly, in order to obtain the actual distribution of the latent periods, the latent period was repeatedly measured with FBRM and FT-IR-ATR. Fourthly, the means and the standard deviations of the obtained latent periods for the two thresholds were input into the inverse mapping, and the kinetic parameters of the stochastic primary nucleation were estimated as the outputs. Fifth, the estimated kinetic parameters were validated using the stochastic integrals and the cumulative distribution functions. Consequently, the simulated and observed functions were assumed to coincide reasonably for the two thresholds used in the estimation. However, the differences between the simulated ones and the observed ones were significant for the thresholds that were not used in the estimation, which might be attributed to the agglomeration, breakage, heat of crystallization, and a low sensitivity of the instruments for the small In this study, the agglomeration, breakage, and the heat of crystallization were not considered in the mathematical model. However, the agglomeration and breakage were reported to be significant at a high magma density [46,47], and the heat of the crystallization is assumed to have greatly affected the latent period at j' = 3 (10% of max.) in Figure 3. In order to improve the accuracy of the estimation, these factors should be added to the mathematical model in future works.

Conclusions
The kinetic parameters of the stochastic primary nucleation were estimated for Arg crystallization. First, the kinetic parameters of the secondary nucleation and growth were determined by preliminary experiments, where the secondary nucleation and growth were regarded as deterministic. Secondly, in order to estimate the remaining kinetic parameters of the primary nucleation, the mean and standard deviation of the latent periods for the total number of crystals to be increased to a predetermined threshold were estimated and correlated with the kinetic parameters of the primary nucleation by means of a Monte Carlo simulation, where the primary nucleation and secondary nucleation were regarded as stochastic. Thirdly, in order to obtain the actual distribution of the latent periods, the latent period was repeatedly measured with FBRM and FT-IR-ATR. Fourthly, the means and the standard deviations of the obtained latent periods for the two thresholds were input into the inverse mapping, and the kinetic parameters of the stochastic primary nucleation were estimated as the outputs. Fifth, the estimated kinetic parameters were validated using the stochastic integrals and the cumulative distribution functions. Consequently, the simulated and observed functions were assumed to coincide reasonably for the two thresholds used in the estimation. However, the differences between the simulated ones and the observed ones were significant for the thresholds that were not used in the estimation, which might be attributed to the agglomeration, breakage, heat of crystallization, and a low sensitivity of the instruments for the small number of crystals. Finally, the considerations to stochastic secondary nucleation were validated using the stochastic integrals and the two-sample Kolmogorov-Smirnov test. The resulting p-values calculated from stochastic secondary nucleation for the two thresholds used in the estimation were relatively large compared to those calculated from deterministic secondary nucleation, which implied that the ignorance of the stochastic behavior of secondary nucleation made the larger difference between the simulated distribution and the observed one.
In this study, the primary nucleation coefficient and order were estimated in the main experiments. These parameters can be changed, removed, or added, and then, the changed parameters can be estimated in the same way as in this work. For example, the activation energy of the primary nucleation will be added to the kinetic parameters and estimated in a future work. Moreover, in order to improve the accuracy of the estimation, the agglomeration, breakage, and the heat of crystallization should be added to the mathematical model in a future work.