Likelihood Function through the Delta Approximation in Mixed SDE Models

: Stochastic differential equations (SDE) appropriately describe a variety of phenomena occurring in random environments, such as the growth dynamics of individual animals. Using appropriate weight transformations and a variant of the Ornstein–Uhlenbeck model, one obtains a general model for the evolution of cattle weight. The model parameters are α , the average transformed weight at maturity, β , a growth parameter, and σ , a measure of environmental ﬂuctuations intensity. We brieﬂy review our previous work on estimation and prediction issues for this model and some generalizations, considering ﬁxed parameters. In order to incorporate individual characteristics of the animals, we now consider that the parameters α and β are Gaussian random variables varying from animal to animal, which results in SDE mixed models. We estimate parameters by maximum likelihood, but, since a closed-form expression for the likelihood function is usually not possible, we approximate it using our proposed delta approximation method. Using simulated data, we estimate the model parameters and compare them with existing methodologies, showing that the proposed method is a good alternative. It also overcomes the existing methodologies requirement of having all animals weighed at the same ages; thus, we apply it to real data, where such a requirement fails.


Introduction
In previous works [1][2][3], we described the individual growth of animals subject to random fluctuations in the environment and study estimation, prediction, and optimization problems with applications to cattle weight data. Considering M animals, we used the following general SDE model: where Y i (t) = h(X i (t)) is the modified weight by a transformation function h, a known monotonous continuously differentiable function of the real weight X i (t) of the animal i at age t, and y i,0 = h(x i,0 ), where x i,0 is the assumed known size of animal i at an initial age of observation t i,0 . The parameter β > 0 is the growth coefficient, and α is the mean asymptotic modified size towards which the mean modified size converges as t → +∞; we denote the corresponding real asymptotic size by A = h −1 (α). The intensity of the effect of environmental random fluctuations on growth is measured by the parameter σ > 0, being W i (t) (i = 1, . . . , M) independent realizations of the standard Wiener process. A transformation of the size leading to more general models was also suggested by Reference [4] with applications to tree growth. Adequate choices of the h function lead to stochastic versions of well-known growth models. For instance, the monomolecular model corresponds to h(x) = x, the Bertallanfy-Richards model corresponds to h(x) = x c (with c > 0), the Gompertz model corresponds to h(x) = ln x, and the logistic model corresponds to h(x) = 1/x, but the theoretical treatment is valid for any monotonous C 1 function h.
Here, for comparison purposes, we will illustrate with h(x) = ln x, corresponding to the stochastic Gompertz model. It is natural to think that the model parameters may vary from animal to animal, and so SDE models with fixed parameters as the one presented in (1) may not be suitable models for these applications. Parameter estimation for models where the parameters are considered random, known as mixed models or mixed-effects models, is presented in References [1,[5][6][7][8][9][10]. A review on the asymptotic inference of SDE mixed models can be found in Reference [11].
For example, in References [1,7], the mixed model considers that different individuals may have different values of A and, consequently, different values of α = h(A), i.e., the case where the average asymptotic weight varies randomly from animal to animal has been considered. In this particular case, in Reference [1], it was considered that α was a random variable, independent of W t , with a Gaussian distribution with mean µ and variance θ 2 , and the maximum likelihood estimation method was applied to estimate the model parameters. In this case, the likelihood function can be explicitly obtained, but it is extremely difficult or impossible to obtain a closed-form expression for the likelihood function in other cases.
More recently, we considered either α or β random, following a Gaussian distribution, and, for these cases, we solved the integral that appears in the likelihood function through approximation methodologies, such as the Laplace and the delta approximation methods (to appear in a forthcoming paper [12]). In References [8,10], for the general case where it is not possible to obtain a closed-form expression for the likelihood function, as is the case of random β, a numerical approximation based on an Hermite expansion was applied, whereas, in Reference [9], in addition to an Hermite expansion, a Gauss-Hermite quadrature was also applied, and the parameters of the SDE mixed model were estimated by the maximum likelihood method. In References [5,6], for mixed-models with linear drift term, when a closed-form expression for the likelihood function is not possible, a different approximation technique is used, based on a discretized version of the continuous-time data likelihood function.
In this work, we consider that both parameters α and β are random variables, and our main contribution is to extend our delta approximation method to this mixed model with two random parameters and compare its performance with other previously proposed estimation methods. The delta approximation method is inspired on the classical statistical delta method, which is properly adapted to serve a quite different purpose. The delta approximation method allows us to approximate the parameter estimates, through numerical maximization of the approximate likelihood function. This approximation methodology, to the best of our knowledge, is the first method to derive simple closed-form expressions for the approximated likelihood function when both α and β are random, allowing anyone to use this method. Notice also that the existing methods for SDE mixed models [5,6,8], when it comes to their numerical implementations, assume that the age vector of the observations is the same for all trajectories and, in some cases, even require equidistant ages of observation. When using real data of cattle weights, these assumptions are not adequate, since the animals are not weighed at the same time instants (ages), nor even with the same elapsed times between weighings. Our proposed methodology allows parameter estimation in scenarios, quite common in applications, where such restrictions on the data structure are not satisfied.
In order to compare our method with existing ones, particularly with the one provided by the MsdeParEst R package (see References [5,6,13]), we worked with simulated cattle weight data with 50, 500, and 5000 animals with the same age vector and with consecutive weights taken at equidistant intervals. We also estimated the model parameters with our method using a real dataset of 16,029 animals with very heterogeneous ages of observations. This paper is organized as follows. In Section 2, we present the SDE models with fixed parameters, their main properties, and the respective likelihood function, proceeding to the extension to mixed stochastic differential equations models. In Section 3, we develop the delta approximation applied to the likelihood function for the case where both parameters α and β are independent Gaussian distributed random variables. In Section 4, we present the application for both simulated and real datasets and, whenever possible, compare the results with the existing methods. Based on those results, we present practical recommendations on how to deal with real datasets on Section 5 and end with the main conclusions in Section 6.

Stochastic Differential Equations Models
Considering data from M individuals, we will denote the size (some measure of weight, volume, height, length, etc.) at age t of the ith individual (i = 1, . . . , M) by X i (t). If the individual is growing in a randomly fluctuating environment, working with a modified size Y i (t) = h(X i (t)), where the transformation h is a monotonous continuously differentiable function, we can describe the evolution of individual growth through an SDE of the form (1). This is a variant of the Ornstein-Uhlenbeck model, also called the Vasicek model in the context of interest rate dynamics [14]. The model solution Y i (t) is a homogeneous diffusion process with drift coefficient a(y) = β(α − y) and diffusion coefficient b(y) = σ 2 , given by (see, for instance, Reference [15]). Let t i,j (i = 1, . . . , M, j = 1, . . . , n i ) be the age of the jth observation of individual number i and let Y i,j = Y i (t i,j ) = h(X i (t i,j )) be the corresponding modified weight according to model (1). For each individual i (i = 1, . . . , M), denote its age vector of observations (which may differ from individual to individual) by t i = t i,0 , t i,1 , . . . , t i,n i , the corresponding vector of modified sizes by Y i = Y i,0 , Y i,1 , . . . , Y i,n i , and the observed value of Y i by y i = y i,0 , y i,1 , . . . , y i,n i . We assume t i,j−1 < t i,j and make E i,j = e −(t i,j −t i,j−1 ) . We see that, for Y i,j conditioned on Y i,j−1 = y i,j−1 , the transition distribution for animal i is Gaussian: In Reference [1], we applied the maximum likelihood estimation method to estimate the parameter vector p = (α, β, σ). From (3), using the fact that Y i (t) is a Markov process, we know that, given Y i,0 = y i,0 (assumed known), the Y i joint probability density function for individual i is given by the product of the transition densities between consecutive observation times of animal i; thus, it takes the form By independence among individuals, we obtain the likelihood function for the M animals: The maximum likelihood estimate of the parameter vector p is obtained by maximization of (5) or of the log-likelihood function LL Y (α, β, σ) = ln L(α, β, σ).
The maximum likelihood estimators are asymptotically Gaussian with mean vector p and variance-covariance matrix V = F −1 , where F is the Fisher information matrix with elements given by F i,j = −E ∂ 2 L(p)/∂p i ∂p j . We can estimate V by the inverse of the empirical information matrixF, with elementsF i,j = −∂ 2 L(p)/∂p i ∂p j . From these values, we can obtain the approximate confidence bands for the parameters. For these type of models, in terms of estimation methods, we also developed and applied parametric and non-parametric bootstrap methods [1,3]. Since the asymptotic confidence intervals obtained from the empirical Fisher information matrix may be quite unreliable for small sample sizes, the bootstrap methods can be used in such cases.
In References [1,16], non-parametric estimation methods were developed in order to estimate the drift and diffusion coefficients of a stochastic differential equation model for the case of non-equidistant data. For our application on cattle weight data, we had been working with models with specific functional forms for the drift and the diffusion coefficients, for example, in the case of model (1), with a drift coefficient of linear form a(y) = β(α − y) and a constant diffusion coefficient b(y) = σ 2 . These non-parametric methods are useful to assess whether our specific choice of functional forms is appropriate for our data or whether some alternative functional forms for these coefficients are suggested.
Recently, weighted maximum likelihood estimation methods were studied and adapted to overcome one very common limitation in the cattle weight data applications, related to the fact that animals are usually not weighed very frequently, and a scarce number of weight observations exists for older ages. In the weighted maximum likelihood estimation method, the weights are built such that the times elapsed between consecutive observations are considered in the likelihood function [3].
We described the general SDE model (1) for the complete growth curve of the animals where the model's parameters α, β, and σ are assumed common to all individuals. Here, we consider a different generalization to account for the fact that it is natural to think that different animals, due to their specific genetics and other characteristics, may have different values of the parameters. So, in this paper, we will consider the situation where different individuals may have different randomly assigned parameters.
In References [1,5,6,9,10,17], it has been shown that, to consider at least one of the two parameters of the drift term, α or β, as random variables, the likelihood function can be obtained from the transition density function conditioned on the respective random parameter.
We now briefly review how to do it in a more general setting. Let b be the ddimensional vector of parameters that vary randomly among animals and assume that the distribution of b among animals has probability density function (p.d.f.) p B (b|Ψ), where Ψ is the parameter vector that characterizes this distribution and needs to be estimated. Assuming independence among the animals, the M parameter vectors b i of the different animals i (i = 1, . . . , M) are independent identically distributed random variables with common p.d.f. p B and assume the b i (i = 1, . . . , M) are also independent of the Wiener processes that characterize the environmental conditions under which the animals are growing. Let Λ be the vector of the remaining model parameters (the ones not involved in p B ), assumed to be common to all animals. The likelihood function for M trajectories (animals) is given by The case of a single random parameter b i = (α i ) has already been studied for when α i ∼ N(µ, θ 2 ). When we have the special situation of a time vector of observations t i ≡ t = (t 0 , t 1 , . . . , t n ), i = 1, . . . , M common to all animals, we can see it in Reference [8] (for the particular situation of µ = 0 and uniform time spacing t j − t j−1 ≡ δ) and in References [6,7]. The general situation, with no such restrictions, can be seen in Reference [1]. In this case, it is possible to explicitly compute the integral in the likelihood function, resulting in a final closed-form expression for this function. This is shown in Reference [1], where the log-likelihood function for all animals LL Y (µ, θ, β, σ) = ln L(µ, θ, β, σ) is given by However, despite the existence of a closed-form expression for the likelihood function for this particular case, we also applied approximation methods (Laplace and delta approximations-to appear in a forthcoming paper [12]), showing that the approximation methods also provide very good results when compared with the exact method. Unfortunately, the integral in (7) does not always have a closed-form expression, for example, when the random parameter is β, corresponding to b i = (β i ), and, in such cases, the approximation methods are good alternatives. In the forthcoming paper [12]), we used the Laplace and delta approximations to obtain closed-form approximate expressions of the likelihood function for this case, also with very good results.
In the following section, we present the study of the case where both α and β are considered random variables and propose the application of the delta approximation method to the integral in (7).

Mixed Model for Two Random Effects
The mixed model where both α and β are random variables can be written as where we assume that the 2M random variables , the fixed effects parameter vector becomes the 1 × 1 vector Λ = (σ), the random effects parameter vector is Ψ = (µ, λ, θ, ω), and the p.d.f. of the random effects p B is a bivariate Gaussian distribution.
To obtain the likelihood function in (7) for this situation of two random parameters, notice that, due to the assumed independence structure, with To determine p Y i (y i |α i , β i , Λ) in (10), notice that, due to the Markov property of each individual trajectory i when conditioned on α i and β i , we have with the transition densities between consecutive observation ages of the animal i trajectory given by We will apply the delta approximation to the integral in (7), which, in this case, is given by (10) and does not have an explicit solution. This approximation method allows us to obtain simpler and closed-form expressions for the likelihood function. The method is also adapted to general age vectors of observations, allowing for estimation of the parameters of the models even when the different animals have their weights observed at different ages, and those ages may be non-equidistant, which, as far as we are aware, is theoretically possible but is not currently available in the numerical implementations of other existing approximation methods.
Let us denote expression (14) Taking into account the previous expressions, the likelihood function (7) can be writen as an expectation w.r.t. the bivariate Gaussian distribution with density p B : The mathematical expectation of u i (α i , β i ) in (16) can be approximately obtained by applying the delta approximation, obtained by adapting the delta method. The delta approximation consists of using the second-order Taylor series expansion about the point (µ, λ) We now apply mathematical expectations to this expression. Noticing that and, due to independence between α i and Finally, the log-likelihood function, LL Y (µ, θ, λ, ω, σ) = ln L(µ, θ, λ, ω, σ), can be obtained as By the chain rule, the second-order derivative of where the first and second-order derivatives with respect to α i are given by Similarly, the second-order derivative of u i (α i , β i ) with respect to β i is given by where the first two derivatives of g i (α i , β i ) with respect to β i are given by and Replacing these derivatives in the expression (19), we obtain an approximate expression for the log-likelihood function LL Y (µ, θ, λ, ω, σ, ρ) = ln L(µ, θ, λ, ω, σ, ρ) for the case of two random Gaussian parameters. Using it, in the next section, we are going to apply this approach to obtain approximate maximum likelihood parameter estimates for the mixed stochastic Gompertz model with two random parameters using cattle weight data.

Results
The main interest with the methodology proposed in the previous sections is to have a reliable estimation method that can be applied to real animal weight data, where the animals' weights are often not taken at the same age instants.
In our application, we worked with real cattle weight data provided by the Associação de Criadores de Bovinos Mertolengos (ACBM), which performs the growing and finishing phases of young Mertolengo males, and by associated breeders, from the Alentejo region in Portugal. The available dataset contains a total of 96,204 observations of the weight (in kg) of 16,029 Mertolengo cattle males, where each animal has several observations with a minimum of 3 and a maximum of 33 weights at ages varying from birth until a maximum age that ranges between 0.2 and 16 years old. This is a case where, indeed, each animal has its weight measurements taken at different (and even non-equidistant) age instants, varying from animal to animal.
We obtained the estimation results implementing the proposed delta approximation method (DA) in the software R project [18]. However, since it is a new method, we first will compare the results obtained with this DA method with the estimation methods available in the R package MsdeParEst described in Reference [13], which we will call MPE methods. The package includes, when closed-form likelihood function expressions are not possible, techniques based on a discretized version of the continuous-time data maximum likelihood developed in References [5,6]. The main drawback of this package (and, to the best of our knowledge, of other similar available R packages [19,20]) is that all observations must be measured at the same age instants for all animals, and, in some cases, the age instants are even required to be equidistant. Therefore, in order to compare the performance of the proposed approximation method DA with the methods available in the literature (in particular, the MPE method used here), we would also need datasets where weights are observed at the same equidistant age instants for all the animals. Whenever possible, we will also compare the DA and MPE methods, which consider approximations for the likelihood function, with existing estimation methods that use exact closed-form expressions for the likelihood function, in particular, when the likelihood function assumes both parameters fixed or Non-Mixed (which we will call the NMSDE method, presented in Reference [1]) or, if appropriate, when the likelihood function assumes just the parameter α as random (which we will call the Exact(α) method, presented in Reference [1]).
For this purpose, we worked with simulated datasets of equidistant monthly weights for M animals, since birth until four years of age, totaling M*49 observations. The animal's weights were simulated based on the stochastic Gompertz model (Y(t) = h(X(t)) = ln(X(t))). We simulated four datasets: The advantage of using simulated data, besides the possibility of comparing the different methods due to its common age vector of observations, is the knowledge of the true parameter values, which allows us to compare the performances of the different methods when they can be applied. We first consider simulated datasets for weights of M = 5000 animals to evaluate the behavior of the methods for a large number of trajectories close to an asymptotic regimen. However, in applications, we usually do not have datasets with so many animals available, and it is important to evaluate the performance of the different methods for smaller datasets. For this reason, we also present a comparison of the main methods, considering datasets with M = 500 and M = 50 animals. To distinguish them, we index the simulated datasets by the number of animals M.
Tables 1-4 presents the results for each of the datasets, D 5000 (α, β), D 5000 (α), D 5000 (β), and D 5000 , simulated under the four stochastic Gompertz mixed and fixed models. In each table, we present the results obtained using the different DA methods, DA(α, β) (the one that assumes both parameters α and β random and is described in Section 3), and DA(α) and DA(β) (the ones that assume just α random or just β random as described in Reference [12]), as well as the estimates obtained by the different MPE methods, MPE(α, β), MPE(α), and MPE(β) (again, when we consider random both or just one of the parameters). We also present, for comparison purposes, the results obtained when using the NMSDE estimation method (which assumes both parameters to be fixed).
In this way, we can assess what happens when an appropriate estimation method is applied (for instance, when applied to the D 5000 (α) dataset, the estimation method MPE(α) and the estimation method DA(α, β) are appropriate) and when an inappropriate estimation method is applied (for instance, when applied to the D 5000 (α, β) dataset, the MPE(α) method is inappropriate since it cannot estimate the parameters involved in the random β); this may be important since, in real non-simulated data, we may not know which parameters are random or not. Of course, DA(α, β) and MPE(α, β) are appropriate for all datasets, but, when the data has one or both parameters as non-random, they are overparametrized and, therefore, may not be as accurate as non-overparametrized methods.
In Tables 1-4, we have also underlined the headings of the estimation methods that are appropriate to analyze the corresponding dataset, in the sense of being able to estimate all the parameters of the model used to simulate the dataset. Appropriate methods include the ones that use, in the likelihood function or its approximation, the same random parameters that were used in the dataset generation (these would be the most appropriate methods when we know beforehand which parameters are random) but include, as well, overparametrized methods that also use additional random parameters in the likelihood function or its approximation.
The maximum likelihood estimates and the approximate 95% confidence bands based on the inverse of the empirical Fisher information matrix are presented only for the DA methods and the NMSDE method but not for the MPE methods since the confidence intervals of the estimates are not provided by the R package MsdeParEst. Instead of presenting the results for the overall mean of the modified asymptotic weight µ (which is =α when α is a fixed parameter), we present them in terms of the corresponding actual weight value h −1 (µ).
The tables also display the log-likelihood function values computed at the estimated parameter values. We display the log-likelihood values LL X in terms of the actual weights X, which can be easily obtained by a simple conversion using the function h from the corresponding log-likelihood values LL Y (expressed in terms of the modified weights Y we have been working with so far in all computations). However, although the displayed log-likelihood values use the exact log-likelihood known expressions for the NMSDE and the Exact(α) methods, for the DA methods, they use the approximate log-likelihood expressions of those methods, and the same is possibly happening with the values delivered by the MsdeParEst Package for the MPE methods. Since these approximations involve some incomparable errors, the displayed log-likelihood values LL X should, with rare exceptions, not be used for comparisons in which an approximate method is involved.
When considering, in Table 1, the D 5000 (α, β) dataset, the appropriate estimation methods are the ones assuming α and β random, the DA(α, β) method we have proposed, and the MPE(α, β) method. We can observe that the DA(α, β) method provides a lower bias for all parameters than the MPE(α, β) method, noticing that, in the MPE(α, β) method, the estimate of ω is extremely biased, close to zero. The use of an inappropriate method should obviously be avoided, but, when using real data, we often do not know beforehand exactly which are the random parameters and may make a wrong assumption, leading to the use of an inappropriate estimation method. These simulated datasets, in which we know exactly which parameters are random, give us the opportunity to assess the consequences of a wrong assumption on the random parameters when dealing with real data. Let us look at this issue, beyond the obvious fact that an inappropriate method cannot estimate some of the parameters. When comparing the appropriate DA(α, β) method with the inappropriate (for this dataset) NMSDE method, the former method leads to less biased estimates. When comparing the appropriate DA(α, β) with the inappropriate (for this dataset) DA(α) and DA(β) methods, the inappropriate methods estimate the standard deviation of their own random parameter (respectively, the standard deviation θ of the random α and the standard deviation ω of the random β) better than the appropriate method, which underestimates both standard deviations. On the estimation of the noise intensity parameter σ, the appropriate DA(α, β) method shows the better performance, and it also outperforms the inappropriate DA(β) method on the estimation of the means h −1 (µ) and λ, but these means are curiously better estimated by the inappropriate DA(α) method. As for the inappropriate MPE(α) and MPE(β) methods, besides the obvious impossibility of estimating one parameter, they have a behavior that is not too different from the MPE(α, β) appropriate method. Table 1. Results for the simulated dataset of 5000 animals when both α and β are random-D 5000 (α, β). Besides the true parameter values used in the simulations, the table shows their maximum likelihood estimates and, when available, approximate 95% confidence bands for the different estimation methods: delta approximation methods, DA(α, β), DA(α), and DA(β); NMSDE method (which wrongly assumes α and β fixed and works with the corresponding exact likelihood); and MPE methods, MPE(α, β), MPE(α), and MPE(β). For each method, the indicated parameters are the ones taken as random in its likelihoods or approximate likelihoods. Appropriate methods (i.e., those that allow the estimation of all the dataset parameters) are underlined. Notice that the LL X values of approximate methods may differ from the true values of the log-likelihood function at the parameter estimated values.  Table 2 shows the results for the D 5000 (α) dataset. For this case, we included the Exact(α) method, where the maximum likelihood is obtained exactly by the closed-form expression (8), and that is certainly the best choice method for this dataset. The methods DA(α, β) and DA(α) are both appropriate for this dataset, but the first is overparametrized, while the latter is the most appropriate of the two, and we expect it to be more accurate. Still, it is interesting to notice that these two methods provide exactly the same parameter estimates, except, naturally, for ω = 0 ("parameter" obviously out of the DA(α) method).
Comparing the MPE(α, β) with the MPE(α) methods, they do not give exactly the same parameter estimates, but their estimates are very close. As expected, the most appropriate method for this dataset, the Exact(α) method, provides slightly better estimates than the DA(α) and the DA(α, β) methods, and than the MPE(α) and the MPE(α, β) methods. The estimates for h −1 (µ) and θ are better when using the appropriate MPE methods than the ones obtained with the corresponding DA methods, but the reverse happens for the estimates of the mean λ and the standard deviations ω = 0 of parameter β. The inappropriate methods DA(β) and MPE(β) give worse estimates than the appropriate methods of the same type. The use of the inappropriate (for this data) NMSDE method performs quite badly in the estimation of h −1 (µ). Notice that the dataset was simulated for fixed β, i.e, ω = 0, and this is better captured by the DA method. Table 2. Results for the simulated dataset of 5000 animals when α is random, but β is fixed (ω = 0) -D 5000 (α). Besides the true parameter values used in the simulations, the table shows their maximum likelihood estimates and, when available, approximate 95% confidence bands, for the different estimation methods: DA(α, β), DA(α), DA(β), NMSDE, MPE(α, β), MPE(α), MPE(β), and Exact(α) (the most appropriate since it uses the exact log-likelihood function when only α is random). Appropriate methods are underlined. Notice that the LL X values of approximate methods may differ from the true values of the log-likelihood function at the parameter estimated values. Since the LL X values for the approximate methods are only approximations, we cannot perform a likelihood ratio test for the randomness of the β parameter (null hypothesis ω = 0 corresponding to a non-random β). However, if one knows (or assumes) that β is fixed, one can perform a likelihood-ratio test on whether α is fixed or random (null hypothesis θ = 0 corresponding to a fixed α) by using the log-likelihood LL X values of the NMSDE method and the Exact(α) method, which are exact values; the result is the rejection of the null hypothesis at the usual significance levels (p-value < 0.001).
For the case of the D 5000 (β) dataset, Table 3 shows that, again, the appropriate (for this dataset) DA(α, β) (which is overparametrized) and DA(β) methods provide very similar parameter estimates. Curiously, the inappropriate NMSDE model performs surprisingly well. In this case, the MPE methods, both appropriate and inappropriate ones, present very poor estimates. We also highlight that the estimates of the standard deviations θ = 0 and ω = 0.30 using the appropriate MPE methods severely fail, suggesting even that α is random and β is not when the reverse is the true situation, while the DA(α, β) method very well captures the non-random nature of parameter α, and both DA(α, β) and DA(β) methods capture the random nature of β, although somewhat underestimating its standard deviation ω. Table 3. Results for the simulated dataset of 5000 animals when β is random, but α is fixed (θ = 0) -D 5000 (β). Besides the true parameter values used in the simulations, the table shows their maximum likelihood estimates and, when available, approximate 95% confidence bands, for the different estimation methods: DA(α, β), DA(α), DA(β), NMSDE, MPE(α, β), MPE(α), and MPE(β). Appropriate methods are underlined. Notice that the LL X values of approximate methods may differ from the true values of the log-likelihood function at the parameter estimated values. In Table 4, the estimates using the D 5000 dataset, with both parameters α and β fixed, i.e., with θ = 0 and ω = 0, are presented. Now, all the methods are appropriate. However, with the exception of the NMSDE method, which, in this case, is the most appropriate one, they are overparametrized. It is quite interesting to note that the different DA methods give practically coincidental parameter estimates among them and are practically coincidental with the parameter estimates of the NMSDE model. This reveals robustness of the proposed DA method, which also captures the non-random character of α and β very well. The MPE methods give reasonable results, but they are not so good.
In this case, not only the NMSDE method and the DA methods give almost coincidental parameter estimates, but the LL X value of the the NMSDE method (which is the exact maximum of the exact log-likelihood function) is also almost coincidental with the LL X values for the DA methods; so, these LL X DA values, which are approximations, are likely to be good approximations. If these values were exact instead of just approximations, a likelihood ratio test for θ = 0 (i.e, for the randomness of α) and/or the significance of ω = 0 (i.e., for the randomness of β) would obtain non-significant results. Table 4. Results for the simulated dataset of 5000 animals when α and β are both fixed (θ = ω = 0) -D 5000 . Besides the true parameter values used in the simulations, the table shows their maximum likelihood estimates and, when available, approximate 95% confidence bands, for the different estimation methods: DA(α, β), DA(α), DA(β), NMSDE, MPE(α, β), MPE(α), and MPE(β). Appropriate methods are underlined. Notice that the LL X values of approximate methods may differ from the true values of the log-likelihood function at the parameter estimated values. Since the typical situation when dealing with real data is not knowing beforehand whether α and/or β are random or fixed, and, since we have seen that, among the approximate methods DA, the method DA(α, β) performs as well or almost as well as other appropriate DA methods (in what concerns the estimation of the common parameters), even when it is overparametrized with respect to the dataset, we will consider from now on, among the three delta approximation methods, only the DA(α, β) method. For similar reasons, among the MPE methods, we will consider from now on just the MPE(α, β) method. For comparison purposes, we also consider the usual NMSDE method, which is the most appropriate method for the fixed-effects model considered in most applications, but it is, of course, inappropriate when our datasets have α random and/or β random.
With the purpose of analyzing the influence of smaller sample sizes on the estimates, Tables 5-8 present the results for the four mixed-and fixed-effects stochastic Gompertz models for the datasets with smaller samples sizes, 50 and 500 animals. We will use the DA(α, β), the MPE(α, β), and the NMSDE estimation methods. For comparison purposes, for the datasets D 50 (α) and D 500 (α), we also present the results of the Exact(α) method.
We can conclude that the same main characteristic features observed in the estimates for the large datasets with 5000 animals still hold. The DA(α, β) method is able to accurately identify the fixed and the random effects, while the MPE(α, β) method usually fails (except, and not clearly, when both effects are fixed).
As expected, when comparing the confidence intervals of the parameter estimates for different number of animals in the datasets (see Tables 5-8 for 50 and 500 animals, and  Tables 1-4 for 5000 animals), their amplitudes decrease as the number of animals in the dataset increases. In general, the MPE(α, β) method have worst estimates than the DA(α, β) method, both for small and large samples, except in estimating the standard deviation θ of the parameter α when this parameter is random.
For the datasets D 50 (α) and D 500 (α), the results obtained with the DA(α, β) method are not exactly the same as the ones obtained by the Exact(α) method, but they are quite close, even very close for the 500 animals dataset D 500 (α).
For the fixed-effects datasets D 50 and D 500 , the estimates of the DA(α, β) method replicate the same results as the NMSDE method, which, for these datasets, is the most appropriate method. The LL X values of the NMSDE method are exact maximum loglikelihood values, and they are again practically equal to the LL X approximate values of the DA(α, β) method, again suggesting that, for fixed-effects datasets, these are good approximations, in which case a likelihood ratio test of θ = ω = 0 would accept this null hypothesis. Table 5. Results for the simulated datasets of 50 and 500 animals when both α and β are random-D 50 (α, β) and D 500 (α, β). Besides the true parameter values used in the simulations, the table shows their maximum likelihood estimates and, when available, approximate 95% confidence bands, for the appropriate estimation methods DA(α, β) and MPE(α, β) and the inappropriate estimation method NMSDE. Notice that the LL X values of approximate methods may differ from the true values of the log-likelihood function at the parameter estimated values.  Table 6. Results for the simulated dataset of 50 and 500 animals when α is random, but β is fixed (ω = 0) -D 50 (α) and D 500 (α  Table 7. Results for the simulated datasets of 50 and 500 animals when β is random, but α is fixed (θ = 0) -D 50 (β) and D 500 (β Finally, in Table 9, we used the real cattle weight dataset of 16,029 Mertolengo cattle males (totaling 96,204 observations) and present the estimates obtained by the DA(α, β) method and by the NMSDE method. The true parameter values are not known, and we cannot use the R package MsdeParEst since the real weight data from the 16,029 animals presents a different age vector of observations for each animal. Analyzing the results of Table 9, we can conclude that, despite the database being large and heterogeneous, the two estimation methods provide very similar estimates. According to our findings obtained above, it is interesting to note that the DA(α, β) method presents clear evidence of random effects on the α parameter and of strong random effects on the β parameter. Although we cannot perform a likelihood ratio test to reject the null hypothesis θ = ω = 0 because the displayed LL X value for the DA(α, β) method is an approximation, the difference of LL X values between this method and the NMSDE method (which displays an exact LL X value) is striking enough to leave doubts about the random nature of α and β. Table 9. Results for the real cattle weight dataset. The maximum likelihood estimates and approximate 95% confidence bands are shown when assuming a mixed model with both α and β random and using the DA(α, β) estimation method and when assuming a non-mixed model (α and β fixed) and using the NMSDE method (which is the classical exact maximum likelihood method under that assumption). Notice that the LL X values of the approximate method DA(α, β) method may differ from the true value of the log-likelihood function at the parameter estimated values.

Recommendations
Taking into account all the analyzed scenarios, it makes sense to present a summary of the advantages and disadvantages of each method and recommendations for its use with a real dataset depending on the number of animals. If one has no information or strong reasons to consider a given parameter as random, one should apply the DA(α, β) method, which, for all datasets considered here, was able to correctly identify which parameters are fixed and which are random.
If one concludes from this preliminary analysis that none of the parameters are fixed (by not having small estimates of their standard deviations θ and ω, complemented by the information given by the approximate confidence intervals of these parameters), the DA(α, β) method is recommended to estimate the mean and standard deviation of the random β, whereas the MPE(α, β) method is recommended to estimate the mean and standard deviation of the random α, either for small and large samples. However, a good compromise will be achieved if one just uses the DA(α, β) method.
In the case the preliminary analysis indicates that ω = 0 (by having a very small estimate of this parameter), we should then apply the Exact(α) method [1], which is designed for the case of α being the only random parameter and uses the exact log-likelihood function (8), to get better estimates of all parameters.
If the preliminary analysis indicates θ = 0 (by having a very small estimate of this parameter), we better remain with the DA(α, β) method (since its estimates are very similar with those obtained under the DA(β) method) and provides better estimates of the mean and standard deviation for the β random effect and also of the fixed effect α than the MPE method or the NMSDE method.
Finally, in the case where the preliminary analysis indicates ω = 0 and θ = 0 (i.e., the dataset only has fixed parameters), we can apply either the most appropriate and classical NMSDE method [1], designed for the fixed-effects model and using the exact loglikelihood function (6), or the DA(α, β) method (which gives basically the same estimates).
Furthermore, the DA method has the big advantage of being usable for all real datasets, with or without equidistant observations and whether or not the animal's sizes have been obtained at the same age instants for all animals.

Conclusions
To describe the individual growth of animals in a randomly varying environment, a general stochastic differential equation (SDE) model was used, and, in order to take into account that the model parameters may vary from animal to animal, which, for instance, occurs due to different individual genetics and other characteristics of the animals, we considered SDE mixed models. Here, we studied the SDE mixed model where both parameters included in the drift term, the parameter α (asymptotic modified weight) and the growth parameter β, were assumed to be Gaussian distributed.
We applied the maximum likelihood estimation method to obtain the estimates for the parameters of the SDE mixed models. In most cases, for this type of model, it is difficult or impossible to obtain a closed-form expression for the likelihood function and approximation methods are used to cope with this issue. In this paper, we propose to use the delta approximation method (DA method) with both α and β random, in which we adapted the delta method, a classic in Statistics, to obtain approximate closed-form expressions for the likelihood function when both parameters α and β are random. Of course, the DA method can also be applied when only one of the two parameters is assumed random.
To evaluate the performance of the proposed method on parameter estimation, we used simulated datasets from different mixed-effects models (with only α random, with only β random, and with both parameters random) and from a fixed-effects model (both α and β fixed), with different numbers of animals (5000, 500, and 50), datasets in which all animals were weighed at the same ages so that we could use all the estimation methods considered here and compare them. Since, in real life, we usually do not know which are the random parameters, wrong assumptions may be made on that issue. In order to evaluate the consequences of wrong assumptions on what parameters are random, we included in the comparisons also the methods designed under such assumptions. We compared the DA method (in its variants of assuming both or just one parameter random) with an existing method specifically designed for mixed-effects models (also with the same variants), referred to here as the MPE method, which is provided by the R package MsdeParEst. We also included in the comparisons the estimation methods for which loglikelihood expressions in closed-form are available, namely the NMSDE method and, in some cases, the Exact(α) method, which are designed, respectively, for the fixed-effects model and the mixed-effects model with only α random.
The results of these comparisons show a very good performance of the proposed DA method with both α and β random, being globally the best method for all the simulated scenarios. This method, unlike the MPE methods, was able to correctly identify, in each of the settings, the fixed and the random parameters. It gives generally better parameter estimates than the MPE method and the estimates are quite close to the true parameter values, with the exception of the standard deviations of random effects, which were somewhat underestimated. The performance of the proposed DA method was confirmed when using a simulated dataset with both parameters fixed, since it provided the same results as the ones obtained when using the exact NMSDE method.
For this type of SDE mixed models, it is usual to find in the literature estimation methods developed under the assumption of having a unique (often also evenly spaced) age vector of observations common to all individuals, and this is also required when using available R packages, such as the MsdeParEst package used in the MPE method. The delta approximation (DA) method has the advantage of not requiring such restrictions, so it can be used in real situations where such restrictions usually do not hold. In our real data application, we are precisely in this situation, and we applied the estimation methods to real weights of Mertolengo cattle males from a large and heterogeneous dataset, reaching the conclusion that the proposed DA method identifies both parameters α and (more strikingly) β as being random.
This approach revealed to be a very interesting alternative to the available estimation methods for SDE mixed models.
As future work, we are undergoing the study of the case where we will incorporate the genetic factors of the animals into the model to explain part of the variation in the random parameters, and we intend to implement the current methods in an R package.