Combining Bayesian Calibration and Copula Models for Age Estimation

Accurately estimating and predicting chronological age from some anthropometric characteristics of an individual without an identity document can be crucial in the context of a growing number of forced migrants. In the related literature, the prediction of chronological age mostly relies upon the use of a single predictor, which is usually represented by a dental/skeletal maturity index, or multiple independent ordinal predictor (stage of maturation). This paper is the first attempt to combine a robust method to predict chronological age, such as Bayesian calibration, and the use of multiple continuous indices as predictors. The combination of these two aspects becomes possible due to the implementation of a complex statistical tool as the copula. Comparing the forecasts from our copula-based method with predictions from an independent model and two single predictor models, we showed that the accuracy increased.


Introduction
Age estimation for living individuals is a common problem in legal medicine. It has a considerable importance in the context of immigration, particularly for undocumented individuals seeking asylum in European countries.
As forced migration continues toward European countries, meaning more unaccompanied young people without documents, the necessity to determine the age of young people has increased. In fact, the assessment of age in children and adolescents is critical for children both to be protected appropriately, and to receive the social and health interventions they need and are entitled to. Therefore, the age estimation should be as accurate as possible. The methods proposed to estimate chronological age from certain maturity indices are mainly based on dental changes or skeletal characteristics of the subjects assessed by radiographs [1][2][3]. In most of the studies devoted to age estimation procedure, the age is estimated using a linear regression model. This approach leads to biased estimates of chronological age [4,5], which is systematically overestimated in younger subjects and underestimated in older ones [6]. As proposed by Lucy and Pollard [5], the use of classical calibration (or "inverse regression") helps to entirely remove such bias. However, in the "inverse regression", low correlation between the dependent variable and the covariate may lead to an inaccurate age estimation.
To avoid this issue, Ferrante [7] introduced a Bayesian calibration method with dental maturation as a unique predictor of chronological age. More recently, Bucci [8] followed the same approach using a segmented function to model the relationship between age and dental maturity to consider the changes in the maturation process in juveniles.
A major limitation of the former approaches with Bayesian calibration is that only a single predictor is allowed. However, the use of multiple predictors may be helpful to better predict the chronological age of an individual. Kumagai [9] combined multiple ordinal predictors for age estimation by applying Bayes' rule to a multivariate continuation ratio model. De Tobel [10] extended this approach, allowing for continuous predictors but assuming a linear relationship between predictors and age. Nevertheless, this methodology relies on the strong assumption of independence between the predictors. We extended the use of the Bayesian calibration approach to multiple continuous predictors, allowing for nonlinear relationships between predictors and age, and combining it with the copula instrument without making any assumption on the dependence structure. The copula function is recognized as a multivariate tool for constructing high flexible joint distribution [11]. For this reason, copulas have been widely used in all fields of research [12][13][14]. This article combines, for the first time, two continuous maturity indices via copula with a Bayesian calibration method and provides an unbiased way to use multiple predictors in the age estimation process. In particular, we used: a dental maturity index, S [2], calculated as the sum of normalized open apices in teeth, and a hand-wrist maturation index, W [1], defined as the ratio between the total area of the eight carpal bones and the carpal area. The relationship between age and the dental maturity index S is inverse and nonlinear: S decreases with advanced age in a nonlinear way. On the contrary, the hand-wrist maturation index W increases with age.

Sample
A sample of 235 orthopantomographies and wrist X-rays from healthy Italian subjects (114 males and 121 females), where the chronological age of the subjects was known, was used.
The sample included healthy individuals aged 5 to 15 years, with all seven permanent left mandibular teeth, free from malnutrition, wrist fractures, bone diseases, growth and other systemic disorders.
Each patient's ID number, chronological age and sex were recorded at the beginning of the study. The chronological age of the living subjects was calculated as decimal age by subtracting the date of birth from the date of taking the orthopantomography and wrist X-ray.
The data used in this study were collected and published in a previous study [15] in which ethical approval was obtained.
Ethical review and approval were waived for this study. It was conducted in conformity with the regulations on data management of the Italian law on privacy (Legislation Decree 196/2003 amended by Legislation Decree 101/2018).

Measurements
All orthopantomography and wrist x-ray images were in digital format. The projections were used to determine dental and skeletal maturation, assessed respectively by the sum of open apices of the seven left permanent mandibular teeth (S) [2] and the ratio between the total area of the eight carpal bones and the carpal area for wrist (W) [1]. These methods were chosen because the images were obtained in a simple radiographic position and with a low level of radiation. Two measures, sex and chronological age, were recorded in an Excel file for use as possible predictive variables for age estimation in later statistical analysis.

Statistical Analysis
The relationship between chronological age and the hand-wrist maturation index was almost linear, as shown by Cameriere [1]. For that reason, we assumed that the location parameter of distribution of W followed a simple linear model. On the contrary, the relationship between chronological age and dental maturation, in early stages of life, is not linear (Ferrante [7]) and it can present one or more breakpoints, i.e., points where the relationship changes abruptly. This kind of relationship may be specified in several ways; we modelled S as a segmented function with one breakpoint (Bucci [8]). To this end, the authors investigated a broken-line relationship between S and age (y), following the approach proposed by Muggeo [16].
Consistent with Ferrante [7], a Bayesian calibration method was used (Appendix A.1). Firstly, we randomly subdivided the dataset into training (about 70% of the total number of observations, N tr = 160 subjects) and validation (30% of the total number of observations, N te = 75 subjects) samples, used to develop the models and to measure their predictive performance, respectively. The Bayesian approach to the age estimation problem consists of determining the a posteriori distribution of age, conditioned to the value of the two maturity indices (W, S). In this context, a priori distribution of model parameters and a priori distribution of age are needed: an uninformative uniform prior distribution was used for both, because no a priori information is available.
Considering that W and S were collected for the same individuals, these two maturity indices could not be assumed to be independent. Every joint distribution function contains information about the marginal behaviour of each variable and their join behaviour (dependence structure). For this reason, to construct the join probability model of W and S, we referred to copula theory (Appendix A.2). Copulas enable us to isolate and capture the full structure of dependence in a multivariate distribution, and help with understanding this dependence at a deeper level than simple correlation. The copula theory was introduced by Sklar [11], who proved that for a given set of n random variables, their joint distribution function can be decomposed into the product of marginal distribution functions and a copula. The usefulness of Sklar's theory stems from the fact that the set of parametric distributions which can be fitted increases substantially, because it is possible to link any n univariate distributions (not necessarily from the same family) with any copula to obtain an unusual but valid n-variate distribution. This allows the researcher to combine any kind of predictors without making any assumptions about their join behaviour. In the literature, several kinds of copula have been implemented. In this work, we focused on three of them: Clayton, Gumbel and Gaussian copulas, that catch respectively lower, upper and no tail dependence (Appendix A.2). Rotated versions at 90 • , 180 • , and 270 • of Gumbel and Clayton copulas were also considered [14,16,17]. The version of the copula that better describes the dependence structure of our data was selected using the Bayesian Information Criteria (BIC).
According to Sklar's theory, the bivariate probability model can be characterized by the marginal distributions of S and W and a copula density function.
Within a Bayesian context, we used four different probability models for W and S as follows: where ALD is the Asymmetric Laplace Distribution with skew parameters τ S , τ W ∈ (0, 1), location parameters µ W (y, α), µ S (y, β) and constant variances σ 2 W and σ 2 S . It should be noted that the joint distribution constructed with the Gaussian copula in the scenario (A) above corresponds to the bivariate normal distribution (Nelsen [18]) of W and S. For each model based on Bayesian calibration, the mode of calibrating distribution was used as the point estimate.
The precision and accuracy of the age estimation models were assessed by: the Inter-Quartile Range of error distribution (IQR ERR ); • the mean of the quantile-based 95% Bayesian confidence interval (MCI 95% ) of the calibrating distribution.

Results
The methodology here presented was applied to a dataset of 235 orthopantomographies and wrist X-rays from healthy Italian subjects (114 males and 121 females), where the chronological age of the subjects was known (in years). In this sample, y ranged between 5 and 15 years, S between 0.04 and 4.51, and W between 0.27 and 0.96. We verified that there was no difference in the distribution of age between the training and validation samples by performing a Kolmogorov-Smirnov test, which did not reject the null hypothesis of no differences between the two distributions. Figure 1 shows the relationship between age and the predictors in both the training and validation samples. The estimated segmented parameter was ψ = 11.31 years. This means that, in our training sample, the dental maturation slope abruptly changed in proximity to this value. According to the Bayesian Information Criteria (BIC), the most suitable versions of the Gumbel and Clayton copula selected were the Rotated Gumbel (270 • ) and the Rotated Clayton (90 • ) ( Table 1), both able to capture an asymmetric dependency structure between W and S (a high value of W corresponds to a low value of S). The analysis of the predictive performance of the considered models is reported in Table 2.
Linear (W single predictor)

Copula BIC
Clayton: Abbreviations: BIC, Bayesian Information Criteria; R, Rotated. The skewness parameters of the marginal distributions were τ = 0.43 for S and τ = 0.51 for W, showing that the distribution of S was quite asymmetric while the distribution of W could be considered symmetric. The results in Table 2 show the good predictive performance of the copula instrument with a different distribution for age predictors. In fact, the presented copula models outperformed the two single-predictor models and the independent one in each of the measures considered. None of the copula models showed any significant bias in the estimated residuals on chronological age. Comparing the outcomes among panels, it may be noticed that they are mostly comparable across the copula models, while the assumption of independent age predictors seems not to provide accurate estimates of chronological age. The models with a single predictor exhibited less accurate predictions, underlining that the use of a larger set of anthropometric characteristics of the subject can lead to a better estimate of the chronological age. Figure 2 reports the calibration distributions constructed using the Rotated Clayton copula model with S for different values of S and W. This helps us to understand how distribution of age varies in relation to the anthropometric indices and highlights two relevant aspects: age distribution is non-normal and exhibits heteroskedasticity.

Discussion
Accurately estimating chronological age assumes a crucial role in forensic science and legal medicine for solving a variety of legal issues concerning criminal liability, majority status and the identification of both living and dead individuals. We introduced a new method that permits the practitioner to use more than one chronological age predictor and completely remove the bias that exists when linear regression is used. To the best of our knowledge, this is the first study that bases the age estimation process on two different anthropometric measures allowing for nonlinear relations between age and predictors without making any assumptions about the dependence structure between predictors. De Tobel [10] highlighted the necessity of an appropriate statistical approach for handling the dependence between predictors. Our study seeks to capture the joint dependence of two maturity indices, the sum of open apices in teeth, and the ratio between the total area of the eight carpal bones and the carpal area, via the combination of the flexible copula function and the Bayesian calibration. Although implemented fairly widely in other fields, copulas have not yet been used in the process of estimating chronological age. This approach is the most flexible way to combine multiple predictors, and the only one producing unbiased estimates.
In our application on real data, the copula models outperformed the models with a single predictor and the model in which the age predictors were assumed to be independent. In fact, the copula models appeared to be more accurate and robust than the independent assumption model when multiple predictors are taken into account, because they exhibited the lowest MAE, RMSE, IQR ERR and MCI 95% . The results obtained in our study also highlighted that the performance of the approach here presented was not affected by the choice of the marginal distribution, meaning that it predicts with greater accurately than the more simplistic hypothesis of independence, regardless of the distribution assumptions made on the involved variables.
Our study found that the best model for estimating chronological age had an average accuracy of ±1.9 years, using a combination of dental and wrist bone maturation as indicators of development. This level of accuracy is similar to or better than that of other Bayesian models that have been developed for this purpose. For example, Rynkiewicz et al. [19] reported an average accuracy of ±3.5 years using a wrist bone maturation method, while Chen et al. [20] reported an average accuracy of ±3.7 years using a combination of dental and wrist bone maturation. Finally, this paper extends to two indices the methodology proposed in Bucci et al. [8], where the best-performing method exhibited an accuracy of ±2.8 years. These findings suggest that the model developed in our study is a promising tool for accurately estimating chronological age in children and adolescents.
A limitation of this study is that the findings may be not representative of different populations, and the specific characteristics of the sample may have influenced the main conclusions. However, the general results from our analysis are consistent throughout the simulation with different settings and the empirical application (as described in Appendix A.3), which suggests that the model can be easily generalized to populations with different characteristics.
Furthermore, it should be mentioned that it may not be common to obtain both tooth and wrist maturity indices in practice. However, if this information is available, both indices should be used in combination to achieve a higher level of accuracy in age estimation.

Conclusions
In conclusion, the use of a larger set of anthropometric characteristics of the subject can lead to a better estimate of the subject's chronological age than using a single measure.
Possible future developments relate to the use of additional individual characteristics as predictors, such as sex or other anthropometric measurements, which could lead to an improved accuracy in age estimation. Notwithstanding that the use of more than two predictors would imply an augmented computational effort, the whole procedure could be easily applied in such a scenario by using the Pair Copula Construction technique. Institutional Review Board Statement: Ethical review and approval were waived for this study. It was conducted in conformity with the regulations on data management of the Italian law on privacy (Legislation Decree 196/2003 amended by Legislation Decree 101/2018). The data used was previously. The data used in this study were collected and published in a previous study in which ethical approval was obtained.

Informed Consent Statement: Patient consent was waived for this study.
Data Availability Statement: Software in the form of R code, together with a sample input data set and complete documentation is available on request from the corresponding author (r.gesuita@staff.univpm.it).

Conflicts of Interest:
The authors declare that they have no conflict of interest.

Appendix A Appendix A.1. Bayesian Calibration
Let us consider a sample of n subjects with known ages, y = {y 1 , . . . , y n }, with observed values of dental maturity index, s = {s 1 , . . . , s n }, and hand-wrist maturity index, w = {w 1 , . . . , w n }. Our purpose was defining the joint distribution of predictors and using it to estimate the unknown chronological age, y n+1 , of a new subject, given her/his maturity indices (s n+1 , w n+1 ). Let us define the probability model, p(s i , w i ∨ y i , θ), as the joint density function of two variables, S and W, conditional to age, and the vector of parameters, θ. Assuming that observations are conditionally independent and identically distributed (i.i.d.), the likelihood can be factorized as follows: Given the unknown chronological age y n+1 and the vector of parameters θ, we assume that the new observations s n+1 , and w n+1 are independent of the observed data and follow the same probability model. Based on these assumptions, the posterior distribution of θ may be written as: where h(θ) is the uninformative prior distribution of parameters. A uniform distribution was chosen. Then, the calibrating distribution is: f (y n+1 ∨ s n+1 , w n+1 , y, s, w) = p(y n+1 )φ(s n+1 , w n+1 ∨ y n+1 , y, s, w) ∞ 0 p(y n+1 )φ(s n+1 , w n+1 ∨ y n+1 , y, s, w)dy n+1 where φ(s n+1 , w n+1 ∨ y n+1 , y, s, w) is the joint predictive distribution: φ(s n+1 , w n+1 ∨ y n+1 , y, s, w) = θ p(s n+1 , w n+1 ∨ y n+1 , θ)h(θ ∨ y, s, w)dθ and the prior distribution of age, p(y n+1 ), is assumed to be uniform, because no a priori information is available.
The predictive distribution φ(s n+1 , w n+1 ∨ y n+1 , y, s, w) cannot be analytically solved but it can be approximated, accordingly to Markov Chain Monte Carlo (MCMC) integration, by the sample mean of the conditional densities: where θ m = 1, . . . , M, where M = 500, are posterior draws from h(θ ∨ y, s, w).

Appendix A.2. Copula Theory
Every joint distribution function contains information about the marginal behaviour of each factor and their dependence structure. An n dimensional copula is a multivariate distribution function over a hypercube [0, 1] n with uniform marginals. Copulas enable us to isolate and capture the full structure of dependence in a multivariate distribution, particularly focusing on tail dependence, which helps dependence to be understood at a deeper level than simple correlation. The copula theory was introduced by Sklar [11], who proved that every joint distribution function can be decomposed into the product of marginal distribution functions and a copula: Theorem A1. Let F be a joint distribution function with margins F 1 , . . . , F n . Then, there exists a copula C : [0, 1] n → [0, 1] such that for all x 1 , . . . , x n in R = (−∞, +∞) we have: If the margins are continuous, then C is unique; otherwise C is determined on Ra(F 1 ) × . . . × Ra(F n ), where Ra denotes the range of F i . Conversely, if C is a copula and F 1 , . . . , F n are univariate distribution functions, then the function F defined above is a joint distribution function with margins F 1 , . . . , F n .
In a bivariate case, for example, Sklar's theorem implies that: with density function: where f 1 and f 2 are the density functions of F 1 and F 2 respectively and c(F 1 (x 1 ), F 2 (x 2 )) represents the density function of copula evaluated in F 1 (x 1 ), F 2 (x 2 ).

Types of Copula
Let 0 ≤ u 1 , u 2 ≤ 1, the copula density functions for bivariate distribution are: Normal Copula: where Φ −1 is the inverse cdf of a standardized Gaussian univariate distribution and ρ is the linear correlation coefficient of u 1 and u 2 . Gumbel Copula: where θ ∈ [1, +∞). Clayton Copula: In addition, we used the rotated versions of the Gumbel and Clayton copulas. When they are rotated by 180 • , survival copulas are obtained (Nelsen [18]), while rotation by 90 • and 270 • allows us to consider a negative dependence that it is not possible to model with the standard nonrotated versions. The distribution functions of the copula C rotated by 90 • , 180 • and 270 • are given as follows: where ∈ [−1, +∞).
In addition, we used the rotated versions of the Gumbel and Clayton copulas. When they are rotated by 180°, survival copulas are obtained (Nelsen [18]), while rotation by 90° and 270° allows us to consider a negative dependence that it is not possible to model with the standard nonrotated versions. The distribution functions of the copula C rotated by 90°, 180° and 270° are given as follows:

Appendix A.3. Simulation Study
In order to assess the predictive ability of the copulas combined with Bayesian calibration, we simulated data from known models and distributions. The predictive performance of the methodology here presented was compared with that of a single predictor model and a model in which the predictors were considered independent. Given the information found in the existing literature ( [1,7,8]), we assumed a segmented relationship with a single breakpoint for the dental maturity index, S, by using a segmented function including one breakpoint [8], and a linear relationship for the hand-wrist maturation index, W. The simulation process is: where, taking into account that hand-wrist and dental maturation are completed before 16 years of age, y is sampled from a uniform distribution y ∼ U(4, 15).
The errors, ε s and ε w , are sampled from the following multivariate distributions: The parameters used in the simulation process reflect the real relationship between chronological age and the maturity indices. In this simulation, age ranges from 4 to 15 years because we are accounting for the fact that hand-wrist and dental maturation are completed before 16 years of age. We simulated N = 500 observations with a breakpoint located at ψ = 9.5 years, and randomly selected N tr = 350 observations as a training sample to estimate the parameters and N te = 150 observations as a validation sample used to verify the predictive accuracy.
The accuracy and precision of the calibrating distribution of data simulated from Equation (A1) with the marginal probability models (A-D) are reported in Tables A1 and A2. According to the Bayesian Information Criteria (BIC), the most suitable versions of the Gumbel and Clayton copulas selected were the Rotated Gumbel (270 • ) and the Rotated Clayton (90 • ). All proposed copula models outperformed the independent one with errors distributed by both Equations (A2) and (A3). In Table A1, the Rotated Clayton copula (90 • ) obtained the best results in terms of MAE and RMSE in Panels A, C and D, while the Gaussian copula was more accurate in Panel B and showed lower MCI 95% in Panels A and D. When considering error distributed as a MVSN, the Rotated Clayton copula (90 • ) exhibited the best results for MAE and RMSE in all panels (Table A2), while the Gaussian copula showed lower MCI 95% in Panel A and B.