Eigenfemora —Age-at-Death Estimation in the Proximal Femur through an Image Processing Approach

: Estimating age at death is essential to establish biological profiles from human skeletal remains in both forensic and archeological settings. Imaging studies of skeletal age changes in adults have described the metamorphosis of trabecular bone structure and bone loss in the proximal femur as well as changes in morphology during different stages of life. This study aims to assess the utility of a digital representation of conventional X-ray films of the proximal femur for the estimation of age at death in a sample of 91 adult individuals (47 females and 44 males) of the Coimbra Identified Skeletal Collection. The proposed approach showed a root mean squared error (RMSE) of 17.32 years (and mean absolute error of 13.47 years) for females and an RMSE of 14.06 years (mean absolute error of 11.08 years) for males. The main advantage of this approach is consistency in feature detection and extraction, as X-ray images projected on the femora space will always produce the same set features to be analyzed for age estimation, while more traditional methods rely heavily on operator experience that can lead to inconsistent age estimates among experts.


Introduction
The assessment of age at death is an essential precondition to establish a biological profile and a necessary empirical procedure for the identification of individual skeletal remains in both forensic and archeological contexts [1][2][3].In a forensic setting, age at death is one of the crucial biological profile parameters (i.e., biological sex, age at death, population affinities and stature) present in the initial description of an individual, allowing investigators to narrow the pool of missing persons by excluding individuals who do not share the same physical attributes [4].In archeological contexts, estimating age is important not only to characterize individuals but also to ascertain the demographic profile of past populations.Physiological aging exhibits a substantial amount of variation within and between populations, which hampers the estimation of age at death in skeletal remains, particularly in adults [3,4].No single age predictor perfectly replicates the range of factors that influence biological age, therefore it is appropriate to use as many indicators of age and age estimation techniques as possible to evaluate age at death in adult skeletons [1,3,[5][6][7].
The most common approaches to age-at-death estimation, including changes in the pubic symphysis, the auricular surface, sternal rib ends and cranial sutures, are based on a qualitative description of the skeletal remains, which is an approach highly dependent on an expert's previous experience [1,5,[7][8][9].Although it is not yet a decisive issue in past population studies (even if this does not mean that efforts are not being made to overcome this issue, especially because methodological insufficiencies may be biasing the knowledge Forensic Sci.2024, 4 2 regarding past populations), the distinctive scientific and legal demands in forensic anthropology have prompted some attempts to replace this approach with quantitative methods or at least reinforce the expertise with the application of both analytical frameworks.Due to Daubert's standards of admissibility, the need to quantify results (i.e., to rephrase the expertise in terms of error rates, probabilities, or hypothesis testing) is growing [4,10,11].While Daubert's criteria (US federal standards specified in 1993 in Daubert vs. Merrell Dow Pharmaceuticals) applies only in US federal courts, its standards have influence well beyond their bounds.This creates the necessity to develop quantitative methods in order to decrease subjectivity as the errors associated with different methods must be known, and it must be possible for different investigation teams and/or experts to replicate the results of studies and to comply with those and other admissibility criteria [4,12].Of the two general methodological approaches commonly used in forensic anthropology or bioarcheology to estimate age at death (excluding molecular methods), quantitative methods have the greatest potential to be developed and improved to meet these admissibility criteria.As is commonly recognized, morphological methods tend to be more subjective and their applicability tends to rely more on an expert's previous experience [10].
Proximal femur bone loss has also been used to systematize the standards for ageat-death estimation in adult skeletons [24,25,35,[40][41][42], including methods that use both medical imaging and machine-learning algorithms [9,15].These studies suggest that the proximal femur is reliable in terms of estimating the age at death, and average to excellent results have been found.The internal structure of the proximal femur above the de lesser trochanter comprises a slender layer of cortical bone and a dense tessiture of cancellous bone that shapes the head, neck and greater trochanter internal system of weight bearing [43].There are five groups of compressive and tensile trabeculae, whose spatial distribution is oriented according an optimized biomechanical pattern [25].The trabecular architecture of the proximal femur is exposed to a significant age-related involution with lacunae formation at the trochanter, neck and head [24,25,35,36,41].However, qualitative visual readings of trabecular bone involution in the proximal femur are prone to intra-and inter-observer errors, being substantially affected by the previous experience of researchers [36,37].As such, in the present exploratory study, a digital representation of conventional X-ray films of the proximal femur is produced, aiming to assess the forensic value of an image processing approach for automatic feature extraction and age-at-death estimation of adult individuals.

Sample
Within the research landscape of biological anthropology, Portugal is famous for its numerous identified skeletal collections [44].One of the oldest and most widely recognized is the Coimbra Identified Skeletal Collection (CISC) that comprises 505 skeletons (housed in the Department of Life Sciences at the University of Coimbra, Portugal).All individuals from this collection were born between 1817 and 1924 and died between 1904 and 1936.The majority were exhumed from the main Coimbra municipal cemetery, Cemitério Municipal da Conchada, during the first half of the 20th century.All the individuals were interred in shallow graves for at least 5 years, after which they were usually exhumed.Available ante mortem data includes, for each individual, age at death, biological sex, cause of death, occupation, marital status and year of death, among others [45].In addition to the ante mortem information, this collection also has an assemblage of complementary examinations that have been carried out and curated throughout its existence, including a collection of conventional radiographies.In terms of bone representation and preservation, almost all skeletons in the collection are complete and well preserved.For the present study, all adult individuals with conventional radiographs were initially included.Subsequently, and within this group, all individuals with taphonomic modifications that greatly affected the proximal femur and/or with visible pathological conditions were excluded.
After the initial selection, the studied material included, in total, 91 plain radiographs of the left femora.This study sample was balanced for sex and age at death (both parameters retrieved from documentary data pertaining the CISC).Female individuals comprised 51.6% (47/91) of the sample, with recorded ages at death between 22 and 89 years (mean = 51.61;SD = 18.50).The 44 males (48.4%) died at ages between 21 and 86 years (mean = 48.91;SD = 16.20).Conventional radiographies were produced using a clinical X-ray unit (GE Medical Systems ® , Chicago, IL, USA) at the Medical Image Service in the Coimbra University Hospitals.At a focal distance of 1 m, the exposure was mAseg = 80-50 at kVp = 30-35, according to individual bone weight and size.All bones were placed in a standard anteroposterior view.

Acquisition and Segmentation
The first step to assess the utility of an image processing approach for automatic feature extraction and age estimation from the proximal femur is to produce a digital representation of the conventional X-ray films.All X-ray films were mounted in a light box and digitally acquired with a tripod-mounted DSLR camera (shutter speed: 6, aperture: f/5, focal length: 55 mm, automatic white balance, resolution: w [width]-4272 by h [height]-2848).
After image acquisition, all images were manually segmented with ImageJ software (National Institutes of Health and the Laboratory for Optical and Computational Instrumentation, University of Wisconsin, Madison, MI, USA), as illustrated in Figure 1.For a more precise and robust segmentation, a polygon selection tool was used, and the region of interest was saved in a TIFF file of variable resolution.All subsequent analyses were conducted in the segmented regions only.
tons (housed in the Department of Life Sciences at the University of Coimbra, Portugal).All individuals from this collection were born between 1817 and 1924 and died between 1904 and 1936.The majority were exhumed from the main Coimbra municipal cemetery, Cemitério Municipal da Conchada, during the first half of the 20th century.All the individuals were interred in shallow graves for at least 5 years, after which they were usually exhumed.Available ante mortem data includes, for each individual, age at death, biological sex, cause of death, occupation, marital status and year of death, among others [45].In addition to the ante mortem information, this collection also has an assemblage of complementary examinations that have been carried out and curated throughout its existence, including a collection of conventional radiographies.In terms of bone representation and preservation, almost all skeletons in the collection are complete and well preserved.For the present study, all adult individuals with conventional radiographs were initially included.Subsequently, and within this group, all individuals with taphonomic modifications that greatly affected the proximal femur and/or with visible pathological conditions were excluded.
After the initial selection, the studied material included, in total, 91 plain radiographs of the left femora.This study sample was balanced for sex and age at death (both parameters retrieved from documentary data pertaining the CISC).Female individuals comprised 51.6% (47/91) of the sample, with recorded ages at death between 22 and 89 years (mean = 51.61;SD = 18.50).The 44 males (48.4%) died at ages between 21 and 86 years (mean = 48.91;SD = 16.20).Conventional radiographies were produced using a clinical Xray unit (GE Medical Systems ® , Chicago, IL, USA) at the Medical Image Service in the Coimbra University Hospitals.At a focal distance of 1 m, the exposure was mAseg = 80-50 at kVp = 30-35, according to individual bone weight and size.All bones were placed in a standard anteroposterior view.

Acquisition and Segmentation
The first step to assess the utility of an image processing approach for automatic feature extraction and age estimation from the proximal femur is to produce a digital representation of the conventional X-ray films.All X-ray films were mounted in a light box and digitally acquired with a tripod-mounted DSLR camera (shutter speed: 6, aperture: f/5, focal length: 55 mm, automatic white balance, resolution: w [width]-4272 by h [height]-2848).
After image acquisition, all images were manually segmented with ImageJ software (National Institutes of Health and the Laboratory for Optical and Computational Instrumentation, University of Wisconsin, Madison, MI, USA), as illustrated in Figure 1.For a more precise and robust segmentation, a polygon selection tool was used, and the region of interest was saved in a TIFF file of variable resolution.All subsequent analyses were conducted in the segmented regions only.Previous approaches to age estimation from bone images or radiological images relied on macroscopic observation or the calculation of first and second order statistics of pixel distribution [46].The approach presented here is based on the construction of an appearance-based model to code information from proximal femur images using principal components analysis (PCA).A PCA is frequently employed as a technique for reducing the dimensionality of extensive datasets.It achieves this by converting a large number of variables into a more concise set that retains the majority of the relevant information found in the original dataset.The specific technique employed in this study was initially developed in the field of computer vision and pattern recognition for the purpose of face detection and recognition [47].
This approach uses a PCA to encode the information in a database of face images through the eigenvectors of the covariance matrix of the image database.The principal component analysis projects the face images on a feature subspace called face space, which is composed by the eigenvectors of a set of face images that encode the information from each face.The eigenvectors can be thought as features extracted from the images and can be used to compare faces in a pair-wise fashion to perform face recognition.The eigenvectors or eigenfaces, as called by Turk and Pentland [47], encode local and global features of the face, but this features may not correspond to nose, eyes, or mouth as perceived by humans.This approach is computationally simple and allows for the efficient learning of features of high-dimensional inputs (images) in unsupervised manner.By analogy, the subspace features were labeled "femora space" and the extracted features "eigenfemora".
In order to apply Turk and Pentland's PCA methodology to images, several steps were first performed to transform the proximal femur segmented images to a representation suitable for a principal component analysis.First, each segmented image was re-dimensioned using a bi-spline interpolation algorithm so that all images shared the same resolution of w-by-h, making w and h half of the mean value of w (width) and h (height) of all images.Subsequently, all images were transformed to a grayscale color map, and their histogram was normalized using the contrast limited adaptive histogram equalization (CLAHE) technique to enhance the details of bones.At last, each image was reshaped from the w-by-h matrix to a wh-by-1 vector.All image vectors were stored in a matrix, D (defined as m-by-n, with m = wh), composing the database.
To produce the femora space and corresponding eigenfemora, the mean configuration of the proximal femora database was first subtracted from each femur image for purposes of normalization.Then, a principal components analysis of the covariance matrix of the database D was enacted, defined as L = DTD instead of L = DDT (see Turk and Pentland [47] for a detailed mathematical explanation).The eigenvectors of the covariance matrix L were sorted in a descending way, according to their eigenvalues, and stored in a matrix, W. The femora space, a matrix S, and corresponding eigenfemora are obtained by S = DWT.The femora space contains the exact same numbers of femora used in the database X, but it can be represented by a number of eigenfemora that account for a specific percentage of variation of the femora in the database D. In this case, the subspace is represented by a number of eigenfemora that account for 95% of the variation of the dataset.
The final step of this algorithmic approach is to obtain variables from each proximal femur image that can be used in the subsequent statistical analysis.The features or variables that characterize each femur image are obtained by projecting the image into the femora space.The projection is performed by the matrix operation STD, which results in a new matrix, X, that is composed of features encoded by the eigenfemora for each proximal femur image projected into the femora space.

Age Estimation from the Femora Space with K-Nearest Neighbor Search
After the feature extraction phase, the variables contained in matrix X were statistically analyzed to evaluate their relationship with age at death, and the respective Spearman's rho correlation coefficient was computed.To perform age estimation, a simple k-nearest neighbor (k-nn) search procedure using a kd-tree [48] was implemented.By using this simple technique, a model-free, non-parametric and non-linear regression approach to age estimation was applied [49].The only assumption made by the use of this kind of approach is that age can be estimated using information from previous examples that are similar to those the forensic expert has in hand.This age estimation procedure is simple and consists in finding, in the femora space, the k most similar proximal femora to the one being analyzed.Once the most similar femora are found, the age prediction for an unidentified set of human remains is given by calculating simple statistics such as percentiles (2.5, 50 and 97.5) for the age distribution of the most similar cases.In this manner, a prediction interval (95% prediction interval) and a point estimate of age at death (the percentile 50) is produced.In this case, weighted percentiles were calculated to ensure that the most similar cases have a higher contribution to the final age estimate.The weights used are set as e-d, where d is the Euclidean distance of the similar example to the target proximal femur.
The performance of this approach was assessed using global indicators of goodness of fit [50]: root mean square error (RMSE), mean absolute error (MAE), bias, mean width of the prediction interval (MW) and coverage of the prediction interval.The root mean squared error is the square root of the average squared loss, while the mean absolute error is estimated in a comparable way utilizing the absolute instead of the squared difference between predicted and documented age.Bias, or mean bias error, is defined as the mean difference between the predicted age and the documented age.The prediction interval is a range of values that is likely to contain the value of a single new observation with a certain level of confidence.For unbiased and realistic estimates of these parameters, the utility of this procedure was evaluated using a leave-one-out resampling strategy (also known as a jackknife cross-validation).The jackknife cross-validation is a resampling technique that assesses the bias and standard error of an estimate, recomputing the estimate from subsamples (n − 1) of the total sample.The optimal value of k, the number of femora retrieved used to estimate age at death, was also determined with the aforementioned validation strategy.
All the computational procedures were carried out with MATLAB ® (MathWorks, Inc., Portola Valley, CA, USA) programming language and scientific computing environment.

Results
One of the advantages of the technique analyzed in this paper is that every element of the femora space, the eigenfemora, can be represented as an image itself.Figures 2 and 3 illustrate some of the elements of the femora space generated for the female and male individuals present in the dataset.Each individual eigenfemur resembles a ghostly image of a proximal femur, composed of superimposed femora images.This is because each eigenfemur encodes specific information from each image of the database D. Figure 3 depicts a jet color map representation of the first eight eigenfemora for the male sample, where the red color represents the region of the image that is encoded by the respective eigenfemur.In the implemented k-nearest neighbor search procedure for age estimation, only the eigenfemora that showed statistical relationships with age at death were considered.Despite the advantages of this procedure as a non-parametric regression technique, k-nearest neighbor algorithms are very sensitive to redundant features, and since the target variable is age at death, it makes sense to only use features that are related to it.
The proposed approach shows a modest success in age-at-death prediction for female individuals, with a relative mean standard error of 17.32 years, a mean absolute error of 13.47 years and a bias of 0.78.The generated prediction intervals only contained the known age in 83% of the cases, although they are remarkably wide (MW = 53.04 years on average).In the male sample, the performance was slightly better, with an RMSE of 14.06 years, a MAE of 11.08 years and a bias of 0.68.The mean width of the prediction intervals was smaller, 43.59 years, and contained the real age in 96% of the cases.The reported values are associated with k = 11 for females and k = 9 in males.In both sexes, age at death was systematically over estimated for young individuals and under estimated for older individuals.

Discussion
Estimating the age of an individual at the time of death holds crucial importance in the fields of forensic anthropology and human bioarcheology.Thus, enhancing the During the process of subspace generation, 95% of variation of the proximal femora was encoded in 40 female eigenfemora and in 39 eigenfemora for male individuals.In the female sample, only the information captured by the 8th eigenfemur contained a variation related to age at death (Spearman's rho = −0.4074,p-value = 0.0045).In male individuals, the 1st, 4th, 5th and 6th revealed statistically significant relationships with age at death (Spearman's rho = 0.3510, Spearman's rho = −0.3127,Spearman's rho = −0.3724,Spearman's rho = 0.3092; p-value = 0.0195, p-value = 0.038, p-value = 0.0128, p-value = 0.0411).
In the implemented k-nearest neighbor search procedure for age estimation, only the eigenfemora that showed statistical relationships with age at death were considered.Despite the advantages of this procedure as a non-parametric regression technique, knearest neighbor algorithms are very sensitive to redundant features, and since the target variable is age at death, it makes sense to only use features that are related to it.
The proposed approach shows a modest success in age-at-death prediction for female individuals, with a relative mean standard error of 17.32 years, a mean absolute error of 13.47 years and a bias of 0.78.The generated prediction intervals only contained the known age in 83% of the cases, although they are remarkably wide (MW = 53.04 years on average).In the male sample, the performance was slightly better, with an RMSE of 14.06 years, a MAE of 11.08 years and a bias of 0.68.The mean width of the prediction intervals was smaller, 43.59 years, and contained the real age in 96% of the cases.The reported values are associated with k = 11 for females and k = 9 in males.In both sexes, age at death was systematically over estimated for young individuals and under estimated for older individuals.

Discussion
Estimating the age of an individual at the time of death holds crucial importance in the fields of forensic anthropology and human bioarcheology.Thus, enhancing the accuracy of age estimation by using different skeletal areas, medical imaging and innovative statistical frameworks (including statistics within the machine-learning paradigm) is pivotal to improve the chances of identifying anonymous skeletal remains [3,9,28].Involutional bone loss is an extended, gradual and permanent process that occurs in both sexes.Significant trabecular bone loss starts in young adults of both sexes at different skeletal sites and endures throughout life with an increase in rate around menopause in females [18].The metamorphosis of trabecular bone structure and bone loss evaluated trough radiographs has been used to develop age-at-death estimation methods [25,[35][36][37][38][39].The majority of these studies are based on the proximal femur trabecular structure and architecture.In fact, due to its morphology and composition, the femur is usually recovered in forensic cases.
As the bigger and one of the more robust bones of the human skeleton, its preservation is often better than that of other parts of skeleton [51].In fact, the femur, and particularly the proximal region, is of paramount significance for the assessment of different parameters of the biological profile [52][53][54][55][56], including age at death [9,15,40,42,57].
Previous methods for the estimation of age through X-rays of the proximal femur were based on the visual assessment of trabecular involution, e.g., [24,35,36], and were procedural frameworks where age assignation was inherently more subjective and dependent on the prior experience of the observer(s).Thus, the key advantage of the eigenfemora approach relates with the consistency in feature detection and extraction, as X-ray images projected on the femora space will always produce the same set of features to be analyzed for age estimation, while more traditional methods deeply rely on observer experience that can lead to inconsistent age estimates among experts [6,8,11].
In general, age-at-death estimation methods in both forensic and archeological contexts are still prone to substantial rates of error, being inaccurate, biased and unreliable.Thus, the results are promising but are not optimal and still distant from a superlative assessment of age at death, especially if applied to a forensic case where a narrower age-at-death interval is required to generate the biological profile.The age estimates produced by the implemented eigenfemora approach show a comparable error rate to the visual analysis of the proximal femur trabecular structure and architecture [36] and other age estimation techniques that rely on the macroscopic analysis of the pubic symphysis and auricular surface [58][59][60].The latter comparison is even more relevant since the pelvis is not always recovered in good preservation conditions, unlike the femur-which, as the heaviest and strongest bone is frequently recovered in both forensic and archeological contexts [61].
Nevertheless, the method presented here performs slightly worse (performance comparisons through RMSE and MAE) than recent methodological routines that rely on advanced mathematical approaches, e.g., [3,62], including those that are also based on medical imaging of the proximal femur [9,15].Interestingly, and contrary to what was observed in densitometric studies of the proximal femur, the eigenfemora approach is more accurate in the age estimation of male individuals and less accurate among females.This highlights a potential benefit for age estimation in male individuals through eigenfemora models when compared with age prediction through densitometry, e.g., [9,15].In fact, bone decline with age at the proximal femur is more obvious in female individuals-which seems to be a general trend in different populations [15,22,23,40,42].Trabecular age-related changes are also influenced by biological sex: both sexes appear to lose a comparable volume of trabecular bone with females tending to lose more trabeculae, while males suffer more trabecular thinning with the loss of trabecular elements [63,64].
The eigenfemora models show low prediction bias, but there is a systematic over estimation for young individuals and an under estimation for older individuals.This tendency was also observed in other works that evaluated age-related changes in the proximal femur [36,57], and has been long identified as a major feature of prediction inaccuracy in different age estimation methods [65].Claude Masset devised the term "attraction of the middle" to describe this phenomenon and associated it with the specific age distribution of the reference sample employed in creating any age estimation method.However, it has also been proposed that this error is, to some extent, a consequence of the statistical techniques, particularly linear regression, employed to predict chronological age from biological age predictors [57].
As with other age estimation techniques, biological variation inherent to structures under analysis limits the information extracted [2].Bone loss and trabecular structure, particularly, are influenced by a plethora of individual and population factors that can hinder age assessment, including physical activity, diet, genetics or hormonal status [9,16,57,63].The characteristic individual and population variability in the expression of bone loss influences the accuracy and bias of methods that exploit the relationship between it and age [9].Also, plain skeletal radiographs can only be used in a limited fashion to quantify bone density with relation to age, as demineralization only becomes visually apparent after a 30-40% or more loss of bone density [66].This technical constraint thwarts any radiological assessment of the initial stages of bone loss.
The eigenfemora approach is based on the principal components analysis (PCA), which is a well-established technique in the field of computer vision and image processing.Such an approach has a strong theoretical foundation and has been rigorously tested in other contexts.While deep learning methods have shown promise in various applications, they may require large amounts of data and computational resources to train and optimize models [3,67].In contrast, the eigenfemora approach is based on a relatively small dataset of proximal femur images.This may make it more practical and cost-effective for certain applications, particularly in forensic anthropology and archeology where resources may be limited.The eigenimage analysis offers a well-established and consistent method for image analysis in forensic age estimation and should be seen as a link to more complex methods, such as deep learning techniques that are available as data expands.
Limitations of this exploratory research study include the use of conventional radiographs (although the eigenfemora routine of analysis might be easily affected in digital radiographs) of the proximal femur and of a relatively small and population-homogeneous sample of identified skeletal remains for the survey of age-related changes in the proximal femur.

Final Remarks
The age estimation of adult individuals (notably older people) persists as one of the most challenging tasks in the workflow of forensic anthropology expertise, as it is particularly influenced by the subjective analyses of age-dependent changes in the skeleton.The aim of the present exploratory study was to improve age-at-death assessment using a bone imaging method that is more objective than a mere visual/macroscopic inspection.The results are promising, especially in male individuals, but warrant further investigations in larger and more heterogeneous samples, particularly in terms of age distribution and with different population origins.For example, it would be particularly interesting to evaluate the performance of the method in samples of the following: elderly individuals; individuals with bone loss and osteoporosis; and individuals with pathologies that cause marked bone growth, such as diffuse idiopathic skeletal hyperostosis (DISH) or bone fragility such as osteomalacia and Paget's disease.

Figure 1 .
Figure 1.Feature extraction with principal components analysis-a femora space and eigenfemora approach.

Forensic Sci. 2024, 4 , 6 Figure 2 .
Figure 2. First eight eigenfemora of the female sample (ordered from top left to bottom right).

Figure 2 .
Figure 2. First eight eigenfemora of the female sample (ordered from top left to bottom right).

Figure 2 .
Figure 2. First eight eigenfemora of the female sample (ordered from top left to bottom right).

Figure 3 .
Figure 3. Jet color map representation of the first eight eigenfemora of the male sample (ordered from top left to bottom right).

Figure 3 .
Figure 3. Jet color map representation of the first eight eigenfemora of the male sample (ordered from top left to bottom right).