Cervical Vertebral Maturation Method: Reproducibility and Efﬁciency of Chronological Age Estimation

: The aim of this study was to investigate the reproducibility of the Cervical Vertebral Maturation (CVM) method and the potential for chronological age estimation using this method. The sample consisted of 474 lateral cephalometric radiographs, from orthodontic patients aged 6.4–22.4 years. Six raters were trained to the CVM method (Baccetti). All images were assessed twice. Intra- and inter-rater agreements were assessed by Cohen’s weighted kappa and intraclass correlation coefﬁcient, respectively. Analysis of variance was performed to investigate the correlation between cervical maturation stages and chronological age. The age prediction potential of the method was tested by general linear model regression analysis. Intra-rater reliability ranged from 0.857 to 0.931. Intra-rater absolute agreement ranged from 77% to 87% however inter-rater absolute agreement was lower than 50%. Inter-rater reliability was higher than 0.9. The 3rd Cervical Maturation Stage (CS3) showed the lowest reproducibility. The mean age differences among the 6 CS stages were statistically signiﬁcant and increased as the CS increased. CS and gender could roughly explain the 60% (adjusted R 2 = 0.61) of the age variance of the sample. This CVM method proved able to show high reliability; however, it cannot predict accurately the pubertal growth spurt. A direct correlation was found between cervical stages and chronological age. This method provides a broad estimation of chronological age.


Introduction
Skeletal maturation is important in orthodontics in order to predict growth peak and completion as well as growth acceleration and deceleration periods. This information is useful mainly in cases of skeletal discrepancies and for clinical decisions regarding timing for orthognathic surgery or treatment initiation time [1][2][3][4][5][6]. Additionally, skeletal maturation may be useful for age estimation in living individuals or human cadavers. Identification of an unknown deceased person is important in individual cases and in mass disasters (Disaster Victim Identification, DVI). In these cases, it is crucial to reconstruct the biological profile of a specimen (by gathering information such as age and gender), even when identification is impossible. In both cases, age estimation is based on dental and skeletal maturity [7][8][9][10][11].
Several biological indices have been proposed in order to assess skeletal maturity including, stature changes, chronological and dental age, appearance of secondary gender characteristics, and ossification of hand and wrist as well as maturation of the cervical vertebrae (CVM) [12][13][14][15][16][17][18]. Hand wrist and cone beam computed tomographic images may be helpful to assess skeletal maturity [5,19]. However, most studies have been conducted with lateral cephalometric radiographs [1,5,18,20,21]. The latter are often included in the initial diagnostic records required for orthodontic diagnosis and treatment planning. The CVM estimation from lateral cephalometric radiographs shows high reliability in comparison to the methods based on the growth increments of the mandible and the ossification of hand and wrist [5,18,[21][22][23]. Two studies stand out as the most influential in this research field [1,21]. Both evaluate the shape of the cervical vertebrae and the morphology of their lower borders in order to assess the maturation stage. A recent systematic review reported that both these methods are reliable enough to replace the hand-wrist radiograph in predicting the pubertal growth spurt [24].
Unfortunately, no scientific consensus has been reached regarding the method's reproducibility and accuracy. Some studies reported high reproducibility and accuracy; however, others observed discordance in the assessments of the maturation stage among the examiners [2,4,[25][26][27][28]. Recent systematic reviews have identified several studies that present methodological weaknesses related to study design (blinding, randomization, sample size), statistical analysis and interpretation of the results [5,24]. Consequently, further research concerning the reproducibility of the CVM method is needed [2,5].
The aims of this study were to investigate (1) the reproducibility of the Cervical Vertebral Maturation (CVM) method according Baccetti et al. [1] and (2) the potential for chronological age estimation using this method.

Materials and Methods
The radiographs were selected from the archives of the postgraduate clinic of the Department of Orthodontics, School of Dentistry, National and Kapodistrian University of Athens, Greece. These radiographs were required for diagnosis and treatment planning. Only gender and age were recorded for each subject.
The sample consisted of 474 radiographs who met the following inclusion criteria: good image quality for visibility of the 2nd, 3rd and 4th cervical vertebrae (C2-C4), belonging to children, adolescent or young adults with unremarkable medical history, having no inherited or acquired craniofacial deformities, and having no previous craniofacial trauma and no previous orthodontic treatment. Radiographs from subjects on long-term medication or with syndromes, congenital anomalies (such as absence of pedicle [29], cleft lip and/or palate), metabolic or developmental diseases or nutritional problems that may have an impact on craniofacial and vertebrae development, were excluded.
All cephalometric radiographs were acquired with the same x-ray setup [Planmenca Promax (Helsinki, Finland), max KV 84, 26 mm Al filter, 1700 VA] using identical sourcesubject-film distances. Image distortion was not considered since linear measurements were not performed.
The skeletal maturation was assessed according to Baccetti's method [1]. This 6stage method (CS1-CS6) is based on the assessment of anatomical changes of the 2nd, 3rd and 4th cervical vertebrae (C2, C3 and C4). The method evaluates two parameters: (a) the presence or absence of concavity of the lower border of the vertebrae and (b) their shapes. Four distinct shapes are recognized, i.e., square, trapezoid, rectangular vertical and rectangular horizontal ( Figure 1).
All radiographs were cropped, in order to depict only the cervical vertebrae area, and randomized. A Microsoft Excel database was formed and distributed to the raters, who were blinded to name, sex and age.
The images were evaluated independently by one medical radiologist (MR), one dentomaxillofacial radiologist (DR), two orthodontists (OR1 and OR2), one dentist (D) and one 3rd year orthodontic postgraduate student (OPS). The calibration procedure consisted of two sessions: (a) a theoretical session, which included a presentation of the CVM method, and (b) a clinical session. During the latter, the raters evaluated 40 images (not included in the sample). The results were analyzed, and the disagreements discussed [1,30]. In case of uncertainty, the earlier maturation stage between two consecutive stages was chosen. The ratings sessions were performed no later than 5 days after the calibration session and under the same conditions in a diagnostic monitor. A time limit was not imposed on the rating session's duration; however, a maximum number of 30 images was allowed in each session in order to minimize errors due to raters' fatigue. Each examiner re-evaluated all images after one month. The images' order was randomly changed for this second rating. Statistical analysis was performed using the STATISTICA 10 software package (TIBCO Software Inc., Palo Alto, California, CA, USA) and MedCalc software (v. 14.10.2) for MS Windows (Microsoft, Washington, DC, USA). p-values ≤ 0.05 were considered statistically significant. Inter-rater agreement was measured using the intraclass correlation coefficient and intra-rater agreement was assessed by Cohen's weighted kappa. Method's reliability was evaluated based on intra-and inter-rater agreement. A factorial (2 × 6) analysis of variance was used to investigate the correlation between the CVM stages and the chronological age, using mean Cervical Maturation Stage (CS; between the 6 ratings) and gender as grouping factors and chronological age as the dependent variable. Furthermore, the potential of this CVM method in chronological age assessment was evaluated by general linear model regression analysis using age as the dependent variable and gender and mean CS as the predictor variables.

Results
The sample included digital lateral cephalometric radiographs from 474 Caucasian subjects (mean age 13 Intra-rater and inter-rater agreement were assessed among all the examiners after the calibration procedure. Intra-rater agreement was high, ranging from 0.857 to 0.931, whereas intra-rater absolute agreement ranged from 77% (367 out of 474) to 87.3% (414 out of 474) ( Table 1). Intra-rater absolute agreement by stage (CS) was calculated and CS3 was the least reproducible stage. Inter-rater reliability was high (0.90). Inter-rater agreement ranging from absolute agreement to 5 stages disagreement was also assessed. Inter-rater absolute agreement was lower than 50% (Table 1). A mean age for each maturation stage was calculated for each gender separately as well as for the total sample. It is clear that mean chronological age increased as CVM stages progressed and males reach CS3 and CS4 later than females (Table 2, Figure 2).  The age confidence intervals among CS1, CS2, CS3, CS5 and CS6 appeared to overlap in each gender group as well as in the total sample. However, a clear distinction among CS3, CS4 and CS5 was observed in the total sample as well as in each gender group. Females aged ≤11.21 years and males aged ≤1.75 years belong to the first three CS. Females older than 15.72 years and males older than 16.17 years belong to CS5 or CS6.
The data were analyzed by means of analysis of variance (gender F: 9.15, p: 0.002629, CS F: 159.43, p < 0.001, gender *CS F: 0.1, p: 0.992637). LSD (Least Significant Difference) test showed that the differences in mean ages among the 6 CS stages were statistically significant and increased as the CS increased. The only exception was CS2 and CS3 that did not show statistically significant difference, albeit the difference was in the expected direction (Table 3). Linear model regression analysis was based on highly significant relationships ( Table 4). The adjusted coefficient of determination (adjusted R 2 = 0.61) reveals that the regression line fits the data relatively well; gender and CS, could explain 61% of the age variance in the sample. The accuracy of age prediction was rather high with only 27 out of the 474 subjects (6%) appearing as outliers using the ±2SD approach. The majority of the 27 outliers (22/27, i.e., 81%) were ≥20 years old. The residuals were mostly positive, indicating a tendency to underestimate chronological age in the subjects.

Discussion
In 1886, Giacomini was the first to investigate the anatomy of cervical vertebrae and document Os odontoideum (OO), an anatomic anomaly of the dens of C2 [30]. Recently, the CVM method has gained popularity in orthodontics since the cervical vertebrae are often visible in the lateral cephalometric radiograph, which is part of the standard documentation used in orthodontic diagnosis [31][32][33][34][35][36][37][38]. The CVM method according to Baccetti was selected since it is a simple, widely used visual method, which focuses on the 2nd, 3rd and 4th cervical vertebrae [1][2][3][4]17,[31][32][33][34][35]. However, the method's reproducibility has been called into question [2,26,28,36,37,[39][40][41][42]. Some authors concluded that CVM is a reproducible and reliable method [42]; however, others concluded that the lower scores of CVM reproducibility may render this method too variable to be used as a strict clinical guideline [2]. In a recent systematic review, the reproducibility was rather high (98.6%), although another review stresses the need for further research on the reproducibility of this method [24,43].
The present study addressed the reproducibility issue and avoided common methodological flaws encountered in the literature. The sample was homogenous, selected with strict inclusion and exclusion criteria. The rater team consisted of six raters with different educational backgrounds and clinical experience. The inter-and intra-rater agreement were both high. Raters were in absolute agreement or in only 1-stage-apart disagreement in 85% of the assessments. Additionally, at the re-evaluation the rater agreed with her/his previous assessment at least 3 out of 4 times. These findings point to the reproducibility of the CVM method. Consequently, the reliability of this CVM method appears to be high since cervical staging assessment can be repeated with similar findings, a finding in agreement with previous studies [28,31,33,35,44].
Absolute agreement among the examiners was lower than 50%, and the reproducibility of CS3, the stage that marks the beginning of pubertal growth, was the lowest among all stages [1]. These findings suggest that this CVM method may be unreliable to predict the pubertal spurt [2,45]. A range of ±1 CS might be quite wide in orthodontics, but could be tolerated in different fields (i.e., forensics).
A recent morphometric study concluded that cervical vertebrae shape alone could not predict skeletal maturation better than chronologic age [20]. Growth and bone maturation are complex phenomena, influenced by several environmental and genetic factors [43]. Recently, a Y-shaped trabecular structure of the odontoid process has been identified. This structure appears to be a biomechanical response to the dynamic loading at the CV1-2 level [46]. Moreover, the posterior aspect of the cervical endplate possesses higher stiffness and yield load than the anterior endplate [47]. Different study designs with heterogeneous groups may yield different results [39,44,45]. Additionally, a slow and gradual transition between stages is expected and there can be morphological vertebrae variations that do not correspond to any of the six stages described above [48] (Figure 3). Inherent limitations of the CVM method such as subjectivity and experience of raters may further complicate the correct interpretation. The criteria for the identification of the concavity at the lower border of the vertebrae, as well as for the shape assessment, are not strict [28,41]. Subjectivity is mainly influenced by training, while experience seems to have no to minimum consequence [41]. Recently, special software applications have been developed based on Deep Learning and Artificial Intelligence. These tools apply objective criteria for the assessment of the degree of CVM [49][50][51]. Age estimation is of major importance in forensic science [11,52,53]. Several methods have been proposed, even a prediction model for saliva samples [54]. The CVM method is not widely used in this field [55] since it is useful for age prediction for younger age groups, mainly for CS1-CS4. Radiographs on CS5/CS6 may belong to adults, aged well beyond the predicted values of this study's equation. Despite this inherent limitation, an approximate age estimate could be obtained for subjects of both genders and CS1-CS4. Based on the present regression equation, the predicted age for CS4 is 13.85 years and 14.49 years for girls and boys, respectively. Consequently, subjects with skeletal maturity up to and including CS4 will most likely be underage. This finding may provide a practical guide for the forensic identification or the determination of legal responsibility.
The thyroid gland is an extremely radiosensitive organ and patient's exposure dose should be kept as low as reasonably achievable (ALARA). Under this aspect, a collimated beam or a thyroid collar is more beneficial than any information gained. Thus, in cases where a skeletal maturity assessment is required, a hand-wrist radiograph might be preferable, in addition to an ALARA complied cephalometric radiograph [1,6,21,[31][32][33]35,36]. Hand wrist radiographs have presented several limitations; nevertheless, they have been used as a gold standard in the assessment of skeletal maturation for many decades [49].
During rating sessions and in cases of uncertainty, the earlier maturation stage between two consecutive stages was chosen. This represents a limitation of the present study and may have influenced the intra-and inter-rater agreement. Additionally, some radiographs were excluded due to incomplete visualization of the C2-4. This fact may limit the clinical application of the CVM method.

Conclusions
This CVM method presented high reliability in the present sample. The method's subjectivity is pointed out since intra-rater absolute agreement was 77-87% while inter-rater absolute agreement was a little less than 50%. This method is unreliable for recognizing pubertal spurt and provides a broad estimation of chronological age. Subjects with skeletal maturity up to and including CS4 will most likely be underage. This method is helpful in providing a broad estimation of chronological age and should be used in combination with other indices if a more accurate estimation is required.