Accuracy of Artificial Intelligence for Cervical Vertebral Maturation Assessment—A Systematic Review
Abstract
:1. Introduction
2. Materials and Methods
2.1. Search Strategy and Eligibility Criteria
2.2. Data Extraction and Quality Assessment
3. Results
3.1. Search Results
3.2. Risk of Bias
3.3. Methods of CVM Assessment and Reference Standards
3.4. AI Models
3.5. Diagnostic Accuracy
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chino, Y. AI in Medicine: Creating a Safe and Equitable Future. Lancet 2023, 402, 503. [Google Scholar]
- McNabb, N.K.; Christensen, E.W.; Rula, E.Y.; Coombs, L.; Dreyer, K.; Wald, C.; Treml, C. Projected Growth in FDA-Approved Artificial Intelligence Products Given Venture Capital Funding. J. Am. Coll. Radiol. 2024, 21, 617–623. [Google Scholar] [CrossRef]
- Kunz, F.; Stellzig-Eisenhauer, A.; Boldt, J. Applications of Artificial Intelligence in Orthodontics—An Overview and Perspective Based on the Current State of the Art. Appl. Sci. 2023, 13, 3850. [Google Scholar] [CrossRef]
- Abesi, F.; Jamali, A.S.; Zamani, M. Accuracy of Artificial Intelligence in the Detection and Segmentation of Oral and Maxillo-facial Structures Using Cone-Beam Computed Tomography Images: A Systematic Review and Meta-Analysis. Pol. J. Radiol. 2023, 88, e256. [Google Scholar] [CrossRef] [PubMed]
- Monill-González, A.; Rovira-Calatayud, L.; D’oliveira, N.G.; Ustrell-Torrent, J.M. Artificial intelligence in orthodontics: Where are we now? A scoping review. Orthod. Craniofacial Res. 2021, 24, 6–15. [Google Scholar] [CrossRef] [PubMed]
- Nishimoto, S. Locating Cephalometric Landmarks with Multi-Phase Deep Learning. J. Dent. Health Oral Res. 2023, 4, 1–13. [Google Scholar] [CrossRef]
- Singh, S.; Singh, M.; Saini, A.; Misra, V.; Sharma, V.; Singh, G. Timing of Myofunctional Appliance Therapy. J. Clin. Pediatr. Dent. 2010, 35, 233–240. [Google Scholar] [CrossRef]
- Flores-Mir, C.; Nebbe, B.; Major, P.W. Use of skeletal maturation based on hand-wrist radiographic analysis as a predictor of facial growth: A systematic review. Angle Orthod. 2004, 74, 118–124. [Google Scholar] [CrossRef]
- Baccetti, T.; Franchi, L.; McNamara, J.A. The Cervical Vertebral Maturation (CVM) Method for the Assessment of Optimal Treatment Timing in Dentofacial Orthopedics. Semin. Orthod. 2005, 11, 119–129. [Google Scholar] [CrossRef]
- McNamara, J.A.; Bookstein, F.L.; Shaughnessy, T.G. Skeletal and dental changes following functional regulator therapy on class II patients. Am. J. Orthod. 1985, 88, 91–110. [Google Scholar] [CrossRef]
- Khanagar, S.B.; Al-Ehaideb, A.; Vishwanathaiah, S.; Maganur, P.C.; Patil, S.; Naik, S.; Baeshen, H.A.; Sarode, S.S. Scope and performance of artificial intelligence technology in orthodontic diagnosis, treatment planning, and clinical decision-making—A systematic review. J. Dent. Sci. 2021, 16, 482–492. [Google Scholar] [CrossRef] [PubMed]
- Kim, D.; Kim, J.; Kim, T.; Kim, T.; Kim, Y.; Song, I.; Ahn, B.; Choo, J.; Lee, D. Prediction of hand-wrist maturation stages based on cervical vertebrae images using artificial intelligence. Orthod. Craniofacial Res. 2021, 24, 68–75. [Google Scholar] [CrossRef]
- Uysal, T.; Sari, Z.; Ramoglu, S.I.; Basciftci, F.A. Relationships between dental and skeletal maturity in Turkish subjects. Angle Orthod. 2004, 74, 657–664. [Google Scholar] [CrossRef] [PubMed]
- Jourieh, A.; Khan, H.; Mheissen, S.; Assali, M.; Alam, M.K. The Correlation between Dental Stages and Skeletal Maturity Stages. BioMed Res. Int. 2021, 2021, 9986498. [Google Scholar] [CrossRef] [PubMed]
- Morris, J.M.; Park, J.H. Correlation of Dental Maturity with Skeletal Maturity from Radiographic Assessment. J. Clin. Pediatr. Dent. 2012, 36, 309–314. [Google Scholar] [CrossRef] [PubMed]
- Szemraj, A.; Wojtaszek-Słomińska, A.; Racka-Pilszak, B. Is the cervical vertebral maturation (CVM) method effective enough to replace the hand-wrist maturation (HWM) method in determining skeletal maturation?—A systematic review. Eur. J. Radiol. 2018, 102, 125–128. [Google Scholar] [CrossRef] [PubMed]
- Kapetanović, A.; Oosterkamp, B.C.M.; Lamberts, A.A.; Schols, J.G.J.H. Orthodontic radiology: Development of a clinical practice guideline. Radiol. Medica 2021, 126, 72–82. [Google Scholar] [CrossRef] [PubMed]
- Mituś-Kenig, M. Bone age assessment using cephalometric photographs. Pol. J. Radiol. 2013, 78, 19–25. [Google Scholar] [CrossRef]
- Hassel, B.; Farman, A.G. Skeletal maturation evaluation using cervical vertebrae. Am. J. Orthod. Dentofac. Orthop. 1995, 107, 58–66. [Google Scholar] [CrossRef]
- Baccetti, T.; Franchi, L.; McNamara, J.A., Jr. An improved version of the cervical vertebral maturation (CVM) method for the assessment of mandibular growth. Angle Orthod. 2002, 72, 316–323. [Google Scholar]
- Gray, S.; Bennani, H.; Farella, M. Authors’ response. Am. J. Orthod. Dentofac. Orthop. 2016, 150, 7–8. [Google Scholar] [CrossRef] [PubMed]
- Nestman, T.S.; Marshall, S.D.; Qian, F.; Holton, N.; Franciscus, R.G.; Southard, T.E. Cervical vertebrae maturation method morphologic criteria: Poor reproducibility. Am. J. Orthod. Dentofac. Orthop. 2011, 140, 182–188. [Google Scholar] [CrossRef] [PubMed]
- Sorantin, E.; Grasser, M.G.; Hemmelmayr, A.; Tschauner, S.; Hrzic, F.; Weiss, V.; Lacekova, J.; Holzinger, A. The augmented radiologist: Artificial intelligence in the practice of radiology. Pediatr. Radiol. 2022, 52, 2074–2086. [Google Scholar] [CrossRef] [PubMed]
- Shaffer, K. Deep Learning and Lung Cancer: AI to Extract Information Hidden in Routine CT Scans. Radiology 2020, 296, 225–226. [Google Scholar] [CrossRef] [PubMed]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
- Higgins, J.P.T.; Thomas, J.; Chandler, J.; Cumpston, M.; Li, T.; Page, M.J.; Welch, V.A. Cochrane Handbook for Systematic Reviews of Interventions; Wiley: Hoboken, NJ, USA, 2019. [Google Scholar]
- Amir-Behghadami, M.; Janati, A. Population, Intervention, Comparison, Outcomes and Study (PICOS) design as a framework to formulate eligibility criteria in systematic reviews. Emerg. Med. J. 2020, 37, 387. [Google Scholar] [CrossRef] [PubMed]
- Whiting, P.F.; Rutjes, A.W.S.; Westwood, M.E.; Mallett, S.; Deeks, J.J.; Reitsma, J.B.; Leeflang, M.M.G.; Sterne, J.A.C.; Bossuyt, P.M.M.; QUADAS-2 Group. QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies. Ann. Intern. Med. 2011, 155, 529–536. [Google Scholar] [CrossRef] [PubMed]
- Akay, G.; Akcayol, M.A.; Özdem, K.; Güngör, K. Deep convolutional neural network—The evaluation of cervical vertebrae maturation. Oral Radiol. 2023, 39, 629–638. [Google Scholar] [CrossRef] [PubMed]
- Amasya, H.; Yildirim, D.; Aydogan, T.; Kemaloglu, N.; Orhan, K. Cervical vertebral maturation assessment on lateral cephalometric radiographs using artificial intelligence: Comparison of machine learning classifier models. Dentomaxillofacial Radiol. 2020, 49, 20190441. [Google Scholar] [CrossRef]
- Amasya, H.; Cesur, E.; Yıldırım, D.; Orhan, K. Validation of cervical vertebral maturation stages: Artificial intelligence vs human observer visual analysis. Am. J. Orthod. Dentofac. Orthop. 2020, 158, e173–e179. [Google Scholar] [CrossRef]
- Atici, S.F.; Ansari, R.; Allareddy, V.; Suhaym, O.; Cetin, A.E.; Elnagar, M.H. Fully automated determination of the cervical vertebrae maturation stages using deep learning with directional filters. PLoS ONE 2022, 17, e0269198. [Google Scholar] [CrossRef]
- Atici, S.F.; Ansari, R.; Allareddy, V.; Suhaym, O.; Cetin, A.E.; Elnagar, M.H. AggregateNet: A deep learning model for automated classification of cervical vertebrae maturation stages. Orthod. Craniofacial Res. 2023, 26, 111–117. [Google Scholar] [CrossRef]
- Khazaei, M.; Mollabashi, V.; Khotanlou, H.; Farhadian, M. Automatic determination of pubertal growth spurts based on the cervical vertebral maturation staging using deep convolutional neural networks. J. World Fed. Orthod. 2023, 12, 56–63. [Google Scholar] [CrossRef]
- Kim, E.-G.; Oh, I.-S.; So, J.-E.; Kang, J.; Le, V.N.T.; Tak, M.-K.; Lee, D.-W. Estimating Cervical Vertebral Maturation with a Lateral Cephalogram Using the Convolutional Neural Network. J. Clin. Med. 2021, 10, 5400. [Google Scholar] [CrossRef] [PubMed]
- Kök, H.; Acilar, A.M.; İzgi, M.S. Usage and comparison of artificial intelligence algorithms for determination of growth and development by cervical vertebrae stages in orthodontics. Prog. Orthod. 2019, 20, 41. [Google Scholar] [CrossRef]
- Kok, H.; Izgi, M.S.; Acilar, A.M.; Practice, I.P. Evaluation of the Artificial Neural Network and Naive Bayes Models Trained with Vertebra Ratios for Growth and Development Determination. Turk. J. Orthod. 2021, 34, 2–9. [Google Scholar] [CrossRef] [PubMed]
- Kök, H.; Izgi, M.S.; Acilar, A.M. Determination of growth and development periods in orthodontics with artificial neural network. Orthod. Craniofacial Res. 2021, 24, 76–83. [Google Scholar] [CrossRef]
- Li, H.; Chen, Y.; Wang, Q.; Gong, X.; Lei, Y.; Tian, J.; Gao, X. Convolutional neural network-based automatic cervical vertebral maturation classification method. Dentomaxillofacial Radiol. 2022, 51, 20220070. [Google Scholar] [CrossRef]
- Li, H.; Li, H.; Yuan, L.; Liu, C.; Xiao, S.; Liu, Z.; Zhou, G.; Dong, T.; Ouyang, N.; Liu, L.; et al. The psc-CVM assessment system: A three-stage type system for CVM assessment based on deep learning. BMC Oral Health 2023, 23, 557. [Google Scholar] [CrossRef] [PubMed]
- Makaremi, M.; Lacaule, C.; Mohammad-Djafari, A. Deep Learning and Artificial Intelligence for the Determination of the Cervical Vertebra Maturation Degree from Lateral Radiography. Entropy 2019, 21, 1222. [Google Scholar] [CrossRef]
- Mohammad-Rahimi, H.; Motamadian, S.R.; Nadimi, M.; Hassanzadeh-Samani, S.; Minabi, M.A.S.; Mahmoudinia, E.; Lee, V.Y.; Rohban, M.H. Deep learning for the classification of cervical maturation degree and pubertal growth spurts: A pilot study. Korean J. Orthod. 2022, 52, 112–122. [Google Scholar] [CrossRef] [PubMed]
- Radwan, M.T.; Sin, Ç.; Akkaya, N.; Vahdettin, L. Artificial intelligence-based algorithm for cervical vertebrae maturation stage assessment. Orthod. Craniofacial Res. 2023, 26, 349–355. [Google Scholar] [CrossRef] [PubMed]
- Seo, H.; Hwang, J.; Jeong, T.; Shin, J. Comparison of Deep Learning Models for Cervical Vertebral Maturation Stage Classification on Lateral Cephalometric Radiographs. J. Clin. Med. 2021, 10, 3591. [Google Scholar] [CrossRef]
- Seo, H.; Hwang, J.; Jung, Y.-H.; Lee, E.; Nam, O.H.; Shin, J. Deep focus approach for accurate bone age estimation from lateral cephalogram. J. Dent. Sci. 2023, 18, 34–43. [Google Scholar] [CrossRef] [PubMed]
- Zhou, J.; Zhou, H.; Pu, L.; Gao, Y.; Tang, Z.; Yang, Y.; You, M.; Yang, Z.; Lai, W.; Long, H. Development of an Artificial Intelligence System for the Automatic Evaluation of Cervical Vertebral Maturation Status. Diagnostics 2021, 11, 2200. [Google Scholar] [CrossRef] [PubMed]
- Perinetti, G.; Caprioglio, A.; Contardo, L. Visual assessment of the cervical vertebral maturation stages: A study of diagnostic accuracy and repeatability. Angle Orthod. 2014, 84, 951–956. [Google Scholar] [CrossRef] [PubMed]
- Khanagar, S.B.; Al-Ehaideb, A.; Maganur, P.C.; Vishwanathaiah, S.; Patil, S.; Baeshen, H.A.; Sarode, S.C.; Bhandi, S. Developments, application, and performance of artificial intelligence in dentistry—A systematic review. J. Dent. Sci. 2021, 16, 508–522. [Google Scholar] [CrossRef]
- Lee, B.-D.; Lee, M.S. Automated Bone Age Assessment Using Artificial Intelligence: The Future of Bone Age Assessment. Korean J. Radiol. 2021, 22, 792–800. [Google Scholar] [CrossRef]
- Nguyen, T.; Hermann, A.-L.; Ventre, J.; Ducarouge, A.; Pourchot, A.; Marty, V.; Regnard, N.-E.; Guermazi, A. High performance for bone age estimation with an artificial intelligence solution. Diagn. Interv. Imaging 2023, 104, 330–336. [Google Scholar] [CrossRef]
- Rana, S.S.; Nath, B.; Chaudhari, P.K.; Vichare, S. Cervical Vertebral Maturation Assessment using various Machine Learning techniques on Lateral cephalogram: A systematic literature review. J. Oral Biol. Craniofacial Res. 2023, 13, 642–651. [Google Scholar] [CrossRef]
- Mathew, R.; Palatinus, S.; Padala, S.; Alshehri, A.; Awadh, W.; Bhandi, S.; Thomas, J.; Patil, S. Neural networks for classification of cervical vertebrae maturation: A systematic review. Angle Orthod. 2022, 92, 796–804. [Google Scholar] [CrossRef] [PubMed]
- Han, X.; Zhang, Z.; Ding, N.; Gu, Y.; Liu, X.; Huo, Y.; Qiu, J.; Yao, Y.; Zhang, A.; Zhang, L.; et al. Pre-trained models: Past, present and future. AI Open 2021, 2, 225–250. [Google Scholar] [CrossRef]
- Schoretsaniti, L.; Mitsea, A.; Karayianni, K.; Sifakakis, I. Cervical Vertebral Maturation Method: Reproducibility and Efficiency of Chronological Age Estimation. Appl. Sci. 2021, 11, 3160. [Google Scholar] [CrossRef]
- Perinetti, G.; Contardo, L. Reliability of Growth Indicators and Efficiency of Functional Treatment for Skeletal Class II Mal-occlusion: Current Evidence and Controversies. Biomed. Res. Int. 2017, 2017, 1367691. [Google Scholar] [CrossRef] [PubMed]
- Gabriel, D.B.; Southard, K.A.; Qian, F.; Marshall, S.D.; Franciscus, R.G.; Southard, T.E. Cervical vertebrae maturation method: Poor reproducibility. Am. J. Orthod. Dentofac. Orthop. 2009, 136, 478.e1–478.e7. [Google Scholar] [CrossRef]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef] [PubMed]
- Obuchowski, N.A.; Bullen, J. Multireader Diagnostic Accuracy Imaging Studies: Fundamentals of Design and Analysis. Radiology 2022, 303, 26–34. [Google Scholar] [CrossRef] [PubMed]
- Obuchowski, N.A.; Rockette, H.E. Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests an anova approach with dependent observations. Commun. Stat. Simul. Comput. 1995, 24, 285–308. [Google Scholar] [CrossRef]
- Bajjad, A.A.; Gupta, S.; Agarwal, S.; Pawar, R.A.; Kothawade, M.U.; Singh, G. Use of artificial intelligence in determination of bone age of the healthy individuals: A scoping review. J. World Fed. Orthod. 2023, 13, 95–102. [Google Scholar] [CrossRef]
- Cao, L.; He, H.; Hua, F. Current neural networks demonstrate potential in automated cervical vertebral maturation stage classification based on lateral cephalograms. J. Évid. Based Dent. Pr. 2024, 24, 101928. [Google Scholar] [CrossRef]
- Mohammad-Rahimi, H.; Nadimi, M.; Rohban, M.H.; Shamsoddin, E.; Lee, V.Y.; Motamedian, S.R. Machine learning and orthodontics, current trends and the future opportunities: A scoping review. Am. J. Orthod. Dentofac. Orthop. 2021, 160, 170–192.e4. [Google Scholar] [CrossRef] [PubMed]
- Caloro, E.; Cè, M.; Gibelli, D.; Palamenghi, A.; Martinenghi, C.; Oliva, G.; Cellina, M. Artificial Intelligence (AI)-Based Systems for Automatic Skeletal Maturity Assessment through Bone and Teeth Analysis: A Revolution in the Radiological Workflow? Appl. Sci. 2023, 13, 3860. [Google Scholar] [CrossRef]
- Goedmakers, C.; Pereboom, L.; Schoones, J.; de Leeuw den Bouter, M.; Remis, R.; Staring, M.; Vleggeert-Lankamp, C. Machine learning for image analysis in the cervical spine: Systematic review of the available models and methods. Brain Spine 2022, 2, 101666. [Google Scholar] [CrossRef]
- Kim, J.; Seo, H.; Park, S.; Lee, E.; Jeong, T.; Nam, O.H.; Choi, S.; Shin, J. Utilization of an Artificial Intelligence Program Using the Greulich-Pyle Method to Evaluate Bone Age in the Skeletal Maturation Stage. J. Korean Acad. Pediatr. Dent. 2023, 50, 89–103. [Google Scholar] [CrossRef]
- Xie, L.; Tang, W.; Izadikhah, I.; Zhao, Z.; Zhao, Y.; Li, H.; Yan, B. Development of a multi-stage model for intelligent and quantitative appraising of skeletal maturity using cervical vertebras cone-beam CT images of Chinese girls. Int. J. Comput. Assist. Radiol. Surg. 2022, 17, 761–773. [Google Scholar] [CrossRef]
- Moztarzadeh, O.; Jamshidi, M.; Sargolzaei, S.; Keikhaee, F.; Jamshidi, A.; Shadroo, S.; Hauer, L. Metaverse and Medical Diagnosis: A Blockchain-Based Digital Twinning Approach Based on MobileNetV2 Algorithm for Cervical Vertebral Maturation. Diagnostics 2023, 13, 1485. [Google Scholar] [CrossRef]
- Serrador, L.; Villani, F.P.; Moccia, S.; Santos, C.P. Knowledge distillation on individual vertebrae segmentation exploiting 3D U-Net. Comput. Med Imaging Graph. 2024, 113, 102350. [Google Scholar] [CrossRef] [PubMed]
- Gonca, M.; Sert, M.F.; Gunacar, D.N.; Kose, T.E.; Beser, B. Determination of growth and developmental stages in hand–wrist radiographs. J. Orofac. Orthop. Fortschritte Kieferorthopädie 2024, 1–15. [Google Scholar] [CrossRef]
- Uys, A.; Steyn, M.; Botha, D. Decision tree analysis for age estimation in living individuals: Integrating cervical and dental radiographic evaluations within a South African population. Int. J. Leg. Med. 2024, 138, 951–959. [Google Scholar] [CrossRef]
- Alfawzan, A. Assessment of Skeletal Maturity in a Sample of the Saudi Population Using Cervical Vertebrae and Frontal Sinus Index: A Cephalometric Study Using Artificial Intelligence. Cureus J. Med. Sci. 2023, 15, e41811. [Google Scholar] [CrossRef]
- Cameriere, R.; Palacio, L.A.V.; Nakaš, E.; Galić, I.; Brkić, H.; Govorko, D.K.; Jerković, D.; Jara, L.; Ferrante, L. The Fourth Cervical Vertebra Anterior and Posterior Body Height Projections (Vba) for the Assessment of Pubertal Growth Spurt. Appl. Sci. 2023, 13, 1819. [Google Scholar] [CrossRef]
- Gulsahi, A.; Çehreli, S.B.; Galić, I.; Ferrante, L.; Cameriere, R. Age estimation in Turkish children and young adolescents using fourth cervical vertebra. Int. J. Leg. Med. 2020, 134, 1823–1829. [Google Scholar] [CrossRef] [PubMed]
- Liao, N.; Dai, J.; Tang, Y.; Zhong, Q.; Mo, S. iCVM: An Interpretable Deep Learning Model for CVM Assessment Under Label Uncertainty. IEEE J. Biomed. Health Inform. 2022, 26, 4325–4334. [Google Scholar] [CrossRef] [PubMed]
- Perinetti, G.; Perillo, L.; Franchi, L.; Di Lenarda, R.; Contardo, L. Maturation of the middle phalanx of the third finger and cervical vertebrae: A comparative and diagnostic agreement study. Orthod. Craniofacial Res. 2014, 17, 270–279. [Google Scholar] [CrossRef]
- Suri, A.; Jones, B.C.; Ng, G.; Anabaraonye, N.; Beyrer, P.; Domi, A.; Choi, G.; Tang, S.; Terry, A.; Leichner, T.; et al. A deep learning system for automated, multi-modality 2D segmentation of vertebral bodies and intervertebral discs. Bone 2021, 149, 115972. [Google Scholar] [CrossRef]
Study No | Author, Year | Country | Sample Size; (Training/Test Ratio) [%] | Tested AI Model | Reference Standard | CVM Method Used | Outcome |
---|---|---|---|---|---|---|---|
1 | Akay G. et al., 2023 [29] | Turkey | 588; (60/40) | SL; CNNs, newly trained models | Two radiologists | Hassel-Farman | As a result of training that lasted 40 epochs, 58% training and 57% test accuracy were obtained. The model obtained results that were very close to the training on the test data. On the other hand, it was determined that the model showed the highest success in terms of precision and F1-score in CVM Stage 1 and the highest success in recall value in CVM Stage 2. |
2 | Amasya et al., 2020 [30] | Turkey | 647; (80/20) | ANN, decision tree, logistic regression, RF, and SVM | Two experts | Bacetti et al. | The results of interobserver agreement assessment between AI and ANN showed CVM stage classifier models with substantial to almost perfect agreement (weighted kappa 0.76–0.92). |
3 | Amasya et al., 2020 [31] | Turkey | 647 + 72 (90/10) | Clinical decision support system (CDSS), ANN | Four observers | Bacetti et al. | Intraobserver agreement ranges were as follows: weighted kappa (wk) 5 0.92–0.98, Cohen’s kappa (ck) 5 0.65–0.85, and 70.8–87.5%. Interobserver agreement ranges were as follows: wk 5 0.76–0.92, ck 5 0.4–0.65, and 50–72.2%. Agreement between the ANN model and observers 1, 2, 3, and 4 were as follows: wk 5 0.85 (ck 5 0.52, 59.7%), wk 5 0.8 (ck 5 0.4, 50%), wk 5 0.87 (ck 5 0.55, 62.5%), and wk 5 0.91 (ck 5 0.53, 61.1%), respectively (p = 0.001). An average of 58.3% agreement was observed between the ANN model and the human observers. |
4 | Atici S. et al., 2022 [32] | USA | 1018; (70/30) | Unsupervised learning; Label distribution learning, DL; newly trained model | One orthodontist | McNamara, Franchi & Bacetti | The proposed CNN model preceded with a layer of tunable directional filters achieved a validation accuracy of 84.63% in CVM stage classification into five classes, exceeding the accuracy achieved with the other DL models investigated. The custom-designed CNN method also achieved 75.11% in six-class CVM stage classification. The effectiveness of the directional filters is reflected in the improved performance attained in the results. |
5 | Atici S. et al., 2023 [33] | USA | 1018; (80/20) | SL; CNNs, newly trained models | Two orthodontists | Bacetti et al. | The proposed innovative model which uses a parallel structured network preceded with a preprocessing layer of edge enhancement filters achieved a validation accuracy of 82.35% in CVM stage classification on female subjects, and 75.0% in CVM stage classification on male subjects, exceeding the accuracy achieved with the other DL models investigated. The effectiveness of the directional filters is reflected in the improved performance attained in the results. If AggregateNet is used without directional filters, the test accuracy decreases to 80.0% on female subjects and to 74.03% on male subjects. |
6 | Khazaei M. et al., [34] | Iran | 1846; (80/20) | SL; CNNs, newly trained models | One orthodontist, twice in one-month interval | Bacetti et al. | The CNN based on the ConvNeXtBase-296 architecture had the highest accuracy for automatically assessing pubertal growth spurts based on CVM staging in both three-class (82% accuracy) and two-class (93% accuracy) scenarios. Given the limited amount of data available for training the target networks for most of the architectures in use, transfer learning improves predictive performance. |
7 | Kim E.G. et al., 2021 [35] | Korea | 600; (80/20) | SL; CNNs, pretrained and newly trained models | Two specialists | McNamara & Franchi | The combination of the CNN with a region-of-interest detector and segment or module was significantly more accurate (62.5%) than without them. |
8 | Kök H. et al., 2019 [36] | Turkey | 300; (80/20) | SL; k-NN, NB, decision tree, ANN, SVM, RF and logistic regression; pretrained models | One orthodontist, twice in one-month interval | Hassel & Farman | ANN had the second highest and most stable accuracy values in CVM assessment (stages 1–4, 6–68, 8–93%) except CVS5 (47, 4%). |
9 | Kök H. et al., 2021 [37] | Turkey | 360; (80/20 and 70/30) | SL; NNM and NBM; newly trained models | One orthodontist, twice at 15-day interval | Hassel & Farman | The highest determination success rate was obtained in NNM 3 (0.95) and the lowest in NBM 4 (0.50). The determination success of NBM 1 and NBM 3 was almost similar (0.60). The success of NNM 2 did not differ much from that of NNM 1 (0.94). The determination success of stage 5 was relatively lower than the others in NNM 1 and NNM 2 (0.83). The NNMs were more successful than the NBMs in our developed models. It is important to determine the effective ratio and/or measurements that will be useful for differentiation. |
10 | Kök H. et al., 2021 [38] | Turkey | 419; (70/30) | SL; ANN; newly trained models | One orthodontist, twice at 20-day interval | Hassel & Farman | Significantly positive correlations between hand-wrist maturation level, CVS and ages. ANN-7 model accuracy value was 0.9427. The highest model accuracy of 0.8687 with least linear measurements was obtained by drawing 13 linear measurements, using vertical measurements and indents. The growth development periods and gender were determined from CVM using ANN successfully. |
11 | Li H. et al., 2022 [39] | China | 6079; (70/30) | SL; CNNs, newly trained models | Two orthodontists | McNamara | The final classification accuracy ranking was ResNet152 > DenseNet161 > GoogLeNet > VGG16, as evaluated on the test set. ResNet152 proved to be the best model among the four models for CVM classification with a weighted κ of 0.826, an average AUC of 0.933 and total accuracy of 67.06%. The F1 score rank for each subgroup was: CS6 > CS1 > CS4 > CS5 > CS3 > CS2. The areas of the third (C3) and fourth (C4) cervical vertebrae were activated when CNNs were assessing the images. |
12 | Li H. et al., 2023 [40] | China | 10,200; (70/30) | SL; CNNs, newly trained models | Three orthodontists | Bacetti et al. | The system has achieved good performance for CVM assessment with an average AUC (the area under the curve) of 0.94 and total accuracy of 70.42%, as evaluated on the test set. The Cohen’s kappa between the system and the expert panel is 0.645. The weighted kappa between the system and the expert panel is 0.844. The overall ICC between the psc-CVM assessment system and the expert panel was 0.946. The F1 score rank for the psc-CVM assessment system was: CVS (cervical vertebral maturation stage) 6 > CVS1 > CVS4 > CVS5 > CVS3 > CVS2. |
13 | Makermi M. et al., 2019 [41] | France | 1870; (80/20) | SL; CNN; pretrained and newly trained models | One radiographer | McNamara & Franchi | The results show the performances of the proposed method with different numbers of images for training, evaluation and testing and different preprocessing of the datasets. The highest accuracy (0.967–1.0) was achieved with 1870 images used for training and entropic filtering. |
14 | Mohammad-Rahimi H. et al., 2022 [42] | Iran | 890; (70/30) | SL; Transfer learning models; pretrained and newly trained for two datasets. | Two orthodontists | McNamara & Franchi | ResNet101 showed best performance. Six-class CVM diagnosis in ResNet101 model showed validation and test accuracy of 62.63% and 61.62%, respectively. With three-class classification, the model’s validation and test accuracy were 75.76% and 82.83%, respectively. |
15 | Radwan M.T. et al., 2019 [43] | Turkey | 1501; (80/20) | SL; CNNs, newly trained models | One orthodontist | Bacetti et al. (3 stages) | The ICC was valued at 0.973, weighted Cohen’s kappa standard error was 0.870 ± 0.027 which shows high reliability of the observers and excellent level of agreement between them, the segmentation network achieved a global accuracy of 0.99 and the average dice score over all images was 0.93. The classification network achieved an accuracy of 0.802, class sensitivity of (prepubertal 0.78; pubertal 0.45; postpubertal 0.98), respectively, per class specificity of (prepubertal 0.94; pubertal 0.94; postpubertal 0.75), respectively. |
16 | Seo H. et al., 2021 [44] | Korea | 600; (80/20) | SL; CNNs, pretrained and newly trained models | One radiologist | Bacetti et al. | Of all the tested AI models, a pretrained network, Inception-ResNet-v2, had the highest accuracy of 0.941. It also had the highest recall and precision scores among all pretrained models tested. |
17 | Seo H. et al., 2023 [45] | Korea | 600; (80/20) | SL; CNNs, newly trained models | Not mentioned | Bacetti et al. | All deep learning models demonstrated more than 90% accuracy, with Inception-ResNet-v2 performing the best, relatively. In addition, visualizing each deep learning model using Grad CAM led to a primary focus on the cervical vertebrae and surrounding structures. |
18 | Zhou J. et al., 2021 [46] | China | 1080; (90/10) | SL; CNNs, newly trained models | Two examiners; disagreements resolved by third expert | Bacetti et al. | In general, the agreement between AI results and the gold standard was good, with the intraclass correlation coefficient (ICC) value being up to 98%. Moreover, the accuracy of CVM staging was 71%. In terms of F1 score, CS6 stage (85%) ranked the highest accuracy. |
Authors/Year | Risk of Bias | Applicability Concerns | |||||
---|---|---|---|---|---|---|---|
Patient Selection | Index Test | Reference Standard | Flow and Timing | Patient Selection | Index Test | Reference Standard | |
Akay G. et al., 2023 [29] | Unclear | Low | Low | Low | Unclear | Low | Low |
Amasya et al., 2020 [31] | Unclear | Low | Low | Low | Unclear | Low | Low |
Amasya et al., 2020 [30] | Unclear | Low | Low | Low | Unclear | Low | Low |
Atici S. et al., 2022 [32] | Low | Low | Low | Low | Low | Low | Low |
Atici S. et al., 2023 [33] | Low | Low | Low | Low | Low | Low | Low |
Khazaei M et al., 2023 [34] | Unclear | Unclear | Low | Low | Unclear | Low | Low |
Kim E.G. et al., 2021 [35] | Unclear | Low | Low | Low | Unclear | Low | Low |
Kök H. et al., 2019 [36] | Unclear | Unclear | Low | Low | Unclear | Low | Low |
Kök H. et al., 2021 [37] | Low | Unclear | Low | Low | Low | Low | Low |
Kök H. et al., 2021 [38] | Unclear | Unclear | Low | Low | Unclear | Low | Low |
Li H. et al., 2022 [39] | Unclear | Unclear | Low | Low | Unclear | Low | Low |
Li H. et al., 2023 [40] | Low | Unclear | Low | Low | Low | Low | Low |
Makermi M. et al., 2019 [41] | High | Unclear | Low | Unclear | High | Unclear | Low |
Mohammad-Rahimi H. et al., 2022 [42] | Unclear | Low | Low | Low | Unclear | Low | Low |
Radwan M.T. et al., 2019 [43] | Low | Low | Low | Low | Low | Low | Low |
Seo H. et al., 2021 [44] | Low | Unclear | Low | Low | Low | Low | Low |
Seo H. et al., 2023 [45] | Unclear | Unclear | High | Low | Unclear | High | Low |
Zhou J. et al., 2021 [46] | Unclear | Low | Low | Low | Unclear | Low | Low |
Study No | Author, Year | Tested AI Model | Stage 1 | Stage 2 | Stage 3 | Stage 4 | Stage 5 | Stage 6 | Pooled Accuracy |
---|---|---|---|---|---|---|---|---|---|
1 | Akay G. et al., 2023 [29] | CNN (40 epochs) | Precision 0.82; Recall 0.7; F1-score 0.76 | Precision 0.47; Recall 0.74; F1-score 0.57 | Precision 0.64; Recall 0.58; F1-score 0.61 | Precision 0.52; Recall 0.54; F1-score 0.53 | Precision 0.55; Recall 0.37; F1-score 0.44 | Precision 0.52; Recall 0.60; F1-score 0.56 | 0.57 |
2 | Atici S. et al., 2022 [32] | CNN, images prefiltered | Precision 0.599; Recall 0.528; F1-score 0.561 | Precision 0.55; Recall 0.562; F1-score 0.556 | Precision 0.671; Recall 0.774; F1-score 0.719 | Precision 0.724; Recall 0.758; F1-score 0.741 | Precision 0.765; Recall 0.685; F1-score 0.723 | Precision 0.789; Recall 0.747; F1-score 0.767 | 0.8463 |
3 | Atici S. et al., 2023 [33] | AggregateNet with a set of tunable directional edge enhancers, CNN model | Female 0.824, Male 0.75 | ||||||
4 | Kim EG. et al., 2021 [35] | Model-3, CNN | 0.625 | ||||||
5. | Kök H. et al., 2019 [36] | Decision tree | Accuracy 0.97; Precision 0.93; Recall 0.97; F1-score 0.97 | Accuracy 0.96; Precision 0.89; Recall 0.83; F1-score 0.86 | Accuracy 0.9; Precision 0.68; Recall 0.71; F1-score 0.7 | Accuracy 0.85; Precision 0.55; Recall 0.51; F1-score 0.53 | Accuracy 0.87; Precision 0.47; Recall 0.5; F1-score 0.48 | Accuracy 0.91; Precision 0.78; Recall 0.78; F1-score 0.78 | NA |
6 | Kök H. et al., 2021 [37] | NNM 3 (70–30%) | Precision 1.0; Recall 1.0; F1-score 1.0 | Precision 0.95; Recall 0.95; F1-score 0.95 | Precision 0.93; Recall 0.93; F1-score 0.93 | Precision 0.95; Recall 1.0; F1-score 0.98 | Precision 0.83; Recall 0.83; F1-score 0.83 | Precision 0.95; Recall 0.90; F1-score 0.92 | 0.95 |
7 | Kök H. et al., 2021 [37] | ANN-7 model | Specificity 0.954; Sensitivity (Recall) 0.914; F1-score 0.8533 | Specificity 0.957; Sensitivity (Recall) 0.7; F1-score 0.7313 | Specificity 0.9628; Sensitivity (Recall) 0.8695; F1-score 0.845 | Specificity 0.9628; Sensitivity (Recall) 0.7428; F1-score 0.7703 | Specificity 0.9140; Sensitivity (Recall) 0.6571; F1-score 0.6301 | Specificity 0.9512; Sensitivity (Recall) 0.6285; F1-score 0.6717 | 0.9427 |
8 | Li H. et al., 2022 [39] | ResNet152 | Precision 0.74; Recall 0.79; F1-score 0.77 | Precision 0.52; Recall 0.52; F1-score 0.52 | Precision 0.59; Recall 0.56; F1-score 0.58 | Precision 0.73; Recall 0.66; F1-score 0.69 | Precision 0.66; Recall 0.64; F1-score 0.65 | Precision 0.77; Recall 0.84; F1-score 0.81 | 0.6706 |
9 | Li H. et al., 2023 [40] | Psc-CVM | Precision 0.8559; Recall 0.7509; F1-score 0.8000 | Precision 0.5704; Recall 0.6335; F1-score 0.6003 | Precision 0.6067; Recall 0.6639; F1-score 0.6340 | Precision 0.7510; Recall 0.6592; F1-score 0.7021 | Precision 0.6760; Recall 0.7137; F1-score 0.6943 | Precision 0.8185; Recall 0.8117; F1-score 0.8151 | 0.704 |
10 | Makermi M. et al., 2019 [41] | NN, 900 images, 7 layers | Accuracy 0.93; Precision 0.99; Recall 0.67; F1-score 0.8 | Accuracy 0.939; Precision 0.94; Recall 0.73; F1-score 0.82 | Accuracy 0.952; Precision 0.94; Recall 0.81; F1-score 0.87 | Accuracy 0.924; Precision 0.59; Recall 0.99; F1-score 0.74 | Accuracy 0.966; Precision 0.84; Recall 0.93; F1-score 0.88 | Accuracy 0.969; Precision 0.97; Recall 0.88; F1-score 0.92 | NA |
11 | Mohammad-Rahimi H. et al., 2022 [42] | ResNet-101 (test set) | Precision 0.6; Recall 0.6; F1-score 0.6 | Precision 0.64; Recall 0.70; F1-score 0.67 | Precision 0.25; Recall 0.33; F1-score 0.29 | Precision 0.52; Recall 0.60; F1-score 0.56 | Precision 0.67; Recall 0.57; F1-score 0.61 | Precision 0.88; Recall 0.78; F1-score 0.82 | 0.6162 |
12 | Seo H. et al., 2021 [44] | Inception-ResNet-v2 | 0.941 | ||||||
13 | Seo H. et al., 2023 [45] | Inception-ResNet-v2 | 0.956 | ||||||
14 | Zhou J. et al., 2021 [46] | CNN | Precision 0.67; Recall 0.92; F1-score 0.77 | Precision 1.0; Recall 0.36; F1-score 0.53 | Precision 0.25; Recall 0.4; F1-score 0.31 | Precision 0.83; Recall 0.63; F1-score 0.71 | Precision 0.46; Recall 1.0; F1-score 0.63 | Precision 1.0; Recall 0.74; F1-score 0.85 | 0.71 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kazimierczak, W.; Jedliński, M.; Issa, J.; Kazimierczak, N.; Janiszewska-Olszowska, J.; Dyszkiewicz-Konwińska, M.; Różyło-Kalinowska, I.; Serafin, Z.; Orhan, K. Accuracy of Artificial Intelligence for Cervical Vertebral Maturation Assessment—A Systematic Review. J. Clin. Med. 2024, 13, 4047. https://doi.org/10.3390/jcm13144047
Kazimierczak W, Jedliński M, Issa J, Kazimierczak N, Janiszewska-Olszowska J, Dyszkiewicz-Konwińska M, Różyło-Kalinowska I, Serafin Z, Orhan K. Accuracy of Artificial Intelligence for Cervical Vertebral Maturation Assessment—A Systematic Review. Journal of Clinical Medicine. 2024; 13(14):4047. https://doi.org/10.3390/jcm13144047
Chicago/Turabian StyleKazimierczak, Wojciech, Maciej Jedliński, Julien Issa, Natalia Kazimierczak, Joanna Janiszewska-Olszowska, Marta Dyszkiewicz-Konwińska, Ingrid Różyło-Kalinowska, Zbigniew Serafin, and Kaan Orhan. 2024. "Accuracy of Artificial Intelligence for Cervical Vertebral Maturation Assessment—A Systematic Review" Journal of Clinical Medicine 13, no. 14: 4047. https://doi.org/10.3390/jcm13144047
APA StyleKazimierczak, W., Jedliński, M., Issa, J., Kazimierczak, N., Janiszewska-Olszowska, J., Dyszkiewicz-Konwińska, M., Różyło-Kalinowska, I., Serafin, Z., & Orhan, K. (2024). Accuracy of Artificial Intelligence for Cervical Vertebral Maturation Assessment—A Systematic Review. Journal of Clinical Medicine, 13(14), 4047. https://doi.org/10.3390/jcm13144047