Accuracy of Interproximal Space Measurement Across Different Orthodontic Tooth-Segmentation Programs: A Comparative Clinical Study

Choi, Tae-Hyun; Kim, So-Yeon; Lee, Nam-Ki

doi:10.3390/app16115497

Open AccessArticle

Accuracy of Interproximal Space Measurement Across Different Orthodontic Tooth-Segmentation Programs: A Comparative Clinical Study

by

Tae-Hyun Choi

^*

,

So-Yeon Kim

and

Nam-Ki Lee

Department of Orthodontics, Section of Dentistry, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(11), 5497; https://doi.org/10.3390/app16115497

Submission received: 25 March 2026 / Revised: 13 May 2026 / Accepted: 27 May 2026 / Published: 1 June 2026

(This article belongs to the Special Issue Advances in Orthodontics and Dentofacial Orthopedics)

Download

Browse Figures

Versions Notes

Abstract

This study evaluated reliability and accuracy of interproximal space measurement in two different orthodontic tooth-segmentation programs in comparison with clinical space measurement and assessed the minimum distance threshold ensuring accuracy. From 6586 digital dental models, eligible digital dental images (DDIs) were selected based on anterior spacing, intra-oral scans, and clinical space (CS) measurement in vivo. Interproximal spaces were virtually measured using two programs: semi-automatic (VS_S; Orthoanalyzer^®) and full-automatic (VS_F; DentOne^®). Accuracy was analyzed against CS as the gold standard, and the minimum distance threshold for reliable measurement was explored. In 85 interproximal spaces from 22 adult patients, both programs showed excellent repeatability (VS_S, ICC=0.922; VS_F, ICC = 0.948). Agreement between VS_S and CS was good (ICC = 0.785 to 0.882), while agreement between VS_F and CS was consistently good (ICC ≈ 0.884). Mean VS values (VS_S, 0.11; VS_F, 0.13 mm) were lower than CS (0.21 mm, p < 0.001). The mean difference was −0.101 to −0.113 mm in VS_S and −0.078 to −0.082 mm in VS_F. The 95% limits of agreement (LoA) for VS_F were narrower compared to VS_S with a higher proportion of measurements falling within the LoA (95.3% vs. 91.8–92.9%). A significant association between space size and measurement accuracy was found in discrete CS intervals, with higher relative measurement accuracy for larger interproximal spaces (all, p < 0.001). This suggests that tooth-segmentation programs showed reliable accuracy in measuring interproximal spaces > 0.20 mm, though with slight underestimation compared to clinical measurements. Clinical assessment could be supplemented to enhance virtual orthodontic planning, particularly for narrow interproximal spaces.

Keywords:

accuracy; interproximal space; tooth-segmentation; digital dentistry

1. Introduction

Orthodontic setup is crucial for precise orthodontic treatment planning to decide extraction, interproximal reduction, and treatment mechanics [1]. In the past, tooth segmentation in orthodontic setup involved manually separating each tooth from the base of plaster models [2]. As digital impression and models rapidly replace plaster models, related virtual orthodontic setup programs have also been developed and widely used. These programs generate virtual tooth segmentation to facilitate orthodontic diagnosis as well as the fabrication of orthodontic appliances such as clear aligners [3], indirect bonding jigs [4], custom brackets and arch wires [5]. More importantly, along with these tools, a 3-dimensional printer enables even in-house fabrication of clear aligners [6].

Given the established reliability of digital models and intraoral scans [7,8,9], digital model analyses involving orthodontic parameters—such as tooth dimensions, arch width and length, and Bolton ratio—have demonstrated comparable accuracy to that of traditional cast models [10,11,12].

Beyond these diagnostic distance assessments, interproximal information in digital dental image (DDI) is crucial from a clinical standpoint for tooth segmentation and furthermore orthodontic virtual setup for crowding relief or space closure [13]. Accurate reproduction of the tooth’s proximal surface in digital models remains challenging as only the visible surface of the dentition is captured during scanning [14], further complicated by the presence of saliva, blood, and restricted intraoral access [15].

In pursuit of accurate interproximal information, various virtual setup programs have adopted different tooth-segmentation methods: automatic tooth segmentation (teeth identification and isolation using an AI algorithm), landmark-based tooth segmentation (tooth segmentation based on user-defined anatomical points such as mesial/distal points), and tooth designation and segmentation (using labeling of teeth with 3D boundary definition) [16]. Recently, deep learning techniques have been used for tooth segmentation, and some are proven to show high success rate, accuracy and efficiency [16,17,18].

Regarding the accuracy of the space measurement in a tooth-segmentation program, Laganà et al. [19] evaluated the accuracy of the planned interproximal reduction (IPR) in ClinCheck compared to the implemented IPR, and other related studies analyzed the accuracy of arch width or mesiodistal width of teeth after IPR [20,21,22]. However, these studies focus on tooth dimensions rather than the interproximal space itself. Moreover, these studies have shown that implemented IPR is less than the planned IPR, implying some possible inaccuracy of measurement in a tooth-segmentation program. Despite its importance, few studies have directly assessed interproximal space in evaluating accuracy of intraoral scanners. According to Huang, the accuracy of intraoral scanners is high for interproximal spaces exceeding 3.5 mm [23].

To the best of the authors’ knowledge, no published studies have directly assessed the accuracy of virtual interproximal measurements, particularly in spaces smaller than 1 mm, nor have they compared these measurements with corresponding clinical values. The assessment of interproximal spaces under 1 mm is clinically essential because they are commonly encountered in clinical practice, often resulting from minor relapse after previous space closure or as a result of interproximal reduction (IPR) for orthodontic treatment. Especially when utilizing virtual setup programs for clear aligner treatment, the accuracy of digital measurements in these specific ranges becomes a critical factor for predictable space closure.

Hence, this study aimed to evaluate accuracy and reliability of interproximal space measurement with two currently available programs adopting different segmentation methods in comparison with actual clinical space (CS). In addition, the minimum interproximal distance for reliable measurement was evaluated according to distance intervals. The null hypothesis of this study was that interproximal space measurements of the two different programs do not significantly differ with actual clinical space.

2. Materials and Methods

This retrospective study was reviewed and approved by the Institutional Review Board (B-2506-977-102) at Seoul National University Bundang Hospital and passed the exemption review of informed consent on the use of patients’ intraoral scan data. All clinical examinations were conducted in accordance with the Declaration of Helsinki.

Among 6586 digital dental models of patients scanned at the Department of Orthodontics, Seoul National University Bundang Hospital (Seongnam, Korea) from 2016 February to 2024 December, were screened for eligibility. The inclusion criteria were as follows: (1) over 16-year-old patients at the beginning of orthodontic treatment with second molars eruption; (2) anterior spacing between canines in maxillary and/or mandibular dentition; (3) high-quality intraoral scan images including the teeth and gingiva; (4) presence of measured CS using a strip gauge in 0.05 mm unit and (5) health periodontium with no mobility of anterior teeth. The exclusion criteria for the digital dental image (DDI) were as follows: (1) missing tooth, (2) tooth deformity such as fusion, (3) heavy restoration such as crowns, or (4) supernumerary tooth between canines. As illustrated in the flow-chart (Figure 1), 89 models with anterior spacing were initially identified from the 3877 models that met the basic clinical criteria. However, 67 model sets were subsequently excluded due to the absence of corresponding clinical space records, resulting in a final sample of 22 digital dental models for analysis.

2.1. Clinical Space Measurement

CS was measured from the patients’ dentition at chairside using a metal strip leaf gauge (OrthoDepot, Nürnberg, Germany) in 0.05 mm unit by an experienced orthodontist. The strip gauge was gently engaged in the interproximal areas with adequate resistance so as to not displace teeth. The space measurement was double-checked from thin to thick strips and from thick to thin strips. The CS was identified as the thickest strip leaf gauge indicated. CS was served as a reference gold standard (Figure 2A).

2.2. Virtual Space Measurement

Maxillary or mandibular dentition with anterior spacing was pre-dried and scanned using an intra-oral scanner Medit i500 (Medit, Seoul, Republic of Korea) by the same experienced orthodontist on the same day of CS measurement. Complete stereolithography (STL) digital records of dentition were imported into two different tooth-segmentation programs: DentOne^® (version 1.6.6.0, Diorco, Yongin, Republic of Korea) and Orthoanalyzer^{^®} (ver.1.9.3.4, 3shape, Copenhagen, Denmark).

In Orthoanalyzer^®, the mesial and distal points of the teeth were designated as the most mesio-distal point of a tooth and corrected by another orthodontist who was not aware of the CS, and the segmentation proceeded, i.e., semi-automatically (Figure 2B). In DentOne^®, tooth segmentation was automatically performed without orthodontist intervention (Figure 2C). The interproximal space in DDI was named as VS_S (semi-automatic) in Orthoanalyzer^® and VS_F (full-automatic) in DentOne^®, respectively. Eight randomly selected DDIs were reanalyzed through the same process twice after two weeks to examine the intra-examiner reliability.

2.3. Statistical Analysis

Sample size calculation was performed with α = 0.05, power (1−β) = 0.80, and a medium effect size (Cohen’s d = 0.5) due to the absence of prior or pilot studies [24]. The initial sample size of 34 interproximal spaces using was determined using G*Power 3.1.9.7 based on a simplified assumption of independence [25]. To account for the clustered nature of the data (85 spaces within 22 patients), a Linear Mixed Model was subsequently employed. The resulting design effect was found to be negligible [DE ≈ 1], suggesting that the clustering within subjects did not significantly impact the statistical power of the study. Prior to statistical analysis, normality of the data was assessed using the Shapiro–Wilk test.

Repeatability, the ability of a program to produce consistent space measurement, was evaluated by the within-program intraclass correlation coefficient (ICC) [26]. Accuracy, the closeness of agreement between two programs against the gold standard CS, was evaluated via descriptive statistics and between-program ICCs. These estimates are adjusted for within-patient clustering and confirmed by non-parametric Wilcoxon signed-rank tests. Bland–Altman analysis was also used to test the agreement of interproximal space measurement between the two programs.

Furthermore, a relative difference analysis was performed to determine the minimum distance threshold required for reliable software measurements. The CS measurements were categorized into four discrete intervals: <0.15, 0.15–0.20, 0.20–0.30, and ≥0.30 mm. The relative difference (%) between the CS and VS_S/VS_F was calculated using the formula: |VS − CS|/CS × 100. To evaluate the clinical acceptability of each program, the accuracy was categorized into two groups based on a 50% relative difference threshold. The frequency of measurements exceeding this threshold was compared between programs using the Chi-squared test or Fisher’s exact test when the expected cell frequency was less than 5, applied to ensure statistical validity. Cases with CS = 0 were excluded from this analysis to avoid division by zero.

The level of significance was set at p < 0.05 for all statistical analyses. All statistical analyses were performed using R Statistical Software (v4.3.0; R Core Team 2023).

3. Results

A total of 85 interproximal spaces from 22 digital dental models that satisfied the inclusion criteria were analyzed. The mean age of the patients was 29.2 ± 10.8 years.

3.1. Intraclass Reliability of the Programs

The repeatability of interproximal measurement showed excellent agreement in both programs [26]. Within-Program ICCs were 0.922 in VS_S (95% confidence interval [CI], 0.882 to 0.949) and 0.948 in VS_F (95% CI, 0.921 to 0.966) (Table 1).

3.2. Accuracy of Space Measurement in the Two Programs

As presented in Table 2, the mean amounts of interproximal spaces were 0.11 mm in VS_S and 0.13 mm in VS_F, compared to 0.21 mm of the mean CS. They all showed statistically significant difference between CS and VS_S or VS_F (all, p < 0.001).

The agreement between VS_S and CS demonstrated good to excellent agreement across both measurements. The ICC for the first measurement was 0.785 (95% CI: 0.688–0.855), while the second showed an improved ICC of 0.882 (95% CI: 0.824 to 0.922). For VS_F, the ICCs for the first and second measurements were 0.877 (95% CI: 0.817 to 0.918) and 0.884 (95% CI: 0.828 to 0.923), respectively, representing consistently good agreement across both measurements.

The mean difference between CS and VS (mean bias, calculated as VS-CS) were −0.101 to −0.113 mm in VS_S and −0.078 to −0.082 mm in VS_F. The negative values of mean bias indicated underestimation by both programs.

The Bland–Altman plots also illustrated the level of agreement between the measurements and CS. The 95% limits of agreement for VS_F were narrower (−0.219 to 0.062) compared to VS_S (−0.278 to 0.076) with a higher proportion of measurements falling within the LoA (95.3% vs. 91.8–92.9%) (Table 2 and Figure 3).

3.3. The Distance Threshold for Reliable Measurement

The relative difference analysis across discrete CS intervals revealed a highly significant association between the CS interval and the proportion of measurements within the ±50% accuracy threshold (all, p < 0.001, Chi-squared test). For both VS-S and VS_F, the proportions of cases within the ±50% accuracy were significantly lower (7.7 to 26.9%) in the smallest interval (<0.15 mm), compared to the larger intervals (70.6 to 88.2% in ≥0.30 mm). It also showed gradual increase as CS increased. In all intervals, VS_F had a significantly higher proportion of measurements within the 50% difference compared to VS_S (p = 0.001) (Table 3). In the scatter plots, most measurements remained within the ±50% difference in spaces ≥0.2 mm, greater increase in dispersion and relative error were observed in the range below 0.2 mm (Figure 4).

4. Discussion

Accurate tooth segmentation is crucial for virtual orthodontic setup used for diagnosis, treatment planning, and fabrication of orthodontic appliances [27], especially clear aligners that are widely used these days. Interproximal information is limited in DDI due to the nature of scans only having surface information [14], so the measurement of the tooth width or interproximal space in DDI is inevitably different from the actual clinical data [28]. Moreover, these measurements in DDI can be affected by crowding [29] and are difficult to measure accurately in oral cavity or on a model. Therefore, the specific aims of this study were to evaluate accuracy of interproximal space measurement in two different tooth-segmentation programs, semi-automated and full automated, compared to clinical in vivo data as a gold standard, and to assess the reliable distance threshold.

Among currently available programs for segmentation of digital models, we used two existing software tools, i.e., OrthoAnalyzer^® and DentOne^®, which adopt different segmentation methods, full and semi-automated, for several reasons: (1) both are popular with orthodontists; (2) they differ in the degree of intervention of users e.g., designating MD points, (3) versatility; OrthoAnalyzer^® is paid, while DentOne^® is free to use, and they both can export stereolithography files, which are compatible with other programs.

According to the results of this study, interproximal measurements of both VS_S and VS_F were reliable, showing excellent agreement. However, the full-automatic program (VS_F) showed slightly higher repeatability than the semi-automatic program (VS_S), which is expected given the reduced operator dependency in automated measurement (Table 1). Similar to the result of this study, a recent study reported intra-operator consistency of 95.5–98.9% for artificial intelligence (AI)-driven segmentation and 90.9–95.4% for semi-automatic segmentation depending on dentition condition [30].

With CS being a gold standard, accuracy was evaluated with differences in descriptive statistics, between-program ICC and Bland–Altman Analysis [31]. Both programs showed a tendency to slightly underestimate the interproximal spaces compared to clinical measurements, as indicated by the negative mean bias values, having greater mean bias in VS_S compared to VS_F. These results coincide with previous reports that tooth-segmentation programs tend to overestimate tooth width. Zilberman [10] et al. observed 0.3 mm overestimation in anterior teeth, and Karslı [32] et al. reported 0.1 mm overestimation in maxillary incisors. In addition, several studies on the accuracy of IPR have shown that implemented IPR is 0.13–0.55 mm less than the planned IPR [20,21,22].

The results of this study implicate that the two tooth-segmentation methods resulted in different interproximal space measurement compared to clinical in vivo data; therefore, the null hypothesis was rejected.

This discrepancy in interproximal space measurement in DDI of a tooth-segmentation program may be attributed to the following reasons: (1) innate inaccuracy of scanner and scanning, (2) different tooth-segmentation methods. The scanner used in this study employs the triangulation method, which measures the shape of an object by directing a beam of light at the object and calculating the distance based on the angle of the reflected light [33,34]. It is known that two types of errors can occur, random and systematic errors, which can arise from speckles from the changes in light wave amplitude and scanning factors such as scan depth, incident angle, and projected angle [18,34]. Additionally, the accuracy of intraoral scanning is known to be affected by intraoral factors such as existing dental restorations [18], brackets [35], and humidity [33] and scanning strategy, the examiner’s experience, and software algorithm [14].

Regarding tooth-segmentation methods, in Orthoanalyzer^®, tooth segmentation is performed after manual designation of the mesial and distal points of a tooth based on the principle that each interstice between two adjacent teeth can only be made by connecting two different cutting points, or one cutting point with a joint point [36]. The invisible interproximal surfaces are reconstructed by closing the mesh between the segmented tooth boundaries using geometry-based reconstruction such as best-fit surface/mesh continuation and smoothing across gaps. On the other hand, DentOne^®, an AI-based automation, recognizes the tooth–gingiva and tooth–tooth boundaries through mesh simplification via a convolutional neural network without manual intervention, preserving geometric features [37]. For interproximal surface reconstruction, it estimates virtual root apex positions from crown shapes and reconstructs root surfaces using segmented crown edges. Interproximal spaces are then closed and refined via boundary-aware surface reconstruction. This process maintains structural integrity by considering mesh proximity, surface continuity, and the natural curvature of the teeth. This differential reconstruction would help explain the underestimation of interproximal spaces.

Previous studies revealed that the success rate of tooth segmentation ranged from 87.9% to 97.3%, with a dynamic graph convolutional neural networks (DGCNN)-based algorithm showing a high segmentation success rate, accuracy, and efficiency [16] and 98.21% in Orthoanalyzer^®, a landmark based segmentation [18]. Another study using DentOne^® achieved 99% successful identification rate, but there was a tendency to overestimate the mesiodistal width of canines and premolars [17], and it is concordance with our results that the distances of interproximal spaces were underestimated. However, these studies did not specify the characteristics of the dental models used, such as spacing or the presence of crowding, which could affect the interproximal information. Their practical difficulties lie in the absence of interproximal surfaces in scanned images, i.e., potential inaccuracy on interproximal space measurement.

Agreement between VS_S and CS initially showed good reliability (ICC: 0.785; 95% CI: 0.688–0.855), which improved to 0.882 in the second trial, suggesting that the semi-automatic workflow benefits from a learning effect (Table 2, Figure 3). With an upper 95% CI of 0.922 in the second measurement, VS_S demonstrates its potential as a highly precise tool when utilized by experienced users. Meanwhile, VS_F exhibited consistent agreement maintaining an ICC of 0.882 with a lower 95% CI bound above 0.80.

Bland–Altman analysis further supported these observations; VS_F showed slightly narrower LoA, minimal mean bias, and tighter data clustering with 95.3% of measurements within the LoA (Figure 3). Overall, while both programs achieve excellent accuracy, VS_F could offer a more consistent experience for initial users. This suggests that while VS_S rewards user experience with high precision, VS_F provides a more intuitive framework during the early stages of the learning curve. The observed performance of VS_F over VS_S may be attributed to several technical factors. While semi-automatic segmentation relies on operator-defined seed points, which could lead to subjective variability, AI-based systems utilize deep convolutional neural networks, which excel in feature extraction and boundary detection even in areas with tight contacts. These algorithms are trained on vast datasets, enabling them to more accurately reconstruct complex interproximal morphologies. According to Im et al., measuring tooth dimensions before full segmentation as in a semi-automatic program often leads to more conservative values because precise point setting is physically limited within the interproximal contact areas. In contrast, fully automated systems like VS_F calculate tooth mesiodistal widths based on pre-segmented tooth morphologies, allowing for a more comprehensive reconstruction of the proximal surfaces [11].

In addition, relative difference analysis was conducted to assess the software’s clinical reliability across varying scales of interproximal spaces, with a particular focus on measurement stability within extremely narrow gaps. It was observed that relative measurement error increases significantly as the interproximal space narrows. Our findings are consistent with the previous literature. For instance, one study reported a deviation of 20.1 µm for a 1.5 mm preparation width, with greater deviations observed as the width decreased [38]. Similarly, another study identified high accuracy when the distance between prepared and adjacent teeth exceeded 2.0 mm [23,39]. Given that success rates for both programs consistently surpassed 50% beyond 0.20 mm, this distance may be considered a practical limit for maintaining measurement consistency as accuracy tends to diminish significantly in narrower gaps. This may be attributed to the limitations of the triangulation-based scanning method, in which effective light reflection is hindered in narrow interproximal spaces [33]. Such constraints may also increase the likelihood of segmentation errors during tooth segmentation.

This study has several limitations, including a relatively small sample size and the use of single type of scanner and two tooth-segmentation programs. In addition, the effect of tooth or interdental shape on segmentation outcomes was not considered. Software tools continue to be upgraded; hence, further research with larger cohorts or posterior teeth and a broader range of upgraded software tools would be needed to more comprehensively evaluate the accuracy of interproximal space measurements, particularly in narrow regions. Furthermore, although the strip gauge is a practical gold standard, its inherent limitations including operator dependency, a 0.05 mm effective unit, and biomechanical factors such as insertion resistance must be considered when interpreting the results, even with the exclusion of mobile teeth.

Despite these limitations, the study is novel in its assessment of virtual tooth-segmentation accuracy and reliability for interproximal spaces less than 1 mm and in exploring a threshold for accurate measurement within digital models.

The findings of this study suggest that intraoral scanning and virtual segmentation are clinically reliable yet could underestimate interproximal space. Therefore, clinical assessment of spacing or interproximal contacts remains essential when planning orthodontic treatment using virtual setup tools.

5. Conclusions

Both semi-automated and fully automated tooth-segmentation programs offer reliable interproximal measurements albeit with a tendency to underestimate values compared to the clinical gold standard.
While both programs demonstrated high accuracy, supplementing digital data with direct clinical assessment—notably for spaces near 0.2 mm—is recommended to ensure the precision of virtual orthodontic planning and clinical outcomes.

Author Contributions

Conceptualization, writing—original draft preparation, writing—review and editing, project administration, funding acquisition, T.-H.C.; data acquisition, software, validation, formal analysis, writing—original draft preparation, S.-Y.K.; supervision, N.-K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by Seoul National University Bundang Hospital (SNUBH) Research Fund. (Grant no. 14-2019-0023).

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki. This retrospective study was reviewed and approved by the Institutional Review Board (B-2506-977-102; 21-05-2025) at Seoul National University Bundang Hospital and passed the exemption review of informed consent on the use of patients’ intraoral scan data.

Informed Consent Statement

Patient consent was waived due to the usage of patients’ intraoral scan data.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

Thanks to Soyeon Ahn of Seoul National University Bundang Hospital Medical Research Collaborating Center (Division of Statistics) for her contribution to the statistical analyses and interpretation of the study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IPR	Interproximal reduction
CS	Clinical space
DDI	Digital dental image
VS_S	Virtual space measured in semi-automatic program
VS_F	Virtual space measured in full-automated program
ICC	Intraclass correlation coefficient
LoA	Limit of agreement (between lower and upper bound 95%)
AI	Artificial intelligence

References

Mattos, C.T.; Gomes, A.C.R.; Ribeiro, A.A.; Nojima, L.I.; Nojima, C. The importance of the diagnostic setup in the orthodontic treatment plan. Int. J. Orthod. Milwaukee 2012, 23, 35–39. [Google Scholar]
Kesling, H.D. The philosophy of the tooth positioning appliance. Am. J. Orthod. Oral Surg. 1945, 31, 297–304. [Google Scholar] [CrossRef]
Miller, K.B.; McGorray, S.P.; Womack, R.; Quintero, J.C.; Perelmuter, M.; Gibson, J.; Dolan, T.A.; Wheeler, T.T. A comparison of treatment impacts between Invisalign aligner and fixed appliance therapy during the first week of treatment. Am. J. Orthod. Dentofac. Orthop. 2007, 131, e301–e309. [Google Scholar] [CrossRef]
Fillion, D. Lingual straightwire treatment with the Orapix system. J. Clin. Orthod. 2011, 45, 488–497. [Google Scholar]
Wiechmann, D.; Rummel, V.; Thalheim, A.; Simon, J.S.; Wiechmann, L. Customized brackets and archwires for lingual orthodontic treatment. Am. J. Orthod. Dentofac. Orthop. 2003, 124, 593–599. [Google Scholar] [CrossRef]
Panayi, N.; Cha, J.Y.; Kim, K.B. 3D printed aligners: Material science, workflow and clinical applications. Semin. Orthod. 2023, 29, 25–33. [Google Scholar] [CrossRef]
Rossini, G.; Parrini, S.; Castroflorio, T.; Deregibus, A.; Debernardi, C.L. Diagnostic accuracy and measurement sensitivity of digital models for orthodontic purposes: A systematic review. Am. J. Orthod. Dentofac. Orthop. 2016, 149, 161–170. [Google Scholar] [CrossRef]
Warnecki, M.; Nahajowski, M.; Papadopoulos, M.A.; Kawala, B.; Lis, J.; Sarul, M. Assessment of the reliability of measurements taken on digital orthodontic models obtained from scans of plaster models in laboratory scanners: A systematic review and meta-analysis. Eur. J. Orthod. 2022, 44, 522–529. [Google Scholar] [CrossRef] [PubMed]
Goracci, C.; Franchi, L.; Vichi, A.; Ferrari, M. Accuracy, reliability, and efficiency of intraoral scanners for full-arch impressions: A systematic review of the clinical evidence. Eur. J. Orthod. 2016, 38, 422–428. [Google Scholar] [CrossRef]
Zilberman, O.; Huggare, J.; Parikakis, K.A. Evaluation of the validity of tooth size and arch width measurements using conventional and three-dimensional virtual orthodontic models. Angle Orthod. 2003, 73, 301–306. [Google Scholar] [CrossRef]
Santoro, M.; Galkin, S.; Teredesai, M.; Nicolay, O.F.; Cangialosi, T.J. Comparison of measurements made on digital and plaster models. Am. J. Orthod. Dentofac. Orthop. 2003, 124, 101–105. [Google Scholar] [CrossRef]
Tomassetti, J.J.; Taloumis, L.J.; Denny, J.M.; Fischer, J.R. A comparison of three computerized Bolton tooth-size analyses with a commonly used method. Angle Orthod. 2001, 71, 351–357. [Google Scholar] [CrossRef]
You, J.; Zhou, X.; Xia, X.; Zhang, J.; Liu, Y. Accuracy evaluation and clinical realization of digital interproximal enamel reduction for orthodontics: A case study. Biomed. Eng. OnLine 2025, 24, 1. [Google Scholar] [CrossRef]
Son, Y.T.; Son, K.; Lee, J.M.; Lee, K.B. Does Intraoral Scanning at the Subgingival Finish Line Affect the Accuracy of Interim Crowns? J. Funct. Biomater. 2025, 16, 309. [Google Scholar] [CrossRef]
Robles-Medina, M.; Romeo-Rubio, M.; Salido, M.P.; Pradíes, G. Digital Intraoral Impression Methods: An Update on Accuracy. Curr. Oral Health Rep. 2020, 7, 361–375. [Google Scholar] [CrossRef]
Im, J.; Kim, J.-Y.; Yu, H.-S.; Lee, K.-J.; Choi, S.-H.; Kim, J.-H.; Ahn, H.-K.; Cha, J.-Y. Accuracy and efficiency of automatic tooth segmentation in digital dental models using deep learning. Sci. Rep. 2022, 12, 9429. [Google Scholar] [CrossRef]
Yacout, Y.M.; Eid, F.Y.; Tageldin, M.A.; Kassem, H.E. Evaluation of the accuracy of automated tooth segmentation of intraoral scans using artificial intelligence-based software packages. Am. J. Orthod. Dentofac. Orthop. 2024, 166, 282–291.e1. [Google Scholar] [CrossRef]
Raju, R.; Tr, P.A. Accuracy of tooth segmentation in the digital Kesling setup of two different software programs: A retrospective study. Cureus 2024, 16, e70306. [Google Scholar] [CrossRef]
Laganà, G.; Malara, A.; Lione, R.; Danesi, C.; Meuli, S.; Cozza, P. Enamel interproximal reduction during treatment with clear aligners: Digital planning versus OrthoCAD analysis. BMC Oral Health 2021, 21, 199. [Google Scholar] [CrossRef]
Kalemaj, Z.; Levrini, L. Quantitative evaluation of implemented interproximal enamel reduction during aligner therapy: A prospective observational study. Angle Orthod. 2021, 91, 61–66. [Google Scholar] [CrossRef]
Hariharan, A.; Arqub, S.A.; Gandhi, V.; Da Cunha Godoy, L.; Kuo, C.L.; Uribe, F. Evaluation of interproximal reduction in individual teeth, and full arch assessment in clear aligner therapy: Digital planning versus 3D model analysis after reduction. Prog. Orthod. 2022, 23, 9. [Google Scholar] [CrossRef]
De Felice, M.E.; Nucci, L.; Fiori, A.; Flores-Mir, C.; Perillo, L.; Grassia, V. Accuracy of interproximal enamel reduction during clear aligner treatment. Prog. Orthod. 2020, 21, 28. [Google Scholar] [CrossRef] [PubMed]
Huang, M.Y.; Son, K.; Lee, K.B. Effect of distance between the abutment and the adjacent teeth on intraoral scanning: An in vitro study. J. Prosthet. Dent. 2021, 125, 911–917. [Google Scholar] [CrossRef]
Sullivan, G.M.; Feinn, R. Using effect size—Or why the P value is not enough. J. Grad. Med. Educ. 2012, 4, 279–282. [Google Scholar] [CrossRef]
Faul, F.; Erdfelder, E.; Lang, A.G.; Buchner, A. G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods 2007, 39, 175–191. [Google Scholar] [CrossRef]
Koo, T.K.; Li, M.Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 2016, 15, 155–163. [Google Scholar] [CrossRef]
Woo, H.; Jha, N.; Kim, Y.J.; Sung, S.J. Evaluating the accuracy of automated orthodontic digital setup models. Semin. Orthod. 2023, 29, 60–67. [Google Scholar] [CrossRef]
Makaremi, M.; Ristor, R.; de Brondeau, F.; Choquart, A.; Mengelle, C.; N’Kaoua, B. Estimation of Distances within Real and Virtual Dental Models as a Function of Task Complexity. Diagnostics 2023, 13, 1304. [Google Scholar] [CrossRef] [PubMed]
Yoon, J.H.; Yu, H.S.; Choi, Y.; Choi, T.H.; Choi, S.H.; Cha, J.Y. Model analysis of digital models in moderate to severe crowding: In vivo validation and clinical application. Biomed. Res. Int. 2018, 2018, 8414605. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Alqahtani, K.A.; Van den Bogaert, T.; Shujaat, S.; Jacobs, R.; Shaheen, E. Convolutional neural network for automated tooth segmentation on intraoral scans. BMC Oral Health 2024, 24, 804. [Google Scholar] [CrossRef]
Mehl, A.; Reich, S.; Beuer, F.; Güth, J.F. Accuracy, trueness, and precision—A guideline for the evaluation of these basic values in digital dentistry. Int. J. Comput. Dent. 2021, 24, 341–352. [Google Scholar] [PubMed]
Karslı, N.; Yurdakul, Z.; Gonca, M.; Çava, K. Does the traditional or digital dental model measurement method affect the results?: A validation study. Eur. Ann. Dent. Sci. 2023, 50, 87–94. [Google Scholar] [CrossRef]
Ji, Z.; Leu, M.C. Design of optical triangulation devices. Opt. Laser Technol. 1989, 21, 339–341. [Google Scholar] [CrossRef]
Feng, H.Y.; Liu, Y.; Xi, F. Analysis of digitizing errors of a laser scanning system. Precis. Eng. 2001, 25, 185–191. [Google Scholar] [CrossRef]
Kim, Y.-K.; Kim, S.-H.; Choi, T.-H.; Yen, E.H.; Zou, B.; Shin, Y.; Lee, N.-K. Accuracy of intraoral scan images in full arch with orthodontic brackets: A retrospective in vivo study. Clin. Oral Investig. 2021, 25, 4861–4869. [Google Scholar] [CrossRef]
Wu, K.; Chen, L.; Li, J.; Zhou, Y. Tooth segmentation on dental meshes using morphologic skeleton. Comput. Graph. 2014, 38, 199–211. [Google Scholar] [CrossRef]
Xu, X.; Liu, C.; Zheng, Y. 3D tooth segmentation and labeling using deep convolutional neural networks. IEEE Trans. Vis. Comput. Graph. 2018, 25, 2336–2348. [Google Scholar] [CrossRef]
Park, Y.; Kim, J.H.; Park, J.K.; Son, S.A. Scanning accuracy of an intraoral scanner according to different inlay preparation designs. BMC Oral Health 2023, 23, 515. [Google Scholar] [CrossRef] [PubMed]
Kim, S.Y.; Son, K.; Bihn, S.K.; Lee, K.B. Effect of the inter-tooth distance and proximal axial wall height of prepared teeth on the scanning accuracy of intraoral scanners. J. Funct. Biomater. 2024, 15, 115. [Google Scholar] [CrossRef]

Figure 1. Flow-chart of digital dental model selection.

Figure 2. Measurement of interproximal space in (A) clinical space (CS); (B) semi-automatic tooth-segmentation programs. (VS_S; Orthoanalyzer^®); (C) full-automatic tooth-segmentation program (VS_F; DentOne^®).

Figure 3. Bland–Altman plots illustrating the agreement between CS and VS_S and between CS and VS_F. The solid line represents the mean bias, and the dashed lines represent the 95% limits of agreement.

Figure 4. Scatter plots illustrating the relative differences by CS intervals. The solid line represents 0%, and the dashed lines represent ±50% relative difference.

Table 1. Within-program intraclass correlation coefficients (ICCs) for the repeatability of the programs.

Programs	Measurement Frequency	Mean (mm)	SD (mm)	ICC (2,1) (95% CI)	p-Value	Agreement
VS_S	1	0.11	0.14	0.922 (0.882–0.949)	<0.001	Excellent
VS_S	2	0.10	0.13	0.922 (0.882–0.949)	<0.001	Excellent
VS_F	1	0.13	0.15	0.948 (0.921–0.966)	<0.001	Excellent
VS_F	2	0.13	0.15	0.948 (0.921–0.966)	<0.001	Excellent

ICC (2,1): two-way random effects model, single rater/measurement, absolute agreement. Agreement: excellent (ICC > 0.9), good (0.75–0.9), moderate (0.5–0.75), poor (<0.5). Abbreviations: VS_S, semi-automatic; VS_F, full-automatic; ICC, intraclass correlation coefficient; CI, confidence interval; SD, standard deviation.

Table 2. Mean measurements and between-program ICCs of two programs compared to clinical space.

Space	Mean ± SD (mm)	ICC (3,1)	95% CI	p-Value ^†	Mean Bias (mm)	95% LoA (mm)	% Within LoA
CS	0.21 ± 0.13	NA	NA	NA	NA	NA	NA
VS_S.1	0.11 ± 0.14	0.785	[0.688, 0.855]	<0.001	−0.101	[−0.278, 0.076]	91.80%
VS_S.2	0.11 ± 0.14	0.882	[0.824, 0.922]	<0.001	−0.113	[−0.242, 0.017]	92.90%
VS_F.1	0.13 ± 0.15	0.877	[0.817, 0.918]	<0.001	−0.078	[−0.219, 0.062]	95.30%
VS_F.2	0.13 ± 0.15	0.884	[0.828, 0.923]	<0.001	−0.082	[−0.215, 0.052]	95.30%

ICC (3,1): two-way mixed effects model (fixed raters), single rater/measurement, absolute agreement. ^† Wilcoxon signed-rank test. Mean Bias, calculated as VS-CS; a negative value indicates underestimation by the programs. 95% limits of agreement, calculated as mean bias ± 1.96 × SD off the difference. Abbreviations: CS, clinical space; VS-S, semi-automatic; VS-F, full-automatic; VS_S/F.1/2, 1st and 2nd measurement respectively; SD, standard deviation; CI, confidence interval; LoA, limit of agreement; NA, Non-Applicable.

Table 3. Proportion of measurements within ±50% relative difference by discrete CS intervals.

CS Interval (mm)	N	VS_S.1 (%)	VS_S.2 (%)	VS_F.1 (%)	VS_F.2 (%)
<0.15	26	15.4	7.7	15.4	26.9
[0.15, 0.20)	25	16	24	44	40
[0.20, 0.30)	17	47.1	52.9	70.6	64.7
≥0.30	17	70.6	76.5	88.2	88.2

Relative difference = (Measurement − CS)/CS. p-value from Chi-squared test for association between CS interval and within ±50% status. Chi-squared p-values: VS_S.1 = < 0.001, VS_S.2 = < 0.001, VS_F.1 = < 0.001, VS_F.2 = < 0.001. Fisher’s exact test used if any expected cell count <5. Abbreviations: CS, clinical space; VS-S, semi-automatic; VS-F, full-automatic; VS_S/F.1/2, 1st and 2nd measurement respectively.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Choi, T.-H.; Kim, S.-Y.; Lee, N.-K. Accuracy of Interproximal Space Measurement Across Different Orthodontic Tooth-Segmentation Programs: A Comparative Clinical Study. Appl. Sci. 2026, 16, 5497. https://doi.org/10.3390/app16115497

AMA Style

Choi T-H, Kim S-Y, Lee N-K. Accuracy of Interproximal Space Measurement Across Different Orthodontic Tooth-Segmentation Programs: A Comparative Clinical Study. Applied Sciences. 2026; 16(11):5497. https://doi.org/10.3390/app16115497

Chicago/Turabian Style

Choi, Tae-Hyun, So-Yeon Kim, and Nam-Ki Lee. 2026. "Accuracy of Interproximal Space Measurement Across Different Orthodontic Tooth-Segmentation Programs: A Comparative Clinical Study" Applied Sciences 16, no. 11: 5497. https://doi.org/10.3390/app16115497

APA Style

Choi, T.-H., Kim, S.-Y., & Lee, N.-K. (2026). Accuracy of Interproximal Space Measurement Across Different Orthodontic Tooth-Segmentation Programs: A Comparative Clinical Study. Applied Sciences, 16(11), 5497. https://doi.org/10.3390/app16115497

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Accuracy of Interproximal Space Measurement Across Different Orthodontic Tooth-Segmentation Programs: A Comparative Clinical Study

Abstract

1. Introduction

2. Materials and Methods

2.1. Clinical Space Measurement

2.2. Virtual Space Measurement

2.3. Statistical Analysis

3. Results

3.1. Intraclass Reliability of the Programs

3.2. Accuracy of Space Measurement in the Two Programs

3.3. The Distance Threshold for Reliable Measurement

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI