1. Introduction
External apical root resorption (EARR) is a well-recognized iatrogenic consequence of orthodontic tooth movement and constitutes an important biological limitation of orthodontic treatment. Histopathologically, EARR involves irreversible loss of cementum and dentin at the root apex and may compromise tooth integrity and long-term prognosis, particularly when resorption becomes extensive. The classic biological and clinical frameworks describing orthodontically induced inflammatory root resorption emphasize that, although mild degrees of resorption are frequent and often clinically acceptable, severe EARR can adversely affect tooth longevity and remains a major concern for clinicians and patients alike [
1,
2].
The etiology of EARR is multifactorial and reflects a complex interaction between patient susceptibility and treatment-related mechanical loading. Patient-related factors such as age, root morphology, and genetic predisposition have been associated with variability in EARR risk [
3,
4,
5]. Treatment-related factors—including force magnitude, force duration, force continuity, total treatment time, appliance system, and the extent or type of tooth movement—are also repeatedly implicated as determinants of both initiation and progression of EARR [
6,
7]. Mechanistically, orthodontic forces induce inflammatory and cellular responses within the periodontal ligament (PDL) and adjacent alveolar bone; sustained or excessive loading can promote hyalinization and osteoclastic activity at the root surface, thereby increasing resorption potential [
8,
9,
10].
Among treatment-related variables, the orthodontic appliance system is clinically relevant because it shapes force delivery and movement control. Fixed orthodontic appliances are frequently associated with EARR—particularly in anterior teeth where tipping, torque, and intrusive mechanics may be required—while the wire–bracket interface may generate complex and relatively continuous force systems [
11,
12]. In contrast, clear aligner therapy has expanded rapidly due to esthetic demands and digital workflow advancements. Biomechanically, aligners typically apply staged, intermittent forces through sequential aligner changes, theoretically allowing periods of force relief and potentially reducing cumulative biological insult to the root–PDL complex [
13,
14,
15]. These features have generated the clinical hypothesis that aligner-based treatment may be associated with less EARR than fixed appliances.
Clinical studies comparing EARR between clear aligners and fixed appliances have reported inconsistent findings, which is partly explained by methodological heterogeneity—especially regarding imaging modality and outcome definition [
16,
17,
18,
19]. Historically, EARR assessment has relied on two-dimensional (2D) imaging (panoramic or periapical radiographs), which is vulnerable to projection errors, distortion, and sensitivity to changes in tooth angulation, potentially underestimating root length changes [
20,
21]. Cone-beam computed tomography (CBCT) enables three-dimensional visualization and offers improved accuracy and reproducibility for quantitative root length evaluation relative to 2D radiography [
22,
23,
24]. Accordingly, CBCT-based measurement is increasingly preferred in research settings when precise quantification of EARR is required.
Several systematic reviews, meta-analyses, and higher-level evidence syntheses have addressed EARR associated with aligners versus fixed appliances. However, an important recurring limitation is that many reviews have pooled heterogeneous outcome definitions and imaging methods (mixing CBCT with 2D measures), which can compromise comparability and quantitative interpretation [
25,
26,
27]. A recent umbrella review further summarized the existing review-level evidence, reinforcing that conclusions remain constrained by heterogeneity of methods and primary studies [
28]. Unlike previous reviews that combined heterogeneous imaging modalities, this study exclusively included CBCT-based quantitative measurements to reduce methodological heterogeneity and improve measurement accuracy.
Therefore, consistent with our PROSPERO-registered protocol, the objective of this study is to systematically review and meta-analyze clinical studies comparing clear aligner therapy versus fixed orthodontic appliances with EARR assessed using quantitative CBCT-based linear root length measurements in millimeters (mm). By restricting inclusion to CBCT-derived linear outcomes and applying updated comprehensive searches, this review aims to provide a methodologically robust and clinically interpretable comparison of EARR across these two widely used orthodontic treatment modalities.
2. Materials and Methods
2.1. Protocol Registration and Reporting Standards
This systematic review and meta-analysis was prospectively registered in the International Prospective Register of Systematic Reviews (PROSPERO; Registration No. CRD420261320269). The study was conducted and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) statement and its updated checklist [
29].
A completed PRISMA 2020 checklist is provided in the
Supplementary Materials (Supplementary Table S1). The study design, eligibility criteria, outcome definitions, and analytical approach strictly followed the registered protocol to ensure methodological transparency and to minimize selective reporting bias.
2.2. Eligibility Criteria (PICO and Outcome Definition)
Eligibility criteria were defined a priori according to the PICO framework:
Population (P): Human subjects undergoing orthodontic treatment.
Intervention (I): Clear aligner therapy (e.g., Invisalign or other thermoplastic aligner systems).
Comparator (C): Fixed orthodontic appliances (e.g., conventional brackets, self-ligating brackets, passive self-ligating systems).
Outcome (O): External apical root resorption (EARR) assessed using quantitative CBCT-based linear root length measurements (millimeters).
Studies were included if they:
Compared clear aligner therapy with fixed orthodontic appliances.
Used CBCT imaging for EARR assessment.
Reported quantitative root length changes (mm) or provided sufficient data to calculate mean differences.
Were randomized clinical trials (RCTs) or non-randomized clinical studies.
Studies were excluded if they:
Used only two-dimensional radiographic assessment.
Reported qualitative scoring without quantitative data.
Were case reports, reviews, in vitro studies, animal studies, or finite element analyses.
Did not provide a direct comparison between aligner and fixed appliance groups.
2.3. Information Sources and Search Strategy
A comprehensive electronic search was conducted from database inception to January 2026 in the following databases:
To minimize publication bias, grey literature searches were performed in:
The search strategy combined controlled vocabulary and free-text terms related to:
“external apical root resorption”;
“clear aligner” OR “Invisalign”;
“fixed orthodontic appliance” OR “braces” OR “brackets”;
“cone beam computed tomography” OR “CBCT”.
The search strategy was initially developed for PubMed and subsequently adapted for each database according to its specific syntax and indexing system. The detailed search strategies for each database are provided in the
Supplementary Materials (Supplementary Table S2).
No restrictions were applied regarding publication year. Only studies published in English were included due to resource limitations for translation.
2.4. Screening and Eligibility Assessment
All records were imported into reference management software and duplicates were removed. Two independent reviewers screened titles and abstracts for eligibility. Potentially relevant articles underwent full-text assessment.
Disagreements were resolved through discussion or consultation with a third reviewer.
The study selection process is illustrated in the PRISMA flow diagram (
Figure 1).
2.5. Data Extraction
Two reviewers independently extracted data using a standardized form. Extracted variables included:
Author and year;
Country;
Study design;
Sample size;
Mean age and sex distribution;
Extraction protocol (extraction vs. non-extraction);
CBCT acquisition parameters (voxel size, field of view);
Root length measurement protocol;
Mean root length change (mm);
Standard deviation (SD);
Follow-up duration.
When necessary, corresponding authors were contacted to obtain missing or unclear data, in accordance with the registered protocol.
CBCT imaging in the included studies was performed based on clinical indications rather than solely for research purposes.
2.6. Assessment of Risk of Bias
Risk of bias was independently evaluated by two reviewers.
Randomized controlled trials were assessed using the Cochrane Risk of Bias tool (RoB 2) [
30].
Non-randomized studies were assessed using the ROBINS-I tool [
31].
Each domain was judged as low, moderate, serious, or critical risk of bias (for ROBINS-I) or low/some concerns/high risk (for RoB 2).
2.7. Effect Measure and Data Synthesis
The primary outcome was the mean difference (MD) in apical root length change (mm) between clear aligner and fixed appliance groups (CA − FA). Effect direction was defined as MD = CA − FA; negative values indicate less EARR with aligners. Unit of analysis was patient-level mean apical root length change. Data were extracted at the patient level where available to minimize unit-of-analysis bias.
When studies reported more than one fixed appliance arm, group means and standard deviations were combined using standard formulas for pooling independent groups prior to meta-analysis.
Meta-analysis was performed using a random-effects model (DerSimonian–Laird method), considering anticipated clinical and methodological heterogeneity across studies [
32]. Between-study variance (tau-squared, τ
2) was estimated using the DerSimonian–Laird estimator.
Heterogeneity was assessed using Cochran’s Q test and quantified using the I2 statistic. I2 values were interpreted as follows: 0–25% (low heterogeneity), 26–50% (moderate heterogeneity), 51–75% (substantial heterogeneity), and >75% (considerable heterogeneity).
The unit of analysis was the study-level mean root length change. When tooth-level data were reported, study-level averages were extracted to avoid unit-of-analysis errors.
Meta-analysis was performed in R (R Foundation for Statistical Computing, Vienna, Austria) using the metafor package with a random-effects model (DerSimonian–Laird estimator) [
33].
Given the observational nature of the included studies, residual confounding related to treatment allocation and case complexity was anticipated and considered during interpretation of the results.
2.8. Subgroup and Sensitivity Analyses
Pre-specified subgroup analyses were conducted based on:
Extraction vs. non-extraction protocols;
Study design (RCT vs. non-randomized);
Tooth type (anterior vs. mixed, when data permitted).
Sensitivity analyses were performed by excluding one study at a time to evaluate the robustness of pooled estimates.
2.9. Assessment of Reporting Bias
If ≥10 studies were available, publication bias was planned to be assessed using funnel plots and Egger’s regression test [
34]. Given that fewer than 10 studies were included, formal statistical tests for publication bias (e.g., Egger’s regression test) are underpowered and may yield misleading results. Therefore, publication bias was assessed descriptively and interpreted with caution.
2.10. Certainty of Evidence
The certainty of evidence was evaluated using the GRADE framework [
35], considering:
Risk of bias;
Inconsistency;
Indirectness;
Imprecision;
Publication bias.
The overall certainty of evidence was classified as high, moderate, low, or very low.
4. Discussion
This systematic review and meta-analysis synthesized the best available clinical evidence comparing external apical root resorption (EARR) between clear aligner therapy and fixed orthodontic appliances, restricted to studies using quantitative CBCT-based root length outcomes. By limiting inclusion to CBCT-derived linear measurements, we aimed to reduce measurement bias associated with two-dimensional radiographs and enhance cross-study comparability [
20,
21,
22,
23]. The principal finding is that clear aligner therapy is associated with less CBCT-measured apical root shortening than fixed appliances, although the certainty of evidence remains constrained by the non-randomized nature of most included studies and residual clinical heterogeneity [
31,
35]. However, the limited number of included studies restricts the robustness of the conclusions and reduces the ability to draw definitive clinical inferences.
Across six eligible clinical studies directly comparing aligners with fixed appliances using CBCT-based linear root length change (mm) [
16,
17,
18,
36,
37,
38], the pooled effect favored aligners, indicating a moderate reduction in apical root shortening relative to fixed appliances. While the mean difference is numerically modest, it should be interpreted in the context of typical orthodontic EARR magnitudes in anterior teeth and the cumulative biological burden of sustained force systems [
6,
7,
11,
12]. Importantly, severe EARR—although less frequent—has been associated with compromised root length and potential long-term prognosis concerns in susceptible teeth, especially maxillary incisors [
1,
2,
24]. Accordingly, even moderate reductions in mean apical shortening may be clinically relevant for patients with higher baseline susceptibility (e.g., atypical root morphology, trauma history, or anticipated extensive movement) [
3,
4,
5,
24].
A key interpretive point is that EARR is not binary; rather, it exists along a spectrum in which small mean differences may reflect meaningful shifts in the tail of the distribution (i.e., fewer “high-resorption” cases). Because most primary studies report group means, the present synthesis primarily informs average effect and does not fully resolve whether aligners reduce the probability of severe EARR at the individual level—an important direction for future research.
The direction of effect is biologically plausible. Orthodontically induced inflammatory root resorption is mediated by inflammatory and cellular responses within the periodontal ligament (PDL) and adjacent bone, influenced by force magnitude, duration, and continuity [
8,
9,
10]. Sustained forces can promote hyalinization and recruitment of clastic cells at the root surface, increasing resorptive activity [
8,
9,
10]. Evidence from classic experimental work indicates that interrupted force application is associated with less root resorption than continuous force systems, supporting a mechanistic basis for differences between appliance modalities [
7].
Fixed appliances frequently deliver relatively continuous forces through archwire engagement and continuous activation during alignment and space closure, particularly when torque, intrusive mechanics, and complex tooth movements are required [
11,
12]. In contrast, aligners commonly apply staged forces with intermittent “force-off” periods between aligner changes, potentially reducing cumulative PDL stress and the duration of sustained hyalinization [
7,
13,
14,
15]. Additionally, digital planning inherent to aligner therapy may facilitate more controlled movement trajectories in select scenarios, potentially limiting uncontrolled intrusive vectors or excessive tipping—mechanical patterns often implicated in higher EARR risk [
6,
7,
11,
12]. Nevertheless, aligners can also exhibit “uncontrolled tipping” under certain biomechanical conditions, and this limitation is clinically important when interpreting the findings and generalizing them to complex malocclusions.
Between-study heterogeneity was moderate, which is unsurprising given the clinical diversity in extraction protocols, malocclusion types, appliance systems (including passive self-ligating brackets), and CBCT acquisition parameters [
16,
17,
18,
36,
37,
38]. Treatment mechanics and the extent of movement are established determinants of EARR [
6,
7], and these may differ substantially across cohorts even within the same nominal “aligner” or “fixed” category.
Consistent with our protocol, extraction versus non-extraction (or mixed) treatment protocols represent a particularly plausible source of heterogeneity. These sources of variability substantially limit the clinical interpretability of the pooled estimate and reduce the ability to generalize findings across all orthodontic treatment scenarios. Extraction therapy generally involves larger sagittal tooth movement and space closure mechanics that may increase biological risk for EARR, especially in anterior teeth [
6,
7,
11,
12]. Therefore, the observation that extraction-focused data can show larger absolute resorption values in both arms, while still favoring aligners, should be interpreted as hypothesis-supporting rather than definitive when subgroup evidence is limited in size or number of studies. Although the extraction-based study also favored aligners, the evidence is insufficient to draw meaningful conclusions regarding extraction protocols.
Sensitivity analyses (e.g., leave-one-out) are essential in a synthesis with a small number of studies because pooled estimates can be disproportionately influenced by a single study with larger sample size, larger effect, or lower variance [
32]. The stability of the direction of effect across most included studies supports a consistent association, but precision is still constrained by limited study count and between-study variability.
Publication bias and small-study effects are difficult to judge reliably with fewer than 10 studies. Although funnel plot visualization can be presented, it remains exploratory under these conditions, and formal statistical testing (e.g., Egger’s test) is underpowered and may be misleading [
34]. Accordingly, any statements about publication bias should be cautious and framed as inconclusive rather than confirmatory.
A critical methodological nuance in orthodontic EARR research is the unit of analysis. Some studies report outcomes per patient (e.g., averaged across teeth), whereas others report per tooth or per incisor subgroup. It should be noted that primary studies did not consistently account for clustering effects (multiple teeth per patient), and intraclass correlation coefficients were not reported, which may have resulted in overestimation of precision. Therefore, pooled confidence intervals should be interpreted cautiously. Tooth-level analyses, if treated as independent without appropriate clustering adjustment, can inflate precision and narrow confidence intervals, leading to overconfident inferences. Where possible, meta-analyses should prefer patient-level summary measures or appropriately derived composite estimates. The present synthesis prioritized extractable group-level summaries aligned with the protocol outcome definition (CBCT-based linear root length change, mm), but future primary studies should explicitly model within-patient clustering and report both tooth-level and patient-level summaries.
Measurement heterogeneity also matters. Although CBCT offers superior accuracy and reproducibility over 2D methods for root length assessment [
20,
21,
22,
23], acquisition parameters (voxel size, field of view, reconstruction algorithms) and measurement protocols (landmark selection, observer calibration) can still influence estimates and contribute to heterogeneity. Standardization of CBCT protocols and reporting would improve comparability and reduce measurement-driven variability.
Even when the pooled estimate is statistically significant, the confidence in the effect should be considered through the lens of imprecision and information size. In GRADE terms, imprecision is evaluated by the width of the confidence interval and whether it includes clinically important benefit or harm [
35]. Here, the interval favors aligners and does not cross the null, supporting a directionally consistent conclusion. However, because EARR lacks a universally accepted minimal clinically important difference threshold, “clinical importance” should be presented as contextual rather than absolute. This supports a cautious statement: aligner therapy is associated with less EARR on average, but the degree of clinical benefit may vary by patient susceptibility, mechanics, and treatment protocol.
Previous systematic reviews and meta-analyses have reported inconsistent conclusions regarding EARR differences between aligners and fixed appliances, largely due to heterogeneous imaging modalities and outcome definitions, including pooling CBCT measures with 2D radiographic outcomes [
25,
26,
27]. Two-dimensional imaging is vulnerable to projection error and sensitivity to tooth angulation changes, which can under- or mis-estimate true root length changes and reduce validity in pooled quantitative comparisons [
20,
21,
22,
23]. A recent umbrella review similarly emphasized that the review-level conclusions remain constrained by heterogeneity of primary studies and methodological inconsistencies [
28]. The present synthesis advances the field by focusing on a consistent, quantitative CBCT-based linear endpoint and incorporating updated searches, thereby improving interpretability while still acknowledging limitations driven by study design and heterogeneity.
From a clinical standpoint, these results suggest that aligner therapy may represent a potentially less biologically burdensome option in selected clinical scenarios with respect to apical root shortening for selected patients, particularly those at higher risk for EARR [
3,
4,
5,
24]. Nevertheless, appliance selection must remain individualized. Aligners have biomechanical limitations in certain complex movements, and uncontrolled tipping or insufficient root control may occur depending on staging, attachments, and compliance. Therefore, a reduction in mean EARR should not be interpreted as a universal protective effect across all malocclusions or mechanics; rather, it supports the need for risk stratification, careful biomechanical planning, and monitoring. Importantly, EARR is primarily driven by biomechanical factors and individual susceptibility rather than appliance type alone.
Early radiographic detection of EARR remains critical during orthodontic care, as early changes can progress if risk factors persist [
24]. Where clinically justified, appropriate imaging and periodic reassessment may help guide force modulation, treatment pacing, or mechanics adjustments in susceptible patients.
The present meta-analysis demonstrated that treatment with clear aligners was associated with significantly lower external apical root resorption compared with fixed appliances (MD = −0.50 mm). Although the absolute magnitude of this difference may appear modest, it may be clinically relevant in susceptible patients, although the patient-level clinical importance remains uncertain. In such patients, treatment modalities that potentially reduce the biological burden on the periodontal ligament may be advantageous. Therefore, the present findings should not be interpreted as evidence that clear aligners universally prevent EARR or are inherently superior across all orthodontic scenarios.
The findings should be interpreted in light of several limitations. First, despite comprehensive searching, the number of eligible CBCT-linear comparative studies remains limited, restricting the ability to explore multiple sources of heterogeneity or perform robust meta-regression. Sources of heterogeneity include differences in treatment duration, type of tooth movement (e.g., intrusion, torque), aligner protocols (attachments, staging), and CBCT acquisition parameters. Second, most included studies were non-randomized, and residual confounding is likely even with careful clinical matching and risk of bias assessment using ROBINS-I [
31]. Residual confounding due to treatment allocation and case complexity cannot be excluded and may partially explain the observed effect. Clear aligners are often used in less complex cases, which may introduce selection bias. Therefore, appliance-related differences cannot be fully separated from underlying differences in biomechanical complexity. Third, variation in extraction protocols, tooth types evaluated, and measurement parameters contributes to heterogeneity and limits generalizability across all orthodontic contexts. Fourth, publication bias assessment is inherently limited with a small number of studies; funnel plot patterns and Egger-type testing are not definitive under these conditions [
34]. Fifth, inconsistent reporting of unit-of-analysis handling and CBCT acquisition parameters, measurement variability, including voxel size, landmark identification, and observer calibration further constrains interpretability and underscores the need for standardized reporting. Additionally, variability among aligner systems (material properties, thickness, wear protocols) and fixed appliance systems further limits direct comparability between studies. Differences in malocclusion severity, extent of tooth movement, previous orthodontic treatment, and dental trauma were not consistently reported and pooling these may limit clinical interpretability. Therefore, the pooled estimate should be interpreted as an overall trend rather than a uniform clinical effect.
Future studies should prioritize well-designed prospective comparative cohorts and, where feasible, randomized trials with standardized mechanics and CBCT acquisition protocols. Reporting should include voxel size, reconstruction details, reliability metrics, and explicit handling of clustering when tooth-level outcomes are analyzed. Beyond mean root length change, future work should report distributional outcomes (e.g., proportion exceeding clinically meaningful thresholds) and consider patient-level predictors to identify subgroups most likely to benefit. Consensus-driven CBCT-based EARR outcome definitions and core outcome sets would substantially strengthen future evidence synthesis.