The mean (±Standard Deviation) and median DAI values obtained were 35.4 (±10.9) and 33.0, respectively. The minimum and maximum values were 19 and 98. For the ordinal DHC-IOTN, the minimum and maximum values ranged between 1 and 5. The orthodontic treatment need according to the DAI and DHC-IOTN evaluation made by the examiner were presented on Table 1
The intra-rater reliability assessment resulted in an intraclass correlation coefficient of 0.89 for the DAI (95% IC = 0.64 to 1.0) and 0.87 for the DHC (95% CI = 0.56 to 0.96). Table 2
shows the comparison between the two indices, the Cohen Kappa and PABAK coefficients.
The time spent (in seconds) to assess the DAI and the DHC-IOTN were presented on Table 3
. These variables were not normally distributed (Kolmogorov-Smirnov test, P
< 0.05). The time spent to assess the DHC-IOTN was statistically lower than that for the DAI (Wilcoxon test, P
When comparing the two indices with the gold standard (Table 4
), less agreement on the overall diagnosis of models examined for treatment needs was observed (47% according to the DAI and 52% according to the DHC-IOTN), with a significant percentage of false positives both for the DAI (41%) and the DHC-IOTN (39%).
The accuracy of the indices, as reflected by the ROC curve, was also presented (Figure 1
). In the analysis of the validity of the indices (Table 5
), both had great sensitivity and very low specificity, indicating a good ability to identify orthodontic treatment need in patients. However, the positive predictive value (PPV) for both indices was low, reducing the certainty of the sensitivity. Otherwise, the specificity is low but the negative predictive value is high. The new cutoff points (DAI=31, and DHC=3), have changed the properties of indexes.
The agreement between assessments of the gold standard and the DHC in three categories (need—borderline—no need) were also fair (Kappa = 0.18 [95% CI = 0.09 to 0.26]).
Indices could be considered useful for epidemiological and public health applications when they are reliable and valid. Considering the results presented, the DAI and DHC-IOTN could be considered reliable and validity.
For the sample size calculation of our study we could use the frequency of orthodontic treatment need as measured by the DHC-IOTN or by the DAI. Considering that orthodontic treatment need based on research with orthodontic study models was about 15.0% [16
], we opted for the frequency of orthodontic treatment need determined by the DAI [9
] because it ensured a larger sample set. The literature have pointed out that it is possible and correct to use dental models in order to validate orthodontic indices [14
]. Besides, the reliability and agreement between the information obtained clinically and from diagnostic models are high [11
]. For these reasons and due to feasibility, we carried out the study using dental models.
High intra-examiner agreement existed between the original DAI and DHC-IOTN values. The examiner was trained and calibrated in the use of the indices before the evaluation sessions, which confirms the need for those steps before an epidemiological survey. This step contributed to the good results. However, the examiner was a specialist in orthodontics, and epidemiological surveys are normally conducted by general dentists, which may point to the need for more previous training. It might be necessary to evaluate the reliability and validity of occlusal indices between general dentists as well. It is important that the indices have a high degree of reproducibility to be useful as a research tool. Despite the lower ICC DHC-IOTN as compared to the DAI value, the confidence intervals are coincident, showing that the reproducibility of both is similar [1
Despite the high percentage of agreement between both indices, the Cohen Kappa could be considered fair. However, considering that the agreement between positive classification (orthodontic treatment need) for DAI and DCH-IOTN was high, Cohen Kappa was artificially low. So, the Prevalence-adjusted bias-adjusted Kappa was calculated and it was considered substantial. So, the DAI and DHC-IOTN measure orthodontic treatment needs in the same way [15
Most cases showed the need for treatment. This high prevalence is similar to the results of other studies because these validation studies are usually conducted in orthodontics settings, where study models are provided and where most cases for treatment are, due to the need for diagnosis [14
In validity studies of the occlusal indices, an important factor is the definition of the gold standard. The literature has considered a panel of orthodontists to be the gold standard. This assessment, as defined by several authors [20
], has been considered the gold standard of the orthodontic treatment needs. However, there are at least two ways to define this panel: using the Likert scale [17
] or by consensus [12
]. The number of orthodontists in this type of panel has varied from two [12
] to eighteen [21
]. Our study had a consensus panel of three orthodontists, similar to that of Freer and Freer [12
]. It seems necessary to standardise the construction of these panels worldwide to better define the need for orthodontic treatment. It is not easy to infer the effect on the validity statistics of the DHC-IOTN and DAI if the number of specialists participating in the panel were changed. In Brazil, the post-graduate Orthodontic programs vary in content and length of study which may potentially increase the discordance among specialists’ determinations of treatment need.
The comparison with the gold standard has shown an impressive amount of false positives. This is a worrying finding because about 50% of the cases were determined to need treatment based on both indices which a committee of experts in orthodontics had not noticed. In this case, an epidemiological survey using these indices may overestimate the need for treatment in a population. The modification in the cutoff points has decreased the proportion of false-positive and has increased the proportion of false-negative results in both indices. The overall concordance has slightly increased.
In the validity assessment, the DHC-IOTN showed a sensitivity of 100% and DAI, 91%, i.e., the probability of the assessment performed correctly indicate the orthodontic treatment needs is great. Both showed low values of specificity (DAI = 14% and DHC-IOTN = 19%).
In epidemiological surveys, sensitive tests are useful because they prevent people with a problem from being disregarded. Depending on the problem, this can be a complicating factor in finding a solution. Moreover, specific tests are also desirable because they contribute to cost reduction both in the need for subsequent examinations and in the treatment that will be provided. The low specificity is related to a high degree of false positives, which affects the good sensitivity. Thus, it would be desirable to have a balance of these two characteristics, but that did not occur with the DAI and DHC-IOTN indices. Thus, it is necessary to develop an occlusal index that evaluates orthodontic treatment needs more accurately. This development process is not easy and could be done with participation of experts in orthodontics, public health, epidemiology, statistic from all over the world.
The positive predictive value (PPV) for the two indices is low (28 for the DAI and 31 for the DHC-IOTN). Whereas PPV increases with increasing prevalence, this is another deficiency in the validity of these indices. The modification in cutoff points increased the PPV and specificity for the two indices. However, the others properties (sensibility and negative predictive value) have decreased.
The deficiencies observed in the characteristics analysed concerning the validity of a test resulted in the accuracy values (DAI = 0.61 and DHC-IOTN = 0.67). Studies with American [2
] or English [1
] orthodontists showed better accuracy levels. However, in another study [14
], the accuracy of the IOTN was very similar to our results. The validity of an index can depend on the origin of the orthodontic experts who determine as the gold standard [14
]. As discussed previously, an expert’s opinion is currently regarded as the best determinant of the treatment need because of the difficulty in using occlusal indices to identify and quantify the objective signs of malocclusion and orthodontic treatment needs [10
]. Therefore, the aggregate decision of orthodontic specialists is generally regarded as the gold standard against which any occlusal index should be validated [20
]. The different methods of obtaining the gold standard in the validation studies could also explain the different accuracy results for the occlusal indices [12
The time spent for the assessment of the DAI was longer than for the DHC-IOTN. This is probably because only the worst occlusal feature is recorded by the DHC-IOTN [3
]. In other words, the identification is made through a hierarchical scale of occlusal anomalies, whereas several occlusal features of space and the teething are recorded to obtain the final DAI score. Reducing the time needed for index application is always important, especially in population studies [22
]. Despite not assessing the aesthetic component of the IOTN once we evaluated dental models [11
], the inclusion of this component would increase the time spent in evaluation. A disadvantage of DHC-IOTN use is that the proposed ruler for the index is not easily found, whereas the DAI is an index whose instrument for measurement (periodontal probe) is easily accessible.
It is necessary to point out some limitations of this study. The study was conducted with a small group of Brazilian orthodontists, and the sample, although probabilistic, is representative of a single orthodontics service in Brazil. Other studies should be conducted to assess the validity and reproducibility of the DAI and IOTN among Brazilian orthodontists. Although there is little option for orthodontic treatment in public health in Brazil, the choice of a reliable and valid instrument is essential for a correct epidemiological diagnosis. Additionally, the incorporation of subjective evaluation in the epidemiological diagnosis of orthodontic treatment need is absolutely relevant [23
]. The studied indices are epidemiological tools that aim to assess the degree of treatment need and not make diagnoses or aid in orthodontic planning. The epidemiological indices usually underestimate the studied disease, which has not occurred in this case. Further research in this area is important so that the epidemiological findings can be utilised as a reliable tool for planning and evaluation of public health actions.