1. Introduction
The existence of cochlear dead regions was first suggested by Brian C. J. Moore in 2000 [
1]. The inner hair cells, which are the transducers of the cochlea and are responsible for converting vibration patterns on the basilar membrane into action potentials in the auditory nerve, may be non-functional over a certain region of the cochlea, leading to a loss of transduction of the auditory signal from outside the cochlea to the auditory nerve. A region of the cochlea which has lost its characteristic transducing ability is defined as a cochlear dead region (DR).
Patients with cochlear dead regions have shown poor understanding of speech in noisy conditions and report less satisfaction with hearing aids than patients with no cochlear dead regions [
2,
3]. To achieve better clinical outcomes, a correct differential diagnosis of cochlear dead regions is imperative to help clinicians provide the best possible care to their patients. However, it is still challenging to predict DRs in patients with hearing loss based on clinical and audiologic findings [
4].
In our previous study, we adopted a machine learning (ML)-based approach to develop and validate cochlear dead region prediction models as a function of frequency [
5]. ML continues to evolve with advances in computing power and the field of computer science. However, we observed some limitations in our approach; the prevalence of cochlear dead regions is about five percent in the overall data if prevalence is counted by frequency. This produces an imbalanced class distribution. The predictive power or accuracy of the model might be affected by these uneven distributions in the data. A chief problem with imbalanced classification datasets is that standard machine learning algorithms do not perform well on imbalanced datasets, because many machine learning algorithms rely upon class distribution in the training dataset to gauge the likelihood of observing examples in each class when the model is used to make predictions. Therefore, the minority class is sometimes deemed as less important than the majority class, resulting in greater attention to and better performance in the majority class.
To overcome this imbalanced dataset issue, the present study adopted oversampling techniques to duplicate examples in the minority class or to synthesize new examples from the existing examples in the minority class, and then comparing the model performance of the two different datasets: the original data and the oversampled data.
3. Results
A total of 555 ears from 380 patients (3770 test samples) were included in the study. The descriptive statistics of the study population are listed on our previous study [
5]. After applying the SMOTE method, the sample size grew to 15,494 samples. Of those 15,494 test samples, the overall frequency-specific prevalence of cochlear dead regions was 18.14%, which was originally 6.7% on our study population. The prevalence of VS etiologies, which had the lowest prevalence among the study population, increased from 7.03% to 8.64% following application of the SMOTE method. In addition, the mean WRS value was 78.9 ± 23.8%; in the original data, the value was 82.1 ± 23.9%. Descriptive statistics of the original data and oversampled data can be found in
Table 1.
The distribution of cochlear dead regions according to the hearing thresholds at each frequency in the original data and in the oversampled data is illustrated in
Figure 1. The overall proportions of the lower sample data in the original data, which indicates the frequency-specific prevalence of cochlear dead regions, were increased with the SMOTE method in the oversampled data.
The results of the CT model with the original data were described in our previous study [
5]. In summary, several factors such as word recognition score (WRS) (break point: 42%), disease type (SSNHL or VS diagnosis), and average at four frequencies (0.5 kHz, 1 kHz, 2 kHz, and 4 kHz) (PTA) when higher than 47 dB (poor overall hearing threshold) were used to split the data and detect cochlear dead regions (
Figure 2a). Sex, age, and side were not significantly used in the CT models.
In the CT model with the oversampled data, WRS (break point: 90%), pure-tone thresholds at each frequency (break point: 52 dB), and age were used to split the data and detect cochlear dead regions (
Figure 2b). Using a WRS break point of 90%, the ratio of cochlear dead regions increased from 0.18 to 0.35, which indicates that the use of a WRS lower than 90% as a predictive factor increases predictability by a factor of two. Compared to the original data, the diagnosis of hearing loss etiologies does not increase the model’s predictive power in the oversampled data. Interestingly, for those aged under 65 with a lower WRS, a lower PTA increased the ratio of cochlear dead regions to 0.62.
The results of the multivariate logistic regression analyses for cochlear dead region detection in both the original data and the oversampled data are shown in
Table 2. In the original data, VS was significantly associated with the presence of cochlear dead regions (odds ratio = 2.40, a 95% confidence interval (CI) of 1.36–4.23,
p = 0.002), while MD showed a significantly lower odds ratio than the SNHL group (odds ratio = 0.36, 95% CI 0.18–0.73,
p = 0.004) [
5]. In the oversampled data, VS (odds ratio = 2.67, 95% CI 2.19–3.24,
p < 0.001), SSNHL (odds ratio = 1.56, 95% CI 1.31–1.85,
p < 0.001), and MD (odds ratio = 0.51, 95% CI 0.41–0.63,
p < 0.001) showed a significant association with the presence of cochlear dead regions. The pure-tone thresholds of the evaluating frequencies showed a positive association with cochlear dead region presence, whereas the odds ratio for cochlear dead region presence with respect to pure-tone average was lower than the odds ratio in the control groups in both the original data and the oversampled data. Frequencies of 3000 Hz and 4000 Hz showed lower odds ratios than the reference frequency of 1000 Hz (odds ratio = 0.22, 95% CI 0.11–0.46,
p < 0.001 and odds ratio = 0.31, 95% CI 0.15–0.62,
p < 0.001, respectively) in the original data. In the oversampled data, all the frequencies showed more significant odds ratios than the reference frequency.
The accuracy results of the 10-fold cross-validation of the LR and CT with the original data were 0.82 (±0.02) and 0.93 (±0.01), respectively. The accuracy results of the 10-fold cross-validation of the LR and CT with the oversampled data were 0.66 (±0.02) and 0.86 (±0.01), respectively.
4. Discussion
The ML-based approach provided well-validated and ready-to-use prediction models for clinical practitioners. However, most ML-based classification methods tend not to perform well on minority class examples, which is common with most medical datasets. Our previous study observed the imbalanced data distribution with only 6.7% of the aimed class diagnosed as cochlear dead region. Rahman et al. proposed both an oversampling (SMOTE) and undersampling (cluster-based methods) to balance a clinical dataset [
15]. They used a cardio-vascular disease dataset (823 instances and 26 attributes from the University of Hull) with two classification algorithms (the Fuzzy Unordered Rule Induction Algorithm [FURIA] and the Classification And Regression Tree [CART]) to classify the rebalanced data. The results showed the sensitivity improvement of both two classification algorithms (from 64.17% to 83.78%, FURIA and from 67.50% to 84.21%, CART). It implies that the class rebalancing technique can be applied to the clinical dataset and the performance of the class rebalancing technique depends on the ML technique used thereafter.
We investigated to overcome the imbalanced dataset issue in the present study by adopting an oversampling technique to create more evenly-distributed data. We applied oversampling techniques (SMOTE) to duplicate examples in the minority class or to synthesize new examples from examples in the minority class. SMOTE works by selecting examples that are close together in the feature space, drawing a line between those examples and extracting a new sample at a point along that line.
We produced results regarding accuracy in the 10-fold cross-validation of the LR and CT models with both the original and oversampled data: they were 0.82 (±0.02) and 0.93 (±0.01) for the original dataset and 0.66 (±0.02) and 0.86 (±0.01) for the oversampled dataset. These results indicate that the accuracy of the oversampled data was lower than that of the original data. Considering that the overall frequency-specific prevalence of cochlear dead regions was much higher than in the original data after applying the SMOTE method, this may affect the model’s accuracy with true positive data, which refers to the presence of cochlear dead regions in the present study. The hypothesis was that the machine learning models could overcome unevenly distributed data with larger clinical samples; however, the results indicate that simply applying oversampling methods did not improve the model’s performance. More powerful clinical indicators such as the audiometric configuration or type of defect when hearing a certain phoneme associated with the cochlear dead region in question may be needed to more accurately detect the presence of cochlear dead regions.
There are some limitations in our ability to detect cochlear dead regions in the general population and in increasing the applicability of these findings to clinical practice. First, TEN testing in clinics is time-consuming. Introducing more advanced technologies, such as optical coherence tomography, may reduce the time needed for testing through easily used, non-invasive methods for evaluating inner ear structures; however, the depth of the image that it produces is limited to a few millimeters due to its low permeability to tissues [
16]. Therefore, the TEN (HL) test is still performed as the primary tool for assessing cochlear dead regions in clinics, although it has some limitations to being more widely used.
In addition, it is still challenging to predict the presence of cochlear dead regions in patients with hearing loss using clinical and audiologic findings [
4], because the prevalence and possible indicators of cochlear dead regions differ depending on the study population [
4,
17,
18,
19]. No definitive indicators have been proven to predict the presence of cochlear dead regions in the general population. Although previous studies have revealed some reliable indicators of cochlear dead regions based on detection by TEN (HL) tests [
8,
20], there are only a few reports that certain hearing thresholds at each test frequency may be possible markers for cochlear dead regions. In addition, there have been no previous studies that specifically address cochlear dead region prediction as a function of frequency-specific information. For example, there are no reports on whether frequency, hearing thresholds, or etiologies of hearing loss have been weighted to address cochlear dead region prediction. Therefore, it is still unclear which patients beyond those with severe-to-profound hearing loss should undergo TEN (HL) testing, something which prevents cochlear dead regions from being more deeply integrated into clinical practice.
We address the prediction model for cochlear dead regions according to frequency and compare the results between the original data and the SMOTE oversampled data. The study results can be helpful for predicting or detecting cochlear dead regions according to frequency in clinical settings. The comparison results imply the existence of hidden associated factors or non-linearity in predicting cochlear dead regions. Previous studies have only assessed the prevalence of cochlear dead regions by ear, not by frequency. In our previous study, we assessed the prevalence of cochlear dead regions by frequency; however, the distribution of cochlear dead regions was only 6.7% and it might be suggested that an imbalanced class distribution could affect the accuracy of ML model development.
In contrast with previous studies [
4,
18], the feature “high frequencies” was negatively associated with cochlear dead regions in the present LR model used on the original dataset. This result may depend on the study population. Our study enrolled both ARHL and NIHL patients. These populations show a low prevalence of cochlear dead regions, despite poorer hearing thresholds at high frequencies. After adopting the SMOTE method to modify the imbalanced distribution, the results remained similar to the original data. Therefore, our results suggest that the feature “high frequencies” might be a negatively associated factor in predicting the absence of cochlear dead regions and should be considered with the diagnosed etiologies.
Because this study included patients with diverse etiologies and vastly differing levels of hearing loss, the indicators identified here may be beneficial for determining which patients have suspected cochlear dead regions. WRS, etiology type, and hearing thresholds at specific frequencies are all informative factors. WRS, which has been addressed in a previous study [
21], can be a useful indicator for predicting cochlear dead regions; this was demonstrated here in this present study with the oversampled data, although the cut-off value may vary depending on the study population. In the CT model, a 43% value used for classifying the break point in WRS with the original dataset was suggested and a 90% value used for classifying the break point in WRS with oversampled data was also suggested.
The features of a disease’s etiology, such as of VS and MD, did not show any predictive power in the CT models with the oversampled data. This might be related to the use of the oversampling method, which affects the distribution according to the disease etiology and thus attenuating the predictive effects of disease etiology.
This study has some limitations. First, because the possible risk factors for cochlear dead regions are not fully understood, and the present study was performed in a retrospective manner, we could not assess all possible features. Adaptation of limited features may affect the predictive power of the ML models. Second, we applied the SMOTE technique to duplicate examples in the minority class. However, it may affect the exaggeration of unnecessary clinical features during the oversampling. This could not be assessed, because we still cannot determine which clinical features are more clinically associated and highly valuated to predict cochlear dead region. If we had selectively oversampled some features that we found in the previous study, the prediction power would have been biased with some minority classes with high predictive power, which could not be elucidated in the original dataset. Therefore, we applied the SMOTE method to observe the features of this clinical dataset.