Interpretable Acoustic Features from Wakefulness Tracheal Breathing for OSA Severity Assessment
Abstract
1. Introduction
2. Materials and Methods
3. Results
3.1. Clinical Interpretation of Top Features
3.2. Top-Ranked Features
3.3. Feature Stability Across Folds and Models
3.4. Correlation with Anthropometric Data
4. Discussion
4.1. Clinical Relevance of Key Features
4.1.1. Non-OSA vs. Mild-OSA
4.1.2. Non-OSA vs. Moderate-OSA
4.1.3. Non-OSA vs. Severe-OSA
4.1.4. Mild-OSA vs. Moderate-OSA
4.1.5. Mild-OSA vs. Severe-OSA
4.1.6. Moderate-OSA vs. Severe-OSA
4.1.7. Physiological Themes Across Models
- Escalating turbulence, bandwidth, and centroid shifts correspond to rising Reynolds number and more pronounced vibration/snoring as the airway narrows [44].
- Event complexity (diagonals, perimeters, shape metrics): track segmented, irregular airflow fragments as OSA severity increases [43].
- Variability (Interquartile range (IQR), Standard Deviation (SD), entropy) reveals unstable ventilatory control, frequent arousals, and abrupt collapse–recovery dynamics [43].
- Amplitude/energy (mean, RMS, total) reflect increasing respiratory effort, loud post-obstructive inspiration, and compensatory surges in disease progression.
4.2. Rationale for Multi-Class OSA Severity Stratification
4.3. Physiological and Clinical Interpretation of Feature Linkage to Severity
4.3.1. Acoustic Signatures of Airway Chaos and Ventilatory Effort (AHI Correlation)
- Decreased Texture Energy (Acoustic Disorganization): The feature exhibits the strongest negative correlation with OSA severity. Texture Energy is a quantitative measure of the uniformity and repetitiveness of local patterns in a spectrogram, computed by summing the squared values of the co-occurrence or filtered spectrogram matrix, reflecting how consistent and regular the acoustic structure is. As severity increases, the pharyngeal airway becomes intrinsically more compliant and prone to intermittent vibration and collapse, leading to flow separation and highly random, broadband turbulence. This shift from structured, laminar-like noise to chaotic, broadband turbulence disrupts the consistency of the spectrogram, resulting in a significant decrease in texture energy. This feature, therefore, serves as a powerful acoustic marker of increasing pharyngeal instability and vulnerability [39,42].
- Increased Skewness (Compensatory Drive): Conversely, the high positive correlation (r ≈ 0.99) between spectral skewness and OSA severity indicates systematic changes in the distribution of sound amplitude. Positive skewness signifies a heavier tail toward high-amplitude values. Physiologically, this represents the subject’s increased reliance on intermittent, high-force maneuvers (such as a forceful, highly turbulent inhalation or a loud snort/gasp) to maintain adequate flow against increasing pharyngeal resistance. Clinically, this feature is an acoustic signature of heightened respiratory drive and compensatory effort, which scales directly with disease burden [41,45].
4.3.2. Morphological and Spectral Markers of Flow Limitation and Airway Dynamics
- Spectral Bandwidth and Flux (Venturi Effect): These features are crucial markers of dynamic flow behavior. Airflow acceleration through a narrow, compliant pharyngeal segment (the site of flow limitation, a manifestation of the Venturi effect) generates high-velocity jets. The high spectral flux reflects the rapid, transient changes in the power spectrum as these turbulent jets form and dissipate during the breathing cycle. In contrast, increased bandwidth reflects a broader spread of acoustic energy across frequencies. Together, these changes are consistent with the presence and severity of flow-limiting segments, where the degree of narrowing modulates the strength and spectral extent of turbulent eddies [40,55].
- Fractal Dimension and Complexity (Non-linear System Behavior): The high-ranking fractal dimension quantifies the non-linear complexity of the signal. Increased airway resistance and turbulence are hallmarks of a system pushed toward instability. A higher fractal dimension suggests a highly complex, chaotic, and less predictable airflow pattern, aligning with established non-linear control theory, which views the respiratory system as operating close to a chaotic bifurcation point [50,51].
4.3.3. Validation Through Established Anatomical Risk Factors
4.4. Correlation with Anthropometric Data
4.5. Alignment with Prior Wakefulness-Based OSA Studies
- Instead of a binary OSA vs. non-OSA classification, our framework performs multi-level severity stratification (non-OSA, mild, moderate, and severe), offering finer clinical granularity.
- We introduce novel morphological and time–frequency gap descriptors, extracted from harmonic–percussive (HP) decompositions and spectrogram bounding boxes, which capture airway-specific acoustic signatures not examined in previous work.
- Our use of ensemble-based models with SHAP explainability provides transparent quantification of feature contributions and robustness validation (Abs ΔAUC < 0.04 across folds), establishing reproducibility across subjects and folds.
4.6. Comparison with Other Awake Screening Modalities
4.7. Limitations and Future Work
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AHI | Apnea–Hypopnea Index |
| AUC | Area Under the Receiver Operating Characteristic Curve |
| AbsDeltaAUC | Absolute Change in AUC |
| BBox | Bounding Box |
| BMI | Body Mass Index |
| CV | Cross-Validation |
| CQT | Constant-Q Transform |
| FM | Frequency Mean |
| FM2M | Mean Frequency Ratio |
| HP | Harmonic–Percussive |
| IQR | Interquartile Range |
| k-Fold CV | K-Fold Cross-Validation |
| MFCC | Mel-Frequency Cepstral Coefficients |
| ML | Machine Learning |
| NC | Neck Circumference |
| NEP | Negative Expiratory Pressure |
| OSA | Obstructive Sleep Apnea |
| RFE | Recursive Feature Elimination |
| ROC | Receiver Operating Characteristic |
| RQA | Recurrence Quantification Analysis |
| RMS | Root Mean Square |
| SD | Standard Deviation |
| SHAP | Shapley Additive exPlanations |
| SNR | Signal-to-Noise Ratio |
| TBS | Tracheal Breathing Sounds |
| ΔAUC | Change in AUC |
Appendix A. Top Selected Features for Each Model
| Feature Number | Non-OSA vs. Mild-OSA | Non-OSA vs. Moderate-OSA | Non-OSA vs. Severe-OSA | Mild-OSA vs. Moderate-OSA | Mild-OSA vs. Severe-OSA | Moderate-OSA vs. Severe-OSA |
|---|---|---|---|---|---|---|
| 1 | MouthExpiration_Range_SpectralEntropy | Average_Range_SpectralSkewness | NoseInspiration_Range_SpectralKurtosis | NoseExpiration_Range_MeanPower | MouthExpiration_Range_SCBW_Bandwidth | NoseExpiration_Range_SpectralCrest |
| 2 | MouthExpiration_Range_SpectralCrest | NoseInspiration_Range_SpectralSkewness | NoseInspiration_Range_FreqSkewness | MouthExpiration_IQR | Average_Average_BBoxes_EnFBiD | Average_BBox_NumHoles |
| 3 | MouthExpiration_Range_SpectralCrest | Average_BBox_Entropy | NoseExpiration_Range_SCBW_Bandwidth | MouthInspiration_MFCCMean | NoseExpiration_BBox_Range | MouthInspiration_WaveletApproxEntropy |
| 4 | NoseInspiration_Range_SCBW_Bandwidth | MouthInspiration_Range | Average_BBox_TextureEnergy | NoseExpiration_WaveletApproxSkewness | Average_Range_SpectralCrest | NoseExpiration_PeakCount |
| 5 | MouthInspiration_MFCCMean | MouthExpiration_WaveletApproxSpectralBandwidth | Average_BBox_CentroidY | MouthInspiration_MFCCMean | NoseInspiration_BBox_Median | MouthInspiration_MFCCMean |
| 6 | MouthInspiration_MFCCMedian | MouthExpiration_WaveletDetailEntropy | NoseInspiration_BBox_AspectRatio | MouthInspiration_MFCCStd | NoseInspiration_BBox_PeakValue | MouthExpiration_Range_SCBW_Bandwidth |
| 7 | NoseExpiration_WaveletApproxMean | MouthInspiration_IQR | MouthInspiration_BBox_EulerNumber | MouthInspiration_SpectralCentroid | NoseInspiration_BBox_FreqCentroid | MouthInspiration_Range_RMS |
| 8 | NoseExpiration_Range_BandPower | NoseInspiration_CoefVariation | MouthInspiration_MeanValue | MouthInspiration_SpectralBandwidth | NoseInspiration_BBox_PeakValue | NoseInspiration_Range_MeanPower |
| 9 | NoseExpiration_Range_SpectralSkewness | NoseExpiration_BBox_Entropy | MouthInspiration_StdValue | MouthInspiration_MFCCKurtosis | NoseInspiration_BBox_FreqCentroid | NoseInspiration_Range_SCBW_Bandwidth |
| 10 | MouthInspiration_BBox_NumHoles | Average_DIR_Histogram | MouthExpiration_TotalEnergy | MouthExpiration_WaveletSpectralCentroid | NoseInspiration_BBox_SpectralFlux | NoseInspiration_Range_SpectralEnergy |
| 11 | MouthExpiration_BBox_FreqCentroid | Average_AVP_Histogram | MouthExpiration_NormalizedEnergy | MouthExpiration_CQTBandwidthDynamicRange | NoseInspiration_BBox_ConnectedComponents | NoseInspiration_Range_BandPower |
| 12 | MouthExpiration_BBox_Kurtosis | MouthInspiration_ZeroCrossing | MouthExpiration_StdAbsBi | MouthExpiration_MFCCStd | NoseInspiration_BBox_EulerNumber | MouthInspiration_BBox_RegionArea |
| 13 | MouthExpiration_BBox_Area | MouthInspiration_WaveletDetailSpectralBandwidth | MouthExpiration_SymStd | MouthExpiration_SpectralCentroid | NoseInspiration_BBox_IQR | NoseInspiration_BBox_IQR |
| 14 | MouthExpiration_BBox_Diagonal | MouthInspiration_WaveletDetailSpectralBandwidth | MouthExpiration_Perimeter | MouthExpiration_MFCCSkewness | NoseInspiration_BBox_FreqCentroid | NoseInspiration_BBox_TextureHomogeneity |
| 15 | MouthExpiration_BBox_NumHoles | MouthInspiration_MFCCMean | MouthExpiration_CentroidY | MouthExpiration_MFCCKurtosis | NoseInspiration_BBox_BBoxArea | NoseInspiration_BBox_NumHoles |
| 16 | MouthExpiration_BBox_AspectRatio | MouthInspiration_MFCCKurtosis | MouthExpiration_Perimeter | MouthExpiration_MFCCKurtosis | NoseInspiration_BBox_FreqCentroid | NoseExpiration_BBox_TextureContrast |
| 17 | MouthExpiration_BBox_TextureContrast | MouthInspiration_PBP_Skewness | MouthExpiration_CentroidX | NoseInspiration_ZeroCrossing | NoseInspiration_BBox_TextureEnergy | NoseExpiration_BBox_AspectRatio |
| 18 | NoseInspiration_BBox_Diagonal | MouthInspiration_TP_Histogram | NoseInspiration_Average_BBox_EnFBiD | NoseInspiration_WaveletApproxSkewness | NoseInspiration_BBox_EulerNumber | NoseExpiration_BBox_Perimeter |
| 19 | NoseInspiration_BBox_TextureEnergy | MouthInspiration_TP_MaxProb | NoseInspiration_Average_BBox_MeanBiDF | NoseInspiration_WaveletApproxSpectralCentroid | NoseInspiration_BBox_StdValue | Average_WaveletDetail_MaxToMinRatio |
| 20 | Average_WaveletApproxMaxToMinRatio | MouthInspiration_EP_MaxEnergy | NoseInspiration_Average_BBox_WCOBDFx | NoseInspiration_WaveletDetailKurtosis | NoseInspiration_BBox_Entropy | Average_WaveletDetail_Kurtosis |
| 21 | NoseInspiration_HurstExponent | MouthExpiration_WaveletApproxKurtosis | NoseInspiration_Average_BBox_WCOBDFy | NoseInspiration_CQTStdPower | NoseInspiration_BBox_Entropy | MouthInspiration_LyapunovExponentMean |
| 22 | NoseExpiration_KatzFD | MouthExpiration_WaveletApproxSkewness | NoseInspiration_Average_BBox_Hf1 | NoseInspiration_CQTSkewnessPower | NoseInspiration_BBox_TextureContrast | MouthInspiration_BandPowerHigh |
| 23 | MouthInspiration_Range_PeakFrequency | MouthExpiration_PBP_Kurtosis | NoseInspiration_Average_BBox_Hf2 | NoseInspiration_CQTTemporalCentroid | NoseInspiration_BBox_TextureEnergy | NoseInspiration_WaveletApproxEntropy |
| 24 | MouthExpiration_BBox_CentroidY | MouthExpiration_PBP_Entropy | NoseInspiration_Average_BBox_Reserved | NoseInspiration_CQTSpectralCentroid | NoseInspiration_BBox_StdValue | NoseInspiration_WaveletApproxEntropy |
| 25 | MouthExpiration_BBox_PeakValue | MouthInspiration_Range_FM2MFreq | NoseInspiration_Average_BBox_EnFBiD | NoseInspiration_CQTBandwidthDynamicRange | NoseExpiration_BBox_ConnectedComponents | NoseInspiration_WaveletDetailSpectralCentroid |
| 26 | MouthExpiration_BBox_TextureCorrelation | MouthInspiration_Range_MeanPower | NoseInspiration_Average_BBox_MeanBiDF | NoseInspiration_MFCCSkewness | NoseExpiration_BBox_Median | NoseExpiration_WaveletApproxSkewness |
| 27 | NoseInspiration_BBox_Range | MouthInspiration_Range_RMS | NoseInspiration_Average_BBox_WCOBDFx | NoseInspiration_MFCCKurtosis | NoseExpiration_BBox_IQR | Average_AVP_Mean |
| 28 | NoseInspiration_BBox_Std | MouthInspiration_Range_FM2MFreq | NoseInspiration_Average_BBox_WCOBDFy | NoseInspiration_SpectralBandwidth | NoseExpiration_BBox_TextureCorrelation | MouthInspiration_KatzFD |
| 29 | NoseExpiration_BBox_FractalDimension | MouthInspiration_Range_BandPower | NoseInspiration_Average_BBox_Hf1 | NoseInspiration_MFCCKurtosis | NoseExpiration_BBox_TextureEnergy | MouthInspiration_LyapunovExponentMax |
| 30 | NoseExpiration_BBox_Std | MouthInspiration_Range_Std | NoseInspiration_Average_BBox_Hf2 | NoseInspiration_TP_Skewness | NoseExpiration_BBox_FractalDimension | MouthExpiration_MFCCMedian |
| 31 | MouthExpiration_MFCCMedian | MouthInspiration_Range_FM2MFreq | NoseInspiration_Average_BBox_Reserved | NoseExpiration_WaveletDetailSkewness | NoseExpiration_BBox_ConnectedComponents | NoseInspiration_WaveletDetailEntropy |
| 32 | MouthExpiration_PBP_Kurtosis | MouthInspiration_Range_FM2MFreq | NoseInspiration_Average_BBox_BisEntropy | NoseExpiration_CQTMeanPower | NoseExpiration_BBox_CentroidY | NoseInspiration_CQTGaborEnergyMean |
| 33 | MouthExpiration_TP_Histogram | MouthInspiration_Range_FM2MFreq | NoseInspiration_BBox_SpectralFlux | NoseExpiration_CQTSkewnessPower | NoseExpiration_BBox_Compactness | NoseInspiration_MFCCMedian |
| 34 | MouthExpiration_TP_MaxProb | MouthExpiration_Range_Std | NoseInspiration_BBox_ConnectedComponents | NoseExpiration_CQTSpectralDynamicsStd | NoseExpiration_BBox_Energy | Average_Range_Maximum |
| 35 | MouthExpiration_TP_Ratio | MouthExpiration_Range_Std | NoseInspiration_BBox_Skewness | MouthInspiration_ZeroCrossing | NoseExpiration_BBox_FreqCentroid | Average_Range_SCBW_Bandwidth |
Appendix B. Analysis of Clinical Interpretation of Top Features for All Models
Appendix B.1. Non-OSA vs. Mild-OSA
- MouthInspiration_Range_FreqSkewness: Measures asymmetry of the power spectrum during mouth inspiration within the relevant frequency range. Positive values indicate dominance of lower frequencies, negative values indicate dominance of higher frequencies. Reflects shifts in frequency energy distribution due to airflow changes, turbulence, or airway geometry.
- Average_BBox_TextureEnergy: Indicates uniformity of bispectral texture. Higher energy suggests consistent patterns, while lower values indicate heterogeneous local coupling. In OSA, irregular airflow can produce less uniform bispectral coupling patterns, consistent with more complex or intermittent nonlinear flow-tissue interactions.
- Average_BBox_FrequencyCentroidX: Centroid of bispectral energy along the f1-axis. Shifts in this centroid indicate changes in dominant coupling frequency, reflecting airflow-tissue interactions or turbulence associated with airway narrowing.
- Average_BBox_TextureCorrelation: Measures gray-level co-occurrence texture correlation. Higher values indicate structured bispectral patterns, suggesting stable nonlinear coupling. Reduced correlation reflects heterogeneous coupling, consistent with variable airflow instability.
- Average_BBox_Perimeter: Perimeter of high-intensity regions within the bispectral bounding box. Larger perimeter suggests fragmented coupling, possibly indicating variable airflow instability.
- Average_BBox_TextureHomogeneity: Measures uniformity of gray-level values. Higher values indicate smoother patterns; lower values indicate irregular textures, reflecting turbulent airflow.
- Average_BBox_IQRValue: Range between the first and third quartiles of bispectral intensity. Larger IQR indicates greater variability in the central portion of the sound signal, potentially associated with airflow turbulence.
- Average_BBox_Compactness: How compact high-intensity regions are. Less compact patterns suggest more diffuse sound energy, consistent with irregular airflow.
- MouthInspiration_Range_SpectralSkewness: Asymmetry of the power spectrum during mouth inspiration within a relevant frequency range. Changes may indicate airflow limitation or obstruction.
Appendix B.2. Non-OSA vs. Moderate-OSA
- Average_Range_Maximum: Highest amplitude within the relevant frequency range, representing peak breathing sound energy. Higher values may indicate more turbulent airflow due to airway narrowing.
- Average_BBox_BoundingBoxDiagonal: Length of the bounding box diagonal for a bispectral patch. Larger diagonal implies broader distribution of nonlinear interactions, consistent with more complex flow disturbances in moderate OSA.
- Average_Range_MeanPower: Average spectral energy across the relevant frequency range. Reflects overall intensity of breathing sounds in that band.
- Average_BBox_ConnectedComponents: Counts distinct high-intensity regions in the bispectral patch. More regions suggest fragmented coupling, potentially due to multiple flow structures or vibration sites.
- Average_BBox_TextureContrast: Local contrast differences in bispectral patches. Higher contrast indicates stronger differences between regions of coupling, compatible with intermittent obstructions.
- Average_BBox_FrequencyCentroidY: Centroid of bispectral energy along the f2-axis. Shifts indicate changes in dominant coupling frequencies, reflecting altered airflow-tissue interactions.
- Average_Range_BandPower: Total spectral power within a relevant mid-frequency range. Higher power may indicate stronger turbulent airflow.
- Average_Range_MeanPower: Average spectral energy, indicative of airflow consistency.
- Average_Range_RMS: Standard amplitude measure within the frequency range. Higher RMS indicates more variable or intense breathing sounds, associated with stronger flow disturbances.
Appendix B.3. Non-OSA vs. Severe-OSA
- Average_BBox_MeanValue: Average amplitude or intensity of the sound signal. In OSA, increased respiratory effort produces louder, higher-amplitude sounds.
- Average_BBox_TextureEnergy: Uniformity of bispectral texture. Lower energy indicates more heterogeneous local coupling, reflecting complex airflow-tissue interactions.
- MouthInspiration_Range_FreqSkewness: Asymmetry of the power spectrum during mouth inspiration within the relevant frequency range. Positive values indicate low-frequency dominance, negative values high-frequency dominance.
- Average_BBox_FractalDimension: Complexity or irregularity of local patterns. Higher values indicate more intricate structures, compatible with chaotic airflow in severe OSA.
- Average_BBox_MedianValue: Median local intensity; less sensitive to outliers than the mean. Changes reflect shifts in breathing sound intensity.
- Average_BBox_EnergyValue: Total energy within the bounding box. Higher energy indicates stronger, more intense breathing sounds, often associated with airway narrowing.
- Average_BBox_EulerNumber: Topological feature reflecting the number of connected components minus holes. Indicates complexity of sound events, altered by intermittent airflow or airway vibrations.
- Average_BBox_AspectRatio: Width-to-height ratio of detected sound events; changes may reflect altered spectral characteristics due to airway dynamics.
- Average_BBox_ConnectedComponents: Counts distinct high-intensity regions; more regions suggest fragmented coupling due to multiple airflow structures.
- Average_BBox_KurtosisValue: Measures tailedness of the sound signal distribution. Higher kurtosis indicates sudden, sharp sound events associated with airway collapse or reopening.
Appendix B.4. Mild-OSA vs. Moderate-OSA
- MouthExpiration_BBox_AspectRatio: Width-to-height ratio during mouth expiration; indicates alterations in spectral characteristics due to airway dynamics.
- Average_BBox_MedianValue: Median local intensity. Changes reflect general shifts in breathing sound intensity.
- Average_BBox_TextureEnergy: Uniformity of sound texture; lower values suggest heterogeneous patterns due to turbulent airflow.
- MouthExpiration_Range_FM2MFreq: Ratio of frequency-modulated to mean spectral energy in a relevant range; reflects redistribution of low-frequency energy, indicating airflow variations.
- Average_BBox_FrequencyBandwidthY: Effective bandwidth along the frequency axis; broader bandwidth suggests wider participation of frequencies, consistent with turbulent airflow.
- Average_BBox_Perimeter: Perimeter of high-intensity regions; larger perimeter indicates more fragmented structure, consistent with variable airflow instability.
- Average_BBox_FrequencyCentroidY: Centroid of bispectral energy along f2-axis; shifts indicate changes in dominant frequency regions.
- MouthExpiration_BBox_TextureEnergy: Uniformity of sound pattern; lower values indicate less consistent sound patterns, reflecting turbulence or intermittent obstruction.
Appendix B.5. Mild-OSA vs. Severe-OSA
- MouthInspiration_BBox_FrequencyCentroidX: Centroid along f1-axis; shifts indicate changes in dominant frequency region, reflecting airflow dynamics.
- Average_Average_BBoxes_Entropy: Overall entropy across bounding boxes; higher values indicate irregular airflow or turbulent sounds in severe OSA.
- Average_Range_RMS: Standard amplitude within the relevant frequency range; higher RMS indicates variable or stronger breathing sound intensity.
- Average_BBox_EnergyValue: Total energy in bounding box; higher energy reflects increased respiratory effort or airway narrowing.
- MouthInspiration_Range_StandardDeviation: Variability of sound amplitude; higher SD indicates greater fluctuation in airflow or turbulence.
- Average_Range_FM2MFreq: Low-frequency energy ratio; reflects subtle airflow variations or airway patency changes.
- MouthInspiration_BBox_TextureContrast: Local contrast; higher values indicate strong differences between adjacent sound regions.
- MouthExpiration_BBox_FrequencyBandwidthY: Bandwidth along frequency axis; broader bandwidth indicates more turbulent airflow.
- Average_Range_StandardDeviation: Variability of amplitude; higher SD reflects irregular airflow.
- MouthExpiration_Range_SpectralEnergy: Total spectral energy within relevant range; higher energy corresponds to stronger, more turbulent airflow.
Appendix B.6. Moderate-OSA vs. Severe-OSA
- Average_BBox_MedianValue: Median intensity; indicates overall breathing sound level.
- Average_BBox_IQRValue: Range between first and third quartiles; larger IQR suggests greater variability, linked to turbulence.
- Average_BBox_EnergyValue: Total energy; higher values reflect stronger, more intense breathing sounds due to airflow restriction.
- Average_BBox_RangeValue: Difference between maximum and minimum values; larger range indicates pronounced variations in sound intensity.
- Average_BBox_KurtosisValue: Measures tailedness; higher kurtosis indicates sudden peaks in sound, linked to airway collapse or reopening.
- Average_BBox_Compactness: How compact high-intensity regions are; less compact indicates more diffuse sound energy due to turbulent airflow.
- Average_BBox_StdValue: Dispersion of sound values; higher SD indicates greater variability, reflecting irregular airflow.
- Average_BBox_IQRValue: Central variability of sound; larger values indicate more turbulence.
- Average_BBox_EulerNumber: Complexity of sound events; reflects structure or continuity of airflow events.
- Average_BBox_EntropyValue: Randomness or unpredictability of sound; higher entropy indicates more chaotic or turbulent airflow.
Appendix C. Analysis for Non-OSA vs. Mild-OSA
| Feature Name | AUC Test | Corr Test |
|---|---|---|
| MouthInspiration_Range_FreqSkewness | 0.787037 | 0.498762 |
| Average_BBox_TextureEnergy | 0.666667 | 0.301309 |
| Average_BBox_FrequencyCentroidX | 0.657407 | 0.145422 |
| Average_BBox_TextureCorrelation | 0.657407 | 0.145286 |
| Average_BBox_Perimeter | 0.657407 | 0.0772519 |
| Average_BBox_TextureEnergy | 0.644444 | 0.0914112 |
| Average_BBox_TextureHomogeneity | 0.638889 | 0.0837961 |
| Average_BBox_IQRValue | 0.62963 | 0.211449 |
| Average_BBox_Compactness | 0.62963 | 0.211449 |
| MouthInspiration_Range_SpectralSkewness | 0.62963 | 0.237871 |
| Feature Name | Abs Delta AUC |
|---|---|
| MouthInspiration_Range_SpectralEntropy | 0.0102916 |
| Average_BBox_TextureEnergy | 0.0241135 |
| Average_BBox_FrequencyCentroidY | 0.0441738 |
| Average_BBox_SpectralFlux | 0.0498553 |
| Average_BBox_EntropyValue | 0.0613248 |
| Average_BBox_BoundingBoxDiagonal | 0.0613248 |
| Average_BBox_Compactness | 0.0627066 |
| Average_BBox_SpectralFlux | 0.0728632 |
| Average_BBox_FrequencyCentroidX | 0.0728632 |
| Average_BBox_FractalDimension | 0.0860684 |
| Feature Name | Anthropometric Feature | Pearson Correlation |
|---|---|---|
| Average_BBox_MeanValue | Sex | 0.998692 |
| Average_BBox_NumHoles | MPS | 0.997553 |
| Average_BBox_FrequencyCentroidX | MPS | 0.997288 |
| Average_BBox_StdValue | Sex | 0.997255 |
| Average_BBox_CoefVariation | Sex | 0.997255 |
| Average_BBox_Perimeter | Sex | 0.99691 |
| Average_BBox_MeanValue | MPS | 0.994605 |
| Average_BBox_Compactness | NC | 0.994605 |
| Average_BBox_TextureContrast | NC | 0.993768 |
| Average_BBox_NumHoles | NC | 0.993507 |
| Feature Name | Pearson Correlation |
|---|---|
| Average_BBox_MeanValue | 0.994605 |
| Average_BBox_Compactness | 0.994605 |
| Average_BBox_AspectRatio | 0.984433 |
| Average_BBox_Compactness | 0.984433 |
| Average_BBox_MedianValue | 0.978554 |
| Average_BBox_FrequencyCentroidY | 0.973494 |
| Average_BBox_FrequencyBandwidthY | 0.969688 |
| MouthExpiration_Range_SCBW_Bandwidth | 0.969226 |
| Average_BBox_TextureContrast | 0.960109 |
| Average_BBox_TextureEnergy | 0.951338 |
Appendix D. Analysis for Non-OSA vs. Moderate-OSA
| Feature Name | AUC Test | Corr Test |
|---|---|---|
| MouthInspiration_Range_FreqSkewness | 0.787037 | 0.498762 |
| Average_Range_Maximum | 0.849794 | 0.458931 |
| Average_BBox_BoundingBoxDiagonal | 0.81893 | 0.537497 |
| Average_Range_MeanPower | 0.816872 | 0.4697 |
| Average_BBox_ConnectedComponents | 0.792181 | 0.400149 |
| Average_BBox_TextureContrast | 0.788066 | 0.421496 |
| Average_BBox_BoundingBoxDiagonal | 0.781893 | 0.479976 |
| Average_BBox_FrequencyCentroidY | 0.76749 | 0.336108 |
| Average_Range_BandPower | 0.738683 | 0.437269 |
| MouthInspiration_Range_FreqSkewness | 0.738683 | 0.436397 |
| Feature Name | AbsDeltaAUC |
|---|---|
| Average_BBox_FrequencyBandwidthX | 0.00244068 |
| Average_BBox_RangeValue | 0.00810185 |
| Average_BBox_AspectRatio | 0.015558 |
| Average_BBox_EnergyValue | 0.0266176 |
| Average_BBox_MeanValue | 0.0277022 |
| Average_BBox_EulerNumber | 0.0319368 |
| Average_BBox_KurtosisValue | 0.0366769 |
| Average_BBox_RangeValue | 0.0470624 |
| Average_BBox_BoundingBoxArea | 0.0470624 |
| Average_BBox_FrequencyBandwidthX | 0.047544 |
| Feature Name | Anthropometric Feature | Pearson Correlation |
|---|---|---|
| Average_BBox_TextureHomogeneity | Sex | 0.999783 |
| Average_BBox_RangeValue | BMI | 0.997506 |
| NoseExpiration_Range_MeanPower | Smoke History | 0.996784 |
| Average_BBox_FrequencyCentroidY | NC | 0.995961 |
| Average_BBox_BoundingBoxDiagonal | Smoke History | 0.990975 |
| Average_BBox_TextureHomogeneity | MPS | 0.989529 |
| Average_BBox_IQRValue | NC | 0.987245 |
| Average_BBox_TextureHomogeneity | Age | 0.982519 |
| Average_BBox_MeanValue | Sex | 0.982193 |
| Average_BBox_ConnectedComponents | Smoke History | 0.980487 |
| Feature Name | Pearson Correlation |
|---|---|
| Average_BBox_RangeValue | 0.962649 |
| Average_BBox_TextureCorrelation | 0.96016 |
| Average_BBox_Perimeter | 0.930377 |
| Average_BBox_MeanValue | 0.917263 |
| Average_BBox_TextureEnergy | 0.911276 |
| Average_BBox_FrequencyCentroidX | 0.89864 |
| Average_BBox_SkewnessValue | 0.882517 |
| Average_BBox_IQRValue | 0.877315 |
| Average_BBox_StdValue | 0.844572 |
| Average_BBox_FractalDimension | 0.735205 |
Appendix E. Analysis for Non-OSA vs. Severe-OSA
| Feature Name | AUC Test | Corr Test |
|---|---|---|
| Average_BBox_MeanValue | 0.860248 | 0.505319 |
| Average_BBox_TextureEnergy | 0.775463 | 0.444318 |
| MouthInspiration_Range_FreqSkewness | 0.775 | 0.413793 |
| Average_BBox_FractalDimension | 0.73913 | 0.429386 |
| Average_BBox_MedianValue | 0.733333 | 0.242264 |
| Average_BBox_EnergyValue | 0.729167 | 0.216856 |
| Average_BBox_EulerNumber | 0.716667 | 0.246949 |
| Average_BBox_AspectRatio | 0.708333 | 0.200236 |
| Average_BBox_ConnectedComponents | 0.708333 | 0.215239 |
| Average_BBox_KurtosisValue | 0.708333 | 0.267374 |
| Feature Name | AbsDeltaAUC |
|---|---|
| Average_BBox_SkewnessValue | 0.00123128 |
| Average_BBox_FrequencyBandwidthX | 0.0025107 |
| Average_BBox_FrequencyBandwidthX | 0.00366667 |
| MouthInspiration_Range_FreqSkewness | 0.00366667 |
| Average_BBox_Compactness | 0.00418763 |
| Average_BBox_FrequencyBandwidthX | 0.004408 |
| Average_BBox_KurtosisValue | 0.00454831 |
| Average_BBox_SkewnessValue | 0.00578492 |
| Average_BBox_EulerNumber | 0.00608255 |
| Average_BBox_ConnectedComponents | 0.00933082 |
| Feature Name | Anthropometric Feature | Pearson Correlation |
|---|---|---|
| Average_BBox_EnergyValue | NC | 0.998475 |
| Average_BBox_AspectRatio | MPS | 0.99834 |
| Average_BBox_StdValue | Age | 0.995421 |
| Average_BBox_EulerNumber | Sex | 0.995285 |
| Average_BBox_Perimeter | MPS | 0.99373 |
| Average_BBox_IQRValue | Sex | 0.991446 |
| Average_BBox_NumHoles | Smoke History | 0.989047 |
| Average_BBox_ConnectedComponents | Age | 0.988745 |
| Average_BBox_EnergyValue | Age | 0.988332 |
| Average_BBox_MeanValue | Age | 0.987492 |
| Feature Name | Pearson Correlation |
|---|---|
| Average_BBox_PeakValue | 0.895716 |
| Average_BBox_BoundingBoxDiagonal | 0.881183 |
| Average_BBox_BoundingBoxDiagonal | 0.85574 |
| Average_BBox_Perimeter | 0.846486 |
| Average_BBox_KurtosisValue | 0.835585 |
| Average_BBox_Perimeter | 0.828251 |
| Average_BBox_MeanValue | 0.769224 |
| Average_BBox_FrequencyCentroidX | 0.759227 |
| Average_BBox_EnergyValue | 0.752154 |
| Average_BBox_SkewnessValue | 0.734339 |
Appendix F. Analysis for Mild-OSA vs. Moderate-OSA
| Feature Name | AUC Test | Corr Test |
|---|---|---|
| MouthExpiration_BBox_AspectRatio | 0.883333 | 0.496958 |
| Average_BBox_MedianValue | 0.788889 | 0.451368 |
| Average_BBox_TextureEnergy | 0.783333 | 0.258151 |
| MouthExpiration_Range_FM2MFreq | 0.6875 | 0.187878 |
| Average_BBox_FrequencyBandwidthY | 0.671875 | 0.251735 |
| Average_BBox_FrequencyBandwidthY | 0.655556 | 0.0760239 |
| Average_BBox_Perimeter | 0.655556 | 0.0760239 |
| Average_BBox_FrequencyCentroidY | 0.644531 | 0.221861 |
| MouthExpiration_BBox_TextureEnergy | 0.644444 | 0.220908 |
| Average_BBox_TextureEnergy | 0.641667 | 0.294824 |
| Feature Name | AbsDeltaAUC |
|---|---|
| MouthExpiration_BBox_CoefVariation | 0.00180556 |
| Average_BBox_TextureEnergy | 0.00291667 |
| Average_BBox_FrequencyCentroidY | 0.00443837 |
| MouthExpiration_Range_FM2MFreq | 0.00593891 |
| Average_BBox_FrequencyBandwidthX | 0.00599845 |
| Average_BBox_CentroidY | 0.00813298 |
| Average_BBox_FrequencyBandwidthY | 0.0124323 |
| Average_BBox_CentroidY | 0.0142974 |
| Average_BBox_SkewnessValue | 0.0192591 |
| Average_BBox_SkewnessValue | 0.0218056 |
| Feature Name | Anthropometric Feature | Pearson Correlation |
|---|---|---|
| Average_BBox_IQRValue | Sex | 0.998547 |
| MouthExpiration_FrequencyCentroidX | MPS | 0.998212 |
| Average_BBox_CentroidY | MPS | 0.99816 |
| MouthExpiration_StdValue | NC | 0.997504 |
| Average_BBox_MeanValue | MPS | 0.997397 |
| MouthExpiration_FrequencyCentroidX | NC | 0.996663 |
| Average_BBox_FrequencyCentroidY | Sex | 0.996409 |
| Average_BBox_Perimeter | Age | 0.996071 |
| Average_BBox_FrequencyCentroidX | Sex | 0.995729 |
| Average_BBox_Perimeter | MPS | 0.994929 |
| Feature Name | Pearson Correlation |
|---|---|
| Average_BBox_TextureHomogeneity | 0.990363 |
| Average_BBox_BoundingBoxDiagonal | 0.984096 |
| Average_BBox_EntropyValue | 0.9703 |
| Average_BBox_EntropyValue | 0.964663 |
| Average_BBox_FrequencyBandwidthX | 0.962357 |
| Average_BBox_CentroidY | 0.962357 |
| Average_BBox_FrequencyCentroidX | 0.956978 |
| Average_BBox_MedianValue | 0.953038 |
| Average_BBox_NumHoles | 0.952526 |
| Average_BBox_TextureEnergy | 0.951101 |
Appendix G. Analysis for Mild-OSA vs. Severe-OSA
| Feature Name | AUC Test | Corr Test |
|---|---|---|
| MouthInspiration_FrequencyCentroidX | 0.844444 | 0.157994 |
| Average_BBox_EnTBiD | 0.777778 | 0.483544 |
| Average_Range_RMS | 0.744444 | 0.67878 |
| Average_BBox_EnergyValue | 0.733333 | 0.513402 |
| MouthInspiration_Range_StdDev | 0.722222 | 0.776042 |
| Average_Range_FM2MFreq | 0.714286 | 0.572164 |
| MouthInspiration_TextureContrast | 0.711111 | 0.380311 |
| MouthExpiration_FrequencyBandwidthY | 0.69375 | 0.214055 |
| Average_Range_StdDev | 0.688889 | 0.621574 |
| MouthExpiration_Range_SpectralEnergy | 0.688889 | 0.516899 |
| Feature Name | AbsDeltaAUC |
|---|---|
| MouthExpiration_Perimeter | 0.00291667 |
| MouthExpiration_FrequencyBandwidthX | 0.00291667 |
| MouthInspiration_PeakValue | 0.0041751 |
| Average_BBox_NumHoles | 0.0042735 |
| MouthExpiration_Perimeter | 0.00583333 |
| MouthInspiration_ConnectedComponents | 0.00600058 |
| MouthExpiration_CentroidX | 0.00666667 |
| MouthExpiration_NumHoles | 0.0075 |
| MouthExpiration_PeakValue | 0.0075 |
| MouthInspiration_StdValue | 0.00816946 |
| Feature Name | Anthropometric Feature | Pearson Correlation |
|---|---|---|
| MouthExpiration_PeakValue | NC | 0.99998 |
| MouthExpiration_EntropyValue | BMI | 0.99901 |
| MouthInspiration_FrequencyCentroidX | BMI | 0.998535 |
| MouthInspiration_SpectralFlux | BMI | 0.997205 |
| MouthExpiration_PeakValue | NC | 0.996188 |
| MouthInspiration_CentroidY | MPS | 0.996019 |
| MouthExpiration_Range_RMS | NC | 0.99593 |
| MouthExpiration_Range_MeanPower | Sex | 0.994836 |
| MouthInspiration_EulerNumber | Sex | 0.99472 |
| MouthExpiration_Range_BandPower | Sex | 0.994715 |
| Feature Name | Pearson Correlation |
|---|---|
| MouthInspiration_IQRValue | 0.988287 |
| MouthInspiration_EulerNumber | 0.986967 |
| MouthInspiration_AspectRatio | 0.980666 |
| MouthInspiration_CentroidX | 0.965072 |
| MouthInspiration_FrequencyBandwidthX | 0.958011 |
| MouthInspiration_RegionArea | 0.942624 |
| MouthExpiration_StdValue | 0.902562 |
| MouthInspiration_KurtosisValue | 0.879297 |
| MouthInspiration_IQRValue | 0.875097 |
| MouthInspiration_MedianValue | 0.872257 |
Appendix H. Analysis for Moderate-OSA vs. Severe-OSA
| Feature Name | AUC Test | Corr Test |
|---|---|---|
| Average_BBox_MedianValue | 0.71875 | 0.239624 |
| Average_BBox_IQRValue | 0.690972 | 0.354004 |
| Average_BBox_EnergyValue | 0.680556 | 0.361267 |
| Average_BBox_RangeValue | 0.642361 | 0.211027 |
| Average_BBox_KurtosisValue | 0.637153 | 0.397161 |
| Average_BBox_Compactness | 0.615625 | 0.423041 |
| Average_BBox_StdValue | 0.607143 | 0.121142 |
| Average_BBox_IQRValue | 0.607143 | 0.121142 |
| Average_BBox_EulerNumber | 0.59375 | −0.0542083 |
| Average_BBox_EntropyValue | 0.59375 | −0.0327421 |
| Feature Name | AbsDeltaAUC |
|---|---|
| Average_BBox_MedianValue | 0.00355392 |
| Average_BBox_RegionArea | 0.0053935 |
| Average_BBox_FrequencyCentroidX | 0.0101997 |
| Average_BBox_IQRValue | 0.0132378 |
| MouthInspiration_Range_SpectralEnergy | 0.0278361 |
| Average_BBox_KurtosisValue | 0.0325428 |
| Average_BBox_PeakValue | 0.0373264 |
| Average_BBox_EnergyValue | 0.0503472 |
| Average_BBox_CoefVariation | 0.0576721 |
| Average_BBox_SpectralFlux | 0.0649846 |
| Feature Name | Anthropometric Feature | Pearson Correlation |
|---|---|---|
| Average_BBox_BoundingBoxArea | NC | 0.997757 |
| Average_BBox_RangeValue | Sex | 0.997111 |
| Average_BBox_NumHoles | Smoke History | 0.996614 |
| Average_BBox_Perimeter | Smoke History | 0.996614 |
| Average_BBox_FrequencyBandwidthX | Smoke History | 0.995328 |
| Average_BBox_RegionArea | NC | 0.992481 |
| Average_BBox_EnergyValue | MPS | 0.991225 |
| NoseExpiration_Range_SpectralCrest | Age | 0.990493 |
| Average_BBox_PeakValue | Sex | 0.988759 |
| Average_BBox_Perimeter | Sex | 0.987818 |
| Feature Name | Pearson Correlation |
|---|---|
| Average_BBox_RegionArea | 0.977981 |
| Average_BBox_ConnectedComponents | 0.966506 |
| Average_BBox_TextureHomogeneity | 0.937707 |
| Average_BBox_MeanValue | 0.937068 |
| Average_BBox_RegionArea | 0.925231 |
| Average_BBox_BoundingBoxArea | 0.915942 |
| Average_BBox_EnergyValue | 0.909118 |
| Average_BBox_CentroidY | 0.906971 |
| Average_BBox_FrequencyCentroidX | 0.903769 |
| Average_Range_StdDev | 0.890411 |
References
- Rizzo, D.; Baltzan, M.; Sirpal, S.; Dosman, J.; Kaminska, M.; Chung, F. Prevalence and regional distribution of obstructive sleep apnea in Canada: Analysis from the Canadian Longitudinal Study on Aging. Can. J. Public Health 2024, 115, 970–979. [Google Scholar] [CrossRef]
- Lechat, B.; Naik, G.; Reynolds, A.; Aishah, A.; Scott, H.; Loffler, K.A.; Vakulin, A.; Escourrou, P.; McEvoy, R.D.; Adams, R.J.; et al. Multinight Prevalence, Variability, and Diagnostic Misclassification of Obstructive Sleep Apnea. Am. J. Respir. Crit. Care Med. 2022, 205, 563–569. [Google Scholar] [CrossRef]
- Faria, A.; Allen, A.H.; Fox, N.; Ayas, N.; Laher, I. The public health burden of obstructive sleep apnea. Sleep Sci. 2021, 14, 257–265. [Google Scholar]
- Singh, M.; Liao, P.; Kobah, S.; Wijeysundera, D.N.; Shapiro, C.; Chung, F. Proportion of surgical patients with undiagnosed obstructive sleep apnoea. Br. J. Anaesth. 2013, 110, 629–636. [Google Scholar] [CrossRef]
- Kushida, C.A.; Littner, M.R.; Morgenthaler, T.; Alessi, C.A.; Bailey, D.; Coleman, J., Jr.; Friedman, L.; Hirshkowitz, M.; Kapen, S.; Kramer, M.; et al. Practice parameters for the indications for polysomnography and related procedures: An update for 2005. Sleep 2005, 28, 499–521. [Google Scholar] [CrossRef]
- Chen, L.; Pivetta, B.; Nagappa, M.; Saripella, A.; Islam, S.; Englesakis, M.; Chung, F. Validation of the STOP-Bang questionnaire for screening of obstructive sleep apnea in the general population and commercial drivers: A systematic review and meta-analysis. Sleep Breath. 2021, 25, 1741–1751. [Google Scholar] [CrossRef] [PubMed]
- Mazzotti, D.R.; Keenan, B.T.; Thorarinsdottir, E.H.; Gislason, T.; Pack, A.I. Sleep Apnea Global Interdisciplinary, C. Is the Epworth Sleepiness Scale Sufficient to Identify the Excessively Sleepy Subtype of OSA? Chest 2022, 161, 557–561. [Google Scholar] [CrossRef] [PubMed]
- Alqudah, A.M.; Moussavi, Z. Assessing Obstructive Sleep Apnea Severity During Wakefulness via Tracheal Breathing Sound Analysis. Sensors 2025, 25, 6280. [Google Scholar] [CrossRef]
- Alqudah, A.M.; Elwali, A.; Kupiak, B.; Hajipour, F.; Jacobson, N.; Moussavi, Z. Obstructive sleep apnea detection during wakefulness: A comprehensive methodological review. Med. Biol. Eng. Comput. 2024, 62, 1277–1311. [Google Scholar] [CrossRef] [PubMed]
- Elwali, A.; Meza-Vargas, S.; Moussavi, Z. Using tracheal breathing sounds and anthropometric information for screening obstructive sleep apnoea during wakefulness. J. Med. Eng. Technol. 2019, 43, 111–123. [Google Scholar] [CrossRef]
- Elwali, A.; Moussavi, Z. Obstructive Sleep Apnea Screening and Airway Structure Characterization During Wakefulness Using Tracheal Breathing Sounds. Ann. Biomed. Eng. 2017, 45, 839–850. [Google Scholar] [CrossRef] [PubMed]
- Elwali, A.; Moussavi, Z. A Novel Decision Making Procedure during Wakefulness for Screening Obstructive Sleep Apnea using Anthropometric Information and Tracheal Breathing Sounds. Sci. Rep. 2019, 9, 11467. [Google Scholar] [CrossRef]
- Elwali, A.; Moussavi, Z. Predicting Polysomnography Parameters from Anthropometric Features and Breathing Sounds Recorded during Wakefulness. Diagnostics 2021, 11, 905. [Google Scholar] [CrossRef] [PubMed]
- Hajipour, F.; Jozani, M.J.; Elwali, A.; Moussavi, Z. Regularized logistic regression for obstructive sleep apnea screening during wakefulness using daytime tracheal breathing sounds and anthropometric information. Med. Biol. Eng. Comput. 2019, 57, 2641–2655. [Google Scholar] [CrossRef] [PubMed]
- Hajipour, F.; Jozani, M.J.; Moussavi, Z. A comparison of regularized logistic regression and random forest machine learning models for daytime diagnosis of obstructive sleep apnea. Med. Biol. Eng. Comput. 2020, 58, 2517–2529. [Google Scholar] [CrossRef]
- Montazeri, A.; Giannouli, E.; Moussavi, Z. Assessment of obstructive sleep apnea and its severity during wakefulness. Ann. Biomed. Eng. 2012, 40, 916–924. [Google Scholar] [CrossRef]
- Elwali, A.; Moussavi, Z. Determining Breathing Sound Features Representative of Obstructive Sleep Apnea During Wakefulness with Least Sensitivity to Other Risk Factors. J. Med. Biol. Eng. 2018, 39, 230–237. [Google Scholar] [CrossRef]
- Elwali, A.; Moussavi, Z. A feature reduction and selection algorithm for improved obstructive sleep apnea classification process. Med. Biol. Eng. Comput. 2021, 59, 2063–2072. [Google Scholar] [CrossRef]
- Astfalck, L.C.; Sykulski, A.M.; Cripps, E.J. Debiasing Welch’s Method for Spectral Density Estimation. Biometrika 2023, 111, 1313–1329. [Google Scholar] [CrossRef]
- Mendel, J.M. Tutorial on Higher-Order Statistics (Spectra) in Signal Processing and System Theory: Theoretical Results and Some Applications. Proc. IEEE 1991, 79, 278–305. [Google Scholar] [CrossRef]
- Dlask, M.; Kukal, J. Hurst Exponent Estimation from Short Time Series. Signal Image Video Process. 2018, 12, 745–752. [Google Scholar] [CrossRef]
- Rosenstein, M.T.; Collins, J.J.; De Luca, C.J. A Practical Method for Calculating Largest Lyapunov Exponents from Small Data Sets. Phys. D Nonlinear Phenom. 1993, 65, 117–134. [Google Scholar] [CrossRef]
- Zhao, K.; Wen, H.; Guo, Y.; Scano, A.; Zhang, Z. Feasibility of Recurrence Quantification Analysis (RQA) in Quantifying Dynamical Coordination among Muscles. Biomed. Signal Process. Control 2023, 79, 104042. [Google Scholar] [CrossRef]
- Gosala, B.; Kapgate, P.D.; Jain, P.; Chaurasia, R.N.; Gupta, M. Wavelet Transforms for Feature Engineering in EEG Data Processing: An Overview. Biomed. Signal Process. Control 2023, 85, 104811. [Google Scholar] [CrossRef]
- Abdul, Z.K.; Al-Talabani, A.K. Mel Frequency Cepstral Coefficient and Its Applications: A Review. IEEE Access 2022, 10, 122136–122158. [Google Scholar] [CrossRef]
- Kohlrausch, A. Binaural masking experiments using noise maskers with frequency-dependent interaural phase differences. II: Influence of frequency and interaural-phase uncertainty. J. Acoust. Soc. Am. 1990, 88, 1749–1756. [Google Scholar] [CrossRef]
- Rangayyan, R.M.; Reddy, N.P. Biomedical Signal Analysis: A Case-Study Approach; Pergamon Press: New York, NY, USA, 2002; Volume 30. [Google Scholar]
- Ojala, T.; Pietikäinen, M.; Mäenpää, T. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
- Divya, S.; Suresh, L.P.; John, A. Image Feature Generation Using Binary Patterns—LBP, SLBP and GBP. In Advanced Computing and Intelligent Technologies; Springer Nature: Singapore, 2022; pp. 233–239. [Google Scholar]
- Jotz, G.P.; Cervantes, O.; Abrahão, M.; Settanni, F.A.P.; de Angelis, E.C. Noise-to-Harmonics Ratio as an Acoustic Measure of Voice Disorders in Boys. J. Voice 2002, 16, 28–31. [Google Scholar] [CrossRef]
- Farrús, M.; Hernando, J.; Ejarque, P. Jitter and Shimmer Measurements for Speaker Recognition. Proc. Interspeech 2007, 2007, 778–781. [Google Scholar]
- Borowska, M. Entropy-Based Algorithms in the Analysis of Biomedical Signals. Stud. Log. Gramm. Rhetor. 2015, 43, 21–23. [Google Scholar] [CrossRef]
- Hanley, J.A.; McNeil, B.J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982, 143, 29–36. [Google Scholar] [CrossRef]
- Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson Correlation Coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
- Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene Selection for Cancer Classification using Support Vector Machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- Tibshirani, R. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Abeyratne, U.R.; de Silva, S.; Hukins, C.; Duce, B. Obstructive sleep apnea screening by integrating snore feature classes. Physiol. Meas. 2013, 34, 99–121. [Google Scholar] [CrossRef] [PubMed]
- Akhter, S.; Abeyratne, U.R.; Swarnkar, V. Variations of snoring properties with macro sleep stages in a population of Obstructive Sleep Apnea patients. In Proceedings of the 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, 3–7 July 2013; pp. 1318–1321. [Google Scholar] [CrossRef]
- Herer, B.; Sarig-Bahat, H.; Tarasiuk, A.; Lavie, L. Relationships between snore sound complexity and upper airway collapse level. Sleep Breath. 2018, 22, 437–445. [Google Scholar]
- Kim, J.; Kim, T.; Lee, D.; Kim, J.W.; Lee, K. Exploiting temporal and nonstationary features in breathing sound analysis for multiple obstructive sleep apnea severity classification. Biomed. Eng. Online 2017, 16, 6. [Google Scholar] [CrossRef] [PubMed]
- Ng, A.K.; Koh, T.S.; Abeyratne, U.R.; Puvanendran, K. Investigation of obstructive sleep apnea using nonlinear mode interactions in non-stationary snore signals. Ann. Biomed. Eng. 2009, 37, 1796–1806. [Google Scholar] [CrossRef] [PubMed]
- Janott, C.; Schuller, B.; Heiser, C. Snoring—An acoustic definition. Physiol. Meas. 2019, 40, 05T01. [Google Scholar]
- Ashraf, W.; Fredberg, J.J.; Moussavi, Z. Aeroacoustics of breath sounds in trachea and upper airway. Appl. Acoust. 2026, 241, 111021. [Google Scholar] [CrossRef]
- Dafna, E.; Herer, B.; Eberhardt, R.; Tarasiuk, A. Automatic detection of whole night snoring events using audio recordings. Physiol. Meas. 2013, 34, 1619–1634. [Google Scholar]
- Janott, C.; Schmitt, M.; Zhang, Y.; Qian, K.; Pandit, V.; Zhang, Z.; Heiser, C.; Hohenhorst, W.; Herzog, M.; Hemmert, W.; et al. Snoring classified: The Munich-Passau Snore Sound Corpus. Comput. Biol. Med. 2018, 94, 106–118. [Google Scholar] [CrossRef]
- Ng, A.K.; Koh, T.S.; Baey, E.; Lee, T.H.; Abeyratne, U.R.; Puvanendran, K. Could formant frequencies of snore signals be an alternative means for the diagnosis of obstructive sleep apnea? Sleep Med. 2008, 9, 894–898. [Google Scholar] [CrossRef]
- Huynh, P.K.; Setty, A.R.; Le, T.Q. Koopman spectral analysis of intermittent dynamics in complex systems: A case study in pathophysiological processes of obstructive sleep apnea. arXiv 2022, arXiv:2202.12430. [Google Scholar] [CrossRef]
- Kim, T.; Kim, J.W.; Lee, K. Detection of sleep disordered breathing severity using acoustic biomarker and machine learning techniques. Biomed. Eng. Online 2018, 17, 16. [Google Scholar] [CrossRef]
- Mikami, T.; Ueki, S.; Takahashi, H.; Yonezawa, K. Detecting nonlinear acoustic properties of snoring sounds using Hilbert–Huang Transform. In Proceedings of the International Conference on Biomedical Electronics and Devices; SCITEPRESS: Setúbal, Portugal, 2015. [Google Scholar]
- Majumder, S. A Gaussian mixture model method for eigenvalue-based spectrum sensing with uncalibrated multiple antennas. Signal Process. 2022, 192, 108404. [Google Scholar] [CrossRef]
- Sebastian, A.; Cistulli, P.A.; Cohen, G.; de Chazal, P. Automatic classification of OSA-related snoring signals from nocturnal audio recordings. arXiv 2021, arXiv:2102.12829. [Google Scholar]
- Akhter, S.; Bradley, T.D.; Morrell, M.J. Investigation of macro sleep-stage influences on snore characteristics in OSA. Am. J. Respir. Crit. Care Med. 2014, 189, A567. [Google Scholar]
- Janott, C.; Schuller, B.; Heiser, C. Acoustic information in snoring noises. HNO 2017, 65, 107–116. [Google Scholar] [CrossRef] [PubMed]
- Ye, Z.; Peng, J.; Zhang, X.; Song, L. Snoring Sound Recognition Using Multi-Channel Spectrograms. Arch. Acoust. 2024, 49, 169–178. [Google Scholar] [CrossRef]
- Hajipour, F.; Moussavi, Z. Spectral and Higher Order Statistical Characteristics of Expiratory Tracheal Breathing Sounds During Wakefulness and Sleep in People with Different Levels of Obstructive Sleep Apnea. J. Med. Biol. Eng. 2019, 39, 244–250. [Google Scholar] [CrossRef]
- Kevat, A.; Bernard, A.; Harris, M.A.; Heussler, H.; Black, R.; Cheng, A.; Waters, K.; Chawla, J. Impact of adenotonsillectomy on growth trajectories in preschool children with mild-moderate obstructive sleep apnea. J. Clin. Sleep Med. 2023, 19, 55–62. [Google Scholar] [CrossRef]
- Cao, S.; Rosenzweig, I.; Bilotta, F.; Jiang, H.; Xia, M. Automatic detection of obstructive sleep apnea based on speech and snoring sounds with AI. J. Thorac. Dis. 2024, 16, 2654–2667. [Google Scholar] [CrossRef] [PubMed]
- Gonzalez-Martinez, F.D.; Carabias-Orti, J.J.; Canadas-Quesada, F.J.; Ruiz-Reyes, N.; Martinez-Munoz, D.; Garcia-Galan, S. Improving snore detection under limited dataset through harmonic/percussive source separation and convolutional neural networks. Appl. Acoust. 2024, 216, 109811. [Google Scholar] [CrossRef]
- Nagappa, M.; Liao, P.; Wong, J.; Auckley, D.; Ramachandran, S.K.; Memtsoudis, S.; Mokhlesi, B.; Chung, F. Validation of the STOP-Bang Questionnaire as a Screening Tool for Obstructive Sleep Apnea among Different Populations: A Systematic Review and Meta-Analysis. PLoS ONE 2015, 10, e0143697. [Google Scholar] [CrossRef] [PubMed]
- Balaei, A.T.; Sutherland, K.; Cistulli, P.A.; de Chazal, P. Automatic Detection of Obstructive Sleep Apnea Using Facial Images. In Proceedings of the 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), Melbourne, VIC, Australia, 18–21 April 2017; pp. 215–218. [Google Scholar]
- Romano, S.; Salvaggio, A.; Hirata, R.P.; Lo Bue, A.; Picciolo, S.; Oliveira, L.V.; Insalaco, G. Upper airway collapsibility evaluated by a negative expiratory pressure test in severe obstructive sleep apnea. Clinics 2011, 66, 567–572. [Google Scholar] [CrossRef]
- Kushida, C.A.; Efron, B.; Guilleminault, C. A predictive morphometric model for the obstructive sleep apnea syndrome. Ann. Intern. Med. 1997, 127, 581–587. [Google Scholar] [CrossRef]
- Sola-Soler, J.; Fiz, J.A.; Torres, A.; Jane, R. Identification of Obstructive Sleep Apnea Patients from Tracheal Breath Sound Analysis during Wakefulness in Polysomnographic Studies. In Proceedings of the 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, IL, USA, 26–30 August 2014; pp. 4232–4235. [Google Scholar]
- El-Sayed, I.H. Comparison of four sleep questionnaires for screening obstructive sleep apnea. Egypt. J. Chest Dis. Tuberc. 2012, 61, 433–441. [Google Scholar] [CrossRef]
- Lee, R.W.; Petocz, P.; Prvan, T.; Chan, A.S.; Grunstein, R.R.; Cistulli, P.A. Prediction of obstructive sleep apnea with craniofacial photographic analysis. Sleep 2009, 32, 46–52. [Google Scholar] [CrossRef][Green Version]
- Montero Benavides, A.; Fernández Pozo, R.; Toledano, D.T.; Blanco Murillo, J.L.; López Gonzalo, E.; Hernández Gómez, L. Analysis of Voice Features Related to Obstructive Sleep Apnoea and Their Application in Diagnosis Support. Comput. Speech Lang. 2014, 28, 434–452. [Google Scholar] [CrossRef]
- Simply, R.M.; Dafna, E.; Zigel, Y. Diagnosis of Obstructive Sleep Apnea Using Speech Signals From Awake Subjects. IEEE J. Sel. Top. Signal Process. 2020, 14, 251–260. [Google Scholar] [CrossRef]




| Severity Group | Number of Subjects | AHI | Sex | Age | NC | BMI | MPS |
|---|---|---|---|---|---|---|---|
| Non-OSA | 74 | 1.2 ± 1.3 | 29 M, 45 F | 46.8 ± 12.9 | 38.8 ± 4.0 | 30.6 ± 6.2 | 41 (1), 19 (2), 6 (3), 8 (4) |
| Mild | 35 | 8.7 ± 2.6 | 21 M, 14 F | 52.3 ± 11.6 | 42.1 ± 6.5 | 34.3 ± 8.4 | 18 (1), 6 (2), 9 (3), 1 (4) |
| Moderate | 50 | 21.5 ± 4.2 | 36 M, 14 F | 54.7 ± 11.3 | 43.1 ± 3.4 | 33.8 ± 6.4 | 17 (1), 17 (2), 8 (3), 8 (4) |
| Severe | 40 | 69.5 ± 33.3 | 30 M, 10 F | 48.9 ± 11.1 | 45.3 ± 3.6 | 39.7 ± 8.7 | 5 (1), 13 (2), 14 (3), 8 (4) |
| Severity Group | Fold | Number of Subjects | AHI | Sex | AGE | BMI | NC | MPS |
|---|---|---|---|---|---|---|---|---|
| Non-OSA | 1 | 23 | 0.6 ± 0.8 | 10 M, 13 F | 44.9 ± 12.1 | 29.2 ± 4.7 | 38.0 ± 4.7 | 12 (1), 7 (2), 1 (3), 3 (4) |
| 2 | 27 | 1.1 ± 1.3 | 10 M, 17 F | 45.7 ± 12.1 | 32.3 ± 7.6 | 39.2 ± 4.3 | 12 (1), 9 (2), 4 (3), 2 (4) | |
| 3 | 24 | 1.8 ± 1.3 | 9 M, 15 F | 50.0 ± 14.3 | 30.0 ± 5.8 | 39.0 ± 2.9 | 17 (1), 3 (2), 1 (3), 3 (4) | |
| Mild | 1 | 16 | 8.7 ± 2.4 | 10 M, 6 F | 50.9 ± 12.5 | 36.6 ± 9.9 | 43.5 ± 5.3 | 7 (1), 1 (2), 7 (3), 1 (4) |
| 2 | 10 | 8.6 ± 2.2 | 5 M, 5 F | 51.3 ± 11.6 | 31.8 ± 8.4 | 38.5 ± 8.8 | 6 (1), 1 (2), 2 (3), 1 (4) | |
| 3 | 9 | 8.8 ± 3.5 | 6 M, 3 F | 56.0 ± 10.3 | 33.0 ± 4.2 | 43.4 ± 2.7 | 5 (1), 4 (2) | |
| Moderate | 1 | 16 | 19.9 ± 2.9 | 10 M, 6 F | 56.3 ± 10.8 | 34.6 ± 7.8 | 42.3 ± 4.0 | 4 (1), 6 (2), 2 (3), 4 (4) |
| 2 | 18 | 22.8 ± 4.5 | 13 M, 5 F | 53.6 ± 9.7 | 31.8 ± 5.7 | 42.7 ± 3.8 | 8 (1), 5 (2), 3 (3), 2 (4) | |
| 3 | 16 | 21.6 ± 4.7 | 13 M, 3 F | 54.5 ± 13.9 | 35.2 ± 5.2 | 43.6 ± 2.8 | 5 (1), 6 (2), 3 (3), 2 (4) | |
| Severe | 1 | 14 | 72.9 ± 35.0 | 11 M, 3 F | 45.5 ± 10.5 | 39.1 ± 10.1 | 44.3 ± 4.2 | 2 (1), 3 (2), 5 (3), 4 (4) |
| 2 | 16 | 66.6 ± 29.6 | 13 M, 3 F | 50.2 ± 11.0 | 40.1 ± 8.5 | 46.6 ± 3.2 | 1 (1), 6 (2), 5 (3), 4 (4) | |
| 3 | 10 | 69.6 ± 39.1 | 6 M, 4 F | 51.9 ± 12.2 | 40.2 ± 7.5 | 43.8 ± 3.5 | 2 (1), 4 (2), 4 (3) |
| Severity Model | Dominant Feature Types | Structure–Function–Symptom Interpretation |
|---|---|---|
| Non-OSA vs. Mild-OSA | Spectral skewness of the power spectrum and texture uniformity of the bispectrum | Mild structural airway compliance and early narrowing (Structure) introduce intermittent airflow instability (Function), producing subtle turbulence and disrupted nonlinear coupling that manifest clinically (Symptom) as early increases in AHI. |
| Non-OSA vs. Moderate-OSA | Spectral power from the power spectrum and fragmentation of bispectral patterns | Progressive anatomical narrowing and reduced airway stiffness (Structure) generate sustained turbulent airflow and fragmented nonlinear interactions (Function), corresponding clinically (Symptom) to increased AHI and frequent obstructive events. |
| Non-OSA vs. Severe-OSA | Energy, complexity, and impulsiveness derived from the power spectrum and bispectrum | Severe upper-airway collapsibility and loss of neuromuscular control (Structure) result in chaotic, high-energy airflow and impulsive breathing sounds (Function), which clinically manifest (Symptom) as severe OSA with high AHI. |
| Mild-OSA vs. Moderate-OSA | Bandwidth expansion in the power spectrum and texture irregularity in the bispectrum | Worsening airway compromise (Structure) shifts airflow from intermittent to persistent instability (Function), reflected clinically (Symptom) by escalating AHI and sustained breathing disruption. |
| Mild-OSA vs. Severe-OSA | Entropy, Root Mean Square (RMS), and high-frequency variability of power spectral and bispectral representations | Dominant structural airway vulnerability (Structure) overwhelms compensatory mechanisms, producing highly irregular and energetic airflow (Function) that manifests clinically (Symptom) as severe OSA. |
| Moderate-OSA vs. Severe-OSA | Variability and topological complexity of power spectral and bispectral structures | Further loss of airway resilience and increased collapsibility (Structure) lead to chaotic airflow dynamics and extreme variability (Function), clinically reflected (Symptom) by markedly elevated AHI. |
| Feature Name | Average Rank by Corr | Average Rank by SHAP | Overall Average Rank |
|---|---|---|---|
| Average_BBox_Spectral | 1 | 1 | 1 |
| MouthInspiration_ConnectedComponents | 1 | 2 | 1.5 |
| MouthExpiration_FreqCentroid | 2 | 1 | 1.5 |
| Average_BBox_SpectralFlux | 2 | 1 | 1.5 |
| MouthExpiration_Range_SCBW_Bandwidth | 1 | 2 | 1.5 |
| Average_BBox_CentroidY | 3 | 1 | 2 |
| MouthInspiration_Range_MeanPower | 3 | 1 | 2 |
| Average_BBox_Entropy | 3 | 2 | 2.5 |
| Average_BBox_TextureEnergy | 3 | 3 | 3 |
| NoseExpiration_Range_FreqSkewness | 3 | 3 | 3 |
| Feature Name | Abs Delta AUC | Abs Delta Correlation |
|---|---|---|
| Average_BBox_Skewness | 0.001231 | 0.005491 |
| MouthExpiration_CoefVariation | 0.001806 | 0.137229 |
| Average_BBox_FrequencyBandwidth | 0.002441 | 0.09704 |
| Average_BBox_FrequencyBandwidth | 0.002511 | 0.070444 |
| Average_BBox_TextureEnergy | 0.002917 | 0.076107 |
| MouthExpiration_Perimeter | 0.002917 | 0.0041 |
| MouthExpiration_FrequencyBandwidth | 0.002917 | 0.016699 |
| Average_BBox_Median | 0.003554 | 0.113319 |
| Average_BBox_FrequencyBandwidth | 0.003667 | 0.152313 |
| MouthInspiration_Range_FreqSkewness | 0.003667 | 0.038156 |
| Feature Name | Anthropometric Feature | Pearson Correlation | Fold | Comparison |
|---|---|---|---|---|
| MouthExpiration_Peak | NC | 1 | 2 | Mild-OSA vs. Severe-OSA |
| Average_BBox_TextureHomogeneity | Sex | 0.9998 | 1 | Non-OSA vs. Moderate-OSA |
| MouthExpiration_Entropy | BMI | 0.999 | 2 | Mild-OSA vs. Severe-OSA |
| Average_BBox_Mean | Sex | 0.9987 | 1 | Non-OSA vs. Mild-OSA |
| Average_BBox_IQR | Sex | 0.9985 | 3 | Mild-OSA vs. Moderate-OSA |
| MouthInspiration_FreqCentroid | BMI | 0.9985 | 3 | Mild-OSA vs. Severe-OSA |
| Average_BBox_Energy | NC | 0.9985 | 2 | Non-OSA vs. Severe-OSA |
| Average_BBox_AspectRatio | MPS | 0.9983 | 1 | Non-OSA vs. Severe-OSA |
| MouthExpiration_FreqCentroid | MPS | 0.9982 | 2 | Mild-OSA vs. Moderate-OSA |
| Average_BBox_CentroidY | MPS | 0.9982 | 3 | Mild-OSA vs. Moderate-OSA |
| Modality | Reference | Dataset Size | Task | Performance Metrics | Uncertainty Modeling |
|---|---|---|---|---|---|
| Questionnaires | [60] | 9206 (Meta-analysis) | Screening | Sens: ~90–96% | No |
| Spec: ~25–34% | |||||
| [65] | 234 (Clinical) | Screening | Sens: ~95% | No | |
| Spec: ~5% | |||||
| Facial Analysis | [61] | 365 | Binary | Acc: 69.8% | No |
| Sens: 68.5% | |||||
| Spec: 76.4% | |||||
| [66] | 180 | Binary | Acc: 79.4% | No | |
| Sens: 69.7% | |||||
| Speech Analysis | [67] | 40 OSA/40 Control | Binary | Acc: 81.0% | No |
| Sens: 77.5% | |||||
| Spec: 85.0% | |||||
| [68] | 190 (AHI < 15) and 208 (AHI > 15) | Binary | Acc: 77.1% | No | |
| Sens: 75.0% | |||||
| Spec: 79.0% | |||||
| Pharyngometry | [63] | 46 (AHI < 5) and 254 (AHI > 15) | Binary | Sens: 97.6% | No |
| Spec: 100% | |||||
| NEP | [62] | 24 (AHI < 5) and 24 (AHI > 30) | Binary | Sens: 91.7% | No |
| Spec: 95.8% | |||||
| Tracheal Sound | [12] | 109 (AHI < 15) and 90 (AHI > 15) | Binary | Acc: 81.4% | No |
| Sens: 80.9% | |||||
| Spec: 82.1% | |||||
| [16] | 17 (AHI < 5) and 35 (AHI > 5) | Binary | Acc: 83.3% | No | |
| Sens: 85.0% | |||||
| Spec: 81.3% | |||||
| Proposed | Propsed | 74 Non-OSA, 35 Mild, 50 Moderate, and 40 Severe | Multi-class | AUC Range: 0.86–0.97 | Yes (Bootstrap Aggregation & 95% CI) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Alqudah, A.M.; Ashraf, W.; Lithgow, B.; Moussavi, Z. Interpretable Acoustic Features from Wakefulness Tracheal Breathing for OSA Severity Assessment. J. Clin. Med. 2026, 15, 1081. https://doi.org/10.3390/jcm15031081
Alqudah AM, Ashraf W, Lithgow B, Moussavi Z. Interpretable Acoustic Features from Wakefulness Tracheal Breathing for OSA Severity Assessment. Journal of Clinical Medicine. 2026; 15(3):1081. https://doi.org/10.3390/jcm15031081
Chicago/Turabian StyleAlqudah, Ali Mohammad, Walid Ashraf, Brian Lithgow, and Zahra Moussavi. 2026. "Interpretable Acoustic Features from Wakefulness Tracheal Breathing for OSA Severity Assessment" Journal of Clinical Medicine 15, no. 3: 1081. https://doi.org/10.3390/jcm15031081
APA StyleAlqudah, A. M., Ashraf, W., Lithgow, B., & Moussavi, Z. (2026). Interpretable Acoustic Features from Wakefulness Tracheal Breathing for OSA Severity Assessment. Journal of Clinical Medicine, 15(3), 1081. https://doi.org/10.3390/jcm15031081

