Artificial Intelligence for Spirometry Quality Evaluation: A Systematic Review
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design
2.2. Inclusion and Exclusion Criteria
2.3. Literature Search Strategy
2.4. Synthesis of the Results
3. Results
3.1. Study Selection
3.2. Characteristics of the Included Studies
| Author | Country | Dataset Source and Origin | Sample Size | Patient Age Range | Clinical Setting | Device | Labeling Process |
|---|---|---|---|---|---|---|---|
| Velickovski et al. [20] | Spain | Basque Country, Spain | 900 | >20 years | Primary care centers participating in forced spirometry training | Sibel 120 (SIBELGroup, Barcelona, Spain) operating at 100 Hz | Spirograms were randomly selected and independently evaluated by three clinical experts from the Lung Function Unit at the Hospital Clinic of Barcelona. Each expert assessed 600 spirograms, labeling them as acceptable if at least two experts agreed on acceptability, and rejected if at least two agreed on rejection. Spirograms for which there was no consensus between the experts were excluded from the dataset. |
| Walag et al. [9] | Poland | AioCare database (Poland) | 1998 | Children aged 9–15 and general population | Primary care and primary schools | AioCare portable spirometer (HealthUp, Warsaw, Poland) | Spirograms were visually inspected by experienced pulmonologists and labeled according to ATS/ERS 2019 criteria, assessing the acceptability and usability of FEV1 and FVC separately. A spirogram was labeled as confirmed if it was acceptable or usable, and rejected otherwise. Spirograms without the consensus of at least three experts were excluded. |
| Wang et al. [21] | China | Pulmonary function tests databases of the First Affiliated Hospital of Guangzhou Medical University | 16,502 | - | Hospital | MasterScreen-Pneumo (Jaeger, Hanover, Germany) | Spirograms were visually inspected and labeled according to ATS/ERS 2005 guidelines and Chines Thoracic Society 2014 guidelines by four pulmonologists. Any uncertainties were resolved by an independent expert. |
| Das et al. [18] | Belgium | NHANES 2011–2012 database | 36,873 | 6–79 years | - | - | Spirograms were reviewed according to ATS/ERS 2005 criteria and classified into four groups: A: acceptable quality, B: large time to peak flow or non-repeatable peak flow; C: <6 s exhalation or no plateau; D: cough or back extrapolated volume. Curves labeled A were considered acceptable, and curves labeled A, B, or C were considered usable. Curves labeled D were excluded. |
| Solinski et al. [19] | Poland | NHANES 2007–2012 database, AioCare database and steady-flow signals | 19,832 | NHANES: 6–79 years AioCare: 7–80 years | - | AioCare portable spirometer (HealthUp, Poland) and Series 1120 Flow Volume Simulator by Hans Rudolph, Inc. (Shawnee, KS, USA) | Raw signals of the spirometry curves are divided in 4 groups: A: acceptable quality; B: large time to peak flow or non-repeatable peak flow; C: <6 s exhalation or no plateau; D: cough and back extrapolated volume The cough containing curves were extracted from the D group by 4 experts. Finally, two groups were created: ATS-acceptable and other error curves and cough curves. |
| Bonthada et al. [22] | India | JSS Hospital, Mysore, India | 1600 | - | Hospital | Spirobank G spirometer (Morgan Scientific Inc., Haverhill, MA, USA) | - |
3.3. Main Results
3.3.1. Systems Oriented Towards the Determination of Acceptability and Usability
| Author | Model | Features | Classes | Validation | AUC | Sensitivity | Specificity |
|---|---|---|---|---|---|---|---|
| Velickovski et al. [20] | 5 ML models: Naïve Bayes, kNN, logistic regression, SVM, Random Forest | Normalized vector: 2 spirometric parameters (BEV, FET) + 7 coefficients derived from the polynomial fitting of the flow-time curve | (1) acceptable/(2) rejected | 422 (70%) training with 10-fold cross-validation; 181 (30%) test | 0.88 | 60% | 91% |
| Walag et al. [9] | CNN | Flow-time signals | FEV1 (1) confirmed (acceptable or usable)/(2) rejected | 1561 (80%) train; 391 (20%) test | 0.95 | 93.1% | 90.0% |
| FVC (1) confirmed (acceptable o usable)/(2) rejected | 1569 (80%) train; 393 (20%) test | 0.98 | 95.6% | 88.3% | |||
| Wang et al. [21] | Rule based module + object detection module (Faster R-CNN with ResNet50) | Numeric parameters (pulmonary function parameters, data from flow-volume and volume-time curves) + 600 × 1200 images of flow-volume and volume-time curves | FEV1 (1) acceptable/(2) not acceptable | 14,124 (90%) train; 1569 (10%) test | - | 97.8% | 92.4% |
| FEV1 (1) usable/(2) not usable | - | 99.4% | 85.4% | ||||
| FVC (1) acceptable/(2) not acceptable | - | 97.5% | 89.6% | ||||
| FVC (1) usable/(2) not usable | - | 99.5% | 89.0% | ||||
| Das et al. [18] | CNN | 32 × 32 images of the flow-volume curve | (1) acceptable/(2) not acceptable | 29,452 (80%) train; 3683 (10%) validation; 710 (10%) test | 0.93 | 90.0% | 85.0% |
| FEV1 (1) usable/(2) not usable | 0.98 | 92.0% | 96.0% |
3.3.2. Systems Oriented Towards Error Detection
| Author | Model | Features | Classes | Validation | AUC | Sensitivity | Specificity |
|---|---|---|---|---|---|---|---|
| Bonthada et al. [22] | 2 independent CNNs + MLP | 128 × 128 × 1 images of the flow-volume and volume-time curves | (1) Presence of performance errors/(2) Absence of performance errors | 1440 (90%) training 5-fold stratified cross-validation; 160 (10%) test | - | 98.0% | 87.0% |
| (1) Extra breaths/(2) High extrapolated volume/(3) Early termination/(4) Submaximal effort | - | - | |||||
| Solinski et al. [19] | 4 models: logistic regression, feed-forward artificial neural network, SVM and Random Forest | Number of local maxima (>0.05 s), horizontal crossings at 15%, 25%, 50%, 75% of PEF, local maxima after PEF with a right slope greater than 0.25 L/s | (1) Presence of cough/(1) Absence of cough | 11,719 (59%) train; 6470 (33%) 5-fold cross validation, 1643 (8%) test | - | 86.0% | 91.0% |
4. Discussion
4.1. Main Findings
4.2. Limitations of the Included Studies
4.3. Clinical Implications
- Assist quality control by enabling the acquisition of high-quality spirometry tests that reflect the expertise of specialized personnel in non-specialized settings, such as primary care.
- Reduce variability in the interpretation of spirometry quality.
- Reduce errors in the diagnosis of respiratory diseases.
- Detect specific errors made by the patient during the performance of the spirometry test.
4.3.1. Practical Implementation and Workflow Integration
- Determination of spirometry acceptability and usability: these systems could be directly integrated into spirometers. In this way, once the patient has performed the maneuver, the intelligent system can automatically trigger alerts for invalid attempts and provide immediate feedback to the operator regarding the need to repeat the test. Additionally, unacceptable maneuvers would be recorded and classified as invalid within the system, reducing the likelihood of diagnostic errors.
- Detection of errors made by the patient during test performance: these systems can be integrated directly into the spirometer and allow immediate identification of whether the patient has made an error during the maneuver. This approach offers a dual benefit: on one hand, it indicates that the test is unacceptable, and on the other, it specifies the type of error committed. In this way, feedback is provided to both the technician and the patient, facilitating error correction and increasing the likelihood of obtaining an acceptable maneuver upon repetition.
4.3.2. Adoption Barriers and Enablers
4.4. Future Opportunities and Need for Further Analysis
- Advanced analysis of spirometric curves using DL: implementing neural networks to support visual inspection, reducing the subjectivity inherent in manual validation.
- Development of integrative representations: combining images of spirometric curves with numerical parameters included in the report, aiming to build more comprehensive and robust classifiers.
- Multicenter and prospective validation: evaluating the models in diverse clinical settings to ensure their applicability, reliability, and scalability.
- Integration with clinical systems and real-time feedback: enabling intelligent systems to provide immediate alerts to the operator and connect with electronic records to improve traceability and the quality of test control.
4.5. Limitations of This Review
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Graham, B.L.; Steenbruggen, I.; Barjaktarevic, I.Z.; Cooper, B.G.; Hall, G.L.; Hallstrand, T.S.; Kaminsky, D.A.; McCarthy, K.; McCormack, M.C.; Miller, M.R.; et al. Standardization of Spirometry 2019 Update an Official American Thoracic Society and European Respiratory Society Technical Statement. Am. J. Respir. Crit. Care Med. 2019, 200, E70–E88. [Google Scholar] [CrossRef]
- Salas, T.; Rubies, C.; Gallego, C.; Muñoz, P.; Burgos, F.; Escarrabill, J. Requerimientos Técnicos de Los Espirómetros En La Estrategia Para Garantizar El Acceso a Una Espirometría de Calidad. Arch. Bronconeumol. 2011, 47, 466–469. [Google Scholar] [CrossRef]
- Soriano, J.B.; Zielinski, J.; Price, D. Screening for and Early Detection of Chronic Obstructive Pulmonary Disease. Lancet 2009, 374, 721–732. [Google Scholar] [CrossRef]
- Miller, M.R.; Hankinson, J.; Brusasco, V.; Burgos, F.; Casaburi, R.; Coates, A.; Crapo, R.; Enright, P.; van der Grinten, C.P.M.; Gustafsson, P.; et al. Standardisation of Spirometry. Eur. Respir. J. 2005, 26, 319–338. [Google Scholar] [CrossRef] [PubMed]
- American Thoracic Society. Standardization of Spirometry, 1994 Update. Am. J. Respir. Crit. Care Med. 1995, 152, 1107–1136. [Google Scholar] [CrossRef]
- Müller-Brandes, C.; Krämer, U.; Gappa, M.; Seitner-Sorge, G.; Hüls, A.; Von Berg, A.; Hoffmann, B.; Schuster, A.; Illi, S.; Wisbauer, M.; et al. LUNOKID: Can Numerical American Thoracic Society/European Respiratory Society Quality Criteria Replace Visual Inspection of Spirometry? Eur. Respir. J. 2014, 43, 1347–1356. [Google Scholar] [CrossRef] [PubMed]
- Hueto, J.; Cebollero, P.; Pascal, I.; Cascante, A.; Eguía, M.; Teruel, F.; Carpintero, M. La Espirometría En Atención Primaria En Navarra. Arch. Bronconeumol. 2006, 42, 326–331. [Google Scholar] [CrossRef]
- Walters, J.A.; Haydn Walters, E.; Nelson, M.; Robinson, A.; Scott, J.; Turner, P.; Wood-Baker, R. Factors Associated with Misdiagnosis of COPD in Primary Care. Prim. Care Respir. J. 2011, 20, 396–402. [Google Scholar] [CrossRef] [PubMed]
- Waląg, D.; Soliński, M.; Kołtowski, Ł.; Górska, K.; Korczyński, P.; Kuźnar-Kamińska, B.; Grabicki, M.; Basza, M.; Łepek, M. Deep Learning Algorithm for Visual Quality Assessment of the Spirograms. Physiol. Meas. 2023, 44, 085004. [Google Scholar] [CrossRef]
- Soriano, J.B.; Ancochea, J.; Miravitlles, M.; García-Río, F.; Duran-Tauleria, E.; Muñoz, L.; Jiménez-Ruiz, C.A.; Masa, J.F.; Viejo, J.L.; Villasante, C.; et al. Recent Trends in COPD Prevalence in Spain: A Repeated Cross-Sectional Survey 1997–2007. Eur. Respir. J. 2010, 36, 758–765. [Google Scholar] [CrossRef]
- López-Canay, J.; Casal-Guisande, M.; Pinheira, A.; Golpe, R.; Comesaña-Campos, A.; Fernández-García, A.; Represas-Represas, C.; Fernández-Villar, A. Predicting COPD Readmission: An Intelligent Clinical Decision Support System. Diagnostics 2025, 15, 318. [Google Scholar] [CrossRef]
- Mei, S.; Li, X.; Zhou, Y.; Xu, J.; Zhang, Y.; Wan, Y.; Cao, S.; Zhao, Q.; Geng, S.; Xie, J.; et al. Deep Learning for Detecting and Early Predicting Chronic Obstructive Pulmonary Disease from Spirogram Time Series. NPJ Syst. Biol. Appl. 2025, 11, 18. [Google Scholar] [CrossRef]
- Cho, M.; Hill, D.; Torop, M.; Masoomi, A.; Castaldi, P.; Silverman, E.; Bodduluri, S.; Bhatt, S.; Yun, T.; McLean, C.; et al. Deep Learning of Suboptimal Spirometry to Predict Respiratory Outcomes and Mortality. Res. Sq. 2025, preprint. [Google Scholar] [CrossRef]
- Sunjaya, A.; Edwards, G.D.; Harvey, J.; Sylvester, K.; Purvis, J.; Rutter, M.; Shakespeare, J.; Moore, V.; El-Emir, E.; Doe, G.; et al. Validation of Artificial Intelligence Spirometry Diagnostic Support Software in Primary Care: A Blinded Diagnostic Accuracy Study. ERJ Open Res. 2025, 11, 00116–02025. [Google Scholar] [CrossRef]
- Pinheira, A.; Casal-Guisande, M.; López-Canay, J.; Fernández-García, A.; Golpe, R.; Represas-Represas, C.; Torres-Durán, M.; Cerqueiro-Pequeño, J.; Comesaña-Campos, A.; Fernández-Villar, A. Image-Based Formalization of Tabular Data for Threshold-Based Prediction of Hospital Stay Using Convolutional Neural Networks: An Intelligent Decision Support System Applied in COPD. Appl. Syst. Innov. 2025, 8, 128. [Google Scholar] [CrossRef]
- Topole, E.; Biondaro, S.; Montagna, I.; Corre, S.; Corradi, M.; Stanojevic, S.; Graham, B.; Das, N.; Ray, K.; Topalovic, M. Artificial Intelligence Based Software Facilitates Spirometry Quality Control in Asthma and COPD Clinical Trials. ERJ Open Res. 2023, 9, 00292–2022. [Google Scholar] [CrossRef] [PubMed]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
- Das, N.; Verstraete, K.; Stanojevic, S.; Topalovic, M.; Aerts, J.M.; Janssens, W. Deep-Learning Algorithm Helps to Standardise ATS/ERS Spirometric Acceptability and Usability Criteria. Eur. Respir. J. 2020, 56, 2000603. [Google Scholar] [CrossRef]
- Soliński, M.; Łepek, M.; Kołtowski, Ł. Automatic Cough Detection Based on Airflow Signals for Portable Spirometry System. Inform. Med. Unlocked 2020, 18, 100313. [Google Scholar] [CrossRef]
- Velickovski, F.; Ceccaroni, L.; Marti, R.; Burgos, F.; Gistau, C.; Alsina-Restoy, X.; Roca, J. Automated Spirometry Quality Assurance: Supervised Learning from Multiple Experts. IEEE J. Biomed. Health Inform. 2018, 22, 276–284. [Google Scholar] [CrossRef]
- Wang, Y.; Li, Y.; Chen, W.; Zhang, C.; Liang, L.; Huang, R.; Liang, J.; Tu, D.; Gao, Y.; Zheng, J.; et al. Deep Learning for Spirometry Quality Assurance with Spirometric Indices and Curves. Respir. Res. 2022, 23, 98. [Google Scholar] [CrossRef] [PubMed]
- Bonthada, S.; Perumal, S.P.; Naik, P.P.; Padukudru, M.A.; Rajan, J. An Automated Deep Learning Pipeline for Detecting User Errors in Spirometry Test. Biomed. Signal Process Control 2024, 90, 105845. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
López-Canay, J.; Casal-Guisande, M.; Represas-Represas, C.; Cerqueiro-Pequeño, J.; Bouza-Rodríguez, J.-B.; Comesaña-Campos, A.; Fernández-Villar, A. Artificial Intelligence for Spirometry Quality Evaluation: A Systematic Review. Bioengineering 2025, 12, 1286. https://doi.org/10.3390/bioengineering12121286
López-Canay J, Casal-Guisande M, Represas-Represas C, Cerqueiro-Pequeño J, Bouza-Rodríguez J-B, Comesaña-Campos A, Fernández-Villar A. Artificial Intelligence for Spirometry Quality Evaluation: A Systematic Review. Bioengineering. 2025; 12(12):1286. https://doi.org/10.3390/bioengineering12121286
Chicago/Turabian StyleLópez-Canay, Julia, Manuel Casal-Guisande, Cristina Represas-Represas, Jorge Cerqueiro-Pequeño, José-Benito Bouza-Rodríguez, Alberto Comesaña-Campos, and Alberto Fernández-Villar. 2025. "Artificial Intelligence for Spirometry Quality Evaluation: A Systematic Review" Bioengineering 12, no. 12: 1286. https://doi.org/10.3390/bioengineering12121286
APA StyleLópez-Canay, J., Casal-Guisande, M., Represas-Represas, C., Cerqueiro-Pequeño, J., Bouza-Rodríguez, J.-B., Comesaña-Campos, A., & Fernández-Villar, A. (2025). Artificial Intelligence for Spirometry Quality Evaluation: A Systematic Review. Bioengineering, 12(12), 1286. https://doi.org/10.3390/bioengineering12121286

