1. Introduction
Chronic obstructive pulmonary disease (COPD) represents a significant global health burden, leading to substantial morbidity, mortality, and health care costs [
1]. Current estimates suggest a prevalence of around 3% in the general population, with projections indicating it will become the third leading cause of death and the seventh leading cause of disability-adjusted life years (DALYs) lost by 2050 [
2]. In Italy, recent estimates indicate a slightly lower prevalence, affecting approximately 2–2.5% of the general population [
2]. COPD is a progressive disease, often resulting in reduced quality of life and increased risk of exacerbations, hospitalizations, and mortality [
3]. The current prognostic markers for COPD include a range of clinical, lung function, and imaging parameters. These may include lung capacity measures such as forced expiratory volume in the first second (FEV
1) [
4], oxygen (O
2) saturation, and inflammatory biomarkers such as white blood cell count and C-reactive protein levels [
5].
Pulmonary rehabilitation (PR) is a cornerstone intervention in COPD management, encompassing a multidisciplinary approach aimed at improving physical and psychological health. PR programs typically include exercise training, education about COPD management, and psychosocial support [
6]. However, the response to PR varies widely across individuals, reflecting the significant heterogeneity of the disease. Indeed, COPD patients present with distinct phenotypes, each characterized by unique etiological, clinical, and prognostic profiles, which complicates efforts to predict rehabilitation outcomes and tailor personalized interventions [
7]. Univariate markers such as FEV
1 [
8], six-minute walk test (6MWT) covered distance [
9], and scales assessing symptoms like the Medical Research Council (MRC) Dyspnoea Scale [
10], have been widely used to assess COPD severity and response to rehabilitation.
However, these single-dimension measures may oversimplify the multifactorial nature of COPD, failing to capture the complex interconnections between clinical, functional, physiological, and psychological factors that shape individual recovery trajectories. This complexity highlights the need for analytical approaches capable of integrating multiple data sources to provide a more comprehensive characterization of disease variability.
Supervised machine learning methods, although highly effective for prediction tasks, rely on predefined outcome labels and therefore cannot uncover latent structures or describe the variability in patient responses.
Unsupervised machine learning techniques have increasingly been used to address the challenges posed by high-dimensional, heterogeneous data [
11,
12,
13]. Their main strength lies in identifying complex, nonlinear relationships that traditional statistical methods may overlook, enabling a more data-driven stratification of patients. Although their clinical applicability is still constrained by the need for large and high-quality datasets, limited reproducibility, and reduced interpretability, these approaches have shown promise in supporting clinical decision-making and complementing conventional assessments.
Building on this rationale, unsupervised clustering is particularly suited to explore the underlying structure of the COPD population and identify patient subgroups with distinct recovery trajectories [
14,
15,
16]. Recent applications of unsupervised clustering in COPD have demonstrated its potential and further support its use in this clinical context [
17,
18]. Despite the promise of clustering approaches, challenges remain in ensuring their reproducibility and clinical applicability. Variability in patient cohorts, data sources, and clustering methodologies across studies can lead to inconsistent results, raising concerns about the robustness of identified phenotypes.
This study employs unsupervised clustering methods to stratify COPD patients from clinical information at admission in a rehabilitation unit. The resulting index is then assessed for its predictive value on rehabilitation outcomes at discharge. To address the limitations of earlier studies, a set of variables from three distinct assessment domains, i.e., the 6MWT, forced oscillation technique (FOT), and spirometry, was considered. In addition, different clustering approaches were compared to verify consistency and robustness in subgroup identification.
2. Materials and Methods
2.1. Study Design and Collection
This study was based on both a prospective observational study (conducted from 2021 to 2022) and a retrospective observational study (from 2016 to 2018) carried out at the Pulmonary Rehabilitation Unit of IRCSS Fondazione Don Gnocchi ONLUS in Florence. The studies enrolled COPD patients undergoing an outpatient pulmonary rehabilitation program (PRP). PRP was conducted in accordance with the American Thoracic Society (ATS) and the European Respiratory Society (ERS) recommendations [
19] and included education, aerobic exercise training for both upper and lower limbs, and breathing retraining. The studies shared the same inclusion criteria: patients had to meet the COPD definition outlined by GOLD standards [
20]; the severity of airflow obstruction ranged from moderate to very severe according to the GOLD classification; participants were former smokers in stable condition for at least four weeks prior to enrollment; and they were receiving optimal standard treatment as recommended by GOLD guidelines. Patients with recent cardiovascular events or with neuromuscular or osteoarticular diseases that limited physical exercise and/or compromised lung mechanical properties were excluded from the PRP. The studies were approved by the Research Ethics Committee (r.n.18765_oss; r.n.15217_oss). All participants provided written informed consent at the time of assessment. The variables of interest were evaluated at two time points, namely admission (T0) and discharge (T1) over 20 sessions of the pulmonary rehabilitation program.
This analysis incorporated three distinct respiratory tests (
Figure 1), including the FOT [
21], spirometry [
22], and the 6MWT [
23], each serving a specific purpose in assessing respiratory function. The FOT procedure focused on assessing respiratory impedance by recording multiple measurements while participants breathed normally. This non-invasive technique provided detailed insights into the mechanical properties of the respiratory system, helping to identify potential abnormalities or dysfunctions [
21]. Spirometry provided insightful data on lung volume and air-flow dynamics [
22]. The 6MWT involved participants walking briskly for six minutes while vital signs and the distance covered were recorded. This test provided insights into participants’ functional capacity and endurance, offering a practical measure of their overall cardiopulmonary health [
23].
2.2. Data Preparation
The outcome of the study was the 6MWT covered distance variation between T0 and T1 (namely, Delta meters). Patients with missing outcome data were excluded from the analysis. In line with international clinical guidelines, specifically the ATS/ERS technical standard on field walking tests under chronic respiratory conditions [
24], a minimal clinically important difference (MCID) threshold of 30 m was used to define clinically significant improvement (CSI) in the 6MWT. Consequently, the outcome was dichotomized as follows:
The independent variables of the study were collected at T0 from the three different assessment domains mentioned above, for a total of 26 variables. Specifically, within the FOT, respiratory system resistance (RRS) and reactance (XRS) were measured during inspiration at 5 Hz, along with its variation (ΔXRS). Moreover, inspiratory time (TI), the ratio of TI to total time (TI/TTOT), expiratory time (TE), mechanical ventilation (VE), tidal volume (VT), the percentage of respiratory flow (RF%), and respiratory rate (RR) were recorded. In spirometry, functional parameters were included, such as forced expiratory volume divided by slow vital capacity (FEV/SVC), FEV1, total lung capacity (TLC), inspiratory capacity (IC), functional residual capacity (FRC), and residual volume (VR). During the 6MWT, in addition to recording the total distance walked, patients were assessed from multiple perspectives, including O2 levels, O2 saturation, the Borg Dyspnoea Scale, and the Borg Scale for limb fatigue, measured twice, before and after the test.
A preliminary analysis was adopted to discard variables showing a cross-correlation greater than 0.8. Variables with missing values were imputed using a k-nearest neighbors (kNN)-based imputer from the Scikit-learn library [
25]. Then, the remaining features were standardized by removing the mean and scaling to unit variance.
2.3. Clustering Methods
Patients were clustered according to four different unsupervised algorithms, including k-means [
26], k-medoids [
27], a Gaussian mixture model [
28], and BIRCH (balanced iterative reducing and clustering using hierarchies) [
29]. Input data for the unsupervised models were the independent variables of the analysis.
K-means clusters data by partitioning samples in a number of groups with equal variance [
26]. The algorithm was initialized with the k-means++ method (selecting initial centroids using the distribution probability-based sampling technique [
30]) with the aim of minimizing the total variance contribution to the cluster. Computation was sped up using the ELKAN method (applying the triangle inequality to avoid computation of unnecessary distances [
31]).
The k-medoids algorithm, a variation of k-means, partitions data into clusters by choosing representative points (medoids) and assigning each sample to the nearest medoid [
27]. The algorithm was initialized with the k-medoids++ method (following an approach similar to k-means++).
The Gaussian mixture model (GMM) assumes data are generated from a mixture of Gaussian distributions [
28]. It employs the expectation–maximization algorithm to estimate the distribution parameters and assigns points to clusters based on the maximum a posteriori probability [
32]. The algorithm was initialized with the k-means++ method.
BIRCH constructs a feature tree with each of the nodes representing a subcluster. The feature tree expands dynamically as new data points are added [
29].
For each algorithm, the number of clusters varied between 2 and 15. The number of clusters, as well as the different initializations, were compared and selected by choosing the configuration that yielded the highest silhouette score [
33]. Once the optimal number of clusters was identified, the clustering algorithm was selected based on the best compromise between the silhouette score and balance in the number of patients assigned to clusters.
2.4. Statistical Analysis
Descriptive statistics were calculated before the imputation to provide a comprehensive overview of the effective absolute values. The median and interquartile range (IQR) values were reported for numerical variables, while for categorical variables, absolute frequencies and percentages were calculated. A comparative analysis was conducted between the subgroups identified by the dichotomized outcome. A Mann–Whitney test was performed for numerical variables, while a chi-squared test was conducted for categorical variables. After computing the cluster centroids, a second comparative analysis was conducted (Mann–Whitney test) to assess whether the outcome distributions in the cluster groups were statistically different. Later, the dichotomized outcome was compared with the cluster labels of each algorithm through a contingency table and a chi-squared analysis. Finally, on the model that reported the best results, a Mann–Whitney test was employed to evaluate whether there were statistically significant variations in the distribution of independent variables between the clusters.
3. Results
3.1. Descriptive and Univariate Results
A total of 166 patients were initially enrolled, of whom 26 were excluded due to comorbidities, resulting in 140 patients included in the study. Among these, 14 patients had missing outcome data, leading to a final sample size of 126 patients analyzed. In this final cohort (median age 77 years [IQR = 10], males: 56), 50% of participants had a
6MWTCSI=1 (the median value of Delta meters was 29.5 [IQR = 61]). The preliminary correlational analysis reduced the cardinality of the variables to 20. All the variables related to the FOT and spirometry did not show significantly different distributions between the two groups stratified by outcome. Conversely, among the variables of the 6MWT, O
2 saturation and Borg Dyspnoea Scale rating were measured at the beginning, and total meters significantly differed between the groups. (
Table 1).
3.2. Cluster Analysis
The optimization of the number of clusters conducted for each of the clustering algorithms led to identical results for all: the configuration with two clusters was the one with the highest silhouette score (
Figure 2). The silhouette scores for the two-cluster configuration were 0.20 for the Gaussian mixture model, 0.14 for BIRCH, 0.12 for k-means, and 0.08 for k-medoids.
The number of patients assigned to each cluster was computed for each clustering method to assess group balancing. K-medoids and k-means clustering resulted in the most balanced distributions (Ncl0 = 61, Ncl1 = 65 and Ncl0 = 60, Ncl1 = 66, respectively); conversely, the Gaussian mixture model and BIRCH showed less cluster balance (Ncl0 = 11, Ncl1 = 115 and Ncl0 = 27, Ncl1 = 99, respectively).
Given these findings, the k-means clustering solution has been considered the most appropriate for the analysis and was referred to as the respiratory rehabilitation index (R2I).
Concerning the comparison of clustering output with the dichotomized outcome, only k-means was statistically significant (χ
2 = 4.58,
p = 0.032). Conversely, the continuous outcome distribution was significantly different between the two clusters (Mann–Whitney,
p < 0.05) for all the proposed solutions. The Delta meters distribution of the two clusters resulted in a median {IQR] of 21 [46.3] and 43.5 [74], 25 [57] and 30 [60], 20 [30.5] and 30 [57], and 21 [29] and 32 [63] for the k-means, k-medoids, GMM, and BIRCH, respectively (
Figure 3). A radar plot illustrating the distribution of independent variables in the two clusters has been provided exclusively for the R2I (
Figure 4). Several variables significantly differed between the two identified clusters (
Table 2).
4. Discussion
This study demonstrated that unsupervised clustering techniques can effectively stratify COPD patients into distinct subgroups based on pre-rehabilitation characteristics, offering valuable insights into rehabilitation outcomes. The outcome measure, defined as the change in 6MWT distance between admission and discharge, was dichotomized based on the MCID threshold of 30 m. The optimal clustering solution was obtained using the k-means algorithm with two clusters, resulting in the R2I. The latter, obtained from T0 data, revealed a significant association with the outcome at T1 (p = 0.032), showing that patients with more severe baseline functional and respiratory impairments (R2I = 1) were positively associated with a post-rehabilitation improvement in walked distance. In particular, patients in R2I = 0, compared to those in R2I = 1, presented at admission with lower overall mechanical impairment (lower respiratory resistance values and smaller variations in reactance during the test), a more favorable ventilatory pattern and lung volumes, and better functional capacity, as indicated by higher walking performance, greater exercise tolerance, and lower perceived dyspnoea. Identifying these profiles through clustering before rehabilitation could help clinicians anticipate which patients are more likely to achieve meaningful functional improvement and adapt the intensity, focus, and monitoring of PR programs accordingly, ultimately aiming to maximize individual benefits. These findings suggested that pre-rehabilitation profiling through clustering can help identify patients who are more likely to benefit from PR in terms of the 6MWT, with a significant increase. While only a few functional parameters of the 6MWT, such as total distance and O2 saturation, showed significant differences between the groups identified by the outcome, the R2I clusters revealed differences in nearly all pre-rehabilitation variables. These included parameters from both the FOT and spirometry, which were not evident in the outcome-based grouping, indicating that these respiratory measures play a critical role in patient stratification and may better capture the underlying heterogeneity in rehabilitation responses. Key variables that contributed most to the discrimination between R2I clusters included ΔXRS, FEV/SVC, and IC. This multidimensional approach goes beyond single-domain assessments used in previous studies by capturing both respiratory mechanics and functional performance, providing a more accurate characterization of patient profiles.
From a methodological perspective, this study compares clustering algorithms, including k-medoids, the GMM, and BIRCH, in addition to k-means, ultimately selected as the most appropriate solution. The use of silhouette scores to choose the optimal number of clusters ensured an objective and reproducible approach, reinforcing the validity of the identified subgroups. These methodological strengths addressed critical gaps in the literature, where clustering solutions were often hindered by inconsistent methods and insufficient validation, resulting in a lack of reproducibility and practical relevance. By applying and comparing different clustering methodologies and achieving consistent subgroup identification across algorithms, this study enhanced confidence in the robustness of the R2I for patient stratification.
The most significant practical implication of this study is the potential to personalize rehabilitation strategies for COPD patients. COPD is a highly heterogeneous condition, with patients presenting diverse clinical profiles and responses to therapy, which often limits the effectiveness of standardized rehabilitation protocols. By stratifying patients into more homogeneous subgroups based on pre-rehabilitation features, unsupervised clustering techniques can contribute to understanding the relationship between pulmonary function impairment and mechanisms of response to PR. This approach enables the design of tailored rehabilitation programs with the potential to improve rehabilitation outcomes, reduce variability in responses, and support more effective patient management in clinical practice.
5. Limitations
The relatively small sample size may limit the generalizability of the findings to broader COPD populations. Moreover, conducting the study in a single rehabilitation center may have introduced bias linked to the specific population characteristics or local rehabilitation protocols.
6. Future Directions
Future research should focus on validating the R2I across larger COPD cohorts to enhance its generalizability and clinical applicability. Further investigations could benefit from the inclusion of additional clinical and functional variables, such as psychosocial factors (e.g., anxiety and depression [
34]), comorbidities (e.g., cardiac, metabolic, orthopedic, or behavioral health problems [
35]), and markers of skeletal muscle dysfunction [
36]. These aspects are well established as influential determinants of rehabilitation outcomes in individuals with COPD. Incorporating them into a multidimensional framework may allow for more accurate patient stratification and could enhance the overall predictive value and clinical utility of the R2I.
7. Conclusions
This study shows that the unsupervised clustering of multidimensional admission data enables the identification of clinically meaningful subgroups of COPD patients undergoing pulmonary rehabilitation. By integrating 6MWT, FOT, and spirometry parameters, the R2I offers a data-driven stratification tool capable of predicting rehabilitation outcomes. Specifically, patients with more severe pre-rehabilitation impairment (R2I = 0) were more likely to achieve clinically significant improvements in functional capacity, as measured by the 6MWT.
The R2I captured differences across a broad range of admission variables, many of which were not univariately associated with the outcome. These findings underscore the potential of unsupervised machine learning approaches to uncover hidden patterns in complex clinical data and support more personalized rehabilitation strategies.
Author Contributions
Conceptualization, P.L. and A.M.; methodology, E.M. and P.L.; software, E.M.; validation, I.R. and F.G.; formal analysis, E.M. and P.L.; investigation, E.M., F.G. and I.R.; resources, A.M., F.G. and I.R.; writing—original draft preparation, E.M.; writing—review and editing, P.L., A.M., I.R. and F.G.; visualization, E.M.; supervision, I.R., F.G. and A.M.; project administration, A.M.; funding acquisition, A.M. All authors have read and agreed to the published version of the manuscript.
Funding
This study was funded by the Italian Ministry of Health under the “Ricerca Corrente” program.
Institutional Review Board Statement
The studies were approved by the Research Ethics Committee (approval code: r.n.18765_oss; r.n.15217_oss; approval date: 20 April 2020).
Informed Consent Statement
Written informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The data presented in this study are available upon request from the corresponding author for reproducibility purposes.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
6MWT | Six-Minute Walk Test |
BIRCH | Balanced Iterative Reducing and Clustering using Hierarchies |
CSI | Clinically Significant Improvement |
COPD | Chronic Obstructive Pulmonary Disease |
DALYs | Disability-Adjusted Life Years |
FEV | Forced Expiratory Volume |
FEV1 | Forced Expiratory Volume in the First Second |
FRC | Functional Residual Capacity |
FOT | Forced Oscillation Technique |
GMM | Gaussian Mixture Model |
IC | Inspiratory Capacity |
IQR | Interquartile Range |
kNN | k-Nearest Neighbors |
MCID | Minimal Clinically Important Difference |
MRC | Medical Research Council |
PR | Pulmonary Rehabilitation |
PRP | Pulmonary Rehabilitation Program |
R2I | Respiratory Rehabilitation Index |
RF | Respiratory Flow |
RRS | Respiratory System Resistance |
RR | Respiratory Rate |
SVC | Slow Vital Capacity |
TE | Expiratory Time |
TI | Inspiratory Time |
TLC | Total Lung Capacity |
TTOT | Total Time |
VE | Mechanical Ventilation |
VR | Residual Volume |
VT | Tidal Volume |
XRS | Respiratory System Reactance |
References
- Safiri, S.; Carson-Chahhoud, K.; Noori, M.; Nejadghaderi, S.A.; Sullman, M.J.M.; Heris, J.A.; Ansarin, K.; Mansournia, M.A.; Collins, G.S.; Kolahi, A.-A.; et al. Burden of chronic obstructive pulmonary disease and its attributable risk factors in 204 countries and territories, 1990–2019: Results from the Global Burden of Disease Study 2019. BMJ 2022, 378, e069679. [Google Scholar] [CrossRef]
- Wang, Z.; Lin, J.; Liang, L.; Huang, F.; Yao, X.; Peng, K.; Gao, Y.; Zheng, J. Global, regional, and national burden of chronic obstructive pulmonary disease and its attributable risk factors from 1990 to 2021: An analysis for the Global Burden of Disease Study 2021. Resp. Res. 2025, 26, 2. [Google Scholar] [CrossRef]
- Wedzicha, J.A.; Seemungal, T.A. COPD exacerbations: Defining their cause and prevention. Lancet 2007, 370, 786–796. [Google Scholar] [CrossRef]
- Vestbo, J.; Edwards, L.D.; Scanlon, P.D.; Yates, J.C.; Agusti, A.; Bakke, P.; Calverley, P.M.; Celli, B.; Coxson, H.O.; Crim, C.; et al. Changes in forced expiratory volume in 1 second over time in COPD. N. Engl. J. Med. 2011, 365, 1184–1192. [Google Scholar] [CrossRef] [PubMed]
- Fermont, J.M.; Masconi, K.L.; Jensen, M.T.; Ferrari, R.; Di Lorenzo, V.A.P.; Marott, J.M.; Schuetz, P.; Watz, H.; Waschki, B.; Müllerova, H.; et al. Biomarkers and clinical outcomes in COPD: A systematic review and meta-analysis. Thorax 2019, 74, 439–446. [Google Scholar] [CrossRef] [PubMed]
- Troosters, T.; Janssens, W.; Demeyer, H.; Rabinovich, R.A. Pulmonary rehabilitation and physical interventions. Eur. Respir. Rev. 2023, 32, 220222. [Google Scholar] [CrossRef] [PubMed]
- Corlateanu, A.; Mendez, Y.; Wang, Y.; Garnica, R.d.J.A.; Botnaru, V.; Siafakas, N. Chronic obstructive pulmonary disease and phenotypes: A state-of-the-art. Pulmonology 2020, 26, 95–100. [Google Scholar] [CrossRef]
- Jones, P.W.; Agusti, A.G.N. Outcomes and markers in the assessment of chronic obstructive pulmonary disease. Eur. Respir. J. 2006, 27, 822–832. [Google Scholar] [CrossRef]
- Jenkins, S.C. Six-minute walk test in patients with COPD: Clinical applications in pulmonary rehabilitation. Physiotherapy 2007, 93, 175–182. [Google Scholar] [CrossRef]
- Bestall, J.C.; A Paul, E.; Garrod, R.; Garnham, R.; Jones, P.W.; A Wedzicha, J. Usefulness of the Medical Research Council (MRC) dyspnoea scale as a measure of disability in patients with chronic obstructive pulmonary disease. Thorax 1999, 54, 581–586. [Google Scholar] [CrossRef]
- Komorowski, M.; Green, A.; Tatham, K.C.; Seymour, C.; Antcliffe, D. Sepsis biomarkers and diagnostic tools with a focus on machine learning. EBioMedicine 2022, 86, 104394. [Google Scholar] [CrossRef]
- Miller, R.J.H.; Bednarski, B.P.; Pieszko, K.; Kwiecinski, J.; Williams, M.C.; Shanbhag, A.; Liang, J.X.; Huang, C.; Sharir, T.; Hauser, M.T.; et al. Clinical phenotypes among patients with normal cardiac perfusion using unsupervised learning: A retrospective observational study. EBioMedicine 2024, 99, 104930. [Google Scholar] [CrossRef]
- Alexander, N.; Alexander, D.C.; Barkhof, F.; Denaxas, S. Identifying and evaluating clinical subtypes of Alzheimer’s disease in care electronic health records using unsupervised machine learning. BMC Med. Inf. Decis. Mak. 2021, 21, 343. [Google Scholar] [CrossRef] [PubMed]
- Burgel, P.R.; Paillasseur, J.-L.; Caillaud, D.; Tillie-Leblond, I.; Chanez, P.; Escamilla, R.; Court-Fortune, I.; Perez, T.; Carré, P.; Roche, N. Clinical COPD phenotypes: A novel approach using principal component and cluster analyses. Eur. Respir. J. 2010, 36, 531–539. [Google Scholar] [CrossRef] [PubMed]
- Pikoula, M.; Quint, J.K.; Nissen, F.; Hemingway, H.; Smeeth, L.; Denaxas, S. Identifying clinically important COPD sub-types using data-driven approaches in primary care population-based electronic health records. BMC Med. Inf. Decis. Mak. 2019, 19, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Burgel, P.R.; Quint, J.K.; Nissen, F.; Hemingway, H.; Smeeth, L.; Denaxas, S. A simple algorithm for the identification of clinical COPD phenotypes. Eur. Respir. J. 2017, 50, 1701034. [Google Scholar] [CrossRef]
- Chikhanie, Y.A.; Bailly, S.; Amroussa, I.; Veale, D.; Hérengt, F.; Verges, S. Clustering of COPD patients and their response to pulmonary rehabilitation. Respir. Med. 2022, 198, 106861. [Google Scholar] [CrossRef]
- Spruit, M.A.; Augustin, I.M.L.; Vanfleteren, L.E.; Janssen, D.J.A.; Gaffron, S.; Pennings, H.-J.; Smeenk, F.; Pieters, W.; van den Bergh, J.J.A.M.; Michels, A.-J.; et al. Differential response to pulmonary rehabilitation in COPD: Multidimensional profiling. Eur. Respir. J. 2015, 46, 1625–1635. [Google Scholar] [CrossRef]
- Rochester, C.L.; Vogiatzis, I.; Holland, A.E.; Lareau, S.C.; Marciniuk, D.D.; Puhan, M.A.; Spruit, M.A.; Masefield, S.; Casaburi, R.; Clini, E.M.; et al. An official American Thoracic Society/European Respiratory Society policy statement enhancing implementation, use, and delivery of pulmonary rehabilitation. Am. J. Respir. Crit. Care Med. 2015, 192, 1373–1386. [Google Scholar] [CrossRef]
- Agustí, A.; Celli, B.R.; Criner, G.J.; Halpin, D.; Anzueto, A.; Barnes, P.; Bourbeau, J.; Han, M.K.; Martinez, F.J.; de Oca, M.M.; et al. Global Initiative for Chronic Obstructive Lung Disease 2023 report. GOLD executive summary. Eur. Respir. J. 2023, 61, 2300239. [Google Scholar] [CrossRef]
- Oostveen, E.; MacLeod, D.; Lorino, H.; Farré, R.; Hantos, Z.; Desager, K.; Marchal, F. The FOT in clinical practice: Methodology, recommendations and future developments. Eur. Respir. J. 2003, 22, 1026–1041. [Google Scholar] [CrossRef]
- Graham, B.L.; Steenbruggen, I.; Miller, M.R.; Barjaktarevic, I.Z.; Cooper, B.G.; Hall, G.L.; Hallstrand, T.S.; Kaminsky, D.A.; McCarthy, K.; McCormack, M.C.; et al. Standardization of spirometry: 2019 update. Am. J. Respir. Crit. Care Med. 2019, 200, e70–e88. [Google Scholar] [CrossRef]
- Enright, P.L. The six-minute walk test. Respir. Care. 2003, 48, 783–785. [Google Scholar]
- Singh, S.J.; Puhan, M.A.; Andrianopoulos, V.; Hernandes, N.A.; Mitchell, K.E.; Hill, C.J.; Lee, A.L.; Camillo, C.A.; Troosters, T.; Spruit, M.A.; et al. An official systematic review of the European Respiratory Society/American Thoracic Society: Measurement properties of field walking tests in chronic respiratory disease. Eur. Respir. J. 2014, 44, 1447–1478. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 21 June–18 July 1965, 27 December 1965–7 January 1966; Le Cam, L.M., Neyman, J., Eds.; University of California Press: Oakland, CA, USA, 1967. [Google Scholar]
- Park, H.S.; Jun, C.H. A simple and fast algorithm for K-medoids clustering. Expert. Syst. Appl. 2009, 36, 3336–3341. [Google Scholar] [CrossRef]
- Rasmussen, C. The infinite Gaussian mixture model. Adv. Neural Inf. Process. Syst. 1999, 12, 554–560. [Google Scholar]
- Zhang, T.; Ramakrishnan, R.; Livny, M. BIRCH: An efficient data clustering method for very large databases. ACM SIGMOD Rec. 1996, 25, 103–114. [Google Scholar] [CrossRef]
- Arthur, D.; Vassilvitskii, S. k-Means++: The Advantages of Careful Seeding; Technical Report; Stanford University: Palo Alto, CA, USA, 2006. [Google Scholar]
- Elkan, C. Using the triangle inequality to accelerate k-means. In Proceedings of the 20th International Conference on Machine Learning (ICML-03), Washington, DC, USA, 21–24 August 2003; pp. 147–153. [Google Scholar]
- Moon, T.K. The expectation-maximization algorithm. IEEE Signal Process Mag. 1996, 13, 47–60. [Google Scholar] [CrossRef]
- Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987, 20, 53–65. [Google Scholar] [CrossRef]
- Gordon, C.S.; Waller, J.W.; Cook, R.M.; Cavalera, S.L.; Lim, W.T.; Osadnik, C.R. Effect of pulmonary rehabilitation on symptoms of anxiety and depression in COPD: A systematic review and meta-analysis. Chest 2019, 156, 80–91. [Google Scholar] [CrossRef] [PubMed]
- Tunsupon, P.; Lal, A.; Abo Khamis, M.; Mador, M.J. Comorbidities in patients with chronic obstructive pulmonary disease and pulmonary rehabilitation outcomes. J. Cardiopulm. Rehabil. Prev. 2017, 37, 283–289. [Google Scholar] [CrossRef] [PubMed]
- Jaitovich, A.; Barreiro, E. Skeletal muscle dysfunction in chronic obstructive pulmonary disease: What we know and can do for our patients. Am. J. Respir. Crit. Care Med. 2018, 198, 175–186. [Google Scholar] [CrossRef] [PubMed]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).