AI-Driven Clustering-Based Stratification of Allergic Patients Towards Smart Healthcare Systems in Southern Italy
Abstract
1. Introduction
1.1. Background and Current Drawbacks
1.2. Difficulties in Subjective Assessment and Reproducible Insights
1.3. Engineering Value and Research Gap: Use of Unsupervised AI to Discover Unbiased Patient Profiles
1.4. Study Objectives and Novelty
2. Study Design and Analytical Framework Overview
2.1. Study Aims and Healthcare Implications
- 1.
- Parsimonious and Interpretable Stratification: Demonstrate that three routine measures (age, white blood cells, and total IgE) yield reproducible, clinically interpretable, and useful patient subgroups for outpatient management in Southern Italy;
- 2.
- In an unsupervised context, it is necessary to guarantee both the stability of the clusters and their reliability in the absence of reference labels. To this end, we distinguish two aspects:
- Internal consistency of the Results: Construct clusters via ensemble-consensus strategy merging four paradigms, label matching via the Hungarian algorithm, and assignment stability via 3/4 majority rule; explain the ultimate selection of k = 5 based on typical internal indices and clinical consistency of disparities between subgroups. Subsequently, assess the internal consistency of the subgroups.
- Estimation of actual reliability without a gold standard: Quantify the reliability of partitions with label-free indicators: assignment consensus rate, bootstrap stability, prediction strength on half-half splits, and silhouette distribution (median and tails), offering an interpretable measure of separability.
- 3.
- Translation into clinical actions: Derive an outpatient decision guide that specifies the frequency of check-ups, when to repeat tests (IgE and white blood cells), which targeted investigations to prioritize, and referral criteria, designed for integration into the electronic medical record in Apulia region. In fact, the study explicitly aims to stratify the risk for outpatient services in Apulia in order to provide region-specific decision support.
2.2. Methodological Workflow
3. Materials and Methods
3.1. Dataset Description and Preprocessing
- patient clinical evaluation took place during which researchers assessed their age through a variable named age (in years). A patient’s age influences allergy pathophysiology because patient immunological responses commonly change regarding their maturity levels [27];
- a WBC count measurement expressed in cells per mm3 (equivalently, cells per L, since 1 mm3 = 1 L) allows medical professionals to track systemic inflammation and monitor immune system activities indirectly. Scientists have observed that high WBC numbers commonly link to both allergic responses and subsequent attacks of infection [28];
- the measurement unit of Total Immunoglobulin E (IgE) levels expresses these levels in kU/L while indicating allergic diseases as well as hypersensitivity. The clinical severity in allergic patients typically elevates when their Total IgE concentrations become higher [29].
3.1.1. Rationale for Variable Selection
3.1.2. Ethical Considerations
3.2. Clustering Algorithms and Design Choices
3.2.1. K-Means Clustering
3.2.2. Agglomerative Hierarchical Clustering
3.2.3. Gaussian Mixture Model
3.2.4. Spectral Clustering
3.2.5. Implementation Specifics and Reproducibility Settings
3.3. Evaluation Metrics and Validation of Clustering Results
3.4. Consensus Clustering Strategy
3.5. Internal Proxy Assessment of Cluster Separability Using Supervised Learning
4. Results and Discussion: Findings and Clinical Stratification
4.1. Data Preprocessing
4.1.1. Handling Missing Values
4.1.2. Outlier Screening
- WBC: 0 missing values;
- Age: 0 missing values;
- Total IgE: 0 missing values;

4.2. Exploratory Data Analysis
4.2.1. Feature Distributions
4.2.2. Correlation Analysis
4.3. Clustering Analysis and Comparison of Techniques
Internal Metrics for Each Clustering Method
4.4. Consensus Clustering Analysis
4.4.1. Clinical Profile of the Consensus Clusters
4.4.2. ANOVA Analysis
- White Blood Cells with and , indicating significant differences in WBC counts across clusters;
- Age with and , showing that age distributions differ significantly among the identified groups;
- Total IgE with and , confirming a significant variation of IgE levels between clusters.
4.4.3. Findings and Implications
4.4.4. Clinical Interpretation of Biomarker-Based Clusters
4.5. Simplified Clinical Decisional Guide
4.6. Validation-Oriented Analysis: Supervised Proxy Prediction of Cluster Labels
- Accuracy: 97.78%;
- Cross-Validation Accuracy Scores: [91.67%, 98.33%, 96.67%, 98.33%, 91.67%];
- Mean Cross-Validation Accuracy: 95.33%.
Model Validation and Clinical Utility
5. Conclusions and Clinical Implications
5.1. Novelty from Clusters to Care
- 1.
- Minimal-cost outpatient risk stratification: using just three routine measures (age, white blood cell count, and total IgE), it is possible to stratify patients, distinguishing those requiring more frequent checkups from those who can be followed with scheduled surveillance (before resorting to specialist testing), with direct applicability in Apulia region.Specifically, five clinically interpretable strata were obtained via an ensemble-consensus rule (3/4 algorithms with label alignment by Hungarian matching), supported by internal indices (Silhouette, Calinski–Harabasz, Davies–Bouldin) and ANOVA on age, WBC, and total IgE (p ), potentially facilitating future EMR integration through simple reference ranges.
- 2.
- Reduction of variability between physicians: the proposed reference table standardizes operating thresholds and consequent actions, promoting more consistent decisions regarding when to reassess, intensify treatment, and refer for specialist consultation.
- 3.
- Translation into clinical actions: the derived profiles are operationalized into concrete outpatient steps:
- Care pathways by phenotype: each profile identified by clustering is translated into practical recommendations: visit frequency, timing for repeat total IgE and white blood cell counts, opportunities for targeted in-depth testing (e.g., eosinophils, exhaled nitric oxide, spirometry), and criteria for referral to the most appropriate specialist. These pathways are intended for deployment in Southern Italian clinics;
- Signs of subclinical inflammation in the elderly: the profile characterized by older age, elevated white blood cells, and only moderate total IgE suggests the possibility of non-clinically evident inflammation. In these cases, targeted screening (e.g., C-reactive protein, differential, airway evaluation, or chronic infections) is indicated to reduce potential diagnostic delays;
- Prioritization of referrals: profiles that combine elevated white blood cell and IgE values at a younger age are associated with greater allergic activity and deserve priority for evaluation and treatment planning. Conversely, profiles with lower values can be managed with structured surveillance and periodic reassessment;
- Monitoring over time: the evolution of an individual patient from one profile to another can be used as a pragmatic indicator of clinical response, useful both in daily practice.
5.2. Methodological Limitations and Future Directions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| ANOVA | Analysis of Variance |
| ARI | Adjusted Rand Index |
| CH | Calinski–Harabasz Index |
| DB | Davies–Bouldin Index |
| EDA | Exploratory Data Analysis |
| EMR | Electronic Medical Record |
| EU | European Union |
| FMI | Fowlkes–Mallows Index |
| GDPR | General Data Protection Regulation |
| GMM | Gaussian Mixture Model |
| IgE | Immunoglobulin E |
| k-NN | k-Nearest Neighbors |
| NMI | Normalized Mutual Information |
| RF | Random Forest |
| SD | Standard Deviation |
| SVM | Support Vector Machine |
| WBC | White Blood Cell |
| WCSS | Within-Cluster Sum of Squares |
References
- Pawankar, R. Allergic diseases and asthma: A global public health concern and a call to action. World Allergy Organ. J. 2014, 7, 12. [Google Scholar] [CrossRef]
- Calderón, M.A.; Linneberg, A.; Kleine-Tebbe, J.; De Blay, F.; Hernandez Fernandez de Rojas, D.; Virchow, J.C.; Demoly, P. Respiratory allergy caused by house dust mites: What do we really know? J. Allergy Clin. Immunol. 2015, 136, 38–48. [Google Scholar] [CrossRef]
- Zhang, X.; Xing, F.; Zhao, Y.; Li, C. Efficacy of probiotics in the treatment of allergic diseases: A meta-analysis. Front. Nutr. 2025, 12, 1502390. [Google Scholar] [CrossRef]
- Juniper, E.F.; Guyatt, G.H.; Dolovich, J. Assessment of quality of life in adolescents with allergic rhinoconjunctivitis: Development and testing of a questionnaire for clinical trials. J. Allergy Clin. Immunol. 1994, 93, 413–423. [Google Scholar] [CrossRef] [PubMed]
- Baiardini, I.; Braido, F.; Bonini, M.; Compalati, E.; Canonica, G.W. Why do doctors and patients not follow guidelines? Curr. Opin. Allergy Clin. Immunol. 2009, 9, 228–233. [Google Scholar] [CrossRef] [PubMed]
- Protudjer, J.L.P.; Davis, C.M.; Gupta, R.S.; Perry, T.T. Social determinants and quality of life in food allergy management and treatment. J. Allergy Clin. Immunol. Pract. 2025, 13, 745–750. [Google Scholar] [CrossRef] [PubMed]
- Warren, C.M.; Otto, A.K.; Walkner, M.M.; Gupta, R.S. Quality of life among food allergic patients and their caregivers. Curr. Allergy Asthma Rep. 2016, 16, 38. [Google Scholar] [CrossRef]
- Bousquet, J.; Mantzouranis, E.; Cruz, A.A.; Aït-Khaled, N.; Baena-Cagnani, C.E.; Bleecker, E.R.; Brightling, C.E.; Burney, P.; Bush, A.; Busse, W.W.; et al. Uniform definition of asthma severity, control, and exacerbations: Document presented for the World Health Organization Consultation on Severe Asthma. J. Allergy Clin. Immunol. 2010, 126, 926–938. [Google Scholar] [CrossRef]
- Eguiluz-Gracia, I.; Tay, T.R.; Hew, M.; Escribese, M.M.; Barber, D.; O’Hehir, R.E.; Torres, M.J. Recent developments and highlights in biomarkers in allergic diseases and asthma. Allergy 2018, 73, 2290–2305. [Google Scholar] [CrossRef]
- Ansotegui, I.J.; Melioli, G.; Canonica, G.W.; Caraballo, L.; Villa, E.; Ebisawa, M.; Passalacqua, G.; Savi, E.; Ebo, D.; Gómez, R.M.; et al. IgE allergy diagnostics and other relevant tests in allergy, a World Allergy Organization position paper. World Allergy Organ. J. 2020, 13, 100080. [Google Scholar] [CrossRef]
- Vitte, J.; Santos, A.F. Editorial: In vitro diagnosis of allergic and mast cell-mediated disorders. Front. Allergy 2024, 5, 1483398. [Google Scholar] [CrossRef]
- Fritzsching, B. Personalized Medicine in Allergic Asthma: At the Crossroads of Allergen Immunotherapy and “Biologicals”. Front. Pediatr. 2017, 5, 31. [Google Scholar] [CrossRef]
- Muraro, A.; Fernandez-Rivas, M.; Beyer, K.; Cardona, V.; Clark, A.; Eller, E.; Hourihane, J.O.B.; Jutel, M.; Sheikh, A.; Agache, I.; et al. The urgent need for a harmonized severity scoring system for acute allergic reactions. Allergy 2018, 73, 1792–1800. [Google Scholar] [CrossRef] [PubMed]
- Caddick, Z.A.; Fraundorf, S.H.; Rottman, B.M.; Nokes-Malach, T.J. Cognitive perspectives on maintaining physicians’ medical expertise: II. Acquiring, maintaining, and updating cognitive skills. Cogn. Res. Princ. Implic. 2023, 8, 47. [Google Scholar] [CrossRef] [PubMed]
- Niven, D.J.; McCormick, T.J.; Straus, S.E.; Hemmelgarn, B.R.; Jeffs, L.; Barnes, T.R.M.; Stelfox, H.T. Reproducibility of clinical research in critical care: A scoping review. BMC Med. 2018, 16, 26. [Google Scholar] [CrossRef] [PubMed]
- Yang, W.C.; Lai, J.P.; Liu, Y.H.; Lin, Y.L.; Hou, H.P.; Pai, P.F. Using medical data and clustering techniques for a smart healthcare system. Electronics 2023, 13, 140. [Google Scholar] [CrossRef]
- Aljuhani, M.; Ashraf, A.; Edison, P. Use of artificial intelligence in imaging dementia. Cells 2024, 13, 1965. [Google Scholar] [CrossRef]
- Subramanian, J.; Simon, R. Gene expression-based prognostic signatures in lung cancer: Ready for clinical use? J. Natl. Cancer Inst. 2010, 102, 464–474. [Google Scholar] [CrossRef]
- van Breugel, M.; Fehrmann, R.S.N.; Bügel, M.; Rezwan, F.I.; Holloway, J.W.; Nawijn, M.C.; Fontanella, S.; Custovic, A.; Koppelman, G.H. Current state and prospects of artificial intelligence in allergy. Allergy 2023, 78, 2623–2643. [Google Scholar] [CrossRef]
- Haider, S.; Fontanella, S.; Ullah, A.; Turner, S.; Simpson, A.; Roberts, G.; Murray, C.S.; Holloway, J.W.; Curtin, J.A.; Cullinan, P.; et al. Evolution of eczema, wheeze, and rhinitis from infancy to early adulthood: Four birth cohort studies. Am. J. Respir. Crit. Care Med. 2022, 206, 950–960. [Google Scholar] [CrossRef]
- Monti, S.; Tamayo, P.; Mesirov, J.; Golub, T. Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Mach. Learn. 2003, 52, 91–118. [Google Scholar] [CrossRef]
- Kiselev, V.Y.; Kirschner, K.; Schaub, M.T.; Andrews, T.; Yiu, A.; Chandra, T.; Natarajan, K.N.; Reik, W.; Barahona, M.; Green, A.R.; et al. SC3: Consensus clustering of single-cell RNA-seq data. Nat. Methods 2017, 14, 483–486. [Google Scholar] [CrossRef]
- Fred, A.L.; Jain, A.K. Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 835–850. [Google Scholar] [CrossRef]
- Ahmed, M.B.; Ad’hiah, A.H. Effects of age, gender and allergen type on immunoglobulin E level in asthma and allergic rhinitis patients. Iraqi J. Sci. 2022, 63, 1498–1506. [Google Scholar] [CrossRef]
- Al-Ghamdi, B.R.; Koshak, E.A.; Omer, F.M.; Awadalla, N.J.; Mahfouz, A.A.; Ageely, H.M. Immunological factors associated with adult asthma in the Aseer region, southwestern Saudi Arabia. Int. J. Environ. Res. Public Health 2019, 16, 2495. [Google Scholar] [CrossRef]
- Leung, T.F.; Kong, A.P.S.; Chan, I.H.S.; Choi, K.C.; Ho, C.S.; Chan, M.H.M.; So, W.Y.; Lam, C.W.K.; Wong, G.W.K.; Chan, J.C.N. Association between obesity and atopy in Chinese schoolchildren. Int. Arch. Allergy Immunol. 2009, 149, 133–140. [Google Scholar] [CrossRef]
- Guo, J.; Huang, X.; Dou, L.; Yan, M.; Shen, T.; Tang, W.; Li, J. Aging and aging-related diseases: From molecular mechanisms to interventions and treatments. Signal Transduct. Target. Ther. 2022, 7, 391. [Google Scholar] [CrossRef]
- Chmielewski, P.P.; Strzelec, B. Elevated leukocyte count as a harbinger of systemic inflammation, disease progression, and poor prognosis: A review. Folia Morphol. 2018, 77, 171–178. [Google Scholar] [CrossRef]
- Amarasekera, M. Immunoglobulin E in health and disease. Asia Pac. Allergy 2011, 1, 12–15. [Google Scholar] [CrossRef] [PubMed]
- Junod, V.; Elger, B. Retrospective research: What are the ethical and legal requirements? Swiss Med. Wkly. 2010, 140, w13041. [Google Scholar] [CrossRef] [PubMed]
- Winter, E.M.; Maughan, R.J. Requirements for ethics approvals. J. Sport. Sci. 2009, 27, 985. [Google Scholar] [CrossRef] [PubMed]
- Gollogly, L. Ethical approval for operational research. Bull. World Health Organ. 2006, 84, 766. [Google Scholar] [CrossRef]
- Wani, A.A. Comprehensive analysis of clustering algorithms: Exploring limitations and innovative solutions. PeerJ Comput. Sci. 2024, 10, e2286. [Google Scholar] [CrossRef] [PubMed]
- Huang, H.; Liao, Z.; Wei, X.; Zhou, Y. Combined Gaussian mixture model and Pathfinder algorithm for data clustering. Entropy 2023, 25, 946. [Google Scholar] [CrossRef] [PubMed]
- Bertsimas, D.; Orfanoudaki, A.; Wiberg, H. Interpretable clustering: An optimization approach. Mach. Learn. 2021, 110, 89–138. [Google Scholar] [CrossRef]
- Javed, H.; El-Sappagh, S.; Abuhmed, T. Robustness in deep learning models for medical diagnostics: Security and adversarial challenges towards robust AI applications. Artif. Intell. Rev. 2024, 58, 12. [Google Scholar] [CrossRef]
- Boutalbi, R.; Labiod, L.; Nadif, M. Implicit consensus clustering from multiple graphs. Data Min. Knowl. Discov. 2021, 35, 2313–2340. [Google Scholar] [CrossRef]
- Tangherloni, A.; Ricciuti, F.; Besozzi, D.; Liò, P.; Cvejic, A. Analysis of single-cell RNA sequencing data based on autoencoders. BMC Bioinform. 2021, 22, 309. [Google Scholar] [CrossRef]
- Handschuh, L.; Kaźmierczak, M.; Milewski, M.; Góralski, M.; Łuczak, M.; Wojtaszewska, M.; Uszczyńska-Ratajczak, B.; Lewandowski, K.; Komarnicki, M.; Figlerowicz, M. Gene expression profiling of acute myeloid leukemia samples from adult patients with AML-M1 and -M2 through boutique microarrays, real-time PCR and droplet digital PCR. Int. J. Oncol. 2017, 656–678. [Google Scholar] [CrossRef]
- Karras, C.; Karras, A.; Giotopoulos, K.C.; Avlonitis, M.; Sioutas, S. Consensus big data clustering for Bayesian mixture models. Algorithms 2023, 16, 245. [Google Scholar] [CrossRef]
- Gabrovšek, B.; Novak, T.; Povh, J.; Rupnik Poklukar, D.; Žerovnik, J. Multiple Hungarian method for k-assignment problem. Mathematics 2020, 8, 2050. [Google Scholar] [CrossRef]
- Hu, J.; Szymczak, S. A review on longitudinal data analysis with random forest. Brief. Bioinform. 2023, 24, bbad002. [Google Scholar]
- Sarica, A.; Cerasa, A.; Quattrone, A. Random forest algorithm for the classification of neuroimaging data in Alzheimer’s disease: A systematic review. Front. Aging Neurosci. 2017, 9, 329. [Google Scholar] [CrossRef] [PubMed]
- Musa, A.B. Comparative study on classification performance between support vector machine and logistic regression. Int. J. Mach. Learn. Cybern. 2013, 4, 13–24. [Google Scholar] [CrossRef]
- Ben-Hur, A.; Weston, J. A User’s Guide to Support Vector Machines. In Methods in Molecular Biology; Clifton, N.J., Ed.; Humana Press: Totowa, NJ, USA, 2010; pp. 223–239. [Google Scholar]
- Gelman, A.; Jakulin, A.; Pittau, M.G.; Su, Y.S. A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat. 2008, 2, 1360–1383. [Google Scholar] [CrossRef]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef]
- Hu, L.Y.; Huang, M.W.; Ke, S.W.; Tsai, C.F. The distance function effect on k-nearest neighbor classification for medical datasets. Springerplus 2016, 5, 1304. [Google Scholar] [CrossRef] [PubMed]
- Probst, P.; Wright, M.; Boulesteix, A.L. Hyperparameters and tuning strategies for random forest. arXiv 2018, arXiv:1804.03515. [Google Scholar] [CrossRef]
- Lange, T.M.; Gültas, M.; Schmitt, A.O.; Heinrich, F. optRF: Optimising random forest stability by determining the optimal number of trees. BMC Bioinform. 2025, 26, 95. [Google Scholar] [CrossRef] [PubMed]
- Zhou, S.; Mentch, L. Trees, forests, chickens, and eggs: When and why to prune trees in a random forest. arXiv 2021, arXiv:2103.16700. [Google Scholar] [CrossRef]
- Nadi, A.; Moradi, H. Increasing the views and reducing the depth in random forest. Expert Syst. Appl. 2019, 138, 112801. [Google Scholar] [CrossRef]
- Saito, T.; Rehmsmeier, M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 2015, 10, e0118432. [Google Scholar] [CrossRef] [PubMed]
- Nembrini, S.; König, I.R.; Wright, M.N. The revival of the Gini importance? Bioinformatics 2018, 34, 3711–3718. [Google Scholar] [CrossRef]
- Dunne, R.; Reguant, R.; Ramarao-Milne, P.; Szul, P.; Sng, L.M.F.; Lundberg, M.; Twine, N.A.; Bauer, D.C. Thresholding Gini variable importance with a single-trained random forest: An empirical Bayes approach. Comput. Struct. Biotechnol. J. 2023, 21, 4354–4360. [Google Scholar] [CrossRef]
- Lötvall, J.; Akdis, C.A.; Bacharier, L.B.; Bjermer, L.; Casale, T.B.; Custovic, A.; Lemanske, R.F., Jr.; Wardlaw, A.J.; Wenzel, S.E.; Greenberger, P.A. Asthma endotypes: A new approach to classification of disease entities within the asthma syndrome. J. Allergy Clin. Immunol. 2011, 127, 355–360. [Google Scholar] [CrossRef]
- Moore, W.C.; Meyers, D.A.; Wenzel, S.E.; Teague, W.G.; Li, H.; Li, X.; D’Agostino, R., Jr.; Castro, M.; Curran-Everett, D.; Fitzpatrick, A.M.; et al. Identification of asthma phenotypes using cluster analysis in the Severe Asthma Research Program. Am. J. Respir. Crit. Care Med. 2010, 181, 315–323. [Google Scholar] [CrossRef]
- Haldar, P.; Pavord, I.D.; Shaw, D.E.; Berry, M.A.; Thomas, M.; Brightling, C.E.; Wardlaw, A.J.; Green, R.H. Cluster analysis and clinical asthma phenotypes. Am. J. Respir. Crit. Care Med. 2008, 178, 218–224. [Google Scholar] [CrossRef] [PubMed]
- Loza, M.; Adcock, I.; Auffray, C.; Chung, K.F.; Djukanovic, R.; Sterk, P.; Susulic, V.; Barnathan, E.; Baribaud, F.; Silkoff, P. Longitudinally Stable, Clinically Defined Clusters of Patients with Asthma Independently Identified in the ADEPT and U-BIOPRED Asthma Studies. Ann. Am. Thorac. Soc. 2016, 13, S102–S103. [Google Scholar] [CrossRef] [PubMed]
- Bousquet, J.; Anto, J.; Auffray, C.; Akdis, M.; Cambon-Thomsen, A.; Keil, T.; Haahtela, T.; Lambrecht, B.; Postma, D.; Sunyer, J.; et al. MeDALL (Mechanisms of the Development of ALLergy): An integrated approach from phenotypes to systems medicine. Allergy 2011, 66, 596–604. [Google Scholar] [CrossRef]
- Guida, G.; Bertolini, F.; Carriero, V.; Levra, S.; Sprio, A.E.; Sciolla, M.; Orpheu, G.; Arrigo, E.; Pizzimenti, S.; Ciprandi, G.; et al. Reliability of total serum IgE levels to define type 2 high and low asthma phenotypes. J. Clin. Med. 2023, 12, 5447. [Google Scholar] [CrossRef]
- Chuang, Y.C.; Tsai, H.H.; Lin, M.C.; Wu, C.C.; Lin, Y.C.; Wang, T.N. Cluster analysis of phenotypes, job exposure, and inflammatory patterns in elderly and nonelderly asthma patients. Allergol. Int. 2024, 73, 214–223. [Google Scholar] [CrossRef]
- Ilmarinen, P.; Tuomisto, L.E.; Niemelä, O.; Tommola, M.; Haanpää, J.; Kankaanranta, H. Cluster analysis on longitudinal data of patients with adult-onset asthma. J. Allergy Clin. Immunol. Pract. 2017, 5, 967–978.e3. [Google Scholar] [CrossRef]
- Carr, T.F.; Kraft, M. Use of biomarkers to identify phenotypes and endotypes of severeasthma. Ann. Allergy Asthma Immunol. 2018, 121, 414–420. [Google Scholar] [CrossRef]
- Wu, W.; Bleecker, E.; Moore, W.; Busse, W.W.; Castro, M.; Chung, K.F.; Calhoun, W.J.; Erzurum, S.; Gaston, B.; Israel, E.; et al. Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data. J. Allergy Clin. Immunol. 2014, 133, 1280–1288. [Google Scholar] [CrossRef]
- Sendín-Hernández, M.P.; Ávila-Zarza, C.; Sanz, C.; García-Sánchez, A.; Marcos-Vadillo, E.; Muñoz-Bellido, F.J.; Laffond, E.; Domingo, C.; Isidoro-García, M.; Dávila, I. Cluster analysis identifies 3 phenotypes within allergic asthma. J. Allergy Clin. Immunol. Pract. 2018, 6, 955–961.e1. [Google Scholar] [CrossRef]
- Lee, E.; Hong, S.J. Phenotypes of allergic diseases in children and their application in clinical situations. Korean J. Pediatr. 2019, 62, 325–333. [Google Scholar] [CrossRef]
- Loza, M.J.; Djukanovic, R.; Chung, K.F.; Horowitz, D.; Ma, K.; Branigan, P.; Barnathan, E.S.; Susulic, V.S.; Silkoff, P.E.; Sterk, P.J.; et al. Validated and longitudinally stable asthma phenotypes based on cluster analysis of the ADEPT study. Respir. Res. 2016, 17, 165. [Google Scholar] [CrossRef] [PubMed]
- Nadif, R.; Febrissy, M.; Andrianjafimasy, M.V.; Le Moual, N.; Gormand, F.; Just, J.; Pin, I.; Siroux, V.; Matran, R.; Dumas, O.; et al. Endotypes identified by cluster analysis in asthmatics and non-asthmatics and their clinical characteristics at follow-up: The case-control EGEA study. BMJ Open Respir. Res. 2020, 7, e000632. [Google Scholar] [CrossRef] [PubMed]
- Lee, Y.; Quoc, Q.L.; Park, H.S. Biomarkers for severe asthma: Lessons from longitudinal cohort studies. Allergy Asthma Immunol. Res. 2021, 13, 375–389. [Google Scholar] [CrossRef] [PubMed]
- Denton, E.; Price, D.B.; Tran, T.N.; Canonica, G.W.; Menzies-Gow, A.; FitzGerald, J.M.; Sadatsafavi, M.; Perez de Llano, L.; Christoff, G.; Quinton, A.; et al. Cluster analysis of inflammatory biomarker expression in the International Severe Asthma Registry. J. Allergy Clin. Immunol. Pract. 2021, 9, 2680–2688.e7. [Google Scholar] [CrossRef]








| Statistic | WBC | Age | Total IgE |
|---|---|---|---|
| Count | 300 | 300 | 300 |
| Mean | 9749.54 | 53.66 | 251.22 |
| Std. Dev. | 2506.91 | 13.53 | 72.21 |
| Minimum | 5113.87 | 22.00 | 84.25 |
| 25th Percentile | 7427.31 | 42.00 | 199.92 |
| Median (50%) | 9819.78 | 55.00 | 242.60 |
| 75th Percentile | 12,035.65 | 63.00 | 301.70 |
| Maximum | 13,939.69 | 86.00 | 459.01 |
| Method 1 | Method 2 | ARI | NMI | V-Measure | FMI |
|---|---|---|---|---|---|
| K-means | Agglomerative | 0.791 | 0.787 | 0.787 | 0.835 |
| K-means | GMM | 0.526 | 0.613 | 0.613 | 0.626 |
| K-means | Spectral | 0.899 | 0.883 | 0.883 | 0.920 |
| Agglomerative | GMM | 0.495 | 0.588 | 0.588 | 0.603 |
| Agglomerative | Spectral | 0.823 | 0.824 | 0.824 | 0.860 |
| GMM | Spectral | 0.543 | 0.630 | 0.630 | 0.639 |
| Method | Silhouette Score | Calinski–Harabasz Index | Davies–Bouldin Score |
|---|---|---|---|
| K-means | 0.406 | 190.00 | 0.900 |
| Agglomerative | 0.361 | 160.41 | 1.016 |
| GMM | 0.233 | 103.89 | 1.289 |
| Spectral | 0.398 | 182.57 | 0.936 |
| Cluster | WBC (Mean) | Age (Mean) | Total IgE (Mean) | WBC (Std) | Age (Std) | Total IgE (Std) |
|---|---|---|---|---|---|---|
| 0 | 12,067.38 | 65.50 | 230.85 | 982.07 | 8.73 | 34.61 |
| 1 | 12,521.79 | 38.02 | 332.69 | 589.05 | 6.24 | 36.29 |
| 2 | 7323.33 | 61.25 | 316.75 | 1368.22 | 9.10 | 58.40 |
| 3 | 8584.14 | 41.60 | 256.26 | 1490.74 | 6.14 | 34.10 |
| 4 | 7313.74 | 58.70 | 152.12 | 1265.03 | 7.70 | 29.02 |
| Cluster | Age Range (Years) | WBC (Cells/μL) | Total IgE (kU/L) | Clinical Profile | Recommended Clinical Management |
|---|---|---|---|---|---|
| 4 | 55–61 | 6345–8091 | 137–172 | Low systemic immune activity:
| Low priority:
|
| 3 | 39–46 | 7899–9769 | 232–272 | Intermediate / stable profile:
| Moderate priority:
|
| 2 | 55–66 | 6392–7978 | 266–376 | Predominantly allergic activity without marked systemic inflammation:
| Allergy-focused priority:
|
| 0 | 58–72 | 11,625–12,760 | 204–245 | Possible inflammatory burden:
| Inflammatory/comorbidity-focused priority:
|
| 1 | 35–42 | 12,017–12,888 | 310–335 | High immuno-allergic activity:
| High immuno-allergic priority:
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Palazzo, S.; Hazar, E.; Gokceoglu, A.U.; Zambetta, G.; Caldelli, R.; Loconsole, C. AI-Driven Clustering-Based Stratification of Allergic Patients Towards Smart Healthcare Systems in Southern Italy. Computers 2026, 15, 296. https://doi.org/10.3390/computers15050296
Palazzo S, Hazar E, Gokceoglu AU, Zambetta G, Caldelli R, Loconsole C. AI-Driven Clustering-Based Stratification of Allergic Patients Towards Smart Healthcare Systems in Southern Italy. Computers. 2026; 15(5):296. https://doi.org/10.3390/computers15050296
Chicago/Turabian StylePalazzo, Stefano, Esra Hazar, Arife Uslu Gokceoglu, Giovanni Zambetta, Roberto Caldelli, and Claudio Loconsole. 2026. "AI-Driven Clustering-Based Stratification of Allergic Patients Towards Smart Healthcare Systems in Southern Italy" Computers 15, no. 5: 296. https://doi.org/10.3390/computers15050296
APA StylePalazzo, S., Hazar, E., Gokceoglu, A. U., Zambetta, G., Caldelli, R., & Loconsole, C. (2026). AI-Driven Clustering-Based Stratification of Allergic Patients Towards Smart Healthcare Systems in Southern Italy. Computers, 15(5), 296. https://doi.org/10.3390/computers15050296

