Next Article in Journal
A Hybrid Recommendation Approach for Adaptive Worksheet Generation Using Pedagogically Structured Learning Objects
Previous Article in Journal
Parallel concatenated block codes with flexible lengths and near-optimum performance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Integrating Risk Factors and Symptoms for Urinary Tract Infection Diagnosis Using an Explainable AI Approach in Low-Resource Regions

1
Department of Computer Science, Faculty of Computing, Ritman University, Ikot Ekpene 530101, Nigeria
2
Department of Information Systems, Faculty of Computing, University of Uyo, Uyo 520103, Nigeria
3
Department of Computing Sciences, Faculty of Science, Admiralty University of Nigeria, Ibusa 320103, Nigeria
4
Department of Computer Science, Faculty of Computing, University of Uyo, Uyo 520103, Nigeria
5
Department of Business Administration, Faculty of Social and Management Sciences, Ritman University, Ikot Ekpene 530101, Nigeria
6
Department of Computer Science, College of Science, Engineering and Technology, Texas Southern University, 3100 Cleburne St, Houston, TX 77004, USA
7
Novena Computers and Technologies Limited, Uyo 520103, Nigeria
8
Institute of Health Research and Development, University of Uyo Teaching Hospital, Uyo 520103, Nigeria
9
Department of Mathematics and Computing, Mount Royal University, Calgary, AB T3E 6K6, Canada
*
Author to whom correspondence should be addressed.
Information 2026, 17(5), 435; https://doi.org/10.3390/info17050435
Submission received: 6 March 2026 / Revised: 19 April 2026 / Accepted: 28 April 2026 / Published: 1 May 2026
(This article belongs to the Section Artificial Intelligence)

Abstract

Urinary Tract Infections (UTIs) represent one of the most prevalent bacterial infections globally, posing significant health burdens, especially in low- and middle-income countries (LMICs), due to delayed diagnoses, limited access to laboratory services, and rising antimicrobial resistance. This study presents a machine learning (ML)-based diagnostic support framework for early UTI detection, leveraging structured clinical data and explainable artificial intelligence (XAI) techniques to enhance interpretability and trust among healthcare providers. A patient dataset containing 4865 records was used in the study to train and test Extreme Gradient Boosting (XGBoost), Decision Tree (DT) and Random Forest (RF) classifiers, while class imbalance was addressed using Synthetic Minority Over-sampling Technique (SMOTE). The performance of the models was evaluated through accuracy, precision, recall, F1-score, Log Loss, and AUC-ROC, and random forest showed the best results (accuracy: 86.43%, F1-score: 86.71%, AUC-ROC: 0.8695). To ensure that such models can be adopted by stakeholders in the health sector, Local Interpret-able Model-agnostic Explanations (LIME) were integrated, which identified painful urination, urinary frequency, and suprapubic pain as primary predictors in the model. This study shows that interpretable ML models can be helpful in resource-limited regions in predicting UTIs, thereby rendering a solution to improve the management of infections in these regions.

Graphical Abstract

1. Introduction

Urinary Tract Infections (UTIs) are bacterial infections that affect individuals across all age groups, making them the most common infections globally. Women, older adults, and people with underlying medical conditions like diabetes and immunosuppression have a disproportionately higher burden of UTIs [1]. In paediatric populations, UTIs are a notable cause of febrile illness, particularly in infants and young children, with Escherichia coli responsible for more than 80% of reported cases [2]. Clinically, UTIs are commonly classified according to the anatomical site of infection as upper or lower UTIs and are further distinguished as uncomplicated or complicated based on patient-specific factors, such as age, sex, pregnancy status, and the presence of comorbidities. The clinical presentation of UTIs is highly variable, ranging from mild or asymptomatic bacteriuria to severe systemic manifestations. Common symptoms include increased urinary frequency, dysuria, haematuria, cloudy urine, abdominal pain, oliguria, fatigue, and fever [3,4]. Severe cases can lead to urosepsis and septic shock, which can be difficult to diagnose and treat. Real-world prescription patterns show persistent non-adherence to clear national and international clinical guidelines that discourage the indiscriminate use of broad-spectrum antibiotics, such as fluoroquinolones, for simple UTIs. The long-term efficacy of commonly used antibiotics has been compromised by this pervasive overprescription, which has been closely associated with the rising incidence of antimicrobial resistance [5,6]. Nitrofurantoin’s clinical application is still restricted, despite the fact that it is acknowledged as a potential substitute for some UTI cases. Nitrofurantoin is only used in a number of nations when other safer or more potent antibiotics or chemotherapeutic agents are judged inappropriate.
UTIs pose a serious clinical and public health challenge in low- and middle-income countries (LMICs). This problem is worsened by the rampant use of antibiotics and delays in commencing proper treatment. It is also aggravated by limited access to diagnostic laboratory facilities. These challenges contribute to the growing problem of antimicrobial resistance (AMR) in these settings [7]. Neugebauer et al. [8] noted that rising bacterial resistance patterns have made it harder to select the best empirical antibiotic therapy in routine clinical practice. Research shows that many antimicrobial prescriptions are written by medical professionals who lack specific training in infection control. Prescribing decisions can be negatively affected when non-expert prescribers lack a proper understanding of antimicrobial agents and resistance mechanisms [9]. This lack of knowledge often leads to poor antibiotic choices. There is a strong need to develop diagnostic support tools that are affordable and reliable. These tools must also be easy for healthcare workers to understand and use. These systems should include the contextual factors that influence prescribing behaviour at the point of care. They should also provide prompt and easily accessible guidance.
Machine learning (ML) is transforming healthcare by analysing clinical data and generating predictive insights from patient information [10,11]. These techniques can model complex non-linear relationships that traditional statistical methods often cannot handle. ML methods are now widely used to support clinical decision-making by using patient histories, symptoms, and risk factors to make disease predictions. Traditional supervised learning algorithms such as Random Forest (RF), Decision Tree (DT), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost) perform well in medical prediction tasks. These models are valued for their interpretability and robustness [12]. Also, they can handle class imbalance and noisy real-world data, making them suitable for healthcare settings that rely on structured clinical data. Previous studies have reported many applications of ML techniques in diagnosing UTIs. Naik et al. [13] reported that artificial neural networks (ANN) dominate applications in this domain, followed by XGBoost, SVM, CatBoost, and ensemble learning, including RF, DT, and logistic regression [14]. de-Vries et al. [15] revealed that integrating urinalysis, Gram stain, and routine clinical parameters improves UTI prediction via semi-supervised ensembles. Furthermore, Flores et al. [1] employed RF and ANN models to predict UTIs in a large cross-sectional study and reported that their deployment influenced clinician behaviour, including antibiotic prescribing patterns and requests for urine cultures. The suggested models produced area under the curve (AUC) values ranging from 0.81 to 0.88 using data from over 8000 cases. The model combinations produced satisfactory results.
These results imply that combining predictive models with contemporary urinalysis methods and digital health infrastructures may enhance antimicrobial stewardship initiatives, reduce laboratory workload, and improve diagnostic accuracy. Recent studies have incorporated Explainable Artificial Intelligence (XAI) techniques, such as Local Interpretable Model-Agnostic Explanations (LIME), as well as Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT), into conventional machine learning approaches to improve the transparency of febrile disease diagnosis [16,17]. While these studies demonstrate strong predictive performance, many focus on accuracy, with limited emphasis on interpretability, clinical usability, and deployment in resource-constrained environments [1,18,19]. Moreover, numerous studies underscore the significance of explainability in healthcare machine learning systems, illustrating that interpretable models can enhance clinician trust and facilitate safer decision-making [20,21,22]. Nonetheless, the utilisation of these methodologies in low-resource environments remains inadequately investigated, especially in contexts characterised by constrained computational resources and limited access to laboratory diagnostics [23,24].
In contrast, this study evaluates the performance of three machine learning algorithms, Random Forest, Decision Tree, and Extreme Gradient Boosting, for UTI diagnosis using patient symptoms and risk factors, with a particular emphasis on model interpretability and applicability in low-resource settings. By integrating explainable AI techniques and focusing on clinically relevant features, this work addresses the gap between high-performing predictive models and their practical adoption in real-world healthcare environments.
Despite the known predictive ability of conventional ML models, they are known to be black boxes, which reduces their chances of adoption in clinical settings because stakeholders in the healthcare sector often require justification for predictions [25]. Therefore, LIME is integrated in our study to generate feature-level explanations for individual model predictions, hence mitigating this limitation [26]. These XAI provide visual interpretations of how each feature influences the positive diagnosis of a medical condition, thereby assisting healthcare professionals in the interpretation and understanding of model outputs [16,27,28]. This study also addressed class imbalance within the dataset using the Synthetic Minority Over-sampling Technique (SMOTE) to enhance the performance of the classifier, reducing the risk of overfitting and ensuring balanced classification results. The main contributions of this work include:
(i)
The development and evaluation of interpretable machine learning models for urinary tract infection (UTI) diagnosis using patient symptoms and risk factors in a low-resource context;
(ii)
The application of the Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance in the dataset, thereby improving model learning and robustness;
(iii)
The integration of explainable artificial intelligence (XAI) techniques, specifically LIME, to provide feature-level explanations of model predictions and enhance interpretability for clinical use;
(iv)
A comparative evaluation of multiple machine learning models using appropriate performance metrics to identify a suitable candidate for potential deployment in clinical decision support;
(v)
A discussion of the potential applicability of the proposed framework in low-resource settings, highlighting its ability to support early UTI detection and assist clinical decision-making while promoting responsible antibiotic use.
The rest of the paper is organised as follows. Section 2 presents the methodology and experimental setup, describing and justifying the choice of predictive models used to assess UTI risk early and reduce unnecessary antibiotic use. It highlights the classical ML models with XAI for interpretability, as well as a description of the dataset and symptoms used for the study. Section 3 discusses the results, summarising the predictive performance of the models across all evaluation metrics, along with a confusion matrix and the LIME analysis to reveal the most influential features driving UTI prediction. Section 4 concludes the paper with directions for future work.

2. Materials and Methods

The workflow for UTI diagnosis is presented in Figure 1 with the important stages starting from the collection of input data, which comprises risk factors and symptoms, before the dataset undergoes preprocessing to handle missing values and encoding of categorical features. SMOTE was then employed to handle class imbalance within the dataset before training the ML models with hyperparameter optimisation using the GridSearchCV algorithm within the scikit-learn library (version 1.7.0) to enhance performance. The model was then evaluated with standard performance metrics, and LIME was integrated to clarify feature-level contributions to individual predictions. In the decision-support stage, the outputs of model predictions and explanatory insights assist healthcare professionals in making informed and transparent diagnostic decisions.

2.1. Dataset and Preprocessing

The dataset used in this study was obtained from the New Frontiers in Research Fund (NFRF) project and initially comprised 4870 patient records. Data were collected prospectively during routine clinical practice by experienced physicians across both public and private hospitals in four states within the Niger Delta region of Nigeria (Akwa Ibom, Cross River, Imo, and Rivers State). This multi-center and multi-setting data collection approach enhances the diversity of the dataset and improves its representativeness of real-world clinical populations within the region. Patient information was recorded on clinic days using a structured questionnaire developed with Open Data Kit (ODK) and deployed on Android tablets. The dataset includes demographic characteristics (as summarized in Table 1), reported symptoms, associated risk factors, and confirmed diagnostic outcomes [29]. The demographic distribution shows representation across all age groups, including pediatric (<5 years), adolescent (5–19 years), adult (20–64 years), and elderly (≥65 years) populations, with a total of 2175 male and 2695 female participants, further supporting the dataset’s heterogeneity. Clinical diagnoses were established following appropriate laboratory investigations, and only laboratory-confirmed cases were entered into the system, ensuring diagnostic reliability. A 5-point Likert scale (absent, mild, moderate, severe, and very severe) was used to systematically capture the presence and severity of each symptom and risk factor, as well as the severity of the confirmed diagnosis. Data preprocessing was performed to ensure quality and relevance. Records with incomplete or missing critical information, particularly those related to symptoms and risk factors, were excluded, resulting in a final dataset of 4865 records. Additionally, variables not directly relevant to urinary tract infection (UTI) diagnosis, including febrile conditions outside the scope of this study, were removed. The final dataset retained only clinically relevant features, including key symptoms, associated risk factors, and the target variable (UTI diagnosis), as presented in Table 2. Despite the strengths of the dataset, several potential sources of bias should be acknowledged. First, selection bias may be present, as the data were collected from patients attending specific public and private healthcare facilities, which may not fully capture individuals who do not seek formal medical care. Second, the use of physician-assigned Likert scale ratings for symptom severity introduces the possibility of subjective interpretation and inter-observer variability. Third, although efforts were made to include diverse clinical settings, the dataset is geographically limited to the Niger Delta region, which may affect the generalizability of the findings to other regions with different epidemiological or socio-demographic characteristics. These factors may influence model performance and should be considered when interpreting the results.
The target labels used in this study were derived from laboratory-confirmed diagnoses of urinary tract infection (UTI). Specifically, cases labeled as UTI-positive were confirmed through laboratory testing, ensuring that the ground truth reflects objective diagnostic evidence. UTI-negative cases correspond to records without laboratory confirmation of infection. This approach ensures that the machine learning models are trained on reliable and clinically validated outcome labels.
The inclusion criteria for the dataset consisted of patient records with complete clinical information relevant to UTI assessment, while records with missing or ambiguous entries were excluded. However, this filtering process may introduce selection bias, as excluded records could systematically differ from those retained. Furthermore, the dataset reflects the population captured within the NFRF project, which may limit its generalizability to other regions or healthcare settings. The processed dataset exhibited class imbalance, with 961 UTI-positive cases and 3904 UTI-negative cases. To mitigate potential bias during model training, the Synthetic Minority Over-sampling Technique (SMOTE) was applied, resulting in a balanced dataset of 3129 positive and 3129 negative instances. While SMOTE improves model learning, it may also introduce synthetic patterns that do not fully capture real-world variability, which should be considered when interpreting the results.
To prevent data leakage, the dataset was first split into training and testing sets before applying any resampling techniques. The Synthetic Minority Over-sampling Technique (SMOTE) was applied exclusively to the training data to address class imbalance, while the test set was left untouched to ensure an unbiased evaluation of model performance.

2.2. Model Development, Performance Evaluation, and Explainability with LIME

RF, XGBoost, and DT algorithms were employed in this study due to their proven effectiveness in prediction tasks and their suitability for handling structured clinical data. The dataset was initially partitioned into training and testing sets using an 80:20 train–test split to enable unbiased evaluation of model performance. To enhance model robustness and generalizability, hyperparameter optimisation was performed using GridSearchCV with 5-fold cross-validation on the training data. This approach ensured that each candidate model was evaluated across multiple folds, thereby reducing the risk of overfitting and improving the stability of the selected hyperparameters. For the XGBoost model, the number of estimators ranged from 100 to 200, with maximum tree depths of 3 and 5. The Random Forest model utilised the same range of estimators with maximum depths of None and 10, while the Decision Tree model considered maximum depths of None, 10, and 20. The best-performing models were selected based on the average cross-validation accuracy. Following model selection, the optimised models were evaluated on the held-out test set to assess their generalisation performance. Model evaluation was conducted using standard classification metrics, including Accuracy, Precision, Recall, F1 Score, Logarithmic Loss (Log Loss), and the Area Under the Receiver Operating Characteristic Curve (AUC-ROC), given their relevance and importance in healthcare-related predictive modelling.
This study employed LIME to enhance the model’s prediction transparency and interpretability by quantifying the contribution of individual features to the outcome of the prediction, which is essential for improving the usability of the model. The model development, training, testing and evaluation steps were carried out using a cloud-based Jupyter notebook environment (Google Colaboratory).

3. Results and Discussion

The performance evaluation results of the models are presented in this section, including the prediction performance for the SMOTE and non-SMOTE models, and the XAI method adopted in the study. The RF model was the most reliable model with a high number of true positives and true negatives, with few misclassifications as show in Figure 2 while Table 3 further presents RF’s strong precision (87.12%), indicating a low rate of false positives, and high recall (86.43%), an F1 score of 86.71% and an AUC-ROC value of 0.8695, both of which highlight the model’s strong discriminative capability. Although a slight reduction in overall performance was noticed when the model was trained using SMOTE, this technique was retained to address potential class imbalance and promote model fairness. This consideration is particularly important in clinical settings, where the underrepresentation of positive UTI cases may lead to clinically significant false negatives.
The XGBoost model (version 3.0.2) showed strong performance based on its confusion matrix (Figure 3). It achieved the lowest Log Loss (0.4566) among the three models. However, its F1 score (86.25%) and AUC-ROC (0.8307) were lower than RF, indicating slightly more misclassifications despite better probabilistic predictions. This shows that XGBoost is useful when accurate probability estimates are important. However, it does not consistently outperform RF in reliably detecting UTI cases.
The DT model performed worse than RF and XGBoost. It produced many misclassifications, especially false negatives and false positives (Figure 4). Its accuracy, recall, and F1 score were all around 81%. A high Log Loss (5.5892) and low AUC-ROC (0.7263) show weak confidence and poor class discrimination.
A critical examination of the model performance presented in Table 3 reveals that the Random Forest model achieved the highest overall performance under the SMOTE setting, followed by XGBoost, while the Decision Tree model demonstrated lower performance across all metrics, which may be attributed to its limited capacity to capture complex, non-linear relationships compared to ensemble-based approaches. These findings are consistent with existing studies on machine learning-based diagnosis of urinary tract infections and other infectious diseases, where ensemble models such as Random Forest and XGBoost often outperform simpler models due to their ability to aggregate multiple decision boundaries [16,17,20]. Nevertheless, variations in reported performance across studies can be influenced by differences in dataset size, feature representation, and preprocessing strategies, including the use of resampling techniques such as SMOTE. While SMOTE contributed to improved class balance in this study, it may also introduce synthetic patterns that do not fully reflect real-world data distributions.
The LIME plots in Figure 5, Figure 6 and Figure 7 show how different symptoms influence each model’s diagnoses across the test dataset, highlighting both positive and negative contributions. The LIME plot for the RF model in Figure 5 identifies painful urination (PNFLURNTN) as the most influential feature in predicting UTI. Other symptoms with strong positive contributions include urinary frequency (URNFQC), suprapubic pain (SPPBPN), and cloudy urine (CLDYURN). These symptoms significantly increase the likelihood of UTI predictions by the model. The results align with known clinical signs of UTI. This suggests that the RF model makes predictions based on medically meaningful patterns.
The XGBoost model in Figure 6 shows that painful urination (PNFLURNTN) is the most important predictor of UTI. Other key symptoms with positive influence are urinary frequency (URNFQC), suprapubic pain (SPPBPN), and cloudy urine (CLDYURN). The similarity in important features across models increases confidence in their predictions. Although XGBoost performs slightly lower than RF in overall metrics, it still identifies relevant clinical patterns. This indicates that the model’s predictions are medically meaningful.
The DT model in Figure 7 identifies painful urination (PNFLURNTN) and urinary frequency (URNFQC) as the most important features. However, it ranks the remaining symptoms differently from the other models. In this model, cloudy urine (CLDYURN) is considered more important than suprapubic pain (SPPBPN). Across all three models, bloody urine (BLDYURN) has a negative effect on UTI predictions. This suggests that bloody urine is more often linked to other urological conditions (e.g., kidney stones or bladder disorders) than to UTIs.
While the proposed models demonstrate strong predictive performance, it is important to note that the initial evaluation was conducted using a single train-test split without an external validation dataset or formal statistical significance testing. Although k-fold cross-validation was applied during the hyperparameter tuning stage through GridSearchCV, the final performance assessment was still based on a held-out test set, which may limit the ability to fully generalize the observed differences between models, particularly between Random Forest and XGBoost. Consequently, future studies should incorporate repeated cross-validation and external validation datasets, alongside statistical significance testing, to provide a more robust and reliable assessment of model stability and generalizability across different clinical populations and settings. In addition to predictive performance, the suitability of the proposed models for low-resource environments is enhanced by their relatively low computational complexity compared to deep learning architectures. The selected models are lightweight, requiring modest memory and processing resources, which makes them suitable for deployment on devices with limited computational capacity. Furthermore, since the framework relies on structured clinical data rather than high-dimensional or unstructured inputs, both storage and computational demands remain manageable. This increases the feasibility of deploying the proposed system in environments with constrained infrastructure, intermittent internet connectivity, and limited access to high-performance computing resources.
The performance of the Random Forest model in this study (Accuracy: 86.43%, AUC-ROC: 0.8695) is generally consistent with recent studies that have demonstrated the effectiveness of ensemble learning methods for urinary tract infection (UTI) prediction. For example, Farashi et al. [18] reported that an ensemble model combining XGBoost, decision tree, and Light Gradient Boosting Machines with a voting strategy achieved an accuracy of 85.64% and an AUC of 88.53%, which closely aligns with the performance obtained in this study. Similarly, Favresse et al. [30] reported higher predictive performance using a CatBoost classifier, achieving an AUC ranging from 92.0% to 94.7% and an average precision between 68.2% and 81.6%, suggesting that more advanced gradient boosting approaches applied to larger and more heterogeneous clinical datasets may yield stronger discriminative capability. However, differences in performance across these studies can be attributed to variations in dataset characteristics, feature engineering strategies, and model complexity. In particular, the use of structured symptom- and risk factor–based data in this study provides a more constrained but clinically interpretable feature space, whereas Favresse et al. [30] benefited from a larger and more diverse dataset, which may partly explain their higher AUC values. Additionally, the incorporation of SMOTE-based class balancing in this study improved minority class representation and contributed to model stability, while differences in outcome definitions and preprocessing pipelines across studies further influence comparative performance. Furthermore, the application of k-fold cross-validation during hyperparameter tuning enhances the robustness and generalizability of the proposed model, reducing the likelihood of overfitting compared to approaches relying solely on single train-test splits. Overall, these findings align with existing literature emphasizing that predictive performance in clinical machine learning is strongly influenced by dataset composition, feature representation, algorithm selection, and validation methodology.
Figure 5, Figure 6 and Figure 7 present LIME-based explanations that illustrate the contribution of individual features to specific predictions at the instance level. To complement this local interpretability, a global feature importance analysis was conducted to capture overall model behaviour across the dataset. The results show that painful urination, urinary frequency, and suprapubic pain are consistently the most influential predictors, aligning with LIME-based findings. This agreement between local and global explanations enhances confidence in the robustness and clinical relevance of the model.
Beyond interpretability, LIME provides meaningful clinical value by translating model outputs into understandable reasoning that can support decision-making in low-resource settings where laboratory confirmation is often unavailable. For example, in cases classified as high risk for urinary tract infection, LIME highlights key contributing symptoms such as urinary frequency and painful urination, enabling clinicians to validate model reasoning against observed clinical findings. This supports more informed decisions regarding empirical treatment, further diagnostic testing, or referral, particularly in primary healthcare settings where non-specialist providers operate under high workload constraints. By making prediction rationale explicit, LIME enhances trust, reduces diagnostic uncertainty, and strengthens the role of the model as a clinical decision-support tool.
However, the effectiveness of such explanations depends on their usability in practice. While LIME improves transparency, its outputs may still require minimal familiarity with machine learning concepts. Therefore, integration into simple clinical interfaces such as mobile health applications or point-of-care dashboards is essential. In addition, alternative techniques such as SHapley Additive exPlanations (SHAP) or rule-based models may offer more intuitive or globally consistent interpretability, and should be explored in future work. While Random Forest achieved the best predictive performance, its ensemble structure limits inherent interpretability compared to simpler models such as decision trees. Although LIME partially addresses this limitation, a trade-off remains between performance and transparency. In low-resource clinical environments, model selection should therefore balance accuracy with interpretability and usability. The practical deployment of the proposed framework through mHealth systems could further support triage and early diagnosis, although challenges such as infrastructure limitations, digital literacy, and user trust must be addressed.
While this study focuses on individual-level diagnosis using structured clinical data, machine learning applications extend beyond point-of-care decision support to broader public health contexts. Recent work on privacy-enhancing digital contact tracing demonstrates how machine learning can support epidemic control while preserving sensitive user data through privacy-preserving techniques and distributed frameworks [31]. Similarly, automated contact tracing models have been developed to estimate infection risk based on transmission-related parameters, enabling real-time identification and prioritisation of high-risk individuals during disease outbreaks [32]. These developments highlight the potential of ML systems to scale from individual diagnosis to population-level health management, emphasizing real-time responsiveness, scalability, and privacy-aware deployment. In this context, the proposed model can be viewed as a component within a broader ecosystem of intelligent healthcare solutions.
In addition to these practical considerations, several ethical implications must be addressed when deploying machine learning-based diagnostic tools, particularly in LMICs. One key concern is decision accountability, as such systems should function strictly as decision-support tools rather than replacements for clinical expertise. Ensuring that healthcare providers retain ultimate responsibility for patient care is essential to prevent over-reliance on automated predictions. Furthermore, potential biases in model predictions must be carefully considered. The use of a dataset derived from a specific population, along with preprocessing steps such as data filtering and synthetic balancing techniques, may introduce biases that affect model generalizability and fairness. If not properly addressed, these biases could result in unequal diagnostic performance across patient groups. Patient data privacy is another critical concern, particularly in resource-constrained settings where data governance frameworks may be less robust. Safeguards such as data anonymisation, secure storage, and controlled access mechanisms are necessary to protect sensitive health information, alongside adherence to ethical standards and regulatory guidelines.
While the proposed model demonstrates effectiveness using structured symptom and risk factor data, its clinical utility could be further enhanced through the integration of additional data modalities. Incorporating longitudinal patient history, including prior UTI occurrences, comorbidities, and treatment outcomes, may enable more personalised and context-aware predictions. Additionally, the increasing availability of wearable health technologies presents an opportunity to integrate continuous physiological data for early detection and monitoring of infection progression. Although such multimodal integration would require more advanced data fusion techniques and robust data management frameworks, it holds significant potential for improving model performance and clinical relevance.
However, a key limitation of this study is the absence of external validation using independent datasets from different geographical regions or healthcare settings. This limitation restricts the ability to fully assess the generalizability and robustness of the model, particularly when deployed in diverse real-world clinical environments. As such, the current findings should be interpreted with caution in terms of broader clinical applicability. Future work should therefore prioritise external validation on independent datasets, as well as the development of scalable and interpretable architectures capable of handling heterogeneous data sources, particularly in low-resource settings.

4. Conclusions

This study demonstrates the potential of machine learning models, particularly Random Forest and XGBoost, for supporting the diagnosis of urinary tract infections using patient symptoms and risk factors in settings where laboratory diagnostics may be limited. The integration of explainable artificial intelligence techniques, such as LIME, enhances the transparency of model predictions by identifying clinically relevant features, including painful urination and urinary frequency, thereby supporting interpretability and clinical reasoning. Although the Random Forest model achieved the best overall performance, the relatively small differences between models suggest that multiple approaches may offer comparable predictive capabilities depending on the clinical context. Importantly, the use of structured clinical data and computationally efficient models highlights the feasibility of deploying such approaches in low-resource environments, where rapid and accessible decision-support tools are critically needed. However, a key limitation of this study is the absence of external validation using independent datasets, which restricts the immediate generalizability and clinical applicability of the proposed models. While internal validation strategies, including cross-validation during model development and evaluation on a held-out test set, were employed to enhance robustness, these approaches cannot fully substitute for validation across diverse populations and real-world clinical settings. Therefore, the findings of this study should be interpreted as preliminary and exploratory, serving as a foundation for further validation rather than as definitive evidence for clinical deployment. In addition, the use of synthetic data balancing techniques may have influenced model behaviour and should be carefully considered when interpreting the results. Future work will prioritize external validation using multi-center and geographically diverse datasets, as well as prospective clinical evaluation to assess real-world performance, reliability, and usability. Further research will also explore advanced explainability methods and integration into digital health platforms to support practical implementation. By addressing these limitations, subsequent studies can enhance the robustness, generalizability, and translational impact of machine learning-based diagnostic systems in healthcare.

Author Contributions

Conceptualization, F.-M.U., O.O. and K.A. (Kingsley Attai); methodology, K.A. (Kingsley Attai), K.A. (Kingsley Akputu), D.A., O.O., C.T. and F.-M.U.; validation, F.-M.U., K.A. (Kingsley Attai), D.A. and K.A. (Kingsley Akputu); formal analysis, K.A. (Kingsley Attai), D.A., C.T., E.A., F.-V.U. and E.A.; data curation, K.A. (Kingsley Attai); K.A. (Kingsley Akputu) and E.A.; writing—original draft preparation, K.A. (Kingsley Attai), E.A., K.A. (Kingsley Akputu), D.A., F.-V.U., C.A. and F.-M.U.; writing—review and editing, D.A., O.O., K.A. (Kingsley Attai), C.A. and F.-M.U.; supervision, F.-M.U., O.O., D.A., K.A. (Kingsley Attai) and K.A. (Kingsley Akputu); project administration, O.O. and F.-M.U. funding acquisition, F.-M.U. All authors have read and agreed to the published version of the manuscript.

Funding

The New Frontier Research Fund funded this study from April 2020 to March 2024 under grant number NFRFE-2019-01365.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent for participation was obtained from all subjects involved in the study.

Data Availability Statement

The dataset has been deposited in the Zenodo repository (https://doi.org/10.5281/zenodo.13756418, 5 May 2025). However, due to an ongoing study and ethical considerations, the dataset is currently under restricted access and is not publicly downloadable. The data are available from the corresponding author upon reasonable request and with permission from the relevant authorities.

Conflicts of Interest

Author Ekerette Attai was employed by the company Novena Computers and Technologies Limited. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Flores, E.; Martínez-Racaj, L.; Blasco, A.; Diaz, E.; Esteban, P.; Lopez-Garrigos, M.; Salinas, M. A step forward in the diagnosis of urinary tract infections: From machine learning to clinical practice. Comput. Struct. Biotechnol. J. 2024, 24, 533–541. [Google Scholar] [CrossRef]
  2. Karas, D.R.; Upadhyayula, S.; Love, A.; Bigham, M.T. Utilising clinical decision support in the treatment of urinary tract infection across a large pediatric primary care network. Pediatr. Qual. Saf. 2023, 8, e655. [Google Scholar] [CrossRef]
  3. Asuquo, D.; Attai, K.; Obot, O.; Ekpenyong, M.; Akwaowo, C.; Arnold, K.; Uzoka, F.-M. Febrile disease modeling and diagnosis system for optimizing medical decisions in resource-scarce settings. Clin. eHealth 2024, 7, 52–76. [Google Scholar] [CrossRef]
  4. Ramsay, J.A.; Mascaro, S.; Campbell, A.J.; Foley, D.A.; Mace, A.O.; Ingram, P.; Borland, M.L.; Blyth, C.C.; Larkins, N.G.; Robertson, T.; et al. Urinary tract infections in children: Building a causal model-based decision support tool for diagnosis with domain knowledge and prospective data. BMC Med. Res. Methodol. 2022, 22, 218. [Google Scholar] [CrossRef]
  5. Flores-Mireles, A.L.; Walker, J.N.; Caparon, M.; Hultgren, S.J. Urinary tract infections: Epidemiology, mechanisms of infection and treatment options. Nat. Rev. Microbiol. 2015, 13, 269–284. [Google Scholar] [CrossRef] [PubMed]
  6. Delory, T.; Le Bel, J.; Lariven, S.; Peiffer-Smadja, N.; Lescure, F.-X.; Bouvet, E.; Jeanmougin, P.; Tubach, F.; Boelle, P.-Y. Computerized decision support system use for surveillance of antimicrobial resistance in urinary tract infections in primary care. J. Antimicrob. Chemother. 2022, 77, 524–530. [Google Scholar] [CrossRef]
  7. Von Vietinghoff, S.; Shevchuk, O.; Dobrindt, U.; Engel, D.R.; Jorch, S.K.; Kurts, C.; Wagenlehner, F. The global burden of antimicrobial resistance—Urinary tract infections. Nephrol. Dial. Transplant. 2024, 39, 581–588. [Google Scholar] [CrossRef] [PubMed]
  8. Neugebauer, M.; Ebert, M.; Vogelmann, R. Clinical decision support system improves antibiotic therapy for upper urinary tract infection in a randomized single-blinded study. BMC Health Serv. Res. 2020, 20, 185. [Google Scholar] [CrossRef]
  9. Rawson, T.M.; Moore, L.S.P.; Hernandez, B.; Charani, E.; Castro-Sanchez, E.; Herrero, P.; Hayhoe, B.; Hope, W.; Georgiou, P.; Holmes, A.H. A systematic review of clinical decision support systems for antimicrobial management: Are we failing to investigate these interventions appropriately? Clin. Microbiol. Infect. 2017, 23, 524–532. [Google Scholar] [CrossRef] [PubMed]
  10. Asuquo, D.; Umoren, I.; Osang, F.; Attai, K. A machine learning framework for length of stay minimization in healthcare emergency department studies. Stud. Eng. Technol. 2023, 10, 1–17. [Google Scholar] [CrossRef]
  11. Puli, S.K.; Usha, P. Transforming healthcare: Advancements, applications, and future directions of machine learning. In Proceedings of the 2024 10th International Conference on Smart Computing and Communication (ICSCC); IEEE: New York, NY, USA, 2024; pp. 502–506. [Google Scholar] [CrossRef]
  12. Devi, A.; Raj, T.N. Enhanced heart disease prediction through optimized ensemble random forest model. In Proceedings of the 2024 4th International Conference on Sustainable Expert Systems (ICSES); IEEE: New York, NY, USA, 2024; pp. 1702–1707. [Google Scholar] [CrossRef]
  13. Naik, N.; Talyshinskii, A.; Shetty, D.K.; Hameed, B.M.Z.; Zhankina, R.; Somani, B.K. Smart diagnosis of urinary tract infections: Is artificial intelligence the fast-lane solution? Curr. Urol. Rep. 2024, 25, 37–47. [Google Scholar] [CrossRef]
  14. Jeng, S.L.; Huang, Z.J.; Yang, D.C.; Teng, C.H.; Wang, M.C. Machine learning to predict the development of recurrent urinary tract infection related to single uropathogen, Escherichia coli. Sci. Rep. 2022, 12, 17216. [Google Scholar] [CrossRef]
  15. de-Vries, S.; Ten Doesschate, T.; Totté, J.E.; Heutz, J.W.; Loeffen, Y.G.; Oosterheert, J.J.; Boel, E. A semi-supervised decision support system to facilitate antibiotic stewardship for urinary tract infections. Comput. Biol. Med. 2022, 146, 105621. [Google Scholar] [CrossRef] [PubMed]
  16. Attai, K.; Ekpenyong, M.; Amannah, C.; Asuquo, D.; Ajuga, P.; Obot, O.; Johnson, E.; John, A.; Maduka, O.; Akwaowo, C.; et al. Enhancing the interpretability of malaria and typhoid diagnosis with explainable AI and large language models. Trop. Med. Infect. Dis. 2024, 9, 216. [Google Scholar] [CrossRef] [PubMed]
  17. Attai, K.F.; Amannah, C.; Ekpenyong, M.E.; Asuquo, D.E.; Akputu, O.K.; Obot, O.U.; Ajuga, P.C.; Obi, J.C.; Maduka, O.; Akwaowo, C.; et al. Developing an explainable artificial intelligence system for the mobile-based diagnosis of febrile diseases using random forest, LIME, and GPT. Healthc. Inform. Res. 2025, 31, 125–135. [Google Scholar] [CrossRef]
  18. Farashi, S.; Momtaz, H.E. Prediction of urinary tract infection using machine learning methods: A study for finding the most-informative variables. BMC Med. Inform. Decis. Mak. 2025, 25, 13. [Google Scholar] [CrossRef] [PubMed]
  19. Yen, C.-C.; Ma, C.-Y.; Tsai, Y.-C. Interpretable Machine Learning Models for Predicting Critical Outcomes in Patients with Suspected Urinary Tract Infection with Positive Urine Culture. Diagnostics 2024, 14, 1974. [Google Scholar] [CrossRef]
  20. Amannah, C.; Attai, K.F.; Uzoka, F.-M. A Data-Driven Intelligent Methodology for Developing Explainable Diagnostic Model for Febrile Diseases. Algorithms 2025, 18, 190. [Google Scholar] [CrossRef]
  21. Attai, K.; Akwaowo, C.; Asuquo, D.; Esubok, N.E.; Nelson, U.A.; Dan, E.; Obot, O.; Amannah, C.; Uzoka, F.M. Explainable AI modelling of comorbidity in pregnant women and children with tropical febrile conditions. In Proceedings of the International Conference on Artificial Intelligence and Its Applications; OJS/PKP: Burnaby, BC, Canada, 2023; pp. 152–159. [Google Scholar]
  22. Markus, A.F.; Kors, J.A.; Rijnbeek, P.R. The role of explainability in creating trustworthy artificial intelligence for healthcare: A comprehensive survey of the terminology, design choices, and evaluation strategies. J. Biomed. Inform. 2021, 113, 103655. [Google Scholar] [CrossRef]
  23. Adekoya, A.; Okezue, M.A.; Menon, K. Medical Laboratories in Healthcare Delivery: A Systematic Review of Their Roles and Impact. Laboratories 2025, 2, 8. [Google Scholar] [CrossRef]
  24. Han, G.R.; Goncharov, A.; Eryilmaz, M.; Ye, S.; Palanisamy, B.; Ghosh, R.; Lisi, F.; Rogers, E.; Guzman, D.; Yigci, D.; et al. Machine learning in point-of-care testing: Innovations, challenges, and opportunities. Nat. Commun. 2025, 16, 3165. [Google Scholar] [CrossRef]
  25. Rasheed, K.; Qayyum, A.; Ghaly, M.; Al-Fuqaha, A.; Razi, A.; Qadir, J. Explainable, trustworthy, and ethical machine learning for healthcare: A survey. Comput. Biol. Med. 2022, 149, 106043. [Google Scholar] [CrossRef]
  26. Tan, L.; Huang, C.; Yao, X. A concept-based local interpretable model-agnostic explanation approach for deep neural networks in image classification. In Proceedings of the International Conference on Intelligent Information Processing; Springer Nature: Cham, Switzerland, 2024; pp. 119–133. [Google Scholar]
  27. Das, N.; Topalovic, M.; Raskin, J.; Aerts, J.M.; Troosters, T.; Janssens, W. Explaining predictions of an automated pulmonary function test interpretation algorithm. Eur. Respir. J. 2019, 54, PA2227. [Google Scholar] [CrossRef]
  28. Kumar-Akulasinghe, N.B.; Blomberg, T.; Liu, J.; Leao, A.S.; Papapetrou, P. Evaluating local interpretable model-agnostic explanations on clinical machine learning classification models. In Proceedings of the 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS); IEEE: New York, NY, USA, 2020; pp. 7–12. [Google Scholar] [CrossRef]
  29. NFRF; University of Uyo Teaching Hospital; Mount Royal University. NFRF Project Patient Dataset with Febrile Diseases [Data Set]. Zenodo 2024. [Google Scholar] [CrossRef]
  30. Favresse, J.; Cabo, J.; Bosse, M.; Lardinois, B.; Cadrobbi, J.; Laffineur, K.; Elsen, M.; Douxfils, J.; Roelandts, L.; De Bruyne, S. Machine learning algorithms for predicting urinary tract infections: Integration of demographic data and dipstick reflectance results. Clin. Chem. 2025, 71, 1083–1094. [Google Scholar] [CrossRef] [PubMed]
  31. Hang, C.-N.; Tsai, Y.-Z.; Yu, P.-D.; Chen, J.; Tan, C.-W. Privacy-Enhancing Digital Contact Tracing with Machine Learning for Pandemic Response: A Comprehensive Review. Big Data Cogn. Comput. 2023, 7, 108. [Google Scholar] [CrossRef]
  32. Aklah, Z.; Al-Safi, A.; Abdali, M.H.; Al-jabery, K. A Machine Learning Model for Automated Contact Tracing during Disease Outbreaks. Healthc. Anal. 2025, 7, 100389. [Google Scholar] [CrossRef]
Figure 1. Study workflow.
Figure 1. Study workflow.
Information 17 00435 g001
Figure 2. Random Forest Confusion Matrix.
Figure 2. Random Forest Confusion Matrix.
Information 17 00435 g002
Figure 3. XGBoost Confusion Matrix.
Figure 3. XGBoost Confusion Matrix.
Information 17 00435 g003
Figure 4. Decision Tree Confusion Matrix.
Figure 4. Decision Tree Confusion Matrix.
Information 17 00435 g004
Figure 5. Random Forest LIME Diagram. Green bars indicate features that support the predicted class, while red bars indicate features that contradict the predicted class.
Figure 5. Random Forest LIME Diagram. Green bars indicate features that support the predicted class, while red bars indicate features that contradict the predicted class.
Information 17 00435 g005
Figure 6. XGBoost LIME Diagram. Green bars indicate features that support the predicted class, while red bars indicate features that contradict the predicted class.
Figure 6. XGBoost LIME Diagram. Green bars indicate features that support the predicted class, while red bars indicate features that contradict the predicted class.
Information 17 00435 g006
Figure 7. Decision Tree LIME Diagram. Green bars indicate features that support the predicted class, while red bars indicate features that contradict the predicted class.
Figure 7. Decision Tree LIME Diagram. Green bars indicate features that support the predicted class, while red bars indicate features that contradict the predicted class.
Information 17 00435 g007
Table 1. Statistics of patients in the dataset.
Table 1. Statistics of patients in the dataset.
Patient Age (Years)MaleFemaleTotal
<5534419953
5–12346323669
13–19150213363
20–64101216052617
≥65133135268
Total217526954870
Table 2. Symptoms and Risk factors used in the study.
Table 2. Symptoms and Risk factors used in the study.
SNSymptom/Risk FactorSNSymptom/Risk Factor
1Abdominal pains (ABDPN)11Urinary frequency (URNFQC)
2Bloody urine (BLDYURN)12Vomiting (VMT)
3Chills and rigors (CHLNRIG)13High Blood Pressure (HIBP)
4Cloudy urine (CLDYURN)14High Cholesterol Level (HICOLV)
5Fatigue (FTG)15Poor Personal Hygiene (PPHYG)
6Fever (FVR)16Intravenous Drug Use (IVNDRUS)
7Nausea (NUS)17Skin Puncture (SKPUPR)
8Upper back pain (UPBCKPN)18Low Fluid Intake (LWFLIN)
9Painful urination (PNFLURNTN)19Underlying Chronic Illness (UNCHRIL)
10Suprapubic pains (SPPBPN)  
Table 3. Prediction model performance.
Table 3. Prediction model performance.
SMOTE Models
 AccuracyPrecisionRecallF1 ScoreLog LossAUC-ROC
Random Forest0.86430.87120.86430.86710.60570.8695
XGBOOST0.86230.86280.86230.86250.45660.8307
Decision Tree0.81290.82820.81290.81915.58920.7263
Non-SMOTE Models
Random Forest0.88490.88080.88490.87500.31170.8833
XGBOOST0.88690.88160.88690.88020.32230.8835
Decision Tree0.86740.85940.86740.85582.31500.7124
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Attai, K.; Asuquo, D.; Akputu, K.; Obot, O.; Thomas, C.; Uzoka, F.-V.; Attai, E.; Akwaowo, C.; Uzoka, F.-M. Integrating Risk Factors and Symptoms for Urinary Tract Infection Diagnosis Using an Explainable AI Approach in Low-Resource Regions. Information 2026, 17, 435. https://doi.org/10.3390/info17050435

AMA Style

Attai K, Asuquo D, Akputu K, Obot O, Thomas C, Uzoka F-V, Attai E, Akwaowo C, Uzoka F-M. Integrating Risk Factors and Symptoms for Urinary Tract Infection Diagnosis Using an Explainable AI Approach in Low-Resource Regions. Information. 2026; 17(5):435. https://doi.org/10.3390/info17050435

Chicago/Turabian Style

Attai, Kingsley, Daniel Asuquo, Kingsley Akputu, Okure Obot, Cornelia Thomas, Faith-Valentine Uzoka, Ekerette Attai, Christie Akwaowo, and Faith-Michael Uzoka. 2026. "Integrating Risk Factors and Symptoms for Urinary Tract Infection Diagnosis Using an Explainable AI Approach in Low-Resource Regions" Information 17, no. 5: 435. https://doi.org/10.3390/info17050435

APA Style

Attai, K., Asuquo, D., Akputu, K., Obot, O., Thomas, C., Uzoka, F.-V., Attai, E., Akwaowo, C., & Uzoka, F.-M. (2026). Integrating Risk Factors and Symptoms for Urinary Tract Infection Diagnosis Using an Explainable AI Approach in Low-Resource Regions. Information, 17(5), 435. https://doi.org/10.3390/info17050435

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop