Leveraging Artificial Intelligence to Predict Potential TB Hotspots at the Community Level in Bangui, Republic of Central Africa
Abstract
:1. Introduction
2. Materials and Methods
2.1. Setting
2.2. Data Collection
2.3. Data Analysis
2.3.1. Integration of Relevant Variables
2.3.2. Incorporation of Program Data
2.3.3. Application of Bayesian Model and Prediction Analysis
2.3.4. Model Training
- -
- p(X = x|Y = y) is the posterior probability;
- -
- p(X = x, Y = y) is the joint probability;
- -
- p(Y = y) is the marginal probability of Y;
- -
- p(X = x) is the prior probability of X;
- -
- p(Y = y|X = x) is the likelihood.
- Defining the Prior p(X = x): Facility-level TB notification data at a 100 m × 100 m resolution were used to determine the probability that each tile’s TB rate (X) falls within each specific percentile range. This provided the baseline TB risk for each tile without incorporating additional contextual data.
- Establishing the Likelihood p(Y = y|X = x): The likelihood specifies how probable it is to observe a particular covariate pattern (Y = y), given that TB status (or rate range) is x. Observed notification data were used to determine how socio-environmental factors correlate with TB notification rates. These probabilities are encoded in Conditional Probability Tables (CPTs). For each TB notification rate interpercentile range, the CPT indicates the likelihood of each combined set of covariate states.
- Combining Prior and Likelihood: For each 100 m × 100 m tile, the software retrieves its covariate profile y. Using the CPT, it obtains p(Y = y|X = x), reflecting the probability of those covariate ranges if TB state x is observed. The product p(X = x) × p(Y = y|X = x) gives an unnormalized posterior, a measure of how likely it is that a tile’s TB notification rate falls in a particular percentile range x, given the observed covariates.
- Normalizing to Obtain the Posterior: To ensure probabilities sum to 1, the algorithm divides by the marginal probability p(Y = y), which is the sum of all prior-likelihood products over possible TB states. After normalization, the posterior p(X = x|Y = y) denotes the updated probability of each TB rate range, having accounted for both local covariates and baseline TB rates.
- Inference at High Resolution: This procedure is repeated for each tile in the study area. The posterior distribution is calculated by the probability assigned to each TB notification interpercentile range. The midpoint (50th percentile) of the posterior distribution is used as the predicted TB positivity rate, with 95% credible intervals reflecting uncertainty.
- Calculating Predicted TB Burden: Once the posterior (i.e., the TB positivity rate) is determined for each tile, multiplying that rate by the tile’s population yields the absolute TB burden. Summing the tile-level burdens provides an estimate of total TB burden across the city.
- Software Implementation: The trained model is queried at high spatial resolution using proprietary Python-based inference software. Each tile’s covariates are fed into the CPTs, and the resulting posterior distribution informs local TB burden estimates.
2.4. Model Evaluation
2.5. Ethics
3. Results
3.1. TB Positivity Rates
3.2. Predicted TB Positivity Rate
3.3. Comparison of Notified and Predicted TB Risk in the Vicinity of TB Clinics
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ministère de la Santé et de la Population, République Centrafricaine. Plan Stratégique National De Lutte Contre La Tuberculose 2024–2028; Ministère de la Santé et de la Population: Bangui, République Centrafricaine, 2023.
- World Bank. Central African Republic Poverty Assessment—A Road Map Towards Poverty Reduction in the Central African Republic (English); World Bank Group: Washington, DC, USA, 2023; Available online: http://documents.worldbank.org/curated/en/099111323121515851/P17739108d680e074088b608a00615bcba3 (accessed on 20 June 2024).
- Médecins Sans Frontières. HIV Is in a State of Silent Crisis in Central African Republic. 2020. Available online: https://www.msf.org/hiv-state-silent-crisis-car (accessed on 23 November 2023).
- The Joint United Nations Programme on HIV/AIDS (UNAIDS). Central African Republic 2022. Available online: https://www.unaids.org/en/regionscountries/countries/centralafricanrepublic (accessed on 23 November 2023).
- Statista. Share of Undernourished Population in Africa Between 2020 and 2022, by Country. 2023. Available online: https://www.statista.com/statistics/1305636/prevalence-of-undernourishment-in-africa-by-country/ (accessed on 20 June 2024).
- United Nations Children’s Fund. West and Central Africa: Water, Sanitation and Hygiene. 2023. Available online: https://www.unicef.org/wca/what-we-do/wash (accessed on 23 November 2023).
- Médecins Sans Frontières. As Cycle of Violence in CAR Continues, People Struggle to Find Support. 2021. Available online: https://www.msf.org/cycle-violence-car-continues-people-struggle-find-support (accessed on 20 June 2024).
- Macrotrends. Bangui, Central African Republic Metro Area Population 1950–2024. 2024. Available online: https://www.macrotrends.net/global-metrics/cities/20410/bangui/population#google_vignette (accessed on 20 June 2024).
- de Abreu, E.S.M.; Di Lorenzo Oliveira, C.; Teixeira Neto, R.G.; Camargos, P.A. Spatial distribution of tuberculosis from 2002 to 2012 in a midsize city in Brazil. BMC Public Health 2016, 16, 912. [Google Scholar] [CrossRef] [PubMed]
- Parums, D.V. Editorial: Infectious Disease Surveillance Using Artificial Intelligence (AI) and its Role in Epidemic and Pandemic Preparedness. Med. Sci. Monit. 2023, 29, e941209. [Google Scholar] [CrossRef] [PubMed]
- Siddig, E.E.; Eltigani, H.F.; Ahmed, A. The Rise of AI: How Artificial Intelligence is Revolutionizing Infectious Disease Control. Ann. Biomed. Eng. 2023, 51, 2636–2637. [Google Scholar] [CrossRef] [PubMed]
- Teibo, T.K.A.; Andrade, R.L.D.P.; Rosa, R.J.; Tavares, R.B.V.; Berra, T.Z.; Arcêncio, R.A. Geo-spatial high-risk clusters of Tuberculosis in the global general population: A systematic review. BMC Public Health 2023, 23, 1586. [Google Scholar] [CrossRef]
- Noykhovich, E.; Mookherji, S.; Roess, A. The Risk of Tuberculosis among Populations Living in Slum Settings: A Systematic Review and Meta-analysis. J. Urban Health 2019, 96, 262–275. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Gelaw, Y.A.; Yu, W.; Magalhães, R.J.S.; Assefa, Y.; Williams, G. Effect of Temperature and Altitude Difference on Tuberculosis Notification: A Systematic Review. J. Glob. Infect. Dis. 2019, 11, 63–68. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Harling, G.; Lima Neto, A.S.; Sousa, G.S.; Machado, M.M.T.; Castro, M.C. Determinants of tuberculosis transmission and treatment abandonment in Fortaleza, Brazil. BMC Public Health 2017, 17, 508. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Siddalingaiah, N.; Chawla, K.; Nagaraja, S.B.; Hazra, D. Risk factors for the development of tuberculosis among the pediatric population: A systematic review and meta-analysis. Eur. J. Pediatr. 2023, 182, 3007–3019. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Franco, J.V.; Bongaerts, B.; Metzendorf, M.I.; Risso, A.; Guo, Y.; Silva, L.P.; Boeckmann, M.; Schlesinger, S.; Damen, J.A.; Richter, B.; et al. Undernutrition as a risk factor for tuberculosis disease. Cochrane Database Syst. Rev. 2024, 6, CD015890. [Google Scholar] [CrossRef] [PubMed]
- Kwan, C.K.; Ernst, J.D. HIV and tuberculosis: A deadly human syndemic. Clin. Microbiol. Rev. 2011, 24, 351–376. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- WorldPop Hub. The Spatial Distribution of Population Density in 2020, Central African Republic 2020. Available online: https://hub.worldpop.org/geodata/summary?id=40591 (accessed on 3 October 2022).
- Institute for Health Metrics and Evaluation (IHME). Child Growth Failure 2020. Available online: https://www.healthdata.org/data-tools-practices/interactive-visuals/child-growth-failure (accessed on 3 October 2022).
- Global Health Data Exchange (GHDx). Low-and Middle-Income Country Drinking Water and Sanitation Facilities Access Geospatial Estimates 2000–2017. 2020. Available online: https://ghdx.healthdata.org/record/ihme-data/lmic-wash-access-geospatial-estimates-2000-2017 (accessed on 3 October 2022).
- Institute for Health Metrics and Evaluation (IHME). Africa Diphtheria-Pertussis-Tetanus Vaccine Coverage Geospatial Estimates 2000–2016. 2019. Available online: https://ghdx.healthdata.org/record/ihme-data/africa-diphtheria-pertussis-tetanus-vaccine-coverage-geospatial-estimates-2000-2016 (accessed on 3 October 2022).
- Institute for Health Metrics and Evaluation (IHME). Low- and Middle-Income Country MCV1 Coverage Geospatial Estimates 2000–2019. 2020. Available online: https://ghdx.healthdata.org/record/ihme-data/lmic-mcv1-coverage-geospatial-estimates-2000-2019 (accessed on 3 October 2022).
- Institute for Health Metrics and Evaluation (IHME). Low- and Middle-Income Country Educational Attainment Geospatial Estimates 2000–2017. 2019. Available online: https://ghdx.healthdata.org/record/ihme-data/lmic-education-geospatial-estimates-2000-2017 (accessed on 3 October 2022).
- Global Health Data Exchange (GHDx). Distance to OSM Major Roads 2016, Central African Republic 2018. Available online: https://hub.worldpop.org/geodata/summary?id=17342 (accessed on 3 October 2022).
- Global Health Data Exchange (GHDx). Low- and Middle-Income Country Neonatal, Infant, and Under-5 Mortality Geospatial Estimates 2000–2017. 2019. Available online: https://ghdx.healthdata.org/record/ihme-data/lmic-under5-mortality-rate-geospatial-estimates-2000-2017 (accessed on 3 October 2022).
- Global Health Data Exchange (GHDx). Africa HIV Prevalence Geospatial Estimates 2000–2017. 2019. Available online: https://ghdx.healthdata.org/record/ihme-data/africa-hiv-prevalence-geospatial-estimates-2000-2017 (accessed on 3 October 2022).
- The Humanitarian Data Exchange (HDx). Central African Republic Healthsites. 2024. Available online: https://data.humdata.org/dataset/central-african-republic-healthsites (accessed on 20 June 2024).
- Koller, D.; Friedman, N. Probabilistic Graphical Models: Principles and Techniques; MIT press: Cambridge, MA, USA, 2009. [Google Scholar]
- MacKay, D.J. Information Theory, Inference and Learning Algorithms; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Jordan, M.I. An Introduction to Probabilistic Graphical Models; University of California: Berkeley, CA, USA, 2003; Available online: https://www.cse.iitk.ac.in/users/piyush/courses/pml_winter16/expfam_glm.pdf (accessed on 20 June 2024).
- Murphy, K.P. Machine Learning: A Probabilistic Perspective; MIT Press: Cambridge, MA, USA, 2012. [Google Scholar]
Variable | Description/Definition | Resolution | Source |
---|---|---|---|
Total population density | Number of people per square kilometer | 100 m a | WorldPop Hub [19] |
Prevalence of underweight in Children | Percentage of children underweight (below −2 SD of weight for age according to the WHO standard) | 5 × 5 km b | Institute for Health Metrics and Evaluation [20] |
Access to improved water source | Percentage of the de jure population living in households whose main source of drinking water is an improved source | 5 × 5 km | Global Health Data Exchange [21] |
Access to improved sanitation facilities | Percentage of the de jure population living in households whose main type of toilet facility is no facility (open defecation) | 5 × 5 km | Global Health Data Exchange [21] |
Prevalence of Stunting in Children | Percentage of children stunted (below −2 SD of height for age according to the WHO standard) | 5 × 5 km | Institute for Health Metrics and Evaluation [20] |
Vaccination coverage (DPT1, DPT3, and measles) | Percentage of children 12–23 months who were vaccinated | 5 × 5 km | Institute for Health Metrics and Evaluation [22,23] |
Literacy (men and women) | Percentage of men and women (age 15–49 years) who are literate | 5 × 5 km | Global Health Data Exchange [24] |
Distance to built settlements | Distance of a built settlement from the centroid of a population cluster measured in meters | 100 m | WorldPop Hub [25] |
Distance to major roads | Distance of a major road from the centroid of a population cluster measured in meters | 100 m | WorldPop Hub [25] |
Children mortality under 5 | Estimates of death counts for children under-5 (0–5 years old) | 5 × 5 km | Global Health Data Exchange [26] |
HIV prevalence | Estimated prevalence among 15–59-year-old individuals | 5 × 5 km | Global Health Data Exchange [27] |
Health Facility coverage (density) | Number of health facilities per square kilometer | Point-level data | The Humanitarian Data Exchange (HDx) [28] |
Night-time lights | The VIIRS data are measured in nanoWatts/cm2/sr | 100 m | WorldPop Hub [25] |
Elevation | Elevation above the sea level (in meters) | 100 m | WorldPop Hub [25] |
TB Clinic | Predicted TB Positivity | Notified TB Positivity |
---|---|---|
Petevo Centre de Santé | Medium | Low |
Lakounanga Urbain Centre de Santé | High | High |
Centre de Santé Saint Joseph | Low | No data available |
Complexe pédiatrique | Medium | No data available |
CNRISTAR CTA | Low | No data available |
Castors CSU | High | High |
CNHUB HN | Medium | Low |
Mamadou Mbaiki Centre de Santé | High | High |
Hospital Communautaire | Low | High |
Obrou Fidel Camp Centre de Santé | Medium | High |
Amis Afrique ONG | Medium | Low |
Malimaka | Medium | Medium |
Hospital Amite | Medium | Medium |
Bédé Combattant CSU | Medium | High |
Variable | Pearson Correlation Coefficient (with Predicted TB Positivity) | p-Value |
---|---|---|
Population density | 0.67 | <0.001 |
Observed TB positivity rate | 0.61 | <0.001 |
Night-time lights | 0.47 | <0.001 |
DPT 3 vaccination (%) | 0.30 | <0.001 |
DPT 1 vaccination (%) | 0.30 | <0.001 |
Children underweight (%) | −0.31 | <0.001 |
HIV Incidence rate | −0.27 | <0.001 |
Distance to built settlements | 0.25 | <0.001 |
Children stunted (%) | −0.23 | <0.001 |
Measles vaccination (%) | 0.21 | <0.001 |
Access to improved water source (%) | 0.205 | <0.001 |
Lacking sanitation services (%) | 0.18 | <0.001 |
Distance to major roads | −0.13 | <0.001 |
Elevation | −0.10 | <0.001 |
Literacy among males (%) | 0.09 | <0.001 |
Literacy among females (%) | 0.05 | <0.001 |
Child mortality rate | 0.03 | <0.001 |
Variable | Pearson Correlation Coefficient (with Notified TB Positivity) | p-Value |
---|---|---|
Predicted TB positivity rate | 0.61 | <0.001 |
Population density | 0.58 | <0.001 |
DPT 1 vaccination (%) | 0.51 | <0.001 |
Lacking sanitation services (%) | 0.43 | <0.001 |
Children underweight (%) | −0.42 | <0.001 |
Children stunted (%) | −0.41 | <0.001 |
Distance to major roads | −0.37 | <0.001 |
Night-time lights | 0.32 | <0.001 |
DPT 3 vaccination (%) | 0.32 | <0.001 |
Measles vaccination (%) | 0.31 | <0.001 |
Literacy among males (%) | 0.25 | <0.001 |
Access to improved water source (%) | 0.22 | <0.001 |
Elevation | −0.18 | <0.001 |
HIV Incidence rate | −0.17 | <0.001 |
Literacy among females (%) | 0.12 | <0.001 |
Child mortality rate | 0.09 | <0.001 |
Distance to built settlements | 0.07 | <0.001 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Koura, K.G.; Hashmi, S.; Menon, S.; Gando, H.G.; Yamodo, A.K.; Budts, A.-L.; Meurrens, V.; Lapelou, S.-C.S.K.; Mbitikon, O.B.; Potgieter, M.; et al. Leveraging Artificial Intelligence to Predict Potential TB Hotspots at the Community Level in Bangui, Republic of Central Africa. Trop. Med. Infect. Dis. 2025, 10, 93. https://doi.org/10.3390/tropicalmed10040093
Koura KG, Hashmi S, Menon S, Gando HG, Yamodo AK, Budts A-L, Meurrens V, Lapelou S-CSK, Mbitikon OB, Potgieter M, et al. Leveraging Artificial Intelligence to Predict Potential TB Hotspots at the Community Level in Bangui, Republic of Central Africa. Tropical Medicine and Infectious Disease. 2025; 10(4):93. https://doi.org/10.3390/tropicalmed10040093
Chicago/Turabian StyleKoura, Kobto G., Sumbul Hashmi, Sonia Menon, Hervé G. Gando, Aziz K. Yamodo, Anne-Laure Budts, Vincent Meurrens, Saint-Cyr S. Koyato Lapelou, Olivia B. Mbitikon, Matthys Potgieter, and et al. 2025. "Leveraging Artificial Intelligence to Predict Potential TB Hotspots at the Community Level in Bangui, Republic of Central Africa" Tropical Medicine and Infectious Disease 10, no. 4: 93. https://doi.org/10.3390/tropicalmed10040093
APA StyleKoura, K. G., Hashmi, S., Menon, S., Gando, H. G., Yamodo, A. K., Budts, A.-L., Meurrens, V., Lapelou, S.-C. S. K., Mbitikon, O. B., Potgieter, M., & Cauwelaert, C. V. (2025). Leveraging Artificial Intelligence to Predict Potential TB Hotspots at the Community Level in Bangui, Republic of Central Africa. Tropical Medicine and Infectious Disease, 10(4), 93. https://doi.org/10.3390/tropicalmed10040093