Enhancing Site Selection Decision-Making Using Bayesian Networks and Open Data
Abstract
1. Introduction
- A novel decision-making framework is proposed, explicitly modeling cause-and-effect relationships among site selection factors using Bayesian Networks.
- A hierarchical BN structure is designed, effectively capturing complex inter-dependencies among factors and enabling dynamic probabilistic inference as new data become available.
- An implementation using real-world open data validates the framework’s practical efficacy, demonstrating its adaptability to various facility types beyond eco-friendly vehicle charging stations.
2. Related Works
2.1. Overview of MCDM Methods
2.2. Probabilistic Approaches
3. Preliminaries
3.1. Bayesian Network
3.2. Generic Types of Cause–Effect Relationships
4. Proposed Framework
4.1. Probability Transformation Based on Data Types
4.2. Construction of Bayesian Network
4.3. Model Validation for Objectivity
- Forward Propagation: Conditional Probability Tables (CPTs) are calculated based on provided data and prior probabilities. This method determines output variable probabilities based on input variables.
- Backward Propagation: CPTs are computed using posterior probabilities of known output variables, thus determining input variable probabilities based on output data.
5. Case Study: Eco-Friendly Vehicle Charging Stations
5.1. Selection of Necessary Factors
5.2. Data Preprocessing
5.3. Probabilistic Modeling Results
5.3.1. Social Factors Modeling
- Population Density (): The Probability Density Function (PDF) is characterized by a mean () of 24,372 and a standard deviation () of 8179, constrained within a data range from 13,242 to 46,577 persons. The expert-defined threshold is 20,000 persons.
- Monthly Average Floating Population Density (): This PDF exhibits a mean () of 441,344 and a standard deviation () of 205,145, bounded within the range [121,290, 867,841] persons. A threshold of 400,000 persons was selected based on expert judgment.
- Number of Eco-friendly Vehicles per Charging Station (): This variable quantifies the maximum number of eco-friendly vehicles per available charging station. The constructed PDF has a mean () of 4 and a standard deviation () of 173, ranging from 0 to 1942 units. The threshold established by experts is 5 units.
- Road Coverage (): Road coverage considers a comprehensive calculation involving road capacity, population distribution, and land area, essential for evaluating local traffic conditions around potential charging stations. The associated PDF has a mean () of 3.70% and a standard deviation () of 1.87%, with a range between 1.29% and 15.34%. The threshold determined through expert analysis is set at 2.9%.
5.3.2. Environmental Factors Modeling
- Nitrogen Dioxide Air Pollution Level (): Primarily emitted from internal combustion engine vehicles, this pollutant poses significant health risks. The constructed PDF has a mean () of 0.031292 ppm, a standard deviation () of 0.014482 ppm, and a defined range between 0.001 and 0.088 ppm. An expert-based threshold of 0.012 ppm is used.
- Carbon Monoxide Air Pollution Level (): Generated mainly from fuel combustion, carbon monoxide is a harmful pollutant that requires reduction in ambient air. The PDF is characterized by a mean () of 0.441904 ppm and a standard deviation () of 0.150112 ppm, with a data range between 0.1 and 1.1 ppm. The threshold value is set at 0.3 ppm.
- Ozone Air Pollution Level (): Ozone produced by internal combustion vehicles is modeled using a PDF with a mean () of 0.011388 ppm, a standard deviation () of 0.011069 ppm, and a range between 0.001 and 0.052 ppm. The threshold is established at 0.003 ppm based on expert recommendations.
- Particulate Matter (): The constructed PDF has a mean () of 37.623809 , a standard deviation () of 19.035647 , and a range spanning from 6 to 99 . An expert-defined threshold is set at 15 .
5.3.3. Technical Factors Modeling
- Charger Capacity (): Charger capacity is treated as discrete data due to distinct categorization into standard capacity ranges. This variable is modeled using a Bernoulli distribution, indicating suitability (True) or unsuitability (False) based on common and most frequently available charger capacities.
- Hydrogen Fuel Supply Method (): This variable is inherently categorical, reflecting specific hydrogen fuel supply methods. The most stable and advantageous method is identified, and its suitability is modeled through a Bernoulli distribution, yielding binary True or False outcomes.
- Number of Vehicles Served per Charging Station (): This variable quantifies the potential number of vehicles serviced at candidate sites, based on available parking capacity. The corresponding PDF is defined by a mean () of 65.667 vehicles, a standard deviation () of 5.1316, and a data range between 50.272 and 85.395 vehicles. An expert-derived threshold of 33.456 vehicles is applied.
5.3.4. Economic Factors Modeling
- Charging Station Construction Cost (): This variable represents the installation cost for the charging station infrastructure. The Probability Density Function (PDF) has a mean () of 31,940,000 Korean Won (KRW) (approximately 24,200 USD) and a standard deviation () of 13,302,982 KRW, with data ranging between 16,900,000 and 58,000,000 KRW. The expert-defined threshold is set at 35,000,000 KRW.
- Average Income at Candidate Site (): Reflecting residents’ average annual income at potential locations, this PDF is characterized by a mean () of 3021.68 (10,000 KRW), a standard deviation () of 753.25 (10,000 KRW), and a range between 1752.97 and 5066.34 (10,000 KRW). An expert-based threshold of 3600 (10,000 KRW) is applied.
- Expected Revenue of Hydrogen Charging Station (): Representing anticipated operational revenue from the installed hydrogen charging stations, the PDF has a mean () of −110,000,000 KRW and a standard deviation () of 100,000,000 KRW, with a range from −300,000,000 to 160,000,000 KRW. The threshold value set by experts is 0 KRW. This projection explicitly excludes government subsidies.
5.4. Bayesian Network Construction Results
5.5. Model Validation for Increased Objectivity
5.6. Site Selection Probability Results
6. Discussion
- Generalizability. Although the empirical analysis focuses on Busan, the proposed framework is structurally general. The four-step pipeline—factor identification, probabilistic transformation, hierarchical BN construction, and validation—can be reproduced for other cities or infrastructure types by substituting appropriate sub-layer indicators and recalibrating the CPTs. Because the model structure is modular, only the parameterization changes when applying the method to different contexts.
- Adaptability to Rural and International Contexts. The modular BN structure allows site-selection factors to be replaced with region-specific indicators. For rural regions, factors such as grid connectivity, land availability, and transportation demand may dominate. For international applications, socio-economic and regulatory factors would require recalibration. Only CPT parameters—not the overall framework—must be rewritten, enabling straightforward transferability.
- Computational Feasibility. The hierarchical structure of the proposed BN significantly reduces computational complexity by limiting parent sets and preventing exponential CPT expansion. Even if additional sub-layer variables or candidate locations are introduced, inference remains tractable because marginalization and belief propagation scale with the number of parents per node rather than total variable count. This makes the framework computationally feasible for larger cities or expanded factor sets.
- Comparison with Classical MCDM Methods. While AHP, TOPSIS, and Fuzzy MCDM approaches provide structured ranking mechanisms [1,2,5,10,11], they do not incorporate causal relationships or probabilistic uncertainty. When we applied a simplified TOPSIS model using the same factors, the resulting rankings differed from the BN-derived probabilities due to the linear compensatory nature of TOPSIS. BN results instead reflect conditional interactions among factors, which explains discrepancies and highlights the advantage of causal modeling for complex site-selection tasks.
7. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| BN | Bayesian Network |
| PGM | Probabilistic Graphical Model |
| MCDM | Multi-Criteria Decision-Making |
| AHP | Analytic Hierarchy Process |
| TOPSIS | Technique for Order Preference by Similarity to Ideal Solution |
| GIS | Geographic Information System |
| SFS | Spherical Fuzzy Sets |
| CBN | Copula Bayesian Network |
| FAHP | Fuzzy Analytic Hierarchy Process |
| PCA | Principal Component Analysis |
| Probability Density Function | |
| PMF | Probability Mass Function |
| CPT | Conditional Probability Table |
| HMM | Hidden Markov Model |
| EV | Electric Vehicle |
References
- Saaty, R.W. The analytic hierarchy process—What it is and how it is used. Math. Model. 1987, 9, 161–176. [Google Scholar] [CrossRef]
- Behzadian, M.; Otaghsara, S.K.; Yazdani, M.; Ignatius, J. A state-of the-art survey of TOPSIS applications. Expert Syst. Appl. 2012, 39, 13051–13069. [Google Scholar] [CrossRef]
- Pan, Y.; Zhang, L.; Koh, J.; Deng, Y. An adaptive decision making method with copula Bayesian network for location selection. Inf. Sci. 2021, 544, 56–77. [Google Scholar] [CrossRef]
- Farahani, R.Z.; SteadieSeifi, M.; Asgari, N. Multiple criteria facility location problems: A survey. Appl. Math. Model. 2010, 34, 1689–1709. [Google Scholar] [CrossRef]
- Pathan, A.I.; Girish Agnihotri, P.; Said, S.; Patel, D. AHP and TOPSIS based flood risk assessment—A case study of the Navsari City, Gujarat, India. Environ. Monit. Assess. 2022, 194, 509. [Google Scholar] [CrossRef]
- Şener, B.; Süzen, M.L.; Doyuran, V. Landfill site selection by using geographic information systems. Environ. Geol. 2006, 49, 376–388. [Google Scholar] [CrossRef]
- Vahidnia, M.H.; Alesheikh, A.A.; Alimohammadi, A. Hospital site selection using fuzzy AHP and its derivatives. J. Environ. Manag. 2009, 90, 3048–3056. [Google Scholar] [CrossRef] [PubMed]
- Özcan, T.; Çelebi, N.; Esnaf, Ş. Comparative analysis of multi-criteria decision making methodologies and implementation of a warehouse location selection problem. Expert Syst. Appl. 2011, 38, 9773–9779. [Google Scholar] [CrossRef]
- Rikalovic, A.; Cosic, I.; Lazarevic, D. GIS Based Multi-criteria Analysis for Industrial Site Selection. Procedia Eng. 2014, 69, 1054–1063. [Google Scholar] [CrossRef]
- Kutlu Gündoğdu, F.; Kahraman, C. A novel spherical fuzzy analytic hierarchy process and its renewable energy application. Soft Comput. 2020, 24, 4607–4621. [Google Scholar] [CrossRef]
- Mathew, M.; Chakrabortty, R.K.; Ryan, M.J. A novel approach integrating AHP and TOPSIS under spherical fuzzy sets for advanced manufacturing system selection. Eng. Appl. Artif. Intell. 2020, 96, 103988. [Google Scholar] [CrossRef]
- Zhang, Y.; Teoh, B.K.; Zhang, L. Integrated Bayesian networks with GIS for electric vehicles charging site selection. J. Clean. Prod. 2022, 344, 131049. [Google Scholar] [CrossRef]
- Xue, J.; Yip, T.L.; Wu, B.; Wu, C.; van Gelder, P. A novel fuzzy Bayesian network-based MADM model for offshore wind turbine selection in busy waterways: An application to a case in China. Renew. Energy 2021, 172, 897–917. [Google Scholar] [CrossRef]
- Heckerman, D. A Tutorial on Learning with Bayesian Networks. In Innovations in Bayesian Networks: Theory and Applications; Holmes, D.E., Jain, L.C., Eds.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 33–82. [Google Scholar]
- Marcot, B.G.; Penman, T.D. Advances in Bayesian network modelling: Integration of modelling technologies. Environ. Model. Softw. 2019, 111, 386–393. [Google Scholar] [CrossRef]
- Kitson, N.K.; Constantinou, A.C.; Guo, Z.; Liu, Y.; Chobtham, K. A survey of Bayesian Network structure learning. Artif. Intell. Rev. 2023, 56, 8721–8814. [Google Scholar] [CrossRef]
- Geiger, D.; Verma, T.; Pearl, J. Identifying independence in bayesian networks. Networks 1990, 20, 507–534. [Google Scholar] [CrossRef]
- Laskey, K.B. Sensitivity Analysis for Probability Assessments in Bayesian Networks. In Uncertainty in Artificial Intelligence; Heckerman, D., Mamdani, A., Eds.; Morgan Kaufmann: San Francisco, CA, USA, 1993; pp. 136–142. [Google Scholar]
- Feng, J.; Xu, S.X.; Li, M. A novel multi-criteria decision-making method for selecting the site of an electric-vehicle charging station from a sustainable perspective. Sustain. Cities Soc. 2021, 65, 102623. [Google Scholar] [CrossRef]
- Mishra, A.R.; Rani, P.; Saha, A. Single-valued neutrosophic similarity measure-based additive ratio assessment framework for optimal site selection of electric vehicle charging station. Int. J. Intell. Syst. 2021, 36, 5573–5604. [Google Scholar] [CrossRef]






| T | T | 0.91 |
| T | F | 0.79 |
| F | T | 0.54 |
| F | F | 0.07 |
| Effect Variables | Causal Variables | Dataset Name |
|---|---|---|
| Social | Population Density | Resident Population by Administrative Region/Age Group |
| Monthly Average Floating Population Density | Busan Metropolitan City Floating Population Data | |
| Number of Eco-friendly Vehicles per Charging Station | Nationwide Charging Station and Vehicle Status Data | |
| Road Coverage | OpenStreetMap API | |
| Environmental | Ozone Air Pollution Level | Monthly Ozone Air Pollution Levels |
| Carbon Monoxide Air Pollution Level | Monthly Carbon Monoxide Air Pollution Levels | |
| Nitrogen Dioxide Air Pollution Level | Monthly Nitrogen Dioxide Air Pollution Levels | |
| Particulate Matter | Monthly Particulate Matter Air Pollution Levels | |
| Technical | Charger Capacity | Nationwide Charging Station Status (Electric, Hydrogen, LPG) |
| Hydrogen Fuel Supply Method | Hydrogen Vehicle Charging Station Status | |
| Number of Vehicles Served per Charging Station | Busan Metropolitan City Public Parking Information | |
| Economic | Charging Station Construction Cost | Ministry of Environment Statistics Data |
| Average Income in Candidate Site | Nationwide Household Income Data | |
| Expected Revenue of Hydrogen Charging Station | Ministry of Environment Statistics Data |
| Annotation | Description |
|---|---|
| Mean of variable x | |
| Standard deviation of variable x | |
| Maximum (right bound) value of data for variable x | |
| Minimum (left bound) value of data for variable x | |
| Threshold values based on expert judgment or literature | |
| Probability density function transformation for variable x | |
| Probabilistic representation of variable x |
| Variables | Probability Transformation Process |
|---|---|
| Population Density () | |
| Monthly Average Floating Population Density () | |
| Number of Eco-friendly Vehicles per Charging Station () | |
| Road Coverage () |
| Variables | Probability Transformation Process |
|---|---|
| Nitrogen Dioxide Air Pollution Level () | |
| Carbon Monoxide Air Pollution Level () | |
| Ozone Air Pollution Level () | |
| Particulate Matter () |
| Variables | Probability Transformation Process |
|---|---|
| Charger Capacity () | |
| Hydrogen Fuel Supply Method () | |
| Number of Vehicles Served per Charging Station () |
| Variables | Probability Transformation Process |
|---|---|
| Charging Station Construction Cost () | |
| Average Income in Candidate Site () | |
| Expected Revenue of Hydrogen Charging Station () |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, J.; Kim, D.; Park, J.; Chun, S. Enhancing Site Selection Decision-Making Using Bayesian Networks and Open Data. Mathematics 2025, 13, 3943. https://doi.org/10.3390/math13243943
Han J, Kim D, Park J, Chun S. Enhancing Site Selection Decision-Making Using Bayesian Networks and Open Data. Mathematics. 2025; 13(24):3943. https://doi.org/10.3390/math13243943
Chicago/Turabian StyleHan, Jungkyu, Daero Kim, Jeonghyeon Park, and Sejin Chun. 2025. "Enhancing Site Selection Decision-Making Using Bayesian Networks and Open Data" Mathematics 13, no. 24: 3943. https://doi.org/10.3390/math13243943
APA StyleHan, J., Kim, D., Park, J., & Chun, S. (2025). Enhancing Site Selection Decision-Making Using Bayesian Networks and Open Data. Mathematics, 13(24), 3943. https://doi.org/10.3390/math13243943

