A Novel Hybrid Fuzzy Comprehensive Evaluation and Machine Learning Framework for Solar PV Suitability Mapping in China

Liao, Yanchun; Miao, Shuangxi; Fan, Wenjing; Liu, Xingchen

doi:10.3390/rs17122070

Open AccessArticle

A Novel Hybrid Fuzzy Comprehensive Evaluation and Machine Learning Framework for Solar PV Suitability Mapping in China

¹

College of Land Science and Technology, China Agricultural University, Beijing 100083, China

²

Key Laboratory of Remote Sensing for Agri-Hazards, Ministry of Agriculture and Rural Affairs, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(12), 2070; https://doi.org/10.3390/rs17122070

Submission received: 15 May 2025 / Revised: 7 June 2025 / Accepted: 12 June 2025 / Published: 16 June 2025

(This article belongs to the Special Issue Innovative Geospatial Information and Earth Observation (GEO) Techniques for Sustainable Development)

Download

Browse Figures

Versions Notes

Abstract

As technological progress and population growth continue to drive rising energy demand, renewable energy has emerged as a key focus of the global energy transition due to its environmental sustainability. However, in suitability assessments and site selection for green energy projects such as photovoltaic (PV) power generation, key criteria such as supply–demand balance and land price are often inadequately considered, despite their direct impact on decision outcomes. Moreover, excessive reliance on expert judgment for weighting, along with the neglect of inter-criterion relationships, introduces uncertainty. Combined with the presence of ill-posed problems, these issues limit the practical value of the evaluation results. This study integrates economic cost–benefit analysis into the evaluation criteria system alongside climatic and geographical criteria, constructing a set of 11 spatial indicators, including global horizontal irradiation (GHI), land prices, and regional power demand, to support PV site selection. Furthermore, a comprehensive evaluation framework is proposed that combines geographic information systems (GIS), multi-criteria decision analysis (MCDA), fuzzy comprehensive evaluation (FCE), and machine learning (ML). The framework enables the collaborative optimization of expert-constrained and data-driven criteria weighting. A national suitability zoning map for PV power plants was developed and validated against actual construction cases. The results demonstrate that the proposed methodology outperforms traditional approaches, achieving a 0.1178 improvement in weight determination compared to expert-based methods, producing a photovoltaic suitability map that more accurately reflects actual construction trends, thereby providing better and more effective support for PV site planning.

Keywords:

solar photovoltaic (PV) power plant; site suitability; geographic information system (GIS); fuzzy comprehensive evaluation (FCE); machine learning (ML)

1. Introduction

Solar energy has emerged as a key pillar of the global energy transition due to its environmental sustainability, abundant availability, and technological maturity. According to the International Renewable Energy Agency (IRENA), renewable energy installations experienced record-breaking growth in 2023, with solar PV expected to remain the leading source through 2030 owing to their cost-effectiveness and scalability [1]. This trend is particularly pronounced in China (see Figure 1), which added 277.57 GW of solar capacity in 2024 (52.37% of the global total), and currently maintains the world’s largest cumulative grid-connected capacity, exceeding 880 GW [2]. However, China faces a significant geographical imbalance in solar deployment: the western regions experience grid absorption constraints leading to resource underutilization, while the eastern areas lag behind due to limited natural conditions and insufficient industrial scale. This spatial mismatch necessitates the development of a novel evaluation framework aimed at producing optimized site suitability maps for photovoltaic deployment, thereby offering strategic guidance for future industrial planning.

Photovoltaic suitability evaluation typically employs multi-criteria decision analysis (MCDA) methods. MCDA has been widely applied in various domains, including urban flood risk assessment [6], selecting sustainable packaging materials [7], ranking alternative options [8], and evaluating the disaster resilience of engineering systems [9]. It generally follows a standardized three-stage procedure: establishing an evaluation criteria system, constructing weight matrices for each criterion, and generating suitability assessment results. This procedure ensures scientific rigor and reproducibility in decision-making, while also allowing for methodological transparency and flexibility.

PV suitability evaluation criteria are typically divided into two groups: common criteria and exclusion criteria. Common criteria primarily contribute to the suitability scoring process, whereas exclusion criteria are used to filter out non-compliant areas prior to the main evaluation and do not participate in the computation of final scores. In conventional studies, the selection of evaluation criteria has primarily focused on meteorological, topographic, and regional criteria. Climatic criteria typically include solar radiation [10,11,12,13], temperature [14,15], humidity [16], precipitation [17,18], and sunshine duration [19]. Topographic criteria encompass slope [20,21], elevation [22,23], and land cover [24,25,26]. Regional criteria primarily concern transport accessibility [27,28], with growing attention to economic criteria like proximity to power transmission lines [29,30], collectively forming a relatively stable evaluation index system. Despite their importance, economic cost–benefit criteria related to PV land development have not received adequate attention in existing studies. Given the large-scale construction requirements of PV plants, regional land acquisition and leasing costs become critical. Furthermore, the intermittent and fluctuating output patterns of PV power generation make long-distance transmission impractical [31], thereby necessitating the inclusion of regional power demand in surrounding areas for comprehensive evaluation.

Existing methods for determining evaluation criteria weights can be categorized into three types. The first method relies solely on subjective expert judgment, where weights are directly assigned based on domain-specific knowledge (Wang et al. [32]). The second method employs statistical techniques, like the Analytic Hierarchy Process (AHP), to constrain subjective judgments and enhance rational decision-making. The third method integrates both subjective and objective approaches, where mathematical techniques refine expert-assigned weights to reduce subjective bias, followed by further processing. For example, the inverse prospect value function-VPRN framework addresses expert psychological reference point deviations and quantifies cognitive ambiguity through VPRN (Zhang et al. [33]). Fuzzy AHP [34] addresses uncertainty in expert judgment by employing fuzzy numbers to represent uncertainty ranges and performing fuzzy consistency checks.

Despite these advances, many current weighting methods are limited by reliance on single-source data, as pre-processed inputs primarily rely on expert knowledge in the absence of supporting reference data. Therefore, integrating external data with expert input is critical to reduce reliance on subjective judgments and improve the rationality and robustness of weighting outcomes.

Among various MCDA methods, the AHP–WLC approach is the most widely adopted by researchers [17,35,36]. Other classical MCDA methods such as TOPSIS [18], ELECTRE [37], and OWA [38], as well as their improved variants [33], have also been applied to PV site suitability mapping. The WLC method reclassifies criteria by establishing thresholds for various suitability levels and generates final results through linear weighting, providing a relatively simple and intuitive procedure. However, WLC is heavily dependent on expert judgment [16,30,32,35,39,40], and traditional single-criterion threshold determination fails to account for inter-criteria relationships, resulting in inconsistent standardization across factors. This necessitates a method that integrates subjective judgment with objective approaches while jointly considering interrelated criteria to establish objective and rational thresholds. A deeper challenge arises from the ill-posed nature of suitability evaluation: This task lacks directly observable true-value labels, essentially representing a non-convex relationship between subjective decision rules and objective spatial data [41,42]. This creates interpretability challenges in traditional evaluation outputs, leading to semantic ambiguity when interpreting numerical differences.

To address the limitations identified in prior research, this study undertakes the following contributions:

(1): Proposes a comprehensive suitability evaluation criteria system that integrate meteorological, topographic, and economic cost–benefit criteria.
(2): Proposes a novel dual-stage AHP–WLC–MLP framework that integrates the I-KMEANS algorithm. This approach aims to overcome the traditional WLC method’s reliance on subjective experience in threshold determination.
(3): Introduces the Fuzzy Comprehensive Evaluation (FCE) method to mitigate semantic ambiguity arising from rigid classification schemes by quantifying fuzzy membership relationships across suitability levels. Additionally, it proposes the FAI to measure fuzzy association intensity, thereby providing an improved approach for sensitivity analysis.

2. Materials and Methods

Figure 2 illustrates the overall framework of this study. For data acquisition, we integrate multi-source geographic, economic, and statistical data, conducting spatial processing and alignment through GIS to establish a comprehensive evaluation criteria system.

To address the limitations of traditional subjective weighting methods, we propose a novel AHP–WLC–MLP approach. In the AHP–WLC–MLP model, AHP sets broad rules using expert knowledge, while I-KMEANS refines these rules by identifying patterns within the expert-defined data domain. MLP then incorporates field data for real-world calibration. This approach ensures logical consistency through expert input, refines details with data, and corrects the model with field data, balancing subjective experience with objective data. First, we calculate correlations between evaluation criteria and determine weight assignment strategies based on these correlations. For highly correlated criteria requiring joint determination, the novel I-KMEANS method integrates domain-expert constraints to set WLC thresholds, reducing subjective bias. Subjective weights are initially derived via AHP to generate AHP–WLC evaluation results. These are then combined with field reference data to create MLP training labels, with SHAP values from the MLP model determining final weights. This methodology effectively combines expert knowledge with data-driven patterns to improve the accuracy of weight estimation.

For suitability evaluation, we adopt the fuzzy comprehensive evaluation (FCE) method to address semantic ambiguity in traditional hard classification schemes. Membership functions are designed based on criteria characteristics, with thresholds informed by I-KMEANS results. We calculate nationwide membership matrices and combine them with weights derived from the AHP–WLC–MLP framework to generate final suitability maps. Sensitivity analysis is conducted using the FAI. The fuzzy comprehensive evaluation (FCE) method addresses the “no standard answer” dilemma in site selection assessment by reconstructing the core decision-making logic. It transforms traditional rigid classifications into continuous membership degrees, quantifying the gradual variation in semantics through mathematical functions, and ultimately producing a multi-level probability distribution map. This framework shifts from seeking an absolute conclusion to reducing the ill-posed problem to a comparison of quantified semantic strengths, significantly alleviating the instability of evaluation results caused by semantic ambiguity and rule conflicts.

2.1. Study Area

China’s terrestrial area spans approximately 9.6 million km² (see Figure 3), characterized by a three-step topographic configuration that descends from west to east. The solar energy distribution exhibits significant spatial heterogeneity.

The northwestern arid and continental regions benefit from 2800–3300 annual sunshine hours, while the Tibetan Plateau, due to its high altitude and low atmospheric attenuation, receives 5000–8000 MJ/m² of annual solar radiation, constituting the nation’s strongest radiation belt. The eastern monsoon region experiences reduced annual sunshine (1200–2000 h) due to cloud cover, particularly pronounced in the middle and lower reaches of the Yangtze River and in South China. Karst topography in Southwest China leads to substantial attenuation of solar radiation due to terrain-induced obstruction. This creates a gradient pattern described as “northwest-high and southeast-low, plateau-strong and basin-weak” across China. This spatial mismatch between solar radiation availability and the distribution of population and economic activity, which are primarily concentrated in the southeastern regions and major plains and basins, necessitates further research and strategic planning of photovoltaic infrastructure to fully harness China’s substantial solar energy potential.

2.2. Data Source

2.2.1. Data

(1): Criteria Data

This study integrates multi-source criteria data (as summarized in Table 1, spatial distributions of the evaluation criteria are illustrated in Figure 4):

Meteorological criteria include solar radiation (GHI), sunshine duration, temperature, precipitation, and humidity. The GHI data is obtained from the Global Solar Atlas platform, with a standard deviation of 0.3% [43]. Sunshine duration data, produced by Northwest A&F University, was generated using observations from 699 meteorological stations across China, applying the Thin Plate Spline (TPS) method with latitude, longitude, and elevation as input variables. The results were evaluated against 1597 independent stations, achieving an R² of 0.82 and a mean absolute error of 1.15 h [44]. Humidity data, downscaled from ERA5, is provided by the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences. After expert review and verification, the data were confirmed to meet the requirements for normal use in terms of completeness, positional accuracy, and attribute accuracy, indicating reliable quality [45]. Precipitation data, sourced from the National Tibetan Plateau Data Center, were downscaled from the original CRU dataset using the delta method and integrated with high-resolution WorldClim climate baselines. Validation against 496 independent stations showed a minimum precipitation MAE of 1.30 mm, meeting the accuracy requirements for engineering applications [46]. Temperature data is derived from daily average measurements (1929–2024) at global stations published by NOAA [47] and processed via inverse distance weighting interpolation.

Topographic criteria consist of elevation, slope, and aspect: elevation data is obtained from NASA, and derived from improved SRTM data [48]; slope and aspect data are computed from elevation using the slope and aspect tools in ArcGIS 10.7 software, respectively.

Location criteria include distance to protected areas and land cover: national protected area distribution data is sourced from the National Geomatics Center of China [49], with distance to protected areas calculated using the distance computation tool in ArcGIS software; land cover data, derived from LANDSAT imagery and the GEE platform via a random forest classifier and overall accuracy (OA) reached 79.31%, is provided by Wuhan University [50].

Economic criteria comprise land price data and regional power demand data: land price data, sourced from the China Land Market Network [51], is spatially processed via inverse distance weighting interpolation (using a power parameter of 2 and a search radius of 100 Km). The interpolation accuracy was validated using 2000 independent samples, yielding an R² of 0.9388 and an MAE of 54.2945 CNY/m²; power demand data, estimated using nighttime light data from Southwestern University of Finance and Economics, which was validated against World Bank statistic at the national scale (R² ≥ 0.91), were further spatially aggregated using a 60 km × 60 km moving window, in which the total demand of all grid cells within the window was assigned to the central cell [52].

(2): Non-Criteria Data

Non-criteria data include data on photovoltaic installation distribution and power generation in China from 2007 to 2022: installation distribution data from Beijing Normal University is derived via random forest classifier on the GEE platform, random forest models were separately constructed for each of China’s seven geographical regions, based on their distinct geomorphological characteristics and PV distribution patterns. Across all years of prediction, each model achieved an F1-score of no less than 0.87 [53]; photovoltaic power generation output is obtained through model estimation and downscaling of field data.

The photovoltaic power generation output is first estimated using the formula:

E_{p} = H_{a} \times \frac{p}{E_{k}} \times K

(1)

where H_a represents the total horizontal solar radiation; E_k is standard irradiance, a constant set at 1 kWh/m²; p denotes the installed component capacity, calculated using the following equation [54]:

p = p_{1} \times S \times n

(2)

where p₁ represents the standard output power per unit area of PV panels, S represents the area of a single panel, and n denotes the total number of panels. The conversion efficiency of PV panels is calculated based on the construction time of the power plant and the average market conversion efficiency at that time [55,56,57,58,59]. The estimated output is further calibrated using provincial PV power generation statistics from the National Bureau of Statistics, with the specific formula as follows:

E_{i} = E_{s a} \times \frac{E_{p i}}{\sum_{j = 1}^{n} E_{p j}}

(3)

where E_i represents the calibrated data for an individual power generation facility, corrected using actual data; E_sa denotes field data within the region; E_pi represents the estimated power generation data for a single PV system, while E_pj refers to the estimated output data for any PV system within the same region.

2.2.2. Data Preprocessing

All data were subjected to reprojection, clipping, standardization, and gridding to 1 km×1 km spatial resolution. For normally distributed data, the ZSCORE-MINMAX standardization method was employed, as expressed by the following formula:

x_{i}^{'} = \frac{\frac{x_{i} - μ}{σ} - \min (\frac{x - μ}{σ})}{\max (\frac{x - μ}{σ}) - \min (\frac{x - μ}{σ})}

(4)

x_i is the raw data point; µ is the mean of raw data and σ is the Standard deviation of raw data. For long-tailed distribution data, logarithmic transformation was applied to reduce left-skewed bias, followed by ZSCORE-MINMAX normalization:

x_{i l}^{'} = \frac{\frac{\ln (x_{i} + ϵ) - μ_{l}}{σ_{l}} - \min (\frac{\ln (x + ϵ) - μ_{l}}{σ_{l}})}{\max (\frac{\ln (x + ϵ) - μ_{l}}{σ_{l}}) - \min (\frac{\ln (x + ϵ) - μ_{l}}{σ_{l}})}

(5)

µ_l is the mean of log-transformed data; σ_l is the Standard deviation of log-transformed data. For inherently non-numerical data, initial scoring is established based on expert knowledge, followed by ZSCORE-MINMAX normalization of the scored values (see Table 2).

2.3. Decision Set Selection

Following prior FCE studies, this research categorizes the indicator set into five suitability levels: Highly Unsuitable, Unsuitable, Moderately Suitable, Suitable, and Highly Suitable.

2.4. Weight Determination

This study proposes a weight optimization method that integrates expert-defined constraints with data-driven methodologies. Dual-source training labels integrate field reference data and expert knowledge: field reference data reflect the combined influence of multiple criteria, while AHP–WLC generates expert evaluation results with WLC thresholds determined via I-KMEANS clustering. A multilayer perceptron (MLP) is employed to model the relationship between input criteria combinations and the hybrid training labels, with SHAP interpretability analysis applied to quantify the contribution of each criterion. This framework balances expert knowledge with data-driven patterns.

2.4.1. AHP–WLC Label Generation

(1): Expert-Driven Weight Determination Via AHP

The standard workflow for determining weights using the AHP consists of defining the set of evaluation criteria, constructing pairwise comparison matrices based on expert judgment, and verifying matrix consistency. Valid weight results are outputted only when passing the consistency check (CR < 0.1) [60]. The judgment matrices are constructed based on Table 3.

The obtained judgment matrix is contained in Table 4.

Calculate weights using the geometric mean method.

w_{i}^{*} = {(\prod_{j = 1}^{n} a_{i j})}^{1 / n} w_{i} = \frac{w_{i}^{*}}{\sum_{k = 1}^{n} w_{k}^{*}}

(6)

where a_ij is an element in the pairwise comparison matrix.

Using the derived weights, compute the maximum eigenvalue λ_max of the pairwise comparison matrix.

λ_{\max} = \frac{1}{n} \sum_{i = 1}^{n} \frac{\sum_{j = 1}^{n} a_{i j} w_{j}}{w_{i}}

(7)

Compute the Consistency Index (CI) using the obtained maximum eigenvalue.

C I = \frac{λ_{\max} - n}{n - 1}

(8)

Compute the Consistency Ratio (CR) using Consistency Index (CI) and Random Index (RI) values.

C R = \frac{C I}{R I}

(9)

Verify if CR exceeds 0.1. If CR < 0.1, the consistency verification is passed. Adopt the weights obtained via the geometric mean method as final criteria weights.

(2): Determination of WLC Thresholds via I-KMEANS

Computation of Inter-Factor Correlations

Pearson correlation coefficients, which measure linear relationships between continuous variables based on covariance and standard deviations, were calculated among criteria. Criterion threshold determination methods were assigned according to correlation levels: criteria with strong correlations underwent joint data-driven partitioning, while weakly correlated criteria were partitioned using expert knowledge and data distribution patterns.

Expert-Driven Threshold Determination

Data-driven unsupervised learning algorithms face two critical limitations: Lack of prior constraints. Vulnerable to sensitivity in discrete variable distributions, leading to degraded clustering performance and interpretability. Unregulated thresholds. Evaluation dimensions often include non-restrictive thresholds that deviate significantly from rational ranges (e.g., PV construction in slopes > 7° remains feasible but substantially increases costs).

This study introduces domain knowledge-driven threshold filtering rules to construct expert informed constraints. By presetting valid domains in feature space, the method applies empirical constraints to input data, achieving two key benefits: Mitigates cognitive bias risks from fully expert-dependent weighting (e.g., AHP) and enhances unsupervised algorithm robustness and semantic traceability.

I-KMEANS

The traditional K-means algorithm partitions the feature space by minimizing within-cluster variance and represents cluster centroids as a two-dimensional matrix, where each column corresponds to an evaluation criterion. However, in renewable energy suitability evaluation, criteria are subject to strict monotonicity constraints (e.g., higher GHI values should correspond to higher suitability scores). Traditional cluster centroids often produce disordered centroid arrangements that violate these monotonicity constraints.

This study proposes an Isotonic K-means (I-KMEANS)algorithm, which enforces monotonicity by alternating between unconstrained optimization steps and constrained projection operations.

The I-KMEANS algorithm is inspired by the projected gradient method in constrained nonlinear optimization, establishing a dual-loop iterative structure (see Figure 5). The outer loop performs standard K-means updates of cluster centroids, analogous to gradient descent, while the inner loop performs constraint projection, where isotonic regression is employed to enforce monotonicity constraints on the unconstrained centers from the outer loop. To prevent identical column values, small perturbation is applied to duplicate elements, followed by SLSQP-based column scaling to preserve similarity to the original distribution while ensuring sufficient variance. This sequence of operations constitutes a single iteration of the inner loop.

The projection steps can be decomposed into the following operations:

(i): Isotonic regression: Isotonic regression modifies each column of the unconstrained cluster centroids individually to satisfy the specified monotonicity constraints. For individual columns, the adjustment objective can be expressed as:

$\begin{matrix} m i n \\ \hat{y} \end{matrix} \sum_{i = 1}^{n} {(\begin{matrix} y_{i} & - & {\hat{y}}_{i} \end{matrix})}^{2} subject to {\hat{y}}_{1} \leq {\hat{y}}_{2} \leq \dots \leq {\hat{y}}_{n}$

(10)

In this study, the PAV (Pool Adjacent Violators) algorithm was utilized to perform isotonic regression. Initially, each observation is treated as an independent block, and a monotonic sequence is constructed by iteratively merging adjacent blocks that violate the monotonicity constraint using weighted averaging. The process can be formally expressed as

$μ_{new} = \frac{\sum_{i \in B_{j}} w_{i} y_{i} + \sum_{i \in B_{j + 1}} w_{i} y_{i}}{\sum_{i \in B_{j}} w_{i} + \sum_{i \in B_{j + 1}} w_{i}}$

(11)

In the equation, B_j denotes the jth block, w_i represents the weight of the ith sample, y_i is the value of the ith sample, µ_j refers to the mean of block B_j, and µ_new indicates the mean of the newly merged block. One iteration of the PAV algorithm is completed by assigning µ_new to all values in the merged violating blocks. The process ends when all blocks meet the required monotonicity.
(ii): Result perturbation: Equal group sizes across successive iterations frequently result in identical values along the same dimension. A perturbation strategy is applied to identical entries in the isotonic regression output, following this rule:

$\{\begin{matrix} \begin{matrix} {\hat{y^{'}}}_{i} = C - \frac{∆}{10} \\ {\hat{y^{'}}}_{j} = C + \frac{∆}{10} \end{matrix} \\ \begin{matrix} {\hat{y^{'}}}_{i + 1} = C - \frac{∆}{10} \cdot \frac{j - i - 1}{2}, i f j - i \geq 2 \\ {\hat{y^{'}}}_{i + 2} = C + \frac{∆}{10} \cdot \frac{1}{2}, i f j - i = 3 \end{matrix} \end{matrix} Δ = \{\begin{matrix} \begin{matrix} |{\hat{y}}_{i - 1} - {\hat{y}}_{i + 1}|, w h e n b o t h n e i g h b o r s e x i s t \\ |{\hat{y}}_{j + 1} - {\hat{y}}_{m i n}|, o n l y r i g h t n e i g h b o r e x i s t s \end{matrix} \\ \begin{matrix} |{\hat{y}}_{m a x} - {\hat{y}}_{i - 1}|, o n l y l e f t n e i g h b o r e x i s t s \\ |{\hat{y}}_{m a x} - {\hat{y}}_{m i n}|, n o v a l i d n e i g h b o r s \end{matrix} \end{matrix}$

(12)
(iii): Result scaling: Final results are derived through column-wise scaling of the perturbed centroids.

$\tilde{y_{i}} = a \cdot \hat{{y^{'}}_{i}} + b$

(13)

Parameters a and b in the scaling formula are optimized on a per-column basis using the Sequential Least-Squares Quadratic Programming (SLSQP) algorithm, with the objective of minimizing the deviation between the scaled and original values while simultaneously maximizing the variance within each column. Predefined weighting coefficients are introduced to balance the dual objectives, thereby ensuring that the scaled data retains key structural characteristics of the original data while promoting more uniform distributions across columns.

$L (a, b) = \underset{distance}{\underset{⏟}{\sum_{i = 1}^{n} {({\tilde{y^{'}}}_{i} - {\tilde{y}}_{i})}^{2}}} + λ_{1} \underset{Variance}{\underset{⏟}{\max {(V_{\min} - Var (\tilde{y^{'}}), 0)}^{2}}} + λ_{2} \underset{distribution}{\underset{⏟}{\sum_{j = 1}^{n - 1} \max {(Δ_{\min} - ({\tilde{y^{'}}}_{j + 1} - {\tilde{y^{'}}}_{j}), 0)}^{2}}}$

(14)

where λ₁, λ₂ are hyperparameters controlling the variance and distribution constraint weights, respectively. V_min is the minimum variance threshold, and ∆_min is the minimum spacing requirement (both are hyperparameters).

The algorithm optimizes scaling parameters a and b per column by approximating the objective function with quadratic models and constraints with linear ones. It iteratively updates the solution along the optimal search direction, ensuring objective reduction and constraint satisfaction. This results in optimal scaling that preserves data fidelity while improving feature discriminability. Once the parameters a and b are determined for all columns of the centroid matrix, scaling is applied to complete the projection step, thereby concluding a single inner-loop iteration. A full iteration consists of an outer loop and a corresponding inner loop. The iteration terminates when the displacement of all cluster centroids between two successive iterations falls below a predefined threshold ε. Three thresholds are calculated using the four cluster centroids derived from I-KMEANS, with the specific formula expressed as

θ_{i} = \frac{{\tilde{y^{'}}}_{i} + {\tilde{y^{'}}}_{i + 1}}{2}

(15)

(3): Result Computation Via WLC

Criteria classification thresholds were determined based on a combination of I-KMEANS clustering and expert knowledge, to facilitate multi-criteria weighting and aggregation using the AHPWLC framework. For supply–demand suitability evaluation, conditional normalization eliminated land price criterion impacts prior to result computation.

2.4.2. Training Sample Generation

The field reference data and the AHP–WLC-derived supply–demand suitability scores were independently normalized within each grid cell.

{\tilde{E}}_{i} = \frac{E_{i} - E_{\min}}{E_{\max} - E_{\min}}, {\tilde{S}}_{i} = \frac{S_{i} - S_{\min}}{S_{\max} - S_{\min}}

(16)

where E_i is the field reference data of the ith grid cell; S_i is the AHP–WLC suitability score of the ith grid cell.

The two normalized datasets are subsequently aggregated using a weighted summation.

R_{i} = w_{1} {\tilde{E}}_{i} + w_{2} {\tilde{S}}_{i}

(17)

Then normalize the weighted summation results to obtain the final output. In this paper, w₁ is set to be 0.4, and w₂ is set to be 0.6.

{\dot{R}}_{i} = \frac{R_{i} - R_{\max}}{R_{\min} - R_{\max}}

(18)

Training samples are constructed by combining fused labels with normalized values of the criteria involved in MLP training within their corresponding grid cells, formatted as [label, x₁, x₂, …, x_n]. Twenty percent of the total samples are randomly selected as validation data to evaluate model performance after each training epoch. These validation samples are held throughout the training process.

2.4.3. Refined Weight Determination Via MLP

The multilayer perceptron (MLP), a classic feedforward neural network, achieves high-dimensional feature representation through hierarchical nonlinear transformations. Its modular architecture includes an input layer, multiple hidden layers using ReLU or Sigmoid activation functions, and a task-oriented output layer. The model extracts abstract features through successive affine transformations and nonlinear activations in hidden layers, trained using backpropagation to minimize a loss function via gradient descent [61].

This study constructs a three-layer MLP architecture using ReLU activations to model nonlinear relationships. The model quantifies prediction deviations via Mean Squared Error (MSE) and optimizes parameters through a backpropagation algorithm, learning a nonlinear mapping from inputs to targets. The architecture effectively captures complex geographical correlations in PV suitability through synergistic mechanisms combining hidden-layer feature abstraction (dimensional expansion) with output-layer dimensionality reduction mapping. The specific structure can be expressed mathematically.

\hat{y} = W^{(2)} \cdot g (W^{(1)} x + b^{(1)}) + b^{(2)}

(19)

Dimension specifications:

Input vector: x ∈ R^d (d-dimensional features)
Hidden layer: W⁽¹⁾ ∈ R^h×d (weight matrix), b⁽¹⁾ ∈ R^h (bias vector)
Output layer: W⁽²⁾ ∈ R^o×h (weight matrix), b⁽²⁾ ∈ R^o (bias vector)
g(·): Hidden layer Activation (E.G., Relu: g(z) = max(0, z))

In MLP training, parameter updates are governed by a training strategy, comprising optimizers and schedulers. This study employs the Adam optimizer [62] and cosine annealing scheduler [63]. The Adam optimizer [62] dynamically adjusts parameter update step sizes using first and second moment estimates of gradients, while the cosine annealing scheduler [63] periodically modulates learning rates by simulating a cosine function curve, enabling the model to escape local optima and enhance generalization capability.

For the trained model, SHAP (Shapley Additive explanations) values [64] are computed to quantify the contribution of each input feature to the model’s predictions. Rooted in Shapley values from cooperative game theory, SHAP quantifies feature impacts by computing weighted integrations of marginal contributions.

ϕ_{i} = \sum_{S \subseteq F ∖ {i}} \frac{| S |! (M - | S | - 1)!}{M!} [E [f (x) ∣ x_{S} \cup {i}] - E [f (x) ∣ x_{S}]]

(20)

Specifically for deep neural networks, the Deep Explainer algorithm (an enhanced variant of Deep LIFT) is employed. This method calculates feature attribution values through backpropagated gradients.

ϕ_{i}^{deep} = \frac{1}{N} \sum_{k = 1}^{N} \sum_{l = 1}^{L} (\prod_{m = l + 1}^{L} ​ ​ W^{(m)}) [\nabla f {(x)}_{i}^{(l)} ⊙ (x_{i}^{(l)} - {\bar{x}}_{i}^{(l, k)})] = E_{D} [\sum_{l = 1}^{L} \frac{\partial f}{\partial x_{i}^{(l)}} \cdot (x_{i}^{(l)} - {\bar{x}}_{i}^{(l)}) \prod_{m = l + 1}^{L} ​ ​ W^{(m)}]

(21)

The recalibrated weights are computed by summing the weight allocations of participating computation criteria, with the land price criterion explicitly excluded from this process. This summation constitutes the final recalibrated weight values.

2.5. Suitability Evaluation Result Computation

2.5.1. Fuzzy Correlation Matrix Computation

The computation of fuzzy correlation matrices primarily relies on membership functions, which quantify elements’ partial belongingness to fuzzy concepts by assigning values within the 0–1 interval. Common membership functions include triangular and trapezoidal forms. Notably, trapezoidal membership functions can be further classified as either ascending or descending semi-trapezoidal types, according to whether the monotonicity is increasing or decreasing.

μ_{R} (x; a, b) = \{\begin{matrix} 0 x \leq a \\ \frac{x - a}{b - a} a < x < b \\ 1 x \geq b \end{matrix} μ_{L} (x; a, b) = \{\begin{matrix} 1 x \leq a \\ \frac{b - x}{b - a} a < x < b \\ 0 x \geq b \end{matrix} μ_{Δ} (x; a, b, c) = \{\begin{matrix} 0 x \leq a \\ \begin{matrix} \frac{x - a}{b - a} a < x \leq b \\ \frac{c - x}{c - b} b < x \leq c \end{matrix} \\ 0 x \geq c \end{matrix}

(22)

(1): Combination and Selection of Membership Functions

The selection and combination of membership functions must account for their inherent properties: Ascending semi-trapezoidal functions are suitable for representing fuzzy attributes with cumulatively positive effects. Descending semi-trapezoidal functions effectively represent attributes with cumulatively negative effects. Triangular functions are more effective in capturing attributes that represent intermediate fuzzy states.

For this study, customized composite membership functions are constructed by integrating ascending semi-trapezoidal, descending semi-trapezoidal, and triangular functions to model positive and negative criteria, respectively.

Single-Factor Fuzzy Correlation Vector Computation

The cluster centers derived from I-KMEANS clustering and expert-defined thresholds are used as thresholds in the construction of membership functions, with each criterion’s composite membership function being uniquely configured. The corresponding membership functions are applied to quantify the degree of membership of regional criteria to each decision grade, the single-criterion fuzzy correlation vector v is derived.

v = [μ_{1}, μ_{2}, μ_{3}, μ_{4}, μ_{5}]

(23)

(2): Membership Matrix Computation

The final membership degree matrix is constructed from fuzzy correlation matrices across all evaluated regions. Each regional fuzzy correlation matrix consists of the fuzzy correlation vectors corresponding to all contributing criteria in that region, which can be expressed as

V = {[v_{1} v_{2} \dots v_{11}]}^{T} = [\begin{matrix} \begin{matrix} μ_{1,1} & μ_{1,2} & \dots \\ μ_{2,1} & μ_{2,2} & \dots \\ ⋮ & ⋮ & ⋱ \end{matrix} & \begin{matrix} μ_{1,5} & \dots & μ_{1,11} \\ μ_{2,5} & \dots & μ_{2,11} \\ ⋮ & ⋱ & ⋮ \end{matrix} \\ \begin{matrix} μ_{5,1} & μ_{5,2} & \dots \\ ⋮ & ⋮ & ⋱ \\ μ_{11,1} & μ_{11,2} & \dots \end{matrix} & \begin{matrix} μ_{5,5} & \dots & μ_{5,11} \\ ⋮ & ⋱ & ⋮ \\ μ_{11,5} & \dots & μ_{11,11} \end{matrix} \end{matrix}]

(24)

The final membership degree matrix can be expressed as

R = {[\begin{matrix} \begin{matrix} V_{1} & V_{2} \end{matrix} & \begin{matrix} \dots & V_{n} \end{matrix} \end{matrix}]}^{T}

(25)

2.5.2. Weighted Result Computation

The fuzzy comprehensive evaluation matrix is obtained by multiplying the predefined weight vector with the membership degree matrix.

S = W \cdot R = {[\begin{matrix} \begin{matrix} s_{1} & s_{2} \end{matrix} & \begin{matrix} \dots & s_{n} \end{matrix} \end{matrix}]}^{T}

(26)

2.5.3. Result Computation

For each one-dimensional vector corresponding to a grid cell in the fuzzy comprehensive evaluation matrix, the decision grade corresponding to the maximum value is assigned as the PV suitability result for that cell.

r = argmax (s_{n}) = argmax ([\begin{matrix} \begin{matrix} μ_{1} & μ_{2} \end{matrix} & \begin{matrix} \dots & μ_{5} \end{matrix} \end{matrix}])

(27)

3. Result

3.1. Threshold Classification Results Based on I-KMEANS

This study engaged domain experts to score evaluation criteria and construct judgment matrices, ultimately determining expert-based weights Via AHP (specific weight values are provided in Table 5). Pairwise linear correlations among criteria were calculated, and are presented in Figure 6.

All criteria, except aspect, exhibit significant correlations and were subjected to domain-constrained clustering using I-KMEANS for WLC threshold determination. Aspect thresholds were assigned independently based on expert input. The effectiveness of the clustering was evaluated by randomly selecting 400 points from each cluster in the I-KMEANS results, visualized via t-SNE dimensionality reduction (see Figure 7).

The clusters remained well-separated after dimensionality reduction, confirming the reliability of the I-KMEANS approach. The final WLC thresholds are summarized in Table 6.

3.2. MLP-Based Weight Determination

To generate fused training labels, the grid-connected electricity generation of PV facilities was selected as the field reference data due to its strong causal relationships and correlations with all evaluation criteria except land price. Supply–demand suitability, excluding land price, was computed using the AHP–WLC model and subsequently fused with the PV generation data using the method described in Section 2.4.2 to generate the training labels. Normalized evaluation data (excluding land price) were used as training inputs. Model training and weight estimation followed the procedure outlined in Section 2.4.3, and the final results are presented in Figure 8.

In the final weight allocation, the weight assigned to the land price criteria (which was excluded from model training) remains unchanged, whereas the weights of the remaining criteria are determined based on the training results. The final results are presented in Table 7.

3.3. Suitability Assessment Results

The results indicate that regional power demand holds the highest weight, followed closely by solar radiation, while land price ranks lower than both. All other criteria exhibit substantially lower weights.

Membership functions were constructed based on each criterion’s characteristics, with distinct configurations applied to those exerting positive versus negative influences, as illustrated in Figure 9.

The intersection points of the combined membership functions with the X-axis correspond to four centroids derived from I-KMEANS clustering and expert-defined constraints (see Table 5 for specific values).

Based on the procedures described in Section 2.5.1, the membership matrix was subsequently calculated. The weight vector obtained from the AHP–WLC–MLP method was subsequently dot-multiplied with the membership matrix. Evaluation results were derived through row-wise argmax operations, ultimately generating the PV construction suitability map shown in Figure 10:

3.4. Analysis of Result

3.4.1. Analysis of National Suitability Evaluation Results

Nationally, moderately suitable areas constitute the largest proportion at 2,395,126 km² (24.3% of total territory). In contrast, highly unsuitable areas comprise the smallest share, with an area of 281,361 km² (2.8%). Both suitable and unsuitable zones exhibit similar proportions, each accounting for approximately 20%. Highly suitable areas represent a slightly lower 16%, yet still significantly exceed the highly unsuitable category, indicating generally favorable conditions for PV infrastructure deployment across China (see Figure 11).

Exclusion zones cover 17.3% of the national land area, primarily due to stringent ecological protection policies and the widespread presence of cultural heritage sites and scenic landscapes.

3.4.2. Analysis of Regional Suitability Evaluation Results

China is categorized into seven regions according to administrative boundaries and climatic zones: Northeast, North, South, Central, Northwest, Southwest, and East. The statistical analysis indicates that: the Northwest and Southwest regions exhibit the highest proportion of highly suitable areas (33.3%, respectively); North, Northwest, and Southwest show significantly higher proportions of areas classified as moderately suitable or higher compared to less suitable zones; the Central, Southern, and Eastern regions exhibit concentrations of moderately suitable or above areas in high electricity consumption zones, while unsuitable/highly unsuitable areas account for 85.8%, 66.1%, and 71.9%, respectively, in these regions. These regions are thus suitable for PV facility deployment only in areas with high electricity demand (Specific results are presented in Figure 12).

3.5. Threshold Determination Algorithm Experiment

3.5.1. Comparative Experiments

Comparative experiments were conducted to compare the proposed I-KMEANS method with traditional isotonic thresholding approaches (Isotonic Regression, Natural Breaks, and Quantile methods; the results are presented in Table 8). Valuation using the Calinski–Harabasz index demonstrated that I-KMEANS outperformed other methods, achieving a score 1.37 times higher than the second-best method (Isotonic Regression). Additionally, I-KMEANS produced threshold distributions more closely aligned with a normal distribution, further validating its algorithmic advantages.

3.5.2. Ablation Study

An ablation study was conducted on the proposed I-KMEANS algorithm to evaluate the impact of the inclusion or exclusion of the Isotonic and scaling modules on clustering performance, measured via the Calinski-Harabasz index. (Results are presented in Table 9).

The ablation study demonstrates that the proposed I-KMEANS algorithm achieves higher Calinski–Harabasz (CH) index values than both K-means with only isotonic regression and K-means using direct centroid ordering as an alternative to isotonic regression, thereby validating the rationale underlying the proposed I-KMEANS methodology.

3.6. Comparison of Weight Determination Methods

To validate the practical alignment of the proposed weight determination method, factor suitability scores for regions with newly constructed PV plants across mainland China (2007–2022) were statistically analyzed. This involved calculating both the aggregate suitability scores and average annual growth rates of criteria in newly developed areas, with the results visualized using the Nightingale rose diagram (see Figure 13). Consistency between factor weight magnitudes and the ranking patterns of both aggregate scores and growth rates demonstrates the improved accuracy and real-world applicability of the proposed method.

This study compared classical objective weighting methods (CRITIC, EWM) and AHP with the proposed AHP–WLC–MLP method, using the aggregate suitability scores and average annual growth rates of criteria in newly developed areas as a benchmark with a 30% tolerance threshold for deviations from benchmark criteria weights. The results quantified deviations exceeding the tolerance thresholds for each method, demonstrating the superior performance of the AHP–WLC–MLP approach (detailed comparative results are presented in Figure 14 and Figure 15).

The proposed AHP–WLC–MLP method offers systematic improvements over traditional approaches. In overall score prediction, it shows significant accuracy improvements in weights for seven criteria (regional electricity demand, aspect, humidity, etc.) relative to EWM, reducing overall deviation by 0.27414; maintains equivalent precision to AHP in four criteria (including regional electricity demand) while improving weight accuracy for seven non-temperature criteria, decreasing total deviation by 0.1168; achieves 0.0883 accuracy enhancement in the core criteria (regional electricity demand) versus CRITIC, with 0.0048 lower overall deviation and one position improvement in ranking deviation. In predicting criteria growth trends, it outperforms EWM in eight criteria (regional electricity demand, land price, aspect, etc.), reducing overall deviation by 0.3166; matches AHP in four criteria (e.g., electricity consumption) while significantly enhancing seven others, lowering total deviation by 0.1178; establishes predictive advantages over CRITIC in electricity demand (+0.0410) and aspect (+0.0391), maintaining a similar overall deviation while improving ranking accuracy by one position. Dual experiments confirm the method’s superiority in both weight determination accuracy (max deviation gap of 0.3166 versus EWM/AHP/CRITIC) and fitting of criteria growth trends (average error < 0.04 for critical criteria), surpassing existing methodological frameworks.

Although the proposed method yields a similar overall deviation compared to the CRITIC approach, it offers a more robust representation of causal logic: criteria such as humidity, sunshine duration, precipitation, aspect, and elevation influence suitability outcomes primarily through their effects on near-surface solar radiation levels, thereby establishing clear causal relationships. When solar radiation itself receives minimal weight, criteria dependent on its importance should logically have even lower weights. The CRITIC method fails to reflect this causal hierarchy, resulting in evaluations that align with empirical construction trends, but exhibit reduced logical coherence and interpretability.

3.7. Sensitivity Analysis

In previous studies, sensitivity analysis primarily focused on assessing the impact of criteria weights on suitability assessment outcomes. Building upon the novel FCE framework, this study introduces a new sensitivity metric, termed the Fuzzy Ambiguity Index (FAI), defined as

A_{i j} = \frac{2 μ_{i} μ_{j}}{μ_{i} + μ_{j}} \cdot e^{- \frac{| μ_{i} - μ_{j} |}{μ_{i} + μ_{j}}}

(28)

FAI = Max (A_{j - 1, j}, A_{j + 1, j})

(29)

where µ is the membership degree; j is the position corresponding to the maximum membership degree.

FAI quantifies the degree of fuzzy correlation between suitability evaluation results and bilateral membership degrees. A higher FAI value indicates heightened ambiguity in evaluation outcomes (reduced distinguishability), whereas a lower FAI value signifies diminished ambiguity (enhanced decisiveness). By integrating FAI with conventional sensitivity metrics, the model’s weight sensitivity can be assessed more comprehensively.

The FAI factor’s ability to analyze uncertainty and semantic ambiguity in evaluation results relies on the properties of membership degrees: the membership degrees for suitability levels follow an approximate normal distribution, and the two highest degrees belong to adjacent levels. This allows the FAI to reflect semantic ambiguity by comparing the similarity between the highest membership degree and those of adjacent levels. A high FAI value indicates that the final decision level (determined by argmax) has membership very close to an adjacent level, meaning the alternative level is also significant. This signifies ambiguity in the region’s classification, resulting in lower reliability and unstable evaluation results.

This section analyzes the sensitivity of suitability evaluation results and FAI values to weight perturbations. Individual criteria weights were adjusted within a range of ±0.2 range using 0.05 increments, while proportionally redistributing remaining weights. Boxplots were generated to visualize variations in suitability scores and FAI values under these parametric adjustments (see Figure 16).

The model exhibited strong robustness to parametric perturbations in criteria weights, with maximum variation in suitability outcomes remaining within ±1,500,000 under weight perturbations, corresponding to average changes of ±0.15 in suitability scores. Variations in FAI values remained within ±0.005, indicating minimal impact on evaluation ambiguity. Median FAI changes for humidity, land cover, elevation, precipitation, and sunshine duration criteria were positive, indicating that weight adjustments increased fuzziness in suitability assessments.

Notable sensitivity was observed for regional power demand, aspect, land price, and solar radiation criteria, with aspect exhibiting the highest sensitivity. These criteria exhibited negative median FAI values with overall left-skewed distributions, indicating that weight adjustments enhanced result clarity by reducing evaluation ambiguity.

While the FAI value ranges for the four criteria (Regional Power Demand, Solar Radiation, Land Price, Humidity) differ, their boxplots share a key feature: the lower whisker values are much more smaller than the median and upper whisker values. This pattern indicates that these are “inert criteria”: major weight changes strongly affect outcomes, while minor changes have little impact on semantic ambiguity. This behavior arises from their right-skewed distributions. Most low, clustered values have minimal impact under low weights. However, when weights exceed a critical threshold, the few high values significantly boost the criteria’s ability to resolve ambiguity, outweighing the influence of the many low values.

4. Discussion

4.1. Research Implications

In PV site suitability evaluation, comprehensive consideration of evaluation criteria and the rationality of weight determination approaches are critical. Addressing limitations in existing approaches regarding criteria selection, weight assignment, and suitability calculation methods, this study, grounded in the economic characteristics of renewable energy and supported by data-driven algorithms, proposes a decision-making model jointly driven by expert knowledge and data, which comprehensively incorporates PV climatic conditions, topography, locational attributes, and economic criteria. The model innovatively introduces the I-KMEANS method for threshold determination and utilizes MLP models and SHAP values for weight optimization. fuzzy comprehensive evaluation (FCE) is incorporated to address ill-posedness in suitability calculation and measure the impact of weight variations on the fuzziness of evaluation results using FAI, thereby overcoming previous methodological limitations and providing novel insights for PV plant siting research.

This study reveals that the regional power demand criteria introduced in this analysis emerge as a critical determinant, exceeding the significance of even solar radiation. China’s PV deployment trends provide empirical support for this finding. While solar radiation has been widely recognized as the dominant criterion in prior research, newly developed PV regions in China are not located in areas with the highest levels of solar radiation. This counterintuitive finding highlights the divergence between real-world development patterns and theoretical suitability assessments, posing challenges for long-term PV planning while simultaneously demonstrating the methodological validity and advantages of our proposed framework.

Based on the suitability evaluation results, the following recommendations are proposed for China’s future photovoltaic siting planning:

(1): Northwest/Southwest China: Prioritize centralized large-scale PV plants in areas with high suitability but low local energy demand and leveraging industrial agglomeration to minimize transmission losses from dispersed small-to-medium installations.
(2): South/Central/East China: Avoid the deployment of large-scale PV plants in regions with limited solar resources; instead, promote distributed PV systems to support localized self-consumption, with excess energy fed into the grid to satisfy regional electricity demands.
(3): North China: Adopt a hybrid development strategy that integrates centralized and distributed systems to optimize solar resource utilization and address regional energy needs.
(4): National Wide: Enhance PV panel conversion efficiency and long-distance transmission capacity, reduce transmission fluctuation losses, and improve solar energy utilization efficiency.

4.2. Comparison with Results from Traditional Methods

Building on proposed planning recommendations, a comparative analysis is conducted between the suitability evaluation results derived from this study with those from other established methods, and the similarities and differences between these assessment outcomes are discussed.

Using this study’s evaluation results as the benchmark, a comparison is made between the CRITIC-weighted and EWM-weighted FCE models. The CRITIC method exhibits significant deviations from our approach: global overestimations are observed in CRITIC-FCE compared to our method, with pronounced overestimations in southwestern, northwestern, northeastern, and central China, underestimations in southeastern coastal regions, and mixed positive–negative distributions in eastern and northern China. These discrepancies stem from CRITIC’s emphasis on topographic criteria (elevation, aspect, slope) and land use types while neglecting regional electricity demand and solar radiation. For instance, CRITIC overestimates suitability in the Hengduan–Himalayan transition zone and Northern Xinjiang due to high topographic suitability (high elevation, low slope) despite low solar radiation. Mixed valuations in northern and eastern China arise from fragmented land use and topographic parameter distributions (increasing planning complexity). Southeastern coastal areas are underestimated due to hilly terrain and limited bare land coverage (specific spatial distribution of difference values is shown in Figure 17a).

In contrast to the CRITIC disparity pattern, EWM-weighted results reveal more distinct spatial variation: low/extremely low values dominate Northwest and Southwest China, while high values cluster in Central and Southeast coastal regions, forming distinct square-shaped patchy patterns. This arises from EWM’s weight determination approach placing excessive emphasis on regional power demand while undervaluing other criteria’ contributions (specific spatial distribution of difference values is shown in Figure 17b). A comparative analysis between the results of this study and those from the RITIC-FCE and EWM-FCE algorithms, which lack expert-informed constraints, reveals a critical pattern: unconstrained algorithmic outcomes often deviate significantly from expert consensus. The EWM-FCE approach tends to assign excessive weight to regional power demand, while CRITIC-FCE overlooks the importance of solar radiation—a key criterion in expert consensus. Although current CRITIC-weighted results align with actual construction trends, its omission of fundamental energy endowment criteria, such as solar radiation, may progressively diminish its applicability as energy storage and transmission technologies advance.

Furthermore, this study compares the results with those obtained from AHP–WLC and MLP-WLC methods to examine differences between FCE and WLC approaches. To ensure dimensional consistency, the continuous outcomes derived from the WLC method were discretized into five suitability levels (1–5) using the quantile method. Visual comparisons indicate WLC method produces mixed high and low-value distributions without discernible spatial patterns, reflecting threshold boundary uncertainties. This suggests that converting WLC’s continuous outputs into robust categorical results may compromise the separation and interpretability of the original outcomes. (specific spatial distribution of difference values is shown in Figure 17c,d).

A comparison with previous evaluation methods that disregarded land costs and regional power demand reveals notable differences in spatial suitability distributions. High suitability zones are primarily concentrated in the central Inner Mongolia Plateau, the majority of the Hengduan Mountains, the southeastern Tibetan Plateau, and eastern Southern Xinjiang, which are characterized by low temperatures, high elevations, extended sunshine durations, and intense solar radiation. In contrast, low suitability areas are distributed across Northeast China, North China, the southeastern coastal regions, and Central China. These patchy distributions result from unaccounted regional power demand, whereas sporadic high-suitability values in these areas reflect elevated land costs compared to Northwest and Southwest China. Additionally, localized zones of low suitability are observed in Northern Xinjiang, primarily due to the omission of power demand, and in western Southern Xinjiang, resulting from reweighted resource endowment criteria. This analysis demonstrates that reliance solely on resource endowment while overlooking supply–demand dynamics substantially reduces the spatial extent of suitable areas for PV development, thereby constraining the industry’s sustainable expansion (see Figure 17e for the specific spatial distribution of differences).

Comparative analysis of different evaluation methods identified key similarities and differences. The chord diagram (see Figure 18) showed the lowest correlations between unconstrained data-driven algorithms (particularly notable with EWM-FCE) and other methods, while AHP–WLC exhibited the highest correlations and ranked first in linkages with MLP-WLC, Resource Result, and the proposed method. The proposed method demonstrated strong compatibility with AHP-WLC, MLP-WLC, and Resource Result, reflecting its inclusive design.

4.3. Limitations and Future Directions

The evaluation method proposed in this study is designed for large-scale suitability assessments and is more suitable for continental or national-scale analyses. However, its applicability may be limited when applied to smaller regions such as provinces, cities, small countries, or islands. This primarily stems from the I-KMEANS algorithm’s reliance on linear scaling without considering nonlinear relationships. While this design reduces computational demands for large datasets, it necessitates trade-offs based on the scale of the study area. Variations in research scale may lead to discrepancies between results derived from the proposed methodology and those observed in the current analysis, primarily due to scale effects.

The method proposed in this study is applicable not only to PV site suitability mapping but also to wind power, geothermal, and other clean energy siting studies, as well as socioeconomic and environmental planning tasks. For example, in assessing the suitability of onshore wind power systems, a criterion system encompassing climate, terrain, location, and economic factors can be developed, with specific indicators adjusted as needed (e.g., replacing GHI with multi-height wind speed). Expert labels are derived via the AHP–WLC method, while electricity generation data inform a hybrid labeling strategy for weight determination. The final evaluation is conducted using the FCE framework. For cross-domain applications, the key challenge lies in identifying reliable ground truth labels from field reference data to construct hybrid training labels for the MLP framework. Furthermore, the MLP architecture can be adjusted based on task complexity: increasing network depth and number of neurons in hidden layers enhances nonlinear fitting capacity when handling multi-criteria decision scenarios. Alternative machine learning algorithms, such as Random Forest and Support Vector Machines, could also replace MLP depending on specific task requirements.

While the model proved effective for PV suitability assessment across China, several limitations persist: (1) Residual subjective biases from expert judgments remain despite mitigation efforts, necessitating multi-expert validation to enhance objectivity; (2) insufficient evaluation criteria, particularly regarding criteria like proximity to transmission networks, primarily due to data availability constraints; (3) heavy dependence on field reference data compatible with expert-derived results restricts generalizability, requiring specific ground truth data integration for optimal performance.

5. Conclusions

This study proposes a photovoltaic site selection decision-making model that integrates expert knowledge with data-driven approaches. Thresholds are determined using the I-KMEANS algorithm, while criterion weights are optimized via machine learning techniques such as MLP and SHAP. To address ill-posed evaluation problems, the model incorporates FCE and FAI. Results indicate that regional power demand is a more influential criterion than traditionally prioritized solar radiation, highlighting a notable divergence between actual photovoltaic development practices and theoretical assessments in China. This confirms the model’s effectiveness in aligning evaluation outputs with practical energy demands. Based on the evaluation outcomes, the study recommends the development of centralized photovoltaic systems in Northwest and Southwest China to reduce transmission losses, the promotion of distributed rooftop systems in South and East China, and the adoption of a hybrid deployment strategy in North China. While the method is well-suited for large-scale assessments, its applicability to small-scale regions is constrained by the I-KMEANS algorithm’s reliance on linear scaling, necessitating adjustments to account for scale effects. The model is extensible to other renewable energy siting tasks, such as wind and geothermal power, although cross-domain applications require solutions for constructing real-world training labels and adapting the MLP architecture. Despite efforts to mitigate subjectivity through multi-expert collaboration and data fusion, limitations remain, including strong dependence on field reference data and incomplete evaluation criteria—necessitating further refinement of both the indicator system and data infrastructure.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, Y.L.; validation, Y.L., S.M., W.F. and X.L.; formal analysis, Y.L.; investigation, Y.L., S.M. and W.F.; resources, S.M.; data curation, Y.L., W.F. and X.L.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L., W.F. and S.M.; visualization, Y.L.; supervision, Y.L. and S.M.; project administration, S.M.; funding acquisition, S.M. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China (2023YFB3907600).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

Acknowledgments

The authors sincerely acknowledge Jin Chen, Xuehong Chen and Xin Cao from Beijing Normal University for their generous support and insightful suggestions, which have significantly contributed to the refinement and completion of this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

International Renewable Energy Agency. World Energy Transitions Outlook 2024. Available online: https://www.iea.org/reports/world-energy-outlook-2024#overview (accessed on 27 February 2025).
China Photovoltaic Industry Association. China PV Industry Development RoadMap. Available online: https://english.www.gov.cn/news/202502/27/content_WS67c01568c6d0868f4e8f018b.html (accessed on 27 February 2025).
International Renewable Energy Agency. Renewable Capacity Statistics 2025. Available online: https://www.irena.org/Publications/2025/Mar/Renewable-capacity-statistics-2025 (accessed on 27 February 2025).
International Renewable Energy Agency. Country Rankings. 2025. Available online: https://www.irena.org/Data/View-data-by-topic/Capacity-and-Generation/Country-Rankings (accessed on 6 June 2025).
Energy Institute. Energy Institute Statistical Review of World Energy 2023. Available online: https://www.energyinst.org/statistical-review (accessed on 6 June 2025).
Zhao, H.; Gu, T.; Tang, J.; Gong, Z.; Zhao, P. Urban flood risk differentiation under land use scenario simulation. iScinece 2023, 26, 106479. [Google Scholar] [CrossRef] [PubMed]
Mudgal, D.; Pagone, E.; Salonitis, K. Selecting sustainable packaging materials and strategies: A holistic approach considering whole lifecycle and customer preferences. J. Clean. Prod. 2024, 481, 144133. [Google Scholar] [CrossRef]
Liang, Q.; Zhang, Z.; Su, Y.S. Constructive preference elicitation for multi-criteria decision analysis using an estimate-then-select strategy. Inf. Fusion 2025, 118, 102926. [Google Scholar] [CrossRef]
Keisler, J.M.; Wells, E.M.; Linkov, I. A Multicriteria Decision Analytic Approach to Systems Resilience. Int. J. Disaster Risk Sci. 2024, 15, 657–672. [Google Scholar] [CrossRef]
Arán Carrión, J.; Espín Estrella, A.; Aznar Dols, F.; Zamorano Toro, M.; Rodríguez, M.; Ramos Ridao, A. Environmental Decision-Support Systems for Evaluating the Carrying Capacity of Land Areas: Optimal Site Selection for Grid-Connected Photovoltaic Power Plants. Renew. Sustain. Energy Rev. 2008, 12, 2358–2380. [Google Scholar] [CrossRef]
Fluri, T.P. The Potential of Concentrating Solar Power in South Africa. Energy Policy 2009, 37, 5075–5080. [Google Scholar] [CrossRef]
Gómez, M.; López, A.; Jurado, F. Optimal Placement and Sizing from Standpoint of the Investor of Photovoltaics Grid-Connected Systems Using Binary Particle Swarm Optimization. Appl. Energy 2010, 87, 1911–1918. [Google Scholar] [CrossRef]
Sun, Y.; Hof, A.; Wang, R.; Liu, J.; Lin, Y.; Yang, D. GIS-Based Approach for Potential Analysis of Solar PV Generation at the Regional Scale: A Case Study of Fujian Province. Energy Policy 2013, 58, 248–259. [Google Scholar] [CrossRef]
Ghasemi, G.; Noorollahi, Y.; Alavi, H.; Marzband, M.; Shahbazi, M. Theoretical and Technical Potential Evaluation of Solar Power Generation in Iran. Renew. Energy 2019, 138, 1250–1261. [Google Scholar] [CrossRef]
Al Garni, H.Z.; Awasthi, A. Solar PV Power Plant Site Selection Using a GIS-AHP Based Approach with Application in Saudi Arabia. Appl. Energy 2017, 206, 1225–1240. [Google Scholar] [CrossRef]
Almasad, A.; Pavlak, G.; Alquthami, T.; Kumara, S. Site Suitability Analysis for Implementing Solar PV Power Plants Using GIS and Fuzzy MCDM Based Approach. Sol. Energy 2023, 249, 642–650. [Google Scholar] [CrossRef]
Tercan, E.; Eymen, A.; Urfalı, T.; Saracoglu, B.O. A sustainable framework for spatial planning of photovoltaic solar farms using GIS and multi-criteria assessment approach in Central Anatolia, Turkey. Land Use Policy 2021, 102, 105272. [Google Scholar] [CrossRef]
Hooshangi, N.; Mahdizadeh Gharakhanlou, N.; Ghaffari Razin, S.R. Evaluation of Potential Sites in Iran to Localize Solar Farms Using a GIS-Based Fermatean Fuzzy TOPSIS. J. Clean. Prod. 2023, 384, 135481. [Google Scholar] [CrossRef]
Doorga, J.R.S.; Rughooputh, S.D.D.V.; Boojhawon, R. Multi-criteria GIS-based modelling technique for identifying potential solar farm sites: A case study in Mauritius. Renew. Energy 2019, 133, 1201–1219. [Google Scholar] [CrossRef]
Arnette, A.N.; Zobel, C.W. Spatial Analysis of Renewable Energy Potential in the Greater Southern Appalachian Mountains. Renew. Energy 2011, 36, 2785–2798. [Google Scholar] [CrossRef]
Omid, A.A.M.; Alimardani, R.; Sarmadian, F. Developing a GIS - Based Fuzzy AHP Model for Selecting Solar Energy Sites in Shodirwan Region in Iran. Int. J. Adv. Sci. Technol. 2014, 68, 37–48. [Google Scholar] [CrossRef]
Domínguez Bravo, J.; García Casals, X.; Pinedo Pascua, I. GIS Approach to the Definition of Capacity and Generation Ceilings of Renewable Energy Technologies. Energy Policy 2007, 35, 4879–4892. [Google Scholar] [CrossRef]
Omitaomu, O.A.; Singh, N.; Bhaduri, B.L. Mapping Suitability Areas for Concentrated Solar Power Plants Using Remote Sensing Data. J. Appl. Remote Sens. 2015, 9, 097697. [Google Scholar] [CrossRef]
Djebbar, R.; Belanger, D.; Boutin, D.; Weterings, E.; Poirier, M. Potential of Concentrating Solar Power in Canada. Energy Procedia 2014, 49, 2303–2312. [Google Scholar] [CrossRef]
Charabi, Y.; Gastli, A. Integration of Temperature and Dust Effects in Siting Large PV Power Plant in Hot Arid Area. Renew. Energy 2013, 57, 635–644. [Google Scholar] [CrossRef]
Dagdougui, H.; Ouammi, A.; Sacile, R. A Regional Decision Support System for Onsite Renewable Hydrogen Production from Solar and Wind Energy Sources. Int. J. Hydrogen Energy 2011, 36, 14324–14334. [Google Scholar] [CrossRef]
Janke, J.R. Multicriteria GIS Modeling of Wind and Solar Farms in Colorado. Renew. Energy 2010, 35, 2228–2234. [Google Scholar] [CrossRef]
Uyan, M. GIS-Based Solar Farms Site Selection Using Analytic Hierarchy Process (AHP) in Karapinar Region, Konya/Turkey. Renew. Sustain. Energy Rev. 2013, 28, 11–17. [Google Scholar] [CrossRef]
Noorollahi, Y.; Ghenaatpisheh Senani, A.; Fadaei, A.; Simaee, M.; Moltames, R. A Framework for GIS-Based Site Selection and Technical Potential Evaluation of PV Solar Farm Using Fuzzy-Boolean Logic and AHP Multi-Criteria Decision-Making Approach. Renew. Energy 2022, 186, 89–104. [Google Scholar] [CrossRef]
Fedakar, H.I.; Dinçer, A.E.; Demir, A. Comparative Analysis of Hybrid Geothermal-Solar Systems and Solar PV with Battery Storage: Site Suitability, Emissions, and Economic Performance. Geothermics 2025, 125, 103175. [Google Scholar] [CrossRef]
Yu, C.; Lai, X.; Chen, F.; Jiang, C.; Sun, Y.; Zhang, L.; Wen, F.; Qi, D. Multi-Time Period Optimal Dispatch Strategy for Integrated Energy System Considering Renewable Energy Generation Accommodation. Energies 2022, 15, 4329. [Google Scholar] [CrossRef]
Wang, Y.; Chao, Q.; Zhao, L.; Chang, R. Assessment of Wind and Photovoltaic Power Potential in China. Carbon Neutrality 2022, 1, 1–15. [Google Scholar] [CrossRef]
Zhang, Y.; Zhao, Z. A Novel Hybrid Multi-Criteria Decision-Making Approach for Solar Photovoltaic Power Plant Site Selection Based on the Quantity and Quality Matching of Resource-Demand. Expert Syst. Appl. 2025, 267, 126014. [Google Scholar] [CrossRef]
Mokarram, M.; Mokarram, M.J.; Gitizadeh, M.; Niknam, T.; Aghaei, J. A novel optimal placing of solar farms utilizing multi-criteria decision-making (MCDA) and feature selection. J. Clean. Prod. 2020, 261, 121098. [Google Scholar] [CrossRef]
Ruiz, H.S.; Sunarso, A.; Ibrahim-Bathis, K.; Murti, S.A.; Budiarto, I. GIS-AHP Multi Criteria Decision Analysis for the optimal location of solar energy plants at Indonesia. Energy Rep. 2020, 6, 3249–3263. [Google Scholar] [CrossRef]
Finn, T.; McKenzie, P. A High-Resolution Suitability Index for Solar Farm Location in Complex Landscapes. Renew. Energy 2020, 158, 520–533. [Google Scholar] [CrossRef]
Jun, D.; Tian-tian, F.; Yi-sheng, Y.; Yu, M. Macro-Site Selection of Wind/Solar Hybrid Power Station Based on ELECTRE-II. Renew. Sustain. Energy Rev. 2014, 35, 194–204. [Google Scholar] [CrossRef]
Villacreses, G.; Gaona, G.; Martínez-Gómez, J.; Jijón, D.J. Wind Farms Suitability Location Using Geographical Information System (GIS), Based on Multi-Criteria Decision Making (MCDM) Methods: The Case of Continental Ecuador. Renew. Energy 2017, 109, 275–286. [Google Scholar] [CrossRef]
Colak, H.E.; Memisoglu, T.; Gercek, Y. Optimal site selection for solar photovoltaic (PV) power plants using GIS and AHP: A case study of Malatya Province, Turkey. Renew. Energy 2020, 149, 565–576. [Google Scholar] [CrossRef]
Shriki, N.; Rabinovici, R.; Yahav, K.; Rubin, O. Prioritizing Suitable Locations for National-Scale Solar PV Installations: Israel’s Site Suitability Analysis as a Case Study. Renew. Energy 2023, 2023, 105–124. [Google Scholar] [CrossRef]
Kyomugisha, R.; Muriithi, C.M.; Edimu, M. Multiobjective Optimal Power Flow for Static Voltage Stability Margin Improvement. Heliyon 2021, 7, e08631. [Google Scholar] [CrossRef] [PubMed]
Varadarajan, M.; Swarup, K.S. Differential Evolutionary Algorithm for Optimal Reactive Power Dispatch. Electr. Power Energy Syst. 2008, 30, 435–441. [Google Scholar] [CrossRef]
Global Solar Atlas 2.0: Validation Report. World Bank. 2019. Available online: http://documents.worldbank.org/curated/en/507341592893487792 (accessed on 6 June 2025).
Zhang, J.; Peng, S. China 4 km Spatial Resolution Daily Sunshine Duration Dataset (2000–2020) (Dataset); National Science & Technology Infrastructure of China: Beijing, China, 2025. [Google Scholar] [CrossRef]
Jing, W. China 1 km Spatial Resolution Monthly Mean Relative Humidity Dataset (2000–2020) (Dataset); National Science & Technology Infrastructure of China: Beijing, China, 2021. [Google Scholar] [CrossRef]
Peng, S.; Ding, Y.; Liu, W.; Li, Z. 1 Km Monthly Temperature and Precipitation Dataset for China from 1901 to 2017. Earth Syst. Sci. Data 2019, 11, 1931–1946. [Google Scholar] [CrossRef]
National Centers for Environmental Information (NCEI). Global Surface Summary of the Day–GSOD. Available online: https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C00516 (accessed on 3 August 2023).
NASADEM: Creating a New NASA Digital Elevation Model and Associated Products | NASA Earthdata. Available online: https://www.earthdata.nasa.gov/about/competitive-programs/measures/new-nasa-digital-elevation-model (accessed on 1 May 2025).
1:1 Million Public Edition Fundamental Geographic Information Data (2021). Available online: https://www.webmap.cn/commres.do?method=result100w (accessed on 3 August 2023).
Yang, J.; Huang, X. The 30 m annual land cover dataset and its dynamics in China from 1990 to 2019. Earth Syst. Sci. Data 2021, 13, 3907–3925. [Google Scholar] [CrossRef]
China Land Supply. Available online: https://www.ceicdata.com/en/china/land-supply/land-supply-ytd (accessed on 1 April 2025).
Chen, J.; Gao, M.; Cheng, S.; Hou, S.; Song, M.; Liu, X.; Liu, Y. Global 1 km × 1 km gridded revised real gross domestic product and electricity consumption during 1992–2019 based on calibrated nighttime light data. Sci. Data 2022, 9, 202. [Google Scholar] [CrossRef]
Lyu, X.; Li, X.; Wei, H.; Wu, J.; Dang, D.; Zhang, C.; Wang, K.; Lou, A. Mapping of Utility-Scale Solar Panel Areas From 2000 to 2022in China Using Google Earth Engine. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 18083–18095. [Google Scholar] [CrossRef]
Photovoltaic Power Station Design Specifications. Available online: https://www.mohurd.gov.cn/gongkai/zc/wjk/art/2024/art_96bd1b03e8d54760b4dff9ad60352ddc.html (accessed on 1 April 2025).
China Photovoltaic Industry Association. 2022–2023 China Photovoltaic Industry Annual Report. 2023. Available online: https://www.chinapv.org.cn/Industry/resource_1285.html (accessed on 1 April 2025).
China Photovoltaic Industry Association. 2020–2021 China Photovoltaic Industry Annual Report. 2021. Available online: https://www.chinapv.org.cn/Industry/resource_1012.html (accessed on 1 April 2025).
China Photovoltaic Industry Association. 2019–2020 China Photovoltaic Industry Annual Report. 2020. Available online: https://www.chinapv.org.cn/Industry/resource_821.html (accessed on 1 April 2025).
China Photovoltaic Industry Association. 2018–2019 China Photovoltaic Industry Annual Report. 2019. Available online: https://www.chinapv.org.cn/Industry/resource_820.html (accessed on 1 April 2025).
China Photovoltaic Industry Association. 2017–2018 China Photovoltaic Industry Annual Report. 2019. Available online: https://www.chinapv.org.cn/Industry/resource_584.html (accessed on 1 April 2025).
Saaty, T.L. A Scaling Method for Priorities in Hierarchical Structures. J. Math. Psychol. 1977, 15, 234–281. [Google Scholar] [CrossRef]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Representations by Back-Propagating Errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J.L. Adam: A Method for Stochastic Optimization, Ver. 9. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Loshchilov, I.; Hutter, F. SGDR: Stochastic Gradient Descent with Warm Restarts, Ver. 1. arXiv 2016, arXiv:1608.03983. [Google Scholar]
Lundberg, S.; Lee, S. A Unified Approach to Interpreting Model Predictions, Ver. 1. arXiv 2017, arXiv:1705.07874. [Google Scholar]

Figure 1. Chart of global solar PV installed capacity by country (2020–2024) [3,4,5].

Figure 2. Flowchart of this article.

Figure 3. Overview of the study area.

Figure 4. (a) Elevation criteria. (b) Proximity to reserve criteria. (c) Precipitation criteria. (d) Temperature criteria. (e) Sunshine duration criteria. (f) Slope criteria. (g) Aspect criteria. (h) Solar radiation criteria. (i) Land price criteria. (j) Regional power demand criteria. (k) Land price criteria.

Figure 5. Flowchart of the I-KMEANS algorithm.

Figure 6. The heatmap of Pearson correlation coefficients between criteria.

Figure 7. T-SNE visualization of I-Kmeans clustering results.

Figure 8. The training curves and determined weights of the MLP.

Figure 9. Two representative combination types of comprehensive membership functions (classified as increasing and decreasing).

Figure 10. The PV construction suitability map determined by the method proposed in this study.

Figure 11. The proportions of national suitability classes.

Figure 12. Statistical chart of suitability classes across the seven major national regions.

Figure 13. (a) Aggregate scores Nightingale rose diagram. (b) Average growth rates Nightingale rose diagram.

Figure 14. Comparison chart of weighting methods based on aggregate scores.

Figure 15. Comparison chart of weighting methods based on average growth rates.

Figure 16. Sensitivity analysis box plots.

Figure 17. (a) Discrepancy map between evaluation results of the CRITIC-FCE method and the proposed method. (b) Discrepancy map between evaluation results of the EWM-FCE method and the proposed method. (c) Discrepancy map between evaluation results of the AHP–WLC method and the proposed method. (d) Discrepancy map between evaluation results of the MLP-WLC method and the proposed method. (e) Discrepancy map between evaluation results of the resource-demand method and the proposed method.

Figure 18. The chord diagram of Pearson correlations among results from various suitability assessment methods.

Table 1. Evaluation criteria and data sources selected in this study.

Criteria	Sub-Criteria	Attribute	Threshold	Type	Source
Climatic	Solar Radiation	common	–	Maximize	a
	Sunshine Duration	common	–	Maximize	b
	Humidity	common	–	Minimize	c
	Precipitation	common	–	Minimize	a
	Temperature	common	–	Minimize	d
Orography	Elevation	common/Exclusion	6000 m	Maximize	e
	Slope	common/Exclusion	35°	Minimize	e
	Aspect	common	–	Maximize	e
Location	Proximity to Reserve	Exclusion	1000 m	–	f
	Land Cover	common/Exclusion	0	Maximize	g
Economic	Land Price	common	–	Minimize	h
	Regional Power Demand	common	–	Maximize	i

a Global Solar Atlas. b National Tibetan Plateau/Third Pole Environment Data Center. c National Science and Technology Infrastructure of China. d National Oceanic and Atmospheric Administration (NOAA). e United States Geological Survey (USGS). f National Geomatics Center of China (NGCC). g Wuhan University (WHU). h Land China. i Southwestern University of Finance and Economics.

Table 2. Reclassification rules for discrete sub-criteria.

Sub-Criteria	Limitation	Grade
Land Cover	Barren	5
	Grassland	4
	Shrub and Impervious	3
	Forest	2
	Wetland	1
	Water and Snow/Ice and Cropland	0
Aspect	South	5
	Southeast and Southwest	4
	East and West	2
	Northeast and Northwest	1
	North	0

Table 3. Saaty’s fundamental importance judgment scale definitions.

Importance	Definition
1	The two criteria are of equal importance
3	Somewhat more importance of the former over the latter
5	Much more importance of the former over the latter
7	Very much more importance of the former over the latter
9	Absolutely more importance of the former over the latter
2, 4, 6, 8	Intermediate values
Multiplicative Inverses for 1–9	Reciprocal Importance in Transposed Comparisons

Table 4. Criteria pairwise comparison matrix.

Criteria	Temperature	Precipitation	Humidity	Solar Radiation	Elevation	Aspect	Slope	Sunshine Duration	Land Cover	Land Price	Regional Power Demand
Temperature	1.000	3.000	0.333	0.111	1.000	0.143	0.200	0.143	0.333	0.111	0.111
Precipitation	0.333	1.000	0.333	0.111	0.333	0.143	0.143	0.143	0.200	0.143	0.111
Humidity	3.000	3.000	1.000	0.111	1.000	0.200	0.333	0.200	1.000	0.143	0.111
Solar Radiation	9.000	9.000	9.000	1.000	9.000	5.000	7.000	5.000	9.000	3.000	2.000
Elevation	1.000	3.000	1.000	0.111	1.000	0.200	0.333	0.200	1.000	0.143	0.143
Aspect	7.000	7.000	5.000	0.200	5.000	1.000	3.000	1.000	3.000	0.200	0.143
Slope	5.000	7.000	3.000	0.143	3.000	0.333	1.000	0.333	3.000	0.143	0.143
Sunshine Duration	7.000	7.000	5.000	0.200	5.000	1.000	3.000	1.000	3.000	0.200	0.167
Land Cover	3.000	5.000	1.000	0.111	1.000	0.333	0.333	0.333	1.000	0.200	0.143
Land Price	9.000	7.000	7.000	0.333	7.000	5.000	7.000	5.000	5.000	1.000	0.333
Regional Power Demand	9.000	9.000	9.000	0.500	7.000	7.000	7.000	6.000	7.000	3.000	1.000

Table 5. Criteria weight determined by AHP and its threshold.

Criteria	Weight (%)	Threshold for Unsuitability
Temperature	1.623	>23.00 °C
Precipitation	1.14	>2000.00 mm/year
Humidity	2.496	>90.00%
Solar Radiation	27.517	<1000.00 kWh/m²
Elevation	2.311	<0.00 m
Aspect	7.848	<1.00
Slope	4.836	>7.00°
Sunshine Duration	7.959	<1200 h/year
Land Cover	3.026	<0.25
Land Price	16.949	>6176.00 yuan
Regional Power Demand	24.295	<61,159,751.74 kWh

Table 6. The thresholds and cluster centers identified by I-Kmeans.

Criteria	Threshold			Clustering Centroid				Expert Threshold
Criteria	Class 1	Class2	Class 3	Centroid	Centroid	Centroid	Centroid	Expert Threshold
Temperature (°C)	13.89	8.96	4.93	17.17	10.60	7.32	2.54	23
Precipitation (mm)	1002.42	513.61	344.03	1401.44	603.39	443.78	244.27	2000
Humidity (%)	63.80	53.14	44.27	69.83	57.76	48.53	40.01	90
Solar Radiation (kWh/m²)	1373.33	1512.82	1705.44	1306.91	1439.76	1585.88	1825.00	1000
Elevation (m)	567.47	928.26	2680.65	412.85	722.09	1134.42	4226.88	0
Aspect	–	–	–	1.84	2.40	3.20	4.40	1.00
Slope (°)	3.40	2.62	1.82	3.78	3.01	2.24	1.40	7
Sunshine Duration (h)	921.06	2826.59	3197.53	1310.10	2532.02	3121.16	3273.90	1200
Land Cover	1.00	2.35	3.85	0.80	1.20	3.50	4.20	0.25
Land Price (yuan)	362.02	127.01	57.90	638.44	205.27	78.59	42.66	6176
Regional Power Demand (kWh)	3.67 × 10⁸	8.40 × 10⁸	2.62 × 10⁹	2.69 × 10⁸	5.00 × 10⁸	1.41 × 10⁹	4.89 × 10⁹	6.11 × 10⁷

Table 7. The final weights determined by the AHP–WLC–MLP method.

Criteria	Weight (%)
Temperature	1.01
Precipitation	2.48
Humidity	5.50
Solar Radiation	22.32
Elevation	4.27
Aspect	8.36
Slope	5.69
Sunshine Duration	6.12
Land Cover	3.47
Land Price	16.95
Regional Power Demand	23.83

Table 8. Performance comparison of threshold determination algorithms (CH index and class distribution).

Algorithm	CH Index	Sample Distribution
Algorithm	CH Index	Class 1	Class 2	Class 3	Class 4
I-KMEANS*	2,818,164	543,715	1,413,679	1,832,426	735,331
Isotonic Regression	1,186,760	453,400	1,579,154	1,484,735	1,007,862
Natural Break	1,010,094	447,104	1,351,661	2,324,195	402,191
Quantile Method	597,613	1,132,660	666,369	2,103,086	623,036

Higher CH Index values indicate better clustering performance. I-KMEANS* is the algorithm proposed in this paper. The dataset comprises the entire evaluation data.

Table 9. Performance comparison of ablation result (CH index and class distribution).

Algorithm	CH Index	Sample Distribution
Algorithm	CH Index	Class 1	Class 2	Class 3	Class 4
Unconstrained	3,317,482	638,434	1,927,121	1,288,756	670,840
I-KMEANS*	2,818,164	543,715	1,413,679	1,832,426	735,331
Isotonic Regression only	2,580,042	631,929	1,295,325	1,857,809	740,088
Sort only	2,041,104	581,146	1,346,833	1,902,915	694,257

Higher CH Index values indicate better clustering performance. I-KMEANS* is the algorithm proposed in this paper. The dataset comprises the entire evaluation data.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liao, Y.; Miao, S.; Fan, W.; Liu, X. A Novel Hybrid Fuzzy Comprehensive Evaluation and Machine Learning Framework for Solar PV Suitability Mapping in China. Remote Sens. 2025, 17, 2070. https://doi.org/10.3390/rs17122070

AMA Style

Liao Y, Miao S, Fan W, Liu X. A Novel Hybrid Fuzzy Comprehensive Evaluation and Machine Learning Framework for Solar PV Suitability Mapping in China. Remote Sensing. 2025; 17(12):2070. https://doi.org/10.3390/rs17122070

Chicago/Turabian Style

Liao, Yanchun, Shuangxi Miao, Wenjing Fan, and Xingchen Liu. 2025. "A Novel Hybrid Fuzzy Comprehensive Evaluation and Machine Learning Framework for Solar PV Suitability Mapping in China" Remote Sensing 17, no. 12: 2070. https://doi.org/10.3390/rs17122070

APA Style

Liao, Y., Miao, S., Fan, W., & Liu, X. (2025). A Novel Hybrid Fuzzy Comprehensive Evaluation and Machine Learning Framework for Solar PV Suitability Mapping in China. Remote Sensing, 17(12), 2070. https://doi.org/10.3390/rs17122070

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Hybrid Fuzzy Comprehensive Evaluation and Machine Learning Framework for Solar PV Suitability Mapping in China

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Source

2.2.1. Data

2.2.2. Data Preprocessing

2.3. Decision Set Selection

2.4. Weight Determination

2.4.1. AHP–WLC Label Generation

2.4.2. Training Sample Generation

2.4.3. Refined Weight Determination Via MLP

2.5. Suitability Evaluation Result Computation

2.5.1. Fuzzy Correlation Matrix Computation

Single-Factor Fuzzy Correlation Vector Computation

2.5.2. Weighted Result Computation

2.5.3. Result Computation

3. Result

3.1. Threshold Classification Results Based on I-KMEANS

3.2. MLP-Based Weight Determination

3.3. Suitability Assessment Results

3.4. Analysis of Result

3.4.1. Analysis of National Suitability Evaluation Results

3.4.2. Analysis of Regional Suitability Evaluation Results

3.5. Threshold Determination Algorithm Experiment

3.5.1. Comparative Experiments

3.5.2. Ablation Study

3.6. Comparison of Weight Determination Methods

3.7. Sensitivity Analysis

4. Discussion

4.1. Research Implications

4.2. Comparison with Results from Traditional Methods

4.3. Limitations and Future Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI