Beyond Infrastructure: A Capability-Conversion Diagnostic Framework for Rural Water Access Inequality in Morocco

Boudrik, Youness; Hasnaoui, Rachid; Touil, Achraf; Oulakhmis, Abdellah; Aissaoui, Nawfal

doi:10.3390/w18080936

Open AccessArticle

Beyond Infrastructure: A Capability-Conversion Diagnostic Framework for Rural Water Access Inequality in Morocco

by

Youness Boudrik

^1,*

,

Rachid Hasnaoui

¹

,

Achraf Touil

²

,

Abdellah Oulakhmis

¹

and

Nawfal Aissaoui

³

¹

Economics and Public Policy Laboratory, Ibn Tofail University of Kénitra, P.O. Box 242, Kénitra 14000, Morocco

²

LMII—Faculty of Sciences and Technology, Hassan 1st University, P.O. Box 577, Settat 26000, Morocco

³

Faculty of Legal, Economic and Social Sciences Souissi, Mohamed V University, P.O. Box 8007, Rabat 10000, Morocco

^*

Author to whom correspondence should be addressed.

Water 2026, 18(8), 936; https://doi.org/10.3390/w18080936

Submission received: 22 February 2026 / Revised: 26 March 2026 / Accepted: 9 April 2026 / Published: 14 April 2026

(This article belongs to the Section Water Resources Management, Policy and Governance)

Download

Browse Figures

Versions Notes

Abstract

Rural water access in developing countries remains deeply unequal, even among communes with comparable infrastructure. This paradox motivates a shift from infrastructure-centered analysis toward a framework that explicitly accounts for governance and local conversion factors. We develop a Capability-Conversion Diagnostic Framework, grounded in Sen’s Capability Approach, that decomposes water access variance into three components: measurable infrastructure, provincial governance context, and commune-level unmeasured conversion factors. Applied to 1324 rural communes from Morocco’s 2024 General Census (RGPH 2024), the framework combines k-means capability segmentation (

k = 3

, selected via a composite validation criterion), cross-validated infrastructure-to-water prediction using OLS with engineered features (

R_{CV}^{2} = 0.274

, out-of-fold), conversion residual extraction, and spatial decomposition. The central finding is a three-way variance partition: infrastructure explains 27.4%, provincial context 34.1%, and commune-level unmeasured factors 38.5%—the latter representing an upper bound that includes omitted variables and measurement noise alongside conversion factors. Theil decomposition confirms that 89.9% of water access inequality occurs within capability tiers, consistent with Sen’s emphasis on conversion factors. A six-archetype policy matrix classifies communes into differentiated intervention strategies: 19.9% require comprehensive multi-sector transformation, 10.0% need governance reform despite adequate infrastructure, and 20.0% need targeted bottleneck interventions. The framework offers planners a diagnostic tool that identifies not only where infrastructure is lacking, but where it fails to deliver outcomes and what complementary interventions are needed.

Keywords:

water access inequality; Capability Approach; conversion factors; infrastructure diagnostics; variance decomposition; rural development; Morocco; RGPH 2024

1. Introduction

Access to safe drinking water remains one of the most critical challenges facing developing nations, with profound implications for public health, economic development, and social equity [1]. The United Nations Sustainable Development Goal 6 explicitly targets universal and equitable access by 2030, yet significant disparities persist, particularly in rural areas of developing countries [2]. Understanding the multidimensional nature of water access challenges requires analytical frameworks that move beyond simple infrastructure metrics to capture the complex interplay between resources, capabilities, and actual outcomes. This complexity is particularly evident in countries like Morocco, where aggregate improvements in national water access statistics mask substantial spatial inequalities between urban and rural areas, and—more critically—among rural communes themselves.

Morocco has made significant strides in improving water access over the past three decades, with national rates increasing from 14% in 1990 to over 87% by 2020 [3]. However, this aggregate progress conceals pronounced disparities that continue to challenge development planners. Rural Morocco presents a striking paradox: commune-level water access ranges from 0% to 99.8%, with a mean of only 55.3% and a standard deviation of 36.3% [4]. The Moroccan government’s Programme National d’Approvisionnement en Eau Potable des Populations Rurales (PAGER) has invested billions of dirhams in rural water infrastructure since 1995 [5], yet the translation of infrastructure resources into actual water access remains strikingly uneven across communes with similar infrastructure endowments. This unevenness raises a fundamental question: If infrastructure investment is necessary for water access, why is it so manifestly insufficient?

Traditional approaches to water access analysis fall into two broad categories that share a common limitation. Regression-based predictive models estimate water connectivity from socio-economic covariates, implicitly treating the problem as one of prediction [6]. Spatial analyses map coverage gaps using geographic information systems and remotely sensed data. Both approaches assume that water access is primarily determined by observable infrastructure characteristics. When models achieve modest predictive power, the standard response is to seek additional predictors—governance indicators, climate data, institutional variables—in the hope of closing the explanatory gap. This framing treats unexplained variance as a measurement problem rather than as a substantive finding.

Amartya Sen’s Capability Approach [7] offers a fundamentally different interpretation. Sen argues that resources (commodities, infrastructure) do not automatically produce well-being; their conversion into valued functionings depends on personal, social, and environmental conversion factors that are inherently context-dependent [8]. Applied to water access, this means that identical infrastructure endowments will produce different outcomes depending on governance quality, community organization, hydrogeological conditions, and other contextual factors. From this perspective, a modest predictive

R^{2}

is not necessarily a model failure—it is consistent with the theoretical expectation that conversion factors occupy a substantial share of outcome variance, although the residual may also reflect omitted variables, measurement noise, and model limitations.

Despite the theoretical appeal of Sen’s framework, its application to infrastructure assessment has remained largely qualitative [8,9]. The challenge of operationalization lies in estimating the relative contributions of measurable resources versus the residual share attributable to unmeasured conversion factors, omitted variables, and measurement limitations. Existing studies either report overall model fit without decomposition, or attribute unexplained variance to measurement error without distinguishing between its different sources.

This paper addresses this challenge by developing a Capability-Conversion Diagnostic Framework that decomposes water access variance into three non-circular components:

Infrastructure (measurable resources): The share of water access variance explained by observable infrastructure features, estimated via out-of-fold cross-validated prediction ( $R_{CV}^{2}$ ).
Provincial context (governance and geography): The share of residual variance attributable to systematic province-level differences, estimated via intraclass correlation of prediction residuals.
Commune-level unmeasured factors: The remaining variance, which may reflect local conversion factors, omitted infrastructure variables, measurement noise, and model limitations not captured by infrastructure data or province membership.

The framework is applied to 1324 rural communes from Morocco’s RGPH 2024 census spanning 39 provinces and nine macro-regions. The principal contributions are as follows:

A non-circular three-way variance decomposition quantifying the infrastructure–governance–conversion split (27.4%/34.1%/38.5%), providing the first such estimate for Moroccan rural water access.
A within versus between province diagnostic revealing that infrastructure predicts between-province variation ( $R^{2} = 0.384$ ) better than within-province variation ( $R^{2} = 0.255$ ), with direct policy implications.
Consistency with Sen’s framework: the finding that 89.9% of water access inequality occurs within capability tiers provides quantitative evidence consistent with the theoretical centrality of conversion factors, although the residual share may also include omitted variables and measurement noise.
A six-archetype policy prioritization matrix translating capability theory into differentiated intervention strategies for 1324 communes.
A co-occurring bottleneck analysis demonstrating that 48.6% of communes face multiple binding infrastructure constraints, challenging single-sector investment approaches.

The remainder of this paper is structured as follows: Section 2 reviews the relevant literature. Section 3 presents the seven-phase diagnostic framework. Section 4 reports the empirical results. Section 5 interprets findings and derives policy implications. Section 6 concludes the paper.

2. Literature Review

2.1. Water Access Inequality in Developing Countries

The relationship between water infrastructure and actual access has been extensively documented. Hutton and Haller [10] estimated the global burden of inadequate water at $260 billion annually. Hunter et al. [11] provided systematic evidence on waterborne disease burdens, while Pickering and Davis [12] quantified the time costs of water collection disproportionately borne by women and children. In Morocco, Molle and Tanouti [13] documented tensions in water allocation, and Devoto et al. [14] studied piped water adoption impacts in urban settings. A consistent finding across this literature is that aggregate investment statistics poorly predict commune-level outcomes, motivating more granular analytical approaches.

Recent work has shifted attention from binary access metrics (connected/not connected) toward service quality dimensions including reliability, affordability, and seasonal variation [15]. This shift aligns with the capability perspective: a household may be nominally connected to a water network, yet unable to access safe water due to intermittent supply, prohibitive pricing, or contamination.

2.2. Rural Water Governance and Institutional Factors

A growing body of empirical research emphasizes governance and institutional capacity as determinants of rural water outcomes. Foster and Hope [16] demonstrated that institutional sustainability—including maintenance funding, community management structures, and accountability mechanisms—explains a substantial share of variation in rural water system functionality across sub-Saharan Africa. Marks et al. [17] used longitudinal data from 346 communities in three countries to show that management practices and institutional support predict water point sustainability more strongly than infrastructure age or technology type. In a systematic review covering 174 studies, Valcourt et al. [18] identified governance, finance, and community engagement as the most frequently cited determinants of rural water sustainability, ahead of technical factors. These findings provide direct empirical support for the capability framework’s theoretical expectation that conversion factors—particularly governance—mediate the infrastructure-to-outcome relationship.

2.3. Spatial Analysis of Water Access and Inequality Metrics

Spatial econometric methods have been increasingly applied to water access analysis. Giné-Garriga et al. [19] developed composite indices for monitoring rural water access that integrate spatial heterogeneity in service delivery across administrative units. Marson and Savin [20] demonstrated strong spatial clustering in WASH outcomes, with administrative boundaries creating discontinuities in access that reflect governance rather than geography. Elliott et al. [21] applied geospatial analysis to sub-national water access disparities in 26 countries, finding that within-country inequality often exceeds between-country inequality. These spatial approaches complement the present study’s province-based decomposition by providing empirical precedent for the view that administrative context shapes water outcomes independently of infrastructure.

2.4. Sen’s Capability Approach and Infrastructure

Sen’s Capability Approach [7] distinguishes between resources (commodities, infrastructure), capabilities (substantive freedoms to achieve valued states), and functionings (actually achieved outcomes). Robeyns [8] formalized the role of conversion factors—personal, social, and environmental—that mediate the transformation of resources into capabilities and ultimately into functionings. Alkire and Foster [9] operationalization the framework through multidimensional poverty measurement, although their counting approach does not decompose the resource-to-functioning gap.

In infrastructure studies, the capability lens has been applied qualitatively [22,23], but rarely quantitatively. The present paper addresses this gap not by measuring conversion factors directly, but by estimating an upper bound on the variance they may occupy through decomposition of prediction residuals—an approach that transforms the inherent difficulty of measuring conversion factors from a methodological limitation into a diagnostic tool, while acknowledging that residual variance also contains omitted variables and measurement noise.

2.5. Infrastructure Determinants of Water Access

Distance to paved roads affects investment costs, maintenance logistics, and travel to water points [24]. Abubakar [25] identified electricity access and housing type as determinants of household water access in Nigeria. Behera et al. [26] emphasized the interconnected nature of water, sanitation, and waste services. These studies establish the empirical relevance of the infrastructure variables used in our framework.

2.6. Spatial Structure of Service Delivery

Water access exhibits strong spatial clustering: neighboring communes tend to have similar access rates, reflecting shared governance regimes, hydrogeological conditions, and investment histories. Provincial and regional boundaries often demarcate administrative jurisdictions that determine budget allocation, institutional capacity, and policy implementation. Understanding whether infrastructure explains this spatial clustering or merely correlates with it has direct policy implications: if infrastructure accounts for provincial differences, investment can close the gap; if not, governance reform is needed alongside infrastructure.

2.7. Clustering for Development Typologies

Cluster analysis enables identification of population typologies without imposing predetermined boundaries [27]. Vyas and Kumaranayake [28] demonstrated clustering utility for wealth indices; Filmer and Pritchett [29] employed similar techniques for socio-economic stratification. In the optimization literature, metaheuristic-based clustering has attracted considerable attention [30,31], with algorithms including Genetic Algorithms [32], Grey Wolf Optimizer [33], and Whale Optimization Algorithm [34]. While these algorithms can outperform standard k-means on certain benchmarks, their advantage is most pronounced on high-dimensional or non-convex problems; for moderate-dimensional, well-structured data such as infrastructure indicators, standard k-means with multiple restarts often achieves comparable quality [35]. In this study, we employ k-means with 50 random restarts as a robust, reproducible baseline, consistent with the emphasis on the diagnostic framework rather than algorithmic novelty.

3. Methodology

The Capability-Conversion Diagnostic Framework proceeds in seven phases, each building on the outputs of the previous phase. This section first presents the theoretical grounding in Sen’s Capability Approach, then details each analytical phase.

3.1. Conceptual Framework

The analytical design operationalization Sen’s Capability Approach [7] through a three-level structure (Figure 1):

Level 1 — Resources and Infrastructure: Observable commune-level endowments drawn from the RGPH 2024 census, including road proximity, sanitation, waste management, electricity, housing characteristics, and household composition.
Level 2—Conversion Factors/Capabilities: Composite indices that capture the substantive freedoms enabled by infrastructure—physical accessibility to services, health-related environmental conditions, available time and energy for water collection, and the quality of living conditions [8].
Level 3—Functioning: The observed outcome of interest, namely the commune-level rate of access to drinking water.

The critical insight from this framework is that the same infrastructure endowment can yield different water access outcomes depending on how effectively resources are converted into capabilities and ultimately into functionings. This conversion depends on personal, social, and environmental factors—governance quality, community organization, hydrogeological conditions—that are inherently context-dependent and often unobservable in census data. Rather than treating this unobservable component as mere measurement error, our framework estimates the residual variance share through decomposition of prediction residuals. This residual is consistent with the presence of conversion factors, but may also include omitted infrastructure variables, functional form misspecification, and aggregation effects.

3.2. Data Source and Study Area

The dataset comprises 1324 rural communes from Morocco’s 2024 General Census of Population and Housing (RGPH 2024), conducted by the High Commission for Planning (HCP) [3]. The communes span 39 provinces and 9 macro-regions, capturing the full diversity of rural Morocco from peri-urban plains to remote mountain communities. The census provides commune-level aggregate indicators derived from household questionnaires administered to all households during the enumeration period. The unit of observation is the commune (the lowest administrative unit in Morocco), with each observation representing aggregated percentages computed from household-level responses. The target variable—water access rate—is defined as the proportion of households in each commune with access to drinking water from an improved source (piped network, public standpipe, or protected well). Province and macro-region identifiers are administrative classifications assigned by HCP and used for the spatial decomposition in Phase 4. No individual-level data are used, and no personally identifiable information is present in the analysis.

3.3. Phase 1: Capability Segmentation

3.3.1. Variable Selection and Screening

The conceptual framework (Figure 1) identifies eight candidate Level 1 indicators from the census. Three variables are excluded through empirical screening prior to clustering:

Household size: Excluded due to non-significant correlation with water access ( $r = 0.039$ , $p = 0.161$ ). However, household size is retained in the construction of capability dimension C3 (Time and Energy), where it contributes to the interpretive—but not clustering—layer of the framework.
Toilettes: Excluded due to high collinearity with wastewater evacuation ( $r > 0.85$ ), which captures the same sanitation dimension more broadly.
Housing occupancy status: Excluded due to weak discriminant power across communes ( $r = 0.031$ , $p = 0.256$ with water access).

This screening reduces the theoretical set to five operational indicators, balancing parsimony with coverage of all four capability dimensions.

3.3.2. Infrastructure Variables (Level 1)

Five variables capture the infrastructure resources available to rural communes:

$V_{1}$ : $log (1 + DistanceRoute)$ : Log-transformed distance to nearest paved road (km), addressing extreme positive skewness (raw skew $= 8.2 \to 1.4$ ) that would otherwise collapse 81% of communes into a narrow normalized range [24].
$V_{2}$ : Wastewater evacuation rate (%): Sanitation infrastructure, combining public sewerage and septic systems [26].
$V_{3}$ : Household waste collection rate (%): Environmental health services, combining container and truck collection [26].
$V_{4}$ : Electricity access rate (%): Energy infrastructure, noting a ceiling effect (mean $= 93.9 %$ ) that limits discriminant power at the upper end [25].
$V_{5}$ : Secure housing type rate (%): Dwelling quality, combining villas, apartments, and Moroccan-style houses [36].

Missing values are imputed with column medians (affecting

< 1 %

of observations). All variables are min–max normalized to

[0, 1]

prior to clustering.

Table 1 provides the complete mapping from census variables to theoretical dimensions, including the empirical justification and screening outcome for each candidate indicator.

3.3.3. Capability Dimensions (Level 2)

Following Sen’s framework [7,8], the five infrastructure variables are mapped onto four interpretive capability dimensions for post hoc cluster profiling and theoretical interpretation. These dimensions are not used as clustering features and are not inputs to the prediction model; clustering and prediction operate exclusively on the five Level 1 variables. The capability dimensions serve solely to connect the empirical cluster profiles back to the theoretical structure of the Capability Approach:

C1: Physical Accessibility: Derived from road proximity ( $V_{1}$ ). Lower distance indicates greater accessibility to service delivery networks, markets, and maintenance logistics [24].
C2: Good Health: Derived from sanitation and waste management indicators ( $V_{2}$ , $V_{3}$ ). These environmental conditions enable health as a precondition for water-related well-being [26].
C3: Time and Energy: A composite reflecting the time burden of water collection, informed by road proximity ( $V_{1}$ ), electricity ( $V_{4}$ ), and household size (the latter retained at the interpretive level despite its exclusion from clustering due to non-significant correlation with water access) [37].
C4: Living Conditions: Derived from secure housing rate ( $V_{5}$ ), capturing the dwelling quality dimension of the capability to maintain safe and decent living conditions [22].

Each dimension is normalized to

[0, 1]

for comparability. The mapping from infrastructure variables to capability dimensions is intentionally transparent: it reflects established theoretical linkages in the capability literature rather than data-driven optimization, ensuring that the interpretive layer remains grounded in Sen’s framework rather than in statistical artifacts.

3.3.4. Cluster Selection

The number of clusters is determined by a composite criterion applied to the range

k \in {3, \dots, 9}

. We exclude

k = 2

on theoretical grounds: a binary partition reduces the Capability Approach to a poverty-line dichotomy, contradicting Sen’s emphasis on multidimensional heterogeneity [7]. The composite score combines three normalized validation metrics:

S (k) = 0.4 \cdot \tilde{Sil} (k) + 0.3 \cdot {\tilde{DBI}}^{- 1} (k) + 0.3 \cdot \tilde{CH} (k)

(1)

where

\tilde{\cdot}

denotes min–max normalization across the candidate range, Sil is the Silhouette Coefficient [38], DBI is the Davies–Bouldin Index [39] (inverted, so higher is better), and CH is the Calinski–Harabasz Index [40]. The weights prioritize cohesion (Silhouette, 40%) while equally valuing separation (DBI) and between-to-within variance ratio (CH).

k = 3

achieves the highest composite score (

S (3) = 0.867

), followed by

k = 6

(

S = 0.691

) and

k = 4

(

S = 0.606

). The selection of

k = 3

is further supported by the elbow criterion (Figure 4c) and by the theoretical requirement for interpretable capability tiers that can meaningfully inform differentiated policy responses. Sensitivity analysis confirms that the main findings—the three-way variance decomposition and the dominance of within-cluster inequality—are qualitatively robust to

k = 4

and

k = 5

(see Section 4).

Clustering is performed using k-means with 50 random restarts and k-means++ initialization [41,42]. The solution with the lowest within-cluster sum of squares (WCSS) across restarts is retained. Clusters are ordered by descending mean water access rate and are labeled according to capability level (High-Capability, Moderate–High, Transitional).

3.4. Phase 2: Infrastructure-to-Functioning Prediction

3.4.1. Feature Engineering

The five raw infrastructure features are augmented with engineered terms to capture non-linear relationships and synergies identified in the literature:

Six interaction terms: Proximity × WasteCollection, Proximity × Electricity, Proximity × Housing, WasteCollection × Wastewater, WasteCollection × Electricity, and Electricity × Housing. These capture joint effects—for instance, proximity × electricity reflects the simultaneous availability of grid power and road access required for water pumping infrastructure [25].
Five squared terms: One per raw feature, capturing diminishing returns and threshold effects.

This produces a 16-dimensional engineered feature space. All features are standardized (zero mean, unit variance) prior to model fitting.

3.4.2. Model Comparison

Seven model configurations are compared via

5 \times 3

repeated stratified cross-validation (15 folds total), yielding 15 independent train-test splits per model to ensure robust performance estimation: OLS on 5 raw features (baseline), OLS on 16 engineered features, Ridge regression (

α

tuned via nested CV), Random Forest (5 raw and 16 engineered), Gradient Boosting (16 engineered), and Extra Trees (16 engineered). The best model is selected by mean out-of-fold

R_{CV}^{2}

. In this study, OLS with 16 engineered features achieves the highest

R_{CV}^{2} = 0.272

, outperforming both simpler linear specifications (OLS 5 raw:

R_{CV}^{2} = 0.250

) and non-linear ensemble methods (Random Forest:

0.261

; Gradient Boosting:

0.237

). The OLS model is retained for all downstream analyses because it combines the best cross-validated performance with interpretable coefficients needed for the counterfactual bottleneck analysis in Phase 6.

Critically, all downstream analyses (Phases 3–7) use out-of-fold predictions (

{\hat{y}}_{CV}

) from a single 10-fold cross-validation run to prevent information leakage. In each fold, the model is trained on approximately 1192 communes (90%) and predictions are generated for the held-out 132 communes (10%), ensuring that no commune’s observed water access influences its own predicted value. A separate OLS model fitted on all 1324 observations with the 16 engineered features is retained exclusively for the counterfactual analysis in Phase 6, where interpretable coefficients are required. The overfitting gap between full-sample

R^{2}

and out-of-fold

R_{CV}^{2}

is reported to verify model stability.

3.4.3. Non-Circularity Guarantee

The prediction model is fitted on infrastructure features only; it has no access to cluster labels, province identifiers, or the target variable during out-of-fold prediction. This ensures that the three-way decomposition in Phase 4 is non-circular: infrastructure’s explanatory share (

R_{CV}^{2}

) is estimated independently of the spatial structure that will be measured from the residuals.

3.5. Phase 3: Conversion Residual Extraction

The conversion residual for commune i is defined as

g_{i} = y_{i} - {\hat{y}}_{i, CV}

(2)

where

y_{i}

is observed water access (expressed as a proportion,

[0, 1]

) and

{\hat{y}}_{i, CV}

is the out-of-fold predicted value from Phase 2. Within the capability framework, the residual

g_{i}

admits the following interpretation:

$g_{i} > 0$ (over-conversion): The commune achieves more water access than its infrastructure predicts, suggesting favorable conditions beyond measured infrastructure (effective governance, community mobilization, advantageous hydrogeology, or unmeasured infrastructure assets).
$g_{i} < 0$ (under-conversion): The commune achieves less than predicted, suggesting conversion barriers or unmeasured disadvantages (institutional failures, terrain constraints, poor service management, or infrastructure variables absent from the model).

We note that residuals may also reflect model misspecification, omitted variables, and measurement error; the capability interpretation is a theoretically motivated framing rather than a definitive attribution. The mean residual across all communes is approximately zero by construction of the cross-validation procedure, confirming calibration.

3.6. Phase 4: Spatial Decomposition of Residuals

3.6.1. Three-Way Variance Decomposition

The total variance of water access is partitioned into three sequential components:

\begin{matrix} σ_{infra}^{2} & = R_{CV}^{2} \cdot σ_{total}^{2} \end{matrix}

(3)

\begin{matrix} σ_{province}^{2} & = {ICC}_{resid} \cdot (1 - R_{CV}^{2}) \cdot σ_{total}^{2} \end{matrix}

(4)

\begin{matrix} σ_{commune}^{2} & = (1 - {ICC}_{resid}) \cdot (1 - R_{CV}^{2}) \cdot σ_{total}^{2} \end{matrix}

(5)

where

{ICC}_{resid}

is the intraclass correlation coefficient of the prediction residuals

g_{i}

at the province level, estimated from one-way ANOVA:

{ICC}_{resid} = \frac{{MS}_{between} - {MS}_{within}}{{MS}_{between} + (n_{0} - 1) \cdot {MS}_{within}}

(6)

with

n_{0}

being the average province size. The ICC captures the proportion of residual variance that is systematic at the province level—attributable to shared governance regimes, budget allocation patterns, water utility management, and hydrogeological conditions within provinces.

Non-circularity: Because

R_{CV}^{2}

is computed from out-of-fold predictions (infrastructure features only, no province information) and

{ICC}_{resid}

is computed from the residuals of that prediction, information leakage between the two components is prevented. However, this sequential design ensures non-circularity, not strict orthogonality: infrastructure variables may vary systematically across provinces, so the infrastructure and provincial components are not causally independent. The decomposition partitions predictive variance, not causally orthogonal components.

Interpretation: The infrastructure share quantifies what observable resources explain; the provincial share captures what systematic administrative context explains beyond the infrastructure features included in the model; the commune share captures what remains—a residual consistent with local conversion factors that Sen’s theory predicts, but which may also include omitted variables, measurement error, and model limitations.

Upper-bound interpretation of the commune-level share: It is essential to recognize that the commune-level component (38.5%) is a residual defined as

1 - R_{CV}^{2} - {ICC}_{resid} \cdot (1 - R_{CV}^{2})

. This residual provides an upper bound on the variance attributable to conversion factors because it also absorbs: (i) omitted infrastructure variables not captured in the census (e.g., water network age, storage capacity, pipe material); (ii) measurement noise in the dependent variable (commune-level aggregation of household responses); (iii) functional form misspecification of the OLS model; and (iv) any random variation unrelated to systematic factors. The capability interpretation is a theoretically motivated framing consistent with Sen’s framework [7], not a definitive causal attribution. Future studies incorporating additional predictors (governance indicators, hydrological data) would reduce this residual share and sharpen the decomposition.

3.6.2. Within vs. Between Province $R^{2}$

To determine whether infrastructure predicts geographic stratification or local variation, we compute two complementary

R^{2}

values:

Between-province $R^{2}$ : Regressing province-mean observed water access ( ${\bar{y}}_{p}$ ) on province-mean predicted values ( ${\bar{\hat{y}}}_{p}$ ), weighted by province size. This measures how well infrastructure explains why some provinces have higher access than others.
Within-province $R^{2}$ : Regressing demeaned observed values ( $y_{i} - {\bar{y}}_{p}$ ) on demeaned predicted values ( ${\hat{y}}_{i} - {\bar{\hat{y}}}_{p}$ ). This measures how well infrastructure explains why communes within the same province differ from each other.

If

R_{between}^{2} ≫ R_{within}^{2}

, infrastructure features partly proxy for provincial development level rather than capturing genuine commune-level determinants.

3.7. Phase 5: Theil Inequality Decomposition

The Theil-T index of water access inequality is decomposed into between-cluster (structural) and within-cluster (conversion) components:

T = T_{between} + T_{within}

(7)

where

T = \frac{1}{N} \sum_{i = 1}^{N} \frac{y_{i}}{\bar{y}} ln (\frac{y_{i}}{\bar{y}}), T_{between} = \sum_{c = 1}^{k} s_{c} \frac{{\bar{y}}_{c}}{\bar{y}} ln (\frac{{\bar{y}}_{c}}{\bar{y}})

(8)

with

s_{c} = n_{c} / N

being the population share of cluster c,

{\bar{y}}_{c}

the cluster mean, and

\bar{y}

the overall mean. Water access values of zero are replaced with a small positive constant (

ε = 0.001

) to avoid undefined logarithms.

The between-cluster share quantifies how much inequality is attributable to structural differences in infrastructure endowments across capability tiers. The within-cluster share captures inequality among communes with similar infrastructure—a residual consistent with the role of conversion factors, although it may also reflect within-tier heterogeneity in unmeasured infrastructure or local conditions. Additionally, cluster-specific Gini coefficients and coefficients of variation are computed to characterize inequality intensity within each tier.

3.8. Phase 6: Counterfactual Bottleneck Identification

For each commune, the full-sample OLS model (retained from Phase 2 for its interpretable coefficients) predicts the water access gain from raising each infrastructure feature individually to its 90th percentile value (or 10th percentile for distance, where lower is better), holding all other features at their observed values:

Δ_{j}^{(i)} = \hat{f} (x_{i}^{(j \to p_{90})}) - \hat{f} (x_{i})

(9)

where

x_{i}^{(j \to p_{90})}

denotes the feature vector of commune i with feature j replaced by its 90th-percentile value, and

\hat{f}

is the OLS prediction function. The feature yielding the largest

Δ_{j}^{(i)}

is designated the primary bottleneck for commune i.

To identify co-occurring constraints, we classify feature j as binding for commune i if

Δ_{j}^{(i)} > 0.5 \cdot max_{j} Δ_{j}^{(i)}

(10)

This relative threshold prevents features with wide dynamic ranges (e.g., waste collection, range

[0, 94 %]

) from mechanically dominating the analysis through absolute gain comparisons alone.

Caveat: These counterfactual gains are computed under a model with

R_{CV}^{2} = 0.274

. They represent the model’s best estimate of marginal infrastructure effects, not causal impacts. Correlation-based attribution cannot distinguish between direct infrastructure effects and correlated confounders.

3.9. Phase 7: Policy Prioritization Matrix

Communes are classified into a

k \times 2

policy matrix based on two dimensions:

Capability tier: The cluster assignment from Phase 1, representing the commune’s infrastructure endowment profile.
Conversion efficiency: The sign of the residual $g_{i}$ from Phase 3, indicating whether the commune over-converts ( $g_{i} > 0$ ) or under-converts ( $g_{i} < 0$ ) relative to its infrastructure.

For

k = 3

clusters, this yields six archetypes, each with a distinct policy logic (Table 2):

The matrix translates the diagnostic framework’s statistical findings into differentiated intervention strategies, ensuring that infrastructure investment is directed where conversion capacity exists, while governance reform is prioritized where infrastructure alone has failed to deliver outcomes. Each archetype is defined by the intersection of two empirically derived dimensions—cluster assignment (data-driven via Phase 1) and conversion residual sign (data-driven via Phase 3)—ensuring that the typology is grounded in the statistical analysis rather than imposed a priori. The policy recommendations within each archetype follow logically from the empirical profile: for example, the Diagnose archetype groups communes with high infrastructure scores (cluster mean capability

= 0.79

) but negative residuals (mean gap

= - 13.0

pp), making additional infrastructure investment empirically redundant while governance investigation is indicated by the systematic under-conversion pattern.

4. Results

Figure 2 provides an overview of the three core analytical outputs: the capability-to-functioning relationship, the explanatory power of different modeling approaches, and the resulting three-way variance decomposition.

4.1. Dataset Characteristics

Table 3 presents descriptive statistics for the 1324 rural communes. Water access averages 55.3% with extreme heterogeneity (range: 0.0–99.8%). A Kolmogorov–Smirnov test rejects normality (

p < 10^{- 24}

). All five infrastructure variables correlate significantly with water access, with waste collection (

r = 0.415

) and secure housing (

r = 0.409

) showing the strongest associations. Figure 3 presents the full correlation structure.

4.2. Phase 1: Capability Segmentation

Table 4 presents cluster validation metrics. The composite score (Equation (1)) selects

k = 3

(score

= 0.867

), balancing statistical separation with theoretical interpretability. Figure 4 visualizes the composite criterion and internal metrics across candidate k values.

The three resulting clusters are characterized in Table 5. Figure 5 displays their capability radar profiles, and Figure 6 presents water access distributions. Figure 7 provides integrated centroid profiles, and Figure 8 visualizes the cluster structure in reduced dimensions.

A one-way ANOVA confirms significant between-cluster differences in water access (

F = 110.9

,

p < 10^{- 44}

,

η^{2} = 0.144

). The Kruskal–Wallis test corroborates this (

H = 177.3

,

p < 10^{- 38}

). Critically, the moderate

η^{2}

indicates that clusters explain only 14.4% of water access variance, consistent with the expectation that infrastructure endowments—which define the clusters—are insufficient to determine water outcomes. The silhouette plot (Figure 9) confirms that all clusters show predominantly positive silhouette values, with higher definition at the distribution extremes.

4.3. Phase 2: Infrastructure Prediction

Table 6 compares seven model configurations. OLS with 16 engineered features achieves the best

R_{CV}^{2} = 0.272

, an 8.6% improvement over the five-feature baseline (

R_{CV}^{2} = 0.250

). The overfitting gap is minimal (

R_{full}^{2} = 0.293

vs.

R_{CV}^{2} = 0.274

, a difference of 0.021), confirming the model’s stability for downstream analyses. Figure 10 presents prediction diagnostics.

Permutation importance (Figure 11) identifies waste collection and its proximity interaction as the most influential features. Partial dependence analysis using Random Forest (Figure 12) reveals threshold effects in wastewater evacuation (linearity

= 0.31

) and secure housing (linearity

= 0.71

), suggesting diminishing returns at high coverage levels.

4.4. Phase 3: Conversion Residuals

The conversion residuals (Equation (2)) have mean

+ 0.01

pp (confirming out-of-fold calibration) and standard deviation 30.96 pp. Notably, 792 communes (59.8%) exhibit absolute residuals exceeding 20 pp, indicating widespread divergence between infrastructure endowment and water outcomes. Residual means are approximately zero within each cluster (ranging from

- 0.1

to

+ 0.1

pp), confirming that the prediction model is not systematically biased by cluster membership.

4.5. Phase 4: Three-Way Variance Decomposition

Table 7 presents the central finding of this study. Figure 13 visualizes the partition.

The within vs. between province diagnostic further sharpens the policy implications. Infrastructure features predict between-province variation (

R^{2} = 0.384

) substantially better than within-province variation (

R^{2} = 0.255

). This asymmetry implies that infrastructure features partly proxy for provincial development level: provinces with better roads, sanitation, and housing also tend to have higher water access. However, within a province—where communes share governance structures and hydrogeological conditions—infrastructure has less predictive power.

Province-level clustering of both water access and prediction residuals is strong (

I_{water} = 0.493

,

I_{residuals} = 0.486

), as shown in Figure 14. The near-identical clustering statistics before and after removing infrastructure effects confirm that infrastructure variables capture minimal province-level spatial structure.

4.6. Phase 5: Inequality Decomposition

Theil-T inequality of water access (

T = 0.282

) decomposes, as reported in Table 8 and visualized in Figure 15. The overwhelming dominance of within-cluster inequality (89.9%) shows that communes at the same capability level achieve vastly different outcomes.

4.7. Phase 6: Bottleneck Identification

Table 9 presents the counterfactual bottleneck analysis using the OLS model. Figure 16 visualizes the bottleneck matrix across clusters.

Waste collection dominates as the primary bottleneck because it has both the strongest water access correlation (

r = 0.415

) and the widest gap between the current distribution (median normalized value

= 0.015

) and the 90th-percentile target (

0.748

). However, the co-occurring analysis reveals that 48.6% of communes face multiple binding constraints. For High-Capability communes specifically, bottlenecks are distributed across distance (24%), electricity (27%), waste (30%), and housing (12%), indicating that the most developed communes face context-specific rather than uniform constraints.

4.8. Phase 7: Policy Prioritization

Table 10 presents the six-archetype policy matrix. Figure 17 visualizes the distribution.

5. Discussion

5.1. The Three-Way Decomposition as the Central Finding

The most important result of this study is the three-way variance decomposition: infrastructure explains 27.4% of water access variance, provincial context accounts for 34.1%, and commune-level unmeasured factors contribute 38.5%. This finding has three immediate implications.

First, infrastructure is necessary, but manifestly insufficient. The 27.4% share means that even a perfect infrastructure prediction model would leave nearly three-quarters of water access variation unexplained. This large residual is consistent with the theoretical expectation from Sen’s Capability Approach that conversion factors occupy a substantial share of outcome variance [7,8], although it also includes omitted infrastructure variables (e.g., water network age, storage capacity), measurement noise, and model limitations. The finding aligns with the persistent observation in development practice that infrastructure investment alone fails to close access gaps without complementary governance and institutional reform [5].

Second, provincial governance is the largest single factor. The 34.1% share attributable to provincial context captures the systematic effects of administrative jurisdiction, including budget allocation, institutional capacity, water utility management, and hydrogeological conditions shared within provinces. This finding is consistent with Jiménez and Pérez-Foguet’s [6] emphasis on local government capacity and with the observation that Morocco’s decentralized water management creates substantial inter-provincial variation in service delivery effectiveness [5].

Third, commune-level factors remain the plurality. The 38.5% commune-level share represents an upper bound on the variance attributable to local factors. It is consistent with the presence of conversion factors that Sen’s framework predicts—community organization, local leadership, terrain-specific construction costs, seasonal water availability, and the quality of household-level connections—but may also include omitted infrastructure variables not available in the census, measurement error, and functional form limitations of the prediction model.

5.2. Within vs. Between Province Asymmetry

The finding that infrastructure predicts between-province variation (

R^{2} = 0.384

) better than within-province variation (

R^{2} = 0.255

) has a precise policy interpretation. Infrastructure features partly proxy for provincial development level: provinces with better roads, sanitation, and housing also tend to have higher water access. However, within a province—where communes share governance structures and hydrogeological conditions—infrastructure has less predictive power. This means that leveling infrastructure across provinces would reduce between-province inequality, but leave substantial within-province disparities driven by factors not captured in the infrastructure variables used here—potentially including local conversion factors, unmeasured infrastructure, and commune-specific conditions.

5.3. Conversion Efficiency as the Primary Inequality Driver

The Theil decomposition reveals that 89.9% of water access inequality occurs within capability tiers rather than between them. This finding is consistent with a core expectation of Sen’s Capability Approach: that inequality in functionings is driven more by differential conversion efficiency than by differential resource endowments [7]. However, within-tier inequality may also reflect heterogeneity in unmeasured infrastructure or local conditions not captured by the five clustering variables. The pattern intensifies at lower capability levels (Transitional Theil

= 0.431

vs. High-Capability Theil

= 0.038

), indicating that outcome heterogeneity is greatest precisely where infrastructure is weakest.

5.4. Moderate–High vs. Transitional: Same Outcomes, Different Profiles

The Moderate–High and Transitional clusters achieve similar water outcomes (50.9% vs. 45.1%, Cohen’s

d = 0.16

), but differ dramatically in infrastructure profiles, particularly wastewater (normalized: 0.656 vs. 0.198). This divergence illustrates a key capability insight: clusters represent distinct infrastructure endowments, not functioning outcomes. Two communes with different infrastructure profiles can achieve similar water access if their local conditions—whether conversion factors, unmeasured infrastructure, or geographic advantages—compensate in different ways. This finding would be invisible in a framework that segments by outcomes rather than by inputs.

5.5. Bottleneck Multiplicity and Policy Implications

The finding that 48.6% of communes face multiple binding infrastructure constraints challenges the implicit assumption of many infrastructure programs that single-sector investment suffices. For High-Capability communes, bottlenecks are diversified across distance (24%), electricity (27%), and waste (30%), requiring context-specific investment packages. For lower-capability communes, waste collection dominates (>89%), reflecting the extremely low baseline of this service in rural Morocco (median

= 2.2 %

). Waste collection likely serves as a proxy for overall municipal service capacity: communes with organized waste management tend to have stronger institutional frameworks that support water service delivery [26].

The Diagnose archetype (133 communes, 10.0%) deserves particular attention. These communes possess infrastructure comparable to the High-Capability tier, but under-convert by 13 pp on average. Additional infrastructure investment would be wasteful; what is needed is an investigation of the governance barriers, institutional failures, and service management problems. This archetype has no analogue in frameworks that segment only by infrastructure level, and its identification represents a concrete policy contribution of the capability-conversion approach.

5.6. From Statistical Diagnostics to Actionable Policy

The quantitative framework developed here goes beyond the general observation that “governance matters alongside infrastructure” by providing three operationally specific contributions. First, the three-way decomposition quantifies the relative magnitude of governance versus infrastructure constraints—34.1% versus 27.4%—giving planners an evidence base for allocating reform effort across sectors. Second, the six-archetype classification assigns each of the 1324 communes to a specific intervention category, enabling programmatic targeting at the commune level rather than blanket regional investment. Third, the bottleneck analysis identifies the specific infrastructure dimension most constraining each commune, preventing the waste of resources on non-binding constraints.

These contributions connect directly to established operational frameworks for water sector reform. The “Seven Starting Points” framework for sustainable local water management [43] identifies governance capacity, institutional sustainability, and adaptive management as prerequisites that infrastructure investment alone cannot satisfy. Our empirical decomposition provides quantitative evidence for this diagnosis: the 49.9% of communes classified as under-converting (Diagnose, Target, and Transform archetypes) represent precisely those communities where infrastructure-first approaches are likely to produce “sunk investments” without complementary governance reform. The framework thus provides a tool for ex ante identification of communes where the conditions for investment success are not yet met, enabling sequenced interventions that build governance capacity before or alongside infrastructure.

Concretely, the framework enables the following policy actions: (a) provincial authorities can rank communes within their jurisdiction by conversion efficiency, directing governance audits to the most under-converting communes; (b) national planners can allocate infrastructure budgets preferentially to Target communes (20.0% of the total) where conversion capacity already exists; and (c) comprehensive reform programs can be directed to Transform communes (19.9%) where both infrastructure and governance deficits are simultaneously binding. As subsequent census rounds become available, the framework can track whether communes transition between archetypes, providing a longitudinal measure of policy effectiveness.

5.7. Limitations

Several limitations should be acknowledged. First, the cross-sectional design precludes causal inference. The three-way decomposition describes variance structure, not causal pathways. Second, the commune-level residual share (38.5%) provides an upper bound on the variance attributable to conversion factors; it also contains omitted infrastructure variables, measurement error, functional form misspecification, and aggregation bias. The capability interpretation of residuals is a theoretically motivated framing, not a definitive attribution. Third, the three-way decomposition assumes sequential additivity: infrastructure and provincial effects are separated through a non-circular procedure (preventing information leakage), but they are not strictly orthogonal; infrastructure variables vary systematically across provinces, so partial overlap between components cannot be ruled out. Fourth, the silhouette coefficient of 0.324 indicates moderate cluster separation; the continuous nature of infrastructure variation means that sharp boundaries between capability tiers are inherently approximate. Fifth, the absence of commune-level geographic coordinates prevents true spatial autocorrelation analysis; our province-based clustering statistic is equivalent to a re-weighted intraclass correlation and should not be interpreted as geographic proximity effects. Sixth, the 27.4% infrastructure share may underestimate the true infrastructure contribution if important variables (e.g., water network age, storage capacity) are absent from the census. Seventh, the counterfactual bottleneck analysis identifies correlational constraints, not causal mechanisms; waste collection dominance may reflect broader municipal capacity rather than a direct causal link to water access. Eighth, results are specific to rural Morocco; generalization requires validation in other contexts.

6. Conclusions

This paper develops and applies a Capability-Conversion Diagnostic Framework to 1324 rural communes in Morocco using RGPH 2024 census data. The framework’s central contribution is a sequential three-way variance decomposition showing that infrastructure explains 27.4% of water access variance, provincial context accounts for 34.1%, and commune-level unmeasured factors contribute 38.5%. This quantification provides, to our knowledge, the first such estimate of the infrastructure–context–residual split for rural water access in a developing country.

The finding that nearly three-quarters of water access variance lies beyond measurable infrastructure is consistent with Sen’s theoretical prediction that resources are necessary but insufficient for functioning achievement, although the residual share also includes omitted variables and measurement limitations. The practical implication is clear: infrastructure investment alone cannot close rural water access gaps. Provincial governance reform (addressing the 34.1% provincial share) and commune-level capacity building (addressing the 38.5% local share) must accompany physical infrastructure programs.

The six-archetype policy matrix translates these findings into actionable typologies. The 263 Transform communes (19.9%) require comprehensive multi-sector intervention. The 133 Diagnose communes (10.0%) have adequate infrastructure but poor conversion, signaling governance rather than investment problems. The 265 Target communes (20.0%) need specific bottleneck-focused interventions. Together, these three under-converting archetypes account for 49.9% of rural communes—a substantial share for which conventional infrastructure-first approaches are unlikely to succeed.

Several directions merit future investigation. Most importantly, longitudinal analysis as subsequent census data become available would enable tracking of capability transitions and policy impact assessment; for instance, comparing archetype assignments between RGPH 2024 and future census rounds would reveal whether targeted interventions successfully move communes from the Transform or Diagnose archetypes toward Sustain or Scale. Incorporation of governance indicators, hydrological data, and water utility performance metrics could reduce the unmeasured conversion factor share and sharpen the decomposition. Multilevel modeling with province random effects could more precisely decompose the governance contribution. Geographic expansion to other developing country contexts would test the framework’s generalizability. Integration with operational frameworks for sustainable water management [43] could translate the diagnostic output directly into sequenced intervention programs.

The Capability-Conversion Diagnostic Framework provides regional planners with an analytical tool that goes beyond asking “how much infrastructure is needed?” to asking “where does infrastructure fail to deliver, and why?” By estimating the residual variance share—an upper bound on the role of conversion factors—the framework directs attention toward the complementary governance and institutional interventions that physical infrastructure alone cannot replace.

Author Contributions

Conceptualization, Y.B. and R.H.; methodology, Y.B. and A.T.; software, A.T. and Y.B.; validation, A.T. and A.O.; formal analysis, Y.B. and A.T.; investigation, Y.B., R.H. and N.A.; resources, R.H. and A.O.; data curation, Y.B. and A.T.; writing—original draft preparation, Y.B.; writing—review and editing, A.T., R.H. and N.A.; visualization, A.T.; supervision, R.H.; project administration, R.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Census data are publicly available through the High Commission for Planning (HCP) of Morocco. The analytical code is available from the corresponding author upon request.

Acknowledgments

The authors thank the Moroccan provincial authorities for data access support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

World Health Organization; UNICEF. Progress on Household Drinking Water, Sanitation and Hygiene 2000–2022: Special Focus on Gender; WHO/UNICEF Joint Monitoring Programme: Geneva, Switzerland, 2023. [Google Scholar]
United Nations. Transforming Our World: The 2030 Agenda for Sustainable Development; Resolution A/RES/70/1; United Nations General Assembly: New York, NY, USA, 2015. [Google Scholar]
Haut-Commissariat au Plan. Recensement Général de la Population et de l’Habitat 2024: Résultats Préliminaires; HCP: Rabat, Morocco, 2024.
Office National de l’Électricité et de l’Eau Potable. Rapport Annuel 2023: Activité Eau Potable; ONEE: Rabat, Morocco, 2023. [Google Scholar]
Ministère de l’Équipement, du Transport, de la Logistique et de l’Eau. Programme National d’Approvisionnement en Eau Potable des Populations Rurales: Bilan et Perspectives; METLE: Rabat, Morocco, 2020. [Google Scholar]
Jiménez, A.; Pérez-Foguet, A. Building the role of local government authorities towards the achievement of the human right to water in rural Tanzania. Nat. Resour. Forum 2010, 34, 93–105. [Google Scholar] [CrossRef]
Sen, A. Development as Freedom; Oxford University Press: Oxford, UK, 1999. [Google Scholar]
Robeyns, I. The Capability Approach: A theoretical survey. J. Hum. Dev. 2005, 6, 93–117. [Google Scholar] [CrossRef]
Alkire, S.; Foster, J. Counting and multidimensional poverty measurement. J. Public Econ. 2011, 95, 476–487. [Google Scholar] [CrossRef]
Hutton, G.; Haller, L. Evaluation of the Costs and Benefits of Water and Sanitation Improvements at the Global Level; World Health Organization: Geneva, Switzerland, 2004. [Google Scholar]
Hunter, P.R.; MacDonald, A.M.; Carter, R.C. Water supply and health. PLoS Med. 2010, 7, e1000361. [Google Scholar] [CrossRef]
Pickering, A.J.; Davis, J. Freshwater availability and water fetching distance affect child health in sub-Saharan Africa. Environ. Sci. Technol. 2012, 46, 2391–2397. [Google Scholar] [CrossRef] [PubMed]
Molle, F.; Tanouti, O. Squaring the circle: Agricultural intensification vs. water conservation in Morocco. Agric. Water Manag. 2017, 192, 170–179. [Google Scholar] [CrossRef]
Devoto, F.; Duflo, E.; Dupas, P.; Parienté, W.; Pons, V. Happiness on Tap: Piped water adoption in urban Morocco. Am. Econ. J. Econ. Policy 2012, 4, 68–99. [Google Scholar] [CrossRef]
Bain, R.E.S.; Gundry, S.W.; Wright, J.A.; Yang, H.; Pedley, S.; Bartram, J.K. Accounting for water quality in monitoring access to safe drinking-water as part of the Millennium Development Goals: Lessons from five countries. Bull. World Health Organ. 2012, 90, 228–235. [Google Scholar] [CrossRef]
Foster, T.; Hope, R. A multi-decadal and social-ecological systems analysis of community waterpoint payment behaviours in rural Kenya. J. Rural Stud. 2017, 47, 357–369. [Google Scholar] [CrossRef]
Marks, S.J.; Komives, K.; Davis, J. Community participation and water supply sustainability: Evidence from handpump projects in rural Ghana. J. Plan. Educ. Res. 2014, 34, 276–286. [Google Scholar] [CrossRef]
Valcourt, N.; Walters, J.; Javernick-Will, A.; Linden, K.; Hacker, B. Understanding rural water services as a complex system: An assessment of key factors as potential leverage points for improved service sustainability. Sustainability 2020, 12, 1243. [Google Scholar] [CrossRef]
Giné-Garriga, R.; Jiménez-Fdez de Palencia, A.; Pérez-Foguet, A. Water-sanitation-hygiene mapping: An improved approach for data collection at local level. Sci. Total Environ. 2013, 463–464, 700–711. [Google Scholar] [CrossRef] [PubMed]
Marson, M.; Savin, I. Ensuring sustainable access to drinking water in sub-Saharan Africa: Conflict between financial and social objectives. World Dev. 2015, 76, 26–39. [Google Scholar] [CrossRef]
Elliott, M.; MacDonald, M.C.; Chan, T.; Kearton, A.; Shields, K.F.; Bartram, J.; Hadwen, W.L. Multiple household water sources and their use in remote communities with evidence from Pacific Island countries. Water Resour. Res. 2017, 53, 9106–9117. [Google Scholar] [CrossRef]
Coates, D.; Anand, P.; Norris, M. Housing, Happiness and Capabilities: A Summary of International Evidence and Models. Open Discuss. Pap. Econ. 2015, 81, 1–30. [Google Scholar] [CrossRef]
Mehta, L. Water and human development. World Dev. 2014, 59, 59–69. [Google Scholar] [CrossRef]
Mikou, M.; Rozenberg, J.; Koks, E.; Fox, C.; Quiros, T.P. Assessing rural accessibility and rural roads investment needs using open source data. World Bank Policy Res. Work. Pap. 2019, 8746, 1–47. [Google Scholar] [CrossRef]
Abubakar, I.R. Factors influencing household access to drinking water in Nigeria. Util. Policy 2019, 58, 40–51. [Google Scholar] [CrossRef]
Behera, B.; Rahut, D.B.; Sethi, N. Analysis of household access to drinking water, sanitation, and waste disposal services in urban areas of Nepal. Util. Policy 2020, 62, 100996. [Google Scholar] [CrossRef]
Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 31, 651–666. [Google Scholar] [CrossRef]
Vyas, S.; Kumaranayake, L. Constructing socio-economic status indices: How to use principal components analysis. Health Policy Plan. 2006, 21, 459–468. [Google Scholar] [CrossRef]
Filmer, D.; Pritchett, L.H. Estimating wealth effects without expenditure data—Or tears: An application to educational enrollments in states of India. Demography 2001, 38, 115–132. [Google Scholar] [CrossRef]
Ezugwu, A.E.; Ikotun, A.M.; Oyelade, O.O.; Abualigah, L.; Agushaka, J.O.; Eke, C.I.; Akinyelu, A.A. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng. Appl. Artif. Intell. 2022, 110, 104743. [Google Scholar] [CrossRef]
Nanda, S.J.; Panda, G. A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm Evol. Comput. 2014, 16, 1–18. [Google Scholar] [CrossRef]
Maulik, U.; Bandyopadhyay, S. Genetic algorithm-based clustering technique. Pattern Recognit. 2000, 33, 1455–1465. [Google Scholar] [CrossRef]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey Wolf Optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
Mirjalili, S.; Lewis, A. The Whale Optimization Algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
Celebi, M.E.; Kingravi, H.A.; Vela, P.A. A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 2013, 40, 200–210. [Google Scholar] [CrossRef]
Morote, Á.F.; Hernández, M.; Rico, A.M. Causes of domestic water consumption trends in the city of Alicante: Exploring the links between the housing bubble, the types of housing and the socioeconomic factors. Water 2016, 8, 374. [Google Scholar] [CrossRef]
Asian Development Bank. Balancing the Burden? Desk Review of Women’s Time Poverty and Infrastructure in Asia and the Pacific; ADB: Manila, Philippines, 2015. [Google Scholar]
Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987, 20, 53–65. [Google Scholar] [CrossRef]
Davies, D.L.; Bouldin, D.W. A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1979, PAMI-1, 224–227. [Google Scholar] [CrossRef]
Caliński, T.; Harabasz, J. A dendrite method for cluster analysis. Commun. Stat. 1974, 3, 1–27. [Google Scholar] [CrossRef]
Arthur, D.; Vassilvitskii, S. k-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms; SIAM: New Orleans, LA, USA, 2007; pp. 1027–1035. [Google Scholar]
MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1967; Volume 1, pp. 281–297. [Google Scholar]
GROW. Seven Starting Points to Secure Water Services and Stop Sunk Investments; BMBF GROW Initiative: Bonn, Germany, 2019; Available online: https://bmbf-grow.de (accessed on 15 January 2026).

Figure 1. Conceptual framework mapping Sen’s Capability Approach to rural water access. Level 1 resources and infrastructure (left) are mediated through four capability dimensions—physical accessibility (C1), good health (C2), time and energy (C3), and living conditions (C4)—to produce the observed functioning of drinking water access (right). The theoretical framework encompasses eight candidate indicators; the final operationalization retains five after empirical screening (see Section 3.3.1).

Figure 2. Summary of core analytical results. (a) Capability-to-functioning relationship: aggregate capability score versus observed water access, with smoothed trend line. (b) Variance explained by three approaches: OLS prediction (

R_{CV}^{2} = 0.250

), best cross-validated model (

R_{CV}^{2} = 0.274

), and capability-tier clustering (

η^{2} = 0.144

). (c) Three-way variance decomposition: infrastructure (27.4%), provincial context (34.1%), and commune-level unmeasured factors (38.5%).

Figure 2. Summary of core analytical results. (a) Capability-to-functioning relationship: aggregate capability score versus observed water access, with smoothed trend line. (b) Variance explained by three approaches: OLS prediction (

R_{CV}^{2} = 0.250

), best cross-validated model (

R_{CV}^{2} = 0.274

), and capability-tier clustering (

η^{2} = 0.144

). (c) Three-way variance decomposition: infrastructure (27.4%), provincial context (34.1%), and commune-level unmeasured factors (38.5%).

Figure 3. Correlation matrix of infrastructure features and water access. Waste collection and secure housing show the strongest positive associations with water access (

r > 0.40

). Feature inter-correlations are moderate, confirming that the five variables capture complementary dimensions.

Figure 3. Correlation matrix of infrastructure features and water access. Waste collection and secure housing show the strongest positive associations with water access (

r > 0.40

). Feature inter-correlations are moderate, confirming that the five variables capture complementary dimensions.

Figure 4. Optimal k selection. (a) Internal validation metrics across

k = 2 - 9

. (b) Composite score for

k \geq 3

. (c) Elbow criterion (WCSS vs. k).

k = 3

achieves the highest composite score (0.867).

k = 2

is excluded to avoid trivial binary partitioning inconsistent with capability theory.

Figure 4. Optimal k selection. (a) Internal validation metrics across

k = 2 - 9

. (b) Composite score for

k \geq 3

. (c) Elbow criterion (WCSS vs. k).

k = 3

achieves the highest composite score (0.867).

k = 2

is excluded to avoid trivial binary partitioning inconsistent with capability theory.

Figure 5. Capability radar profiles for three clusters. The High-Capability cluster dominates across all four dimensions. The Moderate–High and Transitional clusters differ sharply on the Good Health dimension (wastewater and waste collection), yet achieve similar water outcomes, demonstrating that clusters capture infrastructure profiles rather than functioning outcomes directly.

Figure 6. Water access distributions by cluster. (a) Mean access with standard deviation bars. (b) Box plots revealing extensive within-cluster variance, particularly for Moderate–High and Transitional clusters. The overlap between C2 and C3 (Cohen’s

d = 0.16

) reflects the capability framework’s prediction that similar infrastructure can produce widely different outcomes.

Figure 6. Water access distributions by cluster. (a) Mean access with standard deviation bars. (b) Box plots revealing extensive within-cluster variance, particularly for Moderate–High and Transitional clusters. The overlap between C2 and C3 (Cohen’s

d = 0.16

) reflects the capability framework’s prediction that similar infrastructure can produce widely different outcomes.

Figure 7. Cluster centroids heatmap combining Level 1 resource variables (normalized) and Level 2 capability dimensions. The High-Capability cluster scores highest across all dimensions, while the Transitional cluster shows a distinctive deficit in wastewater-related variables.

Figure 8. Principal component analysis visualization of clusters. Stars indicate centroids. Adjacent clusters show overlap, consistent with the continuous nature of infrastructure variation. The first two principal components jointly capture the majority of variance in the five infrastructure features.

Figure 9. Silhouette analysis for the

k = 3

clustering solution. Mean silhouette coefficient

= 0.324

. The High-Capability cluster shows the strongest profile. Intermediate clusters exhibit moderate boundary ambiguity, reflecting the continuous nature of infrastructure variation across rural communes.

Figure 9. Silhouette analysis for the

k = 3

clustering solution. Mean silhouette coefficient

= 0.324

. The High-Capability cluster shows the strongest profile. Intermediate clusters exhibit moderate boundary ambiguity, reflecting the continuous nature of infrastructure variation across rural communes.

Figure 10. Infrastructure-to-water prediction. (a) Observed vs. predicted water access (out-of-fold), colored by cluster. The scatter around the identity line reflects the 72.6% of variance not explained by infrastructure features. (b) Residual distribution across clusters.

Figure 11. Permutation feature importance from the OLS model (30 repeats). Waste collection and its proximity interaction dominate, followed by housing and electricity interactions. Error bars show standard deviation across permutations.

Figure 12. Partial dependence profiles (Random Forest on five raw features, used for non-linearity detection). Wastewater evacuation (linearity

= 0.31

) and secure housing (linearity

= 0.71

) exhibit non-linear relationships with water access, suggesting threshold effects.

Figure 12. Partial dependence profiles (Random Forest on five raw features, used for non-linearity detection). Wastewater evacuation (linearity

= 0.31

) and secure housing (linearity

= 0.71

) exhibit non-linear relationships with water access, suggesting threshold effects.

Figure 13. Three-way variance decomposition. Infrastructure explains 27.4%, provincial context accounts for 34.1%, and commune-level unmeasured factors contribute 38.5%. The decomposition is non-circular (no information leakage between components), although the components are not strictly orthogonal because infrastructure varies systematically across provinces.

Figure 14. Spatial structure analysis. Province-level clustering of water access (

I = 0.493

) and prediction residuals (

I = 0.486

) remain nearly identical, indicating that infrastructure variables explain variance roughly uniformly across provinces rather than differentially capturing spatial structure. Note: Province membership defines the weight matrix; these statistics quantify province-block clustering rather than geographic-distance-based spatial autocorrelation. *** denotes

p < 0.001

.

Figure 14. Spatial structure analysis. Province-level clustering of water access (

I = 0.493

) and prediction residuals (

I = 0.486

) remain nearly identical, indicating that infrastructure variables explain variance roughly uniformly across provinces rather than differentially capturing spatial structure. Note: Province membership defines the weight matrix; these statistics quantify province-block clustering rather than geographic-distance-based spatial autocorrelation. *** denotes

p < 0.001

.

Figure 15. Theil inequality decomposition. (a) Between-cluster (10.1%) vs. within-cluster (89.9%) shares. (b) Lorenz curves by cluster. (c) Cluster-specific inequality metrics (Theil-T, Gini, CV) by capability tier. The Transitional cluster’s Lorenz curve deviates most from the equality line, confirming that conversion efficiency—not infrastructure endowment—drives the bulk of inequality.

Figure 16. Bottleneck matrix by cluster. (a) Primary bottleneck distribution (%). High-Capability communes have diversified bottlenecks (waste 30%, electricity 27%, distance 24%), while lower-capability communes are dominated by waste collection (>89%). (b) Binding frequency (gain

> 50 %

of max), revealing co-occurring constraints. The chi-square test confirms significant association between cluster and bottleneck type (

χ^{2} (8) = 560.5

,

p < 10^{- 115}

).

Figure 16. Bottleneck matrix by cluster. (a) Primary bottleneck distribution (%). High-Capability communes have diversified bottlenecks (waste 30%, electricity 27%, distance 24%), while lower-capability communes are dominated by waste collection (>89%). (b) Binding frequency (gain

> 50 %

of max), revealing co-occurring constraints. The chi-square test confirms significant association between cluster and bottleneck type (

χ^{2} (8) = 560.5

,

p < 10^{- 115}

).

Figure 17. Policy prioritization matrix. Each commune is positioned by capability level (horizontal) and conversion efficiency (vertical). The six archetypes correspond to distinct policy responses. Transform communes (19.9%, water

= 12.8 %

) require the most intensive intervention, while Diagnose communes (10.0%, water

= 67.3 %

) have adequate infrastructure but poor conversion, signaling governance rather than investment problems.

Figure 17. Policy prioritization matrix. Each commune is positioned by capability level (horizontal) and conversion efficiency (vertical). The six archetypes correspond to distinct policy responses. Transform communes (19.9%, water

= 12.8 %

) require the most intensive intervention, while Diagnose communes (10.0%, water

= 67.3 %

) have adequate infrastructure but poor conversion, signaling governance rather than investment problems.

Table 1. [New] variable selection and mapping to capability dimensions.

Variable	Capability Dimension	Theoretical Justification	r with Water	Retained?
$V_{1}$ : Distance to road (log)	C1: Physical access	Service delivery costs	$- 0.326$ ***	Yes
$V_{2}$ : Wastewater evac.	C2: Good health	Environmental sanitation	$+ 0.208$ ***	Yes
$V_{3}$ : Waste collection	C2: Good health	Municipal service capacity	$+ 0.415$ ***	Yes
$V_{4}$ : Electricity access	C3: Time and energy	Pumping infrastructure	$+ 0.256$ ***	Yes
$V_{5}$ : Secure housing	C4: Living conditions	Dwelling quality	$+ 0.409$ ***	Yes
Household size	C3: Time and energy	Collection time burden	${+ 0.039}^{ns}$	No
Toilettes	C2: Good health	Collinear with $V_{2}$	$r_{V_{2}} > 0.85$	No
Occupancy status	C4: Living conditions	Weak discriminant power	${+ 0.031}^{ns}$	No

Note: ***

p < 0.001

; ns = not significant. The five retained variables cover all four capability dimensions from the conceptual framework (Figure 1).

Table 2. Policy archetype definitions.

Archetype	Tier	Residual	Policy Logic
Sustain	High	$g_{i} > 0$	High infrastructure, effective conversion. Monitor and maintain existing systems; identify success factors for replication.
Diagnose	High	$g_{i} < 0$	High infrastructure, poor conversion. Investigate institutional barriers and governance failures; additional infrastructure investment would be wasteful.
Scale	Mod–High	$g_{i} > 0$	Moderate infrastructure, effective conversion. Replicate and expand what works; these communes demonstrate that modest infrastructure can deliver results with good governance.
Consolidate	Mod–High	$g_{i} < 0$	Moderate infrastructure, poor conversion. Strengthen institutional capacity alongside targeted infrastructure upgrades.
Target	Transitional	$g_{i} > 0$	Low infrastructure, effective conversion. Prioritize infrastructure investment where conversion capacity already exists.
Transform	Transitional	$g_{i} < 0$	Low infrastructure, poor conversion. Comprehensive multi-sector intervention combining infrastructure, governance reform, and community capacity building.

Table 3. Descriptive statistics and water access correlations (

N = 1324

communes).

Table 3. Descriptive statistics and water access correlations (

N = 1324

communes).

Variable	Mean	Std	Pearson r	Spearman $ρ$	p-Value
Distance to road (log)	—	—	$- 0.326$	$- 0.349$	< $10^{- 33}$
Wastewater evac. (%)	14.4	11.2	$+ 0.208$	$+ 0.214$	< $10^{- 13}$
Waste collection (%)	16.2	12.8	$+ 0.415$	$+ 0.385$	< $10^{- 55}$
Electricity (%)	93.9	4.2	$+ 0.256$	$+ 0.337$	$<$ $10^{- 20}$
Secure housing (%)	38.4	17.8	$+ 0.409$	$+ 0.410$	< $10^{- 53}$
Water access (%)	55.3	36.3	—	—	—

Note: Distance to road reported after

log (1 + x)

transformation (skewness

8.2 \to 1.4

). Household size excluded (

r = 0.039

,

p = 0.161

). Data span 39 provinces and nine macro-regions.

Table 4. Cluster validation metrics across candidate k values.

k	Silhouette	DBI	CH	WCSS	Composite
2	0.432	0.960	1044	203	(excluded)
3	0.324	1.123	951	149	0.867
4	0.304	1.125	839	125	0.606
5	0.295	1.112	775	108	0.522
6	0.308	1.083	763	93	0.691
7	0.297	1.128	723	85	0.426
8	0.269	1.166	694	77	0.079
9	0.266	1.173	660	72	0.000

Note:

k = 2

excluded from composite scoring. Composite

= 0.4 \cdot \tilde{Sil} + 0.3 \cdot {\tilde{DBI}}^{- 1} + 0.3 \cdot \tilde{CH}

. Bold indicates the selected optimal cluster count (

k = 3

).

Table 5. Cluster characteristics (

k = 3

,

N = 1324

).

Table 5. Cluster characteristics (

k = 3

,

N = 1324

).

Cluster	n	%	Water (%)	Std (%)	Avg Cap.	Wastewater
High Capability	297	22.4	80.5	19.9	0.79	0.656
Moderate–High	524	39.6	50.9	35.3	0.57	0.656
Transitional	503	38.0	45.1	37.9	0.48	0.198

Note: Clusters ordered by descending mean water access. Wastewater column (normalized) highlights the key infrastructure dimension differentiating Moderate–High from Transitional (0.656 vs. 0.198) despite similar water outcomes.

Table 6. Model comparison (

5 \times 3

repeated cross-validation).

Table 6. Model comparison (

5 \times 3

repeated cross-validation).

Model	$R_{CV}^{2}$	${MAE}_{CV}$
OLS (5 raw)	$0.250 \pm 0.048$	$0.268 \pm 0.009$
OLS (16 eng.)	$0.272 \pm 0.055$	$0.260 \pm 0.009$
Ridge (16 eng.)	$0.268 \pm 0.049$	$0.263 \pm 0.009$
RF (5 raw)	$0.261 \pm 0.056$	$0.260 \pm 0.010$
RF (16 eng.)	$0.253 \pm 0.059$	$0.259 \pm 0.010$
GBR (16 eng.)	$0.237 \pm 0.060$	$0.261 \pm 0.009$
ExtraTrees (16 eng.)	$0.271 \pm 0.050$	$0.260 \pm 0.009$

Note: Best model selected by mean

R_{CV}^{2}

. Out-of-fold

R_{CV}^{2} = 0.274

(single ten-fold run for residual computation). All downstream analyses use out-of-fold predictions exclusively.

Table 7. Three-way variance decomposition of water access (

N = 1324

).

Table 7. Three-way variance decomposition of water access (

N = 1324

).

Source	Share (%)	Estimation Method
Infrastructure (measurable)	27.4	$R_{CV}^{2}$ (out-of-fold)
Provincial context	34.1	ICC of residuals $\times (1 - R_{CV}^{2})$
Commune-level unmeasured	38.5	Remainder
Total	100.0

Note: Components are non-circular by construction (no information leakage), although infrastructure and provincial effects are not strictly orthogonal. ICC of prediction residuals at the province level

= 0.469

(

F = 30.7

,

p < 10^{- 148}

). See Equations (3)–(5).

Table 8. Theil inequality decomposition by cluster.

Cluster	Theil-T	Gini	CV
High Capability	0.038	0.128	0.248
Moderate–High	0.297	0.397	0.694
Transitional	0.431	0.473	0.842
Overall	0.282	0.371	—
Between-cluster	0.029 (10.1%)	—	—
Within-cluster	0.254 (89.9%)	—	—

Note: Within-cluster inequality is lowest for High Capability (Theil

= 0.038

) and highest for Transitional (Theil

= 0.431

), indicating that conversion efficiency is most variable where infrastructure is weakest.

Table 9. Infrastructure bottleneck analysis (

N = 1324

).

Table 9. Infrastructure bottleneck analysis (

N = 1324

).

Feature	Primary (%)	Binding (%)	Mean Gain (pp)
Waste collection	77.1	86.6	+20.7
Electricity	8.2	19.0	+4.8
Secure housing	7.5	20.2	+3.5
Distance (log)	5.5	22.5	+7.4
Wastewater evac.	1.7	10.8	+5.2

Note: Primary = feature yielding the largest counterfactual gain. Binding = gain exceeds 50% of the maximum gain (co-occurring constraint). In total, 48.6% of communes have ≥2 binding features.

Table 10. Policy prioritization matrix (six archetypes).

Archetype	n	%	Water	Gap	Recommended Intervention
Sustain	164	12.4	91.2	$+ 10.8$	Monitor and maintain existing systems
Diagnose	133	10.0	67.3	$- 13.0$	Identify institutional barriers; governance reform
Scale	240	18.1	80.4	$+ 31.5$	Replicate successful conversion; expand infrastructure
Consolidate	259	19.6	82.1	$+ 30.3$	Strengthen trajectory; prevent regression
Target	265	20.0	20.3	$- 29.5$	Address specific conversion bottlenecks
Transform	263	19.9	12.8	$- 29.0$	Comprehensive multi-sector intervention

Note: Gap = mean conversion residual (pp). Positive = over-converting; negative = under-converting relative to infrastructure endowment.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Boudrik, Y.; Hasnaoui, R.; Touil, A.; Oulakhmis, A.; Aissaoui, N. Beyond Infrastructure: A Capability-Conversion Diagnostic Framework for Rural Water Access Inequality in Morocco. Water 2026, 18, 936. https://doi.org/10.3390/w18080936

AMA Style

Boudrik Y, Hasnaoui R, Touil A, Oulakhmis A, Aissaoui N. Beyond Infrastructure: A Capability-Conversion Diagnostic Framework for Rural Water Access Inequality in Morocco. Water. 2026; 18(8):936. https://doi.org/10.3390/w18080936

Chicago/Turabian Style

Boudrik, Youness, Rachid Hasnaoui, Achraf Touil, Abdellah Oulakhmis, and Nawfal Aissaoui. 2026. "Beyond Infrastructure: A Capability-Conversion Diagnostic Framework for Rural Water Access Inequality in Morocco" Water 18, no. 8: 936. https://doi.org/10.3390/w18080936

APA Style

Boudrik, Y., Hasnaoui, R., Touil, A., Oulakhmis, A., & Aissaoui, N. (2026). Beyond Infrastructure: A Capability-Conversion Diagnostic Framework for Rural Water Access Inequality in Morocco. Water, 18(8), 936. https://doi.org/10.3390/w18080936

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Beyond Infrastructure: A Capability-Conversion Diagnostic Framework for Rural Water Access Inequality in Morocco

Abstract

1. Introduction

2. Literature Review

2.1. Water Access Inequality in Developing Countries

2.2. Rural Water Governance and Institutional Factors

2.3. Spatial Analysis of Water Access and Inequality Metrics

2.4. Sen’s Capability Approach and Infrastructure

2.5. Infrastructure Determinants of Water Access

2.6. Spatial Structure of Service Delivery

2.7. Clustering for Development Typologies

3. Methodology

3.1. Conceptual Framework

3.2. Data Source and Study Area

3.3. Phase 1: Capability Segmentation

3.3.1. Variable Selection and Screening

3.3.2. Infrastructure Variables (Level 1)

3.3.3. Capability Dimensions (Level 2)

3.3.4. Cluster Selection

3.4. Phase 2: Infrastructure-to-Functioning Prediction

3.4.1. Feature Engineering

3.4.2. Model Comparison

3.4.3. Non-Circularity Guarantee

3.5. Phase 3: Conversion Residual Extraction

3.6. Phase 4: Spatial Decomposition of Residuals

3.6.1. Three-Way Variance Decomposition

3.6.2. Within vs. Between Province R 2

3.7. Phase 5: Theil Inequality Decomposition

3.8. Phase 6: Counterfactual Bottleneck Identification

3.9. Phase 7: Policy Prioritization Matrix

4. Results

4.1. Dataset Characteristics

4.2. Phase 1: Capability Segmentation

4.3. Phase 2: Infrastructure Prediction

4.4. Phase 3: Conversion Residuals

4.5. Phase 4: Three-Way Variance Decomposition

4.6. Phase 5: Inequality Decomposition

4.7. Phase 6: Bottleneck Identification

4.8. Phase 7: Policy Prioritization

5. Discussion

5.1. The Three-Way Decomposition as the Central Finding

5.2. Within vs. Between Province Asymmetry

5.3. Conversion Efficiency as the Primary Inequality Driver

5.4. Moderate–High vs. Transitional: Same Outcomes, Different Profiles

5.5. Bottleneck Multiplicity and Policy Implications

5.6. From Statistical Diagnostics to Actionable Policy

5.7. Limitations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.6.2. Within vs. Between Province $R^{2}$