This study employs a mixed-methods quantitative design to examine public perceptions, attitudes, and WTP for a proposed stormwater management fee in Norwalk, Connecticut. The research design integrates survey-based data collection with descriptive statistical analysis and machine learning techniques to identify the demographic, attitudinal, and perceptual factors that shape policy acceptance. The following sections detail the methodological approach, including details about the survey instrument, the analytical framework guiding data processing and modeling, and statistical techniques used to assess sample characteristics and explore key predictors of support.
Figure 1 presents the study framework, illustrating how stormwater management context, resident perceptions, and analytical methods are linked to WTP outcomes. Together, these methods provide the empirical foundation for understanding how socio-economic context, perceived benefits and risks, and individual beliefs influence public support for a local stormwater management fee.
2.1. Study Area: City of Norwalk
Norwalk is a mid-sized coastal city in southwestern Connecticut with a population of approximately 93,000 residents. Situated along the Long Island Sound, it maintains a combination of inland watersheds and coastal drainage systems. The city experiences a generally humid climate with increasing precipitation intensity, resulting in recurring stormwater challenges associated with heavy rainfall, localized inland flooding, and coastal storm surge. These meteorological conditions place sustained pressure on municipal drainage infrastructure and contribute to water quality impacts in local waterways and coastal receiving waters.
Land use in Norwalk is predominantly urban and residential, with a mix of dense neighborhoods, commercial corridors, transportation infrastructure, and coastal development zones. A substantial portion of the city’s built environment predates contemporary stormwater design standards, contributing to high levels of impervious surface coverage and limited on-site stormwater retention. Runoff from residential, commercial, and transportation areas represents a significant component of stormwater volume and pollutant loading, reinforcing the need for system-wide management strategies rather than site-specific controls alone.
Within this context, stormwater management is a locally important issue for residents as it directly affects neighborhoods and quality of life through coastal and inland flooding, which presents a regular threat to property and socially vulnerable populations, as a significant portion of the flood zone encumbers economically distressed areas of the City. These physical conditions provide an important backdrop for evaluating public perceptions of stormwater infrastructure investment and fee-based financing mechanisms, as residents’ WTP is shaped not only by socio-economic characteristics but also by lived experience with flooding risks and infrastructure performance.
2.2. Survey Design and Data Collection
The survey instrument was designed in collaboration with municipal officials to align with the City of Norwalk’s policy and communication objectives. Before distribution, the instrument underwent pilot testing in May 2024 with a group of residents (
n = 87). This sample size was chosen in consideration of established survey methodology for instrument pretesting and was sufficient to evaluate question clarity, logical sequencing, internal consistency, and coherence between research objectives and data requirements. Minor adjustments were made following pilot feedback, including revisions to item wording, response options, and survey flow to enhance measurement validity and reduce nonresponse bias [
24,
25]. The final questionnaire was structured into five thematic sections designed to capture the range of factors influencing public perceptions of a potential stormwater management fee:
Demographic Characteristics: Age, gender, household income, educational attainment, race/ethnicity, housing tenure, neighborhood, and length of residency were collected to characterize the respondent population and support analysis of socio-demographic influences on policy preferences.
Personal Beliefs and Attitudes: Respondents rated their agreement with statements on municipal spending priorities, environmental protection, and social equity, as well as their perceptions of government effectiveness. A ranking task asked participants to prioritize economic development, environmental improvement, and equity initiatives for potential government spending.
Perceived Benefits: Participants evaluated the potential positive impacts of a stormwater fee, including improved infrastructure, enhanced climate resilience, reduced flood risk, and improved water quality. These items were preceded by a short, neutrally worded informational video to ensure a consistent baseline understanding of stormwater management concepts.
WTP: This section assessed respondents’ financial support thresholds for a modest fee, and a contextualized scenario referencing other locally relevant existing stormwater fees in Connecticut.
Perceived Risks: Respondents rated concerns about potential negative impacts, including increased living costs, disproportionate effects on low-income households and small businesses, limited environmental effectiveness, and insufficient adaptation outcomes.
Qualitative insights were drawn from open-ended survey responses and a follow-up discussion with municipal staff involved in stormwater management. Responses were reviewed using a thematic approach to identify recurring concerns related to affordability, transparency, fairness, and perceived benefits. These qualitative findings were used to contextualize quantitative results rather than formal inference.
Sampling, Distribution, and Data Processing
The survey targeted adult residents (aged 18 and older) in Norwalk, Connecticut, with the aim of capturing a broad sample of the city’s population. It was conducted over a three-month period from late May through July 2024, representing a cross-sectional snapshot of public perceptions during the early policy design phase. Recruitment employed a multimodal strategy to maximize accessibility and representation. The survey was distributed electronically via the City’s email distribution list and promoted through official social media channels. To increase participation among traditionally underrepresented neighborhoods, physical mailers and informational flyers were distributed on public buses, and the U.S. Postal Service’s targeted-mailing tool was used to reach geographic areas with historically low survey response rates.
A total of 457 residents participated voluntarily after providing informed consent. Survey participation was open to the general public, and sample characteristics were compared with American Community Survey (ACS) benchmarks to contextualize representativeness and guide interpretation. As the survey relied on open distribution channels, an exact response rate could not be calculated, but the total number of responses was comparable to prior municipal survey efforts. All participants were required to confirm their residency within the City of Norwalk to ensure sample validity. Responses were collected using Google Forms and exported into a master dataset for cleaning and preprocessing. Incomplete, duplicate, or invalid entries were removed prior to analysis. Categorical and Likert-scale responses were converted into numerical codes to facilitate statistical analysis. The resulting dataset formed the basis for descriptive statistics, inferential testing, and predictive modeling. Statistical analyses were conducted using SPSS version 29, and machine-learning applications, including Random Forest classification, were implemented in Python 3.10 using the scikit-learn library [
26,
27]. Participation in the study was anonymous, and all procedures complied with institutional ethical standards for research involving human subjects. Generative AI tools were not used in the development of this study or manuscript.
2.3. Analytical Framework
The analytical framework for this study was developed to investigate the socio-demographic, perceptual, and attitudinal determinants of public support for a proposed stormwater management fee and to evaluate how advanced machine learning techniques can complement traditional statistical approaches. The primary objective was to capture both the structural characteristics of the survey sample and the behavioral dynamics underlying WTP with particular attention to the latent psychological and perceptual variables that shape policy acceptance.
The analysis proceeded in two distinct but complementary stages. In the first stage, descriptive and inferential statistics were calculated to summarize the survey respondents’ key demographic and socio-economic characteristics. These measures provided an initial profile of the sample’s composition and allowed for comparison with the broader population of the City of Norwalk. Variables, including age, gender, income, education, housing tenure, and race/ethnicity, were compared against population benchmarks. Measures of central tendency (mean, median) and distribution (frequency, percentage) were calculated to profile the sample. Inferential statistical tests, specifically chi-square goodness-of-fit analyses, were conducted to evaluate whether observed differences between the survey sample and population benchmarks were statistically significant within the contexts of demographic groups and their relative WTP.
In addition to demographic analysis, a correlation matrix was constructed to map the attitudinal landscape and examine interrelationships among key perceptual variables, including environmental concern, equity considerations, perceived government effectiveness, and WTP. This exploratory analysis provided insight into how attitudes cluster and co-vary within the sample, offering contextual understanding of public perceptions prior to subsequent modeling. An ordered logistic regression using representative perceptual variables and demographic controls was estimated as a baseline benchmark for comparison with the Random Forest results. This combined approach, integrating descriptive profiling, inferential testing, and attitudinal landscape mapping, reflects a widely accepted methodological standard in environmental economics and policy research for identifying participation biases, establishing baseline sample characteristics, and contextualizing subsequent analytical findings [
28,
29].
In the second stage, a supervised machine learning model, specifically a Random Forest classifier, was implemented to assess the relative influence of individual predictors on WTP and to detect complex interactions among variables that are often missed by conventional statistical models. WTP was measured using a five-point Likert scale ranging from “strongly disagree” to “strongly agree” and was modeled directly as a multi-class classification outcome. Separate Random Forest classifiers were estimated for the modest conceptual fee and the specific example fee amount, allowing the model to capture variation in the intensity of support across response categories and compare how predictors influenced WTP under different payment contexts.
This two-stage analytical design integrates the strengths of classical statistical methods, particularly their ability to describe sample characteristics, and leverages the enhanced explanatory power of machine learning. The framework aligns with established theoretical models in environmental behavior, public goods provision, and policy acceptance, which emphasize that support for environmental initiatives is shaped by a multidimensional interplay of beliefs, perceived outcomes, perceived risks, and behavioral intentions. Such integration of descriptive, inferential, and computational approaches provides a comprehensive basis for examining the drivers of public WTP and for developing more targeted, socially responsive stormwater management policies.
2.3.1. Latent Variables and Model Structure
A core objective of this study was to identify and measure latent constructs underlying psychological and attitudinal dimensions that, while not directly observable, exert significant influence over WTP for environmental policies. These constructs were derived from four thematic sections of the survey instrument, which were purposefully designed to capture key determinants of public support for stormwater management.
Personal Beliefs: The first section of the survey measured respondents’ orientations toward municipal spending priorities, environmental protection, and social equity, as well as their trust in government institutions and sense of civic duty. In this context, institutional trust was defined as confidence that local authorities will manage resources transparently, deliver promised outcomes, and act in the public interest. Similarly, civic responsibility captures internalized social norms around collective action and environmental stewardship, which are shown to enhance support for public goods and sustainability initiatives [
30].
Perceived Benefits: The second section assessed respondents’ expectations of positive outcomes associated with a stormwater fee, including improved infrastructure, reduced flood risk, enhanced water quality, and increased climate resilience. These items operationalize the construct of environmental efficacy, defined as the belief that a policy will achieve its stated objectives. Perceived efficacy is a critical mediator between ecological concern and behavioral intention, with higher efficacy perceptions consistently linked to stronger public support [
31].
Perceived Risks: The third section focused on potential drawbacks, such as financial burden, equity implications, and doubts about the policy’s effectiveness or credibility. These measures capture latent perceptions of fairness, encompassing both distributive fairness (whether costs are shared equitably across demographic groups) and procedural fairness (whether decision-making processes are transparent and inclusive) [
32].
WTP: The final section collected data on WTP directly as a behavioral intention variable by asking respondents to indicate their financial support thresholds for a proposed stormwater fee, both as a general concept of a monthly fee, and separately as a proposed example amount. These variables sought to provide actionable insight in line with behavioral intention models such as TPB, which link attitudes and perceptions to concrete policy support and WTP.
For each latent construct, individual Likert-scale items were first coded in a consistent direction and standardized to ensure comparability across measures. Composite indices were then constructed by averaging the standardized item scores within each construct, with equal weighting applied to all items to preserve interpretability and avoid imposing assumptions about relative item importance. Reliability was assessed using Cronbach’s alpha, with all indices exceeding the conventional 0.70 threshold (
Table 1), indicating strong internal consistency and construct validity [
33]. These composite indices were subsequently used as independent variables alongside demographic and socio-economic data in the Random Forest classification model to evaluate their relative influence on WTP and identify the most relevant factors in citizen decision-making. Although exploratory factor analysis is often used to identify latent dimensions in survey data, it was not applied here because the analysis focused on demonstrating how Random Forest models can be used to examine nonlinear relationships and conditional patterns in WTP, rather than on reducing dimensionality or validating latent constructs. Given this objective, the ordinal nature of the Likert-scale items is well-suited to tree-based methods, which can accommodate nonlinear effects and interactions without requiring the assumptions or transformations associated with factor analysis.
A Random Forest classifier was selected as the primary analytical tool to explore the determinants of WTP for a stormwater management fee. This ensemble-based approach was chosen because of its capacity to model complex, nonlinear relationships among characteristics typical of public perception and behavioral intention data in an environmental policy context [
34,
35,
36,
37]. Traditional econometric techniques such as logistic regression or ordinary least squares are often constrained by assumptions of linearity, independence, and additivity among predictors. These limitations can obscure important behavioral dynamics and interactions between socio-demographic variables, perceptions, and latent constructs [
38,
39,
40,
41].
Random Forests, by contrast, build an ensemble of decision trees on bootstrapped dataset samples and combine their results through majority voting or averaging. This ensemble approach reduces overfitting, improves predictive accuracy, and allows the model to capture subtle interactions among latent variables that would otherwise remain undetected [
42,
43]. In addition, tree-based machine learning methods such as Random Forest do not rely on assumptions of linearity, normality, or interval scaling. As a result, ordinal Likert-scale variables can be incorporated directly, with model splits based on relative ordering rather than distributional properties. This makes Random Forest models well-suited for social survey data that integrate demographic, attitudinal, and perceptual measures [
44,
45,
46].
The Random Forest model was trained using a set of predictors that included both observable characteristics (e.g., age, income, education, housing tenure, race/ethnicity) and latent indices derived from the four thematic survey dimensions. Random Forest hyperparameters were selected through an iterative tuning process guided by model performance and interpretability. Key parameters, including the number of trees, maximum tree depth, and minimum samples per leaf, were adjusted incrementally, with model performance evaluated using out-of-bag (OOB) error estimates and cross-validation. Parameter values were refined until further adjustments produced minimal improvements in classification accuracy, with final settings selected to balance predictive performance and minimize overfitting rather than to maximize model complexity.
2.3.2. Feature Importance and Interpretation
Model interpretation focused on assessing the relative importance of individual predictors in shaping WTP. Feature importance scores calculated as the average reduction in impurity across all decision trees were used to rank variables according to their contribution to predictive performance [
47]. This approach enabled the identification of the most influential demographic, perceptual, and attitudinal factors without imposing restrictive parametric assumptions.
Overall model performance was validated using cross-validation and OOB error estimation to ensure generalizability and reduce overfitting. Accuracy, precision, recall, and the F1 score were calculated to assess predictive performance. Feature importance measures were used as an explainable machine learning tool to support transparent interpretation of model predictions. These steps align with best practices in applied machine learning for policy analysis, where transparent reporting of model validation metrics is essential for interpretability and reproducibility [
48]. Generative AI was not used in the production of this study.
It is important to emphasize that feature importance should not be interpreted as evidence of causality. Rather, it identifies the variables most strongly associated with WTP and provides direction to weight decision factors within their context rather than assessing them as independent variables. By triangulating these findings with descriptive statistics and inferential tests, the analysis provides a more comprehensive and nuanced understanding of the behavioral dynamics underlying support for the stormwater management fee.