Next Article in Journal
Does Digital Transformation Enhance the Sustainability of Enterprises: Evidence from China
Previous Article in Journal
Explaining Disparities in Higher-Education Participation by Socio-Economic-Background: A Longitudinal Study of an Australian National Cohort
Previous Article in Special Issue
Analysis of Sustainable Municipal Solid Waste Management Alternatives Based on Source Separation Using the Analytic Hierarchy Process
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Artificial Intelligence Application in Nonpoint Source Pollution Management: A Status Update

1
Office of International Agriculture Programs, Florida Agricultural and Mechanical University, 1740 S Martin Luther King Jr Blvd, Tallahassee, FL 32307, USA
2
Biological Systems Engineering, Florida Agricultural and Mechanical University, Tallahassee, FL 32307, USA
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(13), 5810; https://doi.org/10.3390/su17135810
Submission received: 1 February 2025 / Revised: 13 May 2025 / Accepted: 1 June 2025 / Published: 24 June 2025
(This article belongs to the Special Issue AI Application in Sustainable MSWI Process)

Abstract

Artificial intelligence (AI) has the potential to significantly advance the management of nonpoint source pollution (NPSP), a critical environmental issue characterized by diffuse sources and complex transport mechanisms. This study systematically examines current AI applications addressing NPSP through bibliometric and systematic analyses. A total of 124 studies were included after rigorous identification, screening, and eligibility assessments based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) framework. Key findings from the bibliometric analysis include publication trends, regional research contributions, author and journal contributions, and core concepts in NPSP. The systematic analysis further provided: (a) a comprehensive synthesis of NPSP characterization, covering pollution sources, key drivers, pollutants, transport pathways, and environmental impacts; (b) identification of emerging AI technologies such as the Internet of Things, unmanned aerial vehicles, and geographic information systems, and their potential applications in NPSP contexts; (c) a detailed classification of AI models used in NPSP assessment, highlighting predictors, predictands, and performance metrics specifically in water quality prediction and monitoring, groundwater vulnerability mapping, and pollutant-specific modeling; and (d) a critical assessment of knowledge gaps categorized into AI model development and validation, data constraints, governance and policy challenges, and system integration, alongside proposed targeted future research directions emphasizing adaptive governance, transparent AI modeling, and interdisciplinary collaboration. The findings from this study provide essential insights for researchers, policymakers, environmental managers, and communities aiming to implement AI-driven strategies to mitigate NPSP.

1. Introduction

Artificial intelligence (AI) has emerged as a transformative technology with applications across various domains, such as climate change modeling [1], hydrometeorological forecasting (e.g., temperature and precipitation) [2,3], waste and water resource management [4,5,6], and pollution detection [7]. Through its ability to process and analyze vast and complex datasets, AI supports predictive analytics, pattern recognition, and decision-support systems, enabling more effective mitigation strategies [8,9]. For example, Zulkifli et al. [10] developed AI-based predictive models for pollutant loads and their impact on water bodies, allowing for proactive planning and targeted pollution control measures. By analyzing complex relationships between critical factors (e.g., slope, land cover, runoff, rainfall intensity), AI can assist in optimizing management strategies, such as identifying the most effective land use practices or suggesting the location of best management practices to minimize pollutant runoff [9,11]. Given its potential, AI is increasingly being explored as a solution to one of the most persistent and complex environmental challenges: nonpoint source pollution (NPSP).
NPSP has emerged as one of the most critical threats to both ecosystems and the management of water resources [12,13,14]. NPSP refers to pollution that originates from multiple diffuse sources, such as land surfaces or the atmosphere, rather than a single, identifiable point. It is typically carried by rainfall or snowmelt as it moves over or through the ground, eventually entering lakes, rivers, wetlands, coastal waters, and underground water resources. The source and magnitude of pollution cannot be accurately identified. This makes it a leading threat to water quality and one of the most challenging forms of water pollution to manage globally [15,16]. From a scientific and technical perspective, NPSP refers to the diffuse discharge of pollutants into the environment from multiple sources such as agricultural lands, urban stormwater runoff, and atmospheric deposition, making it inherently challenging to monitor and mitigate effectively. Agriculture is currently a major contributor to NPSP with activities such as livestock and poultry breeding, field irrigation, and excessive use of pesticides and fertilizers [17]. Despite thorough exploration and diligent efforts, viable and scalable solutions for addressing NPSP remain elusive [12,18]. Researchers have embraced interdisciplinary research methods integrating AI and associated technologies (e.g., Internet of Things (IoT), drones, remote sensing, Geographic Information System (GIS), satellites) to address the systemic complexity of NPSP and support more adaptive management strategies [19,20,21,22]. However, despite growing interest in AI applications in NPSP research, their practical implementation remains limited.
To date, while AI has been widely applied in environmental pollution research, only a limited number of review studies have specifically addressed its applications in NPSP. These review articles are summarized in Figure 1 and Supplementary Table S1. They either cover a wide range of pollution types, from air and water to agricultural and multi-environmental contexts, or adopt a broad approach without directly referencing NPSP. The reviews vary in format, including bibliometric, systematic, comprehensive, and narrative methods, and address diverse pollution types, ranging from air and water to agricultural and multi-environmental contexts. For example, Cabaneros et al. [23] reviewed standalone AI models for air pollution forecasting, while Fan et al. [24] assessed hybrid artificial neural network (ANN) models for pollutant removal in water treatment. Hussain et al. [25] examined the role of AI, remote sensing, and unmanned aerial vehicles (UAVs) in supporting data-driven mitigation strategies in agricultural pollution. Zulkifli et al. [10] focused on AI-based detection of contaminants in water supply systems using wireless sensor networks. Tiyasha et al. [26] traced the evolution of AI models for river water quality modeling, highlighting a shift toward hybrid and ensemble techniques. Similarly, Wong et al. [27] explored the use of back propagation neural network (BPNN) and support vector machine (SVM) in analyzing pollutant interactions across air, soil, and water in mangrove ecosystems. Ye et al. [28] investigated AI applications in wastewater treatment and early warning systems, while Bhatt et al. [29] evaluated deep learning (DL) models for real-time river pollution control. Additionally, bibliometric reviews by Guo et al. [30] and Li et al. [31] have explored AI trends in air pollution and machine learning in groundwater pollution research, respectively.
Despite the growing number of AI-related environmental reviews, several critical aspects of NPSP remain underexplored. There is a need to synthesize and clarify the following:
  • The AI methods applied in NPSP and the metrics used to evaluate their performance.
  • The technical trade-offs associated with current AI methods, including data availability and quality, computational complexity, and model interpretability.
  • The integration of AI with supporting technologies (remote sensing, IoT, and GIS) enhances the scalability, efficiency, and precision of NPSP monitoring and control.
  • The key knowledge gaps, unresolved challenges, and emerging opportunities that can inform future research, policy development, and real-world implementation of AI-based NPSP solutions.
Therefore, this study aims to address these gaps through a combination of bibliometric analysis and systematic review of AI applications in NPSP, guided by the following objectives:
  • To provide a status update on the current structure and evolution of AI applications in NPSP research.
  • To provide a function-based classification of AI models, mapping their inputs, outputs, and roles in solving specific NPSP challenges such as nutrient loading, runoff prediction, and source identification.
  • To offer a comparative synthesis of commonly used AI techniques (e.g., ANN, SVM, and hybrid models), assessing their respective strengths and limitations in terms of data requirements, computational costs, scalability, and interpretability.
  • To explore how AI can be enhanced through integration with remote sensing, IoT, GIS, and other enabling technologies, improving its applicability for real-time, large-scale NPSP management.
  • To identify and categorize critical knowledge gaps, including (a) AI model development, optimization, and validation, (b) data limitations and monitoring challenges, (c) governance, policy, and social dimensions, and (d) system integration (IoT, remote sensing, GIS) and to propose targeted directions for future research, emphasizing adaptive governance, transparent model development, and interdisciplinary solutions.
Table 1 shows a list of abbreviations.

2. Materials and Methods

2.1. Framework Adoption

The methodology adopted in this study ensured a robust and structured approach, aligning with the esteemed Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) standards. By using this methodology, key findings were systematically identified and extracted to contribute to a deeper comprehension of the interconnection between AI and NPSP.
The PRISMA method was chosen due to its transparent and rigorous procedures that document the steps of the review process, making it easy for others to replicate, verify, and build upon [34]. The PRISMA checklist includes critical items such as details on the specific sources searched, the search strategies employed, eligibility criteria for including or excluding studies, and assessment of study quality based on internal and external validity. Most importantly, the PRISMA framework provides a flow diagram visually depicting the main phases of identification, screening, eligibility, and inclusion of studies (Figure 2). This flow diagram facilitates obtaining a set of high-quality, relevant articles that can be relied upon to represent the current state of research on NPSP accurately.

2.2. Database Sourcing and Search Strategy

This section constitutes the identification phase of PRISMA. The studies were identified through a systematic search from the “Web of Science” and “ScienceDirect” databases, chosen for their multidisciplinary coverage of peer-reviewed academic sources. An iterative process was used to develop a comprehensive search string that captured all relevant studies that addressed the topic in the title, abstract, or keywords. The first search string, conducted in January 2023 on the Web of Science database, consisted of the following combinations connected by the Boolean operator “AND”: Artificial intelligence AND Nonpoint source pollution, Machine learning AND Nonpoint source pollution, Data mining AND Nonpoint source pollution, Expert systems AND Nonpoint source pollution, Predictive modeling AND nonpoint source pollution, and Deep learning AND Nonpoint source pollution. Each combination yielded the following number of results: eight studies for “Artificial intelligence AND Nonpoint source pollution”, 24 studies for “Machine learning AND Nonpoint source pollution”, 53 studies for “Data mining AND Nonpoint source pollution”, 18 studies for “Expert systems AND Nonpoint source pollution”, 40 studies for “Predictive modeling AND nonpoint source pollution”, and three studies for “Deep learning AND Nonpoint source pollution.” In total, 146 unique, English-language, peer-reviewed articles were identified from this search. A second search was conducted in March 2025 to enhance the quality of the study by incorporating more recent publications. This follow-up search, using the keywords “nonpoint source pollution” combined with “artificial intelligence” (using the quotation strategy), was performed on the ScienceDirect database and was limited to studies published in 2024 and 2025. To specifically evaluate whether variations in our search terms would yield significantly different article pools, we conducted a sensitivity analysis. More details are presented in Supplementary Table S2. A total of 56 studies were identified from this search, bringing the total number of studies to 202 (146 + 56).

2.3. Eligibility Criteria and Screening

This section corresponds to the second and third phases (Eligibility and Screening) of the PRISMA framework. Each retrieved article was carefully reviewed to determine its suitability for inclusion in the systematic review. Eligibility was assessed based on the inclusion and exclusion criteria outlined in Table 2, with a focus on the article’s relevance, methodological rigor, and alignment with the study’s objectives. In addition to meeting the predefined general criteria such as language, document type, and keyword relevance, each article was evaluated for scientific rigor and completeness. Quality assessment was based on the clarity of research objectives, transparency in methodology (e.g., model design, training, and validation), data availability, and peer-review status. Particular attention was given to whether the study presented original methods, empirically tested models or comparative evaluations of AI techniques. Studies lacking robust methodology, reproducible outcomes, or adequate explanation of AI application were excluded. This ensured that only high-quality, relevant, and empirically sound studies were included in the final synthesis.
The screening phase, conducted according to the eligibility criteria, consisted of two distinct stages. The initial stage involved thoroughly evaluating study titles, abstracts, and keywords, focusing on including the key search terms “artificial intelligence” and “nonpoint source pollution.” This rigorous initial screening excluded 21 (10 + 11) irrelevant studies, yielding 181 (136 + 45) potentially relevant articles for further review. However, it was identified that there were nine duplicates among them, and an additional five articles were inaccessible in full text. Consequently, 167 (122 + 45) distinct records remained for full-text assessment. The second stage comprised a more in-depth evaluation of full-text articles based on the eligibility criteria previously discussed. The second stage led to the exclusion of 43 (15 + 28) additional articles.

2.4. Study Selection Process

This section addresses the inclusion phase of PRISMA. It encompasses 124 (107 + 17) studies that successfully passed the identification, screening, and eligibility phases. These studies meticulously satisfied the predefined eligibility criteria and were comprehensively analyzed and incorporated into the synthesis. Conversely, 39 articles were excluded at various stages for different reasons. The specific reasons for exclusion at each phase were carefully documented to ensure transparency in the systematic review process. Specifically, nine articles were removed as duplicates found during the database searches. Another ten studies were screened out based on their titles, abstracts, and keywords not relating to the designated scope of applying AI techniques to monitor and address NPSP. After a full-text review, 15 articles lacked sufficient data or were unrelated to the core subject matter. Lastly, five studies had to be excluded because the full texts were unavailable even after exhaustive attempts to retrieve them from various sources, preventing an adequate evaluation.

2.5. Data Extraction and Preparation

Data for this review were collected through both automated and manual extraction methods. Bibliometrix/Biblioshiny (www.bibliometrix.org (accessed on 17 June 2025)), an online data analysis framework based on the shiny (install.packages(“bibliometrix”), library(bibliometrix) package in R software (Version 4.2.3.), was utilized to extract bibliometric data (authors, date and number of publications, citations, journals, and country affiliation) [35]. Additionally, VosViewer (www.vosviewer.com (accessed on 17 June 2025)), which features a text mining function [36] was used for keyword extraction. Keywords with a frequency of five or more were selected to support co-occurrence and thematic analyses. These high-frequency terms were treated as indicators of each article’s core content and research focus.
To complement the automated extraction, a structured data extraction spreadsheet was also developed to manually gather content-specific information from each study systematically. This included detailed entries on the type and classification of AI models used, supporting technologies, their specific applications to NPSP, the pros and cons of the models, associated technical or implementation challenges, and knowledge gaps identified by the authors. Each article was reviewed manually to ensure consistency, accuracy, and comparability across all data entries.

2.6. Data Analysis

The extracted data were analyzed through a combination of Bibliometrix/Biblioshiny and VosViewer. Bibliometrix and Biblioshiny, integrated with R software (Version 4.2.3.)’s shiny package, facilitated the exploration of diverse facets of this study [35]. This included annual publication trends, collaborative patterns among authors from different nations, scientific productivity of countries and authors, and themes associated with NPSP. The tool also has a text mining feature for keyword extraction, facilitating keyword analysis, and determining the number of citations. The keywords present in a study serve as a concise summary and refinement of its core content. To gain insights into the research trends and directions in AI applications in NPSP, keywords with a frequency of five or greater are drawn as a word cloud.
A publication trend analysis was conducted using a simple linear regression equation (Equation (1)) and the coefficient of determination (R2) (Equation (2)) to quantify and model the temporal trend in the number of AI-based NPSP publications.
y = a x + b
R 2 = 1 i = 1 n Y i Y ^ i 2 i = 1 n Y i Ȳ i 2
where y is the number of publications, x is the year, a is the slope, b is the y-intercept, n is the number of observations, Y - i is the average number of publications, Y ^ i is the predicted number of publications for a given year, and Yi is the actual number of publications for that year.
Multiple correspondence analysis was also conducted using the R software (Version 4.2.3.)’s shiny package to illustrate the interconnections among the keywords, condensing complex data with multiple variables into a two-dimensional representation. This approach enables the exploration of relationships between keywords, identifying co-occurrence patterns, and revealing underlying themes within the bibliometric data [35]. The distance between the points on the created graph reflects the similarity between the keywords, and the ones moving toward the central point indicate that they have received significant attention in recent years [35,37].
A classification tree of the AI models used in NPSP studies was created using the RStudio software (Version 2024.12.1+563) “collapsibletree” package in RStudio, based on the learning methods and algorithm purposes defined by Mukhamediev et al. [38]. Spatial distributions of the studies across different regions were mapped using Bibliometrix to look for geographic patterns.

2.7. Summarizing and Reporting of the Results

The findings of this study were illustrated using a range of visualization techniques. Bar graphs effectively visualized the chronological evolution of publication output and the journals in which they were published. Network maps depicted interactions among authors and co-authors, while geographical maps provided insight into international collaboration patterns. A thematic map and word cloud were employed to dissect and illustrate the core themes of AI-focused NPSP research. A collapsible tree diagram was created using the RStudio software (Version 2024.12.1+563) to elucidate AI classifications, and customized figures were employed to explain the PRISMA framework and application of AI in NPSP and the challenges that emerged from the selected studies. This array of visualization techniques helps communicate the study’s findings in a readily understandable manner.

3. Results

3.1. Bibliometric Analysis

3.1.1. Trends in Scientific Studies on AI in NPSP

The application of AI techniques to address NPSP has progressed over the past few decades, as NPSP continues to be a widespread environmental challenge [39]. Peer-reviewed studies published between 1995 and 2022 were used in the trend analysis. As illustrated in Figure 3, the amount of AI-based research in NPSP has grown substantially since 1995. The fitted line (y = 0.5402x − 4.0476; R2 = 0.4907) demonstrates a moderate growth in publication rate over time. Notably, over 50% of the studies were published during the six years (2017–2022). The surge in publications indicates a growing interest in utilizing AI to address environmental issues associated with NPSP, particularly as more robust data sources and computational methods become available.

3.1.2. Country Productivity and Collaboration

Figure 4 illustrates both the global distribution of publications and the collaborative efforts among countries in the field of AI applications for NPSP. China leads significantly with 29.4% of the total publications, followed by India (11.2%) and the United States (8.6%), reflecting their dominant research presence. However, the findings also highlight a notable limited contribution from regions such as South America and Africa, revealing a critical gap in global research efforts that could affect these regions’ ability to tackle local pollution challenges effectively. Beyond individual country productivity, Figure 4 reveals important patterns of international collaboration. The thick connector line between China and the United States, representing seven joint publications, indicates a strong bilateral synergy in applying AI to NPSP problems. Similarly, Iran’s collaborative links with Vietnam (five), China (three), and Australia (three) demonstrate an active role in global research networks. These partnerships highlight the importance of scientific cooperation in advancing technological solutions to diffuse pollution. Such collaborations foster knowledge exchange, enhance model development, and build capacity across regions with varying levels of expertise. Encouraging broader international partnerships, especially involving underrepresented continents (e.g., South America and Africa), is essential to ensure a more inclusive and globally responsive approach to AI-based NPSP management.

3.1.3. Author, Citation, and Source Analysis

This authorship analysis examines the contributions of several authors to the literature, particularly AI-focused NPSP research. Figure 5 highlights the most prolific authors within the selected research articles, with Liu H. leading in publication volume (4 publications), followed by Liu Y. and Zhang A. tied in second place (3 publications). Although those authors have collaborated on multiple research papers, they primarily served as co-authors because of their cooperation, close contact, and mutual interest in the specific topic. Figure 5 also depicts numerous clusters, each representing a group of researchers focusing on a particular application of AI in NPSP. However, only three of the nine clusters reveal significant research collaboration. The most extensive collaboration clusters are centered around Liu H. and Zhang A., who collaborated with six other authors. The first cluster, led by Liu H., focuses on studies about water quality monitoring using satellite and machine learning [20], mitigation of fertilizer-based pollution [40], characteristics of NPSP research [41], performance and trends in NPSP modeling research [42] air pollution [43], and nonpoint pollutant sources management [19]. The second cluster, led by Liu Y., focuses on studies related to pollution in drinking water source areas [44], river pollution under the rainfall-runoff impacts [45], prediction of urban water quality [46], and heavy metal contamination concentration in surface waters [47]. The third cluster, led by Zhang A., predominantly involves studies on data-driven machine learning in environmental pollution [48,49,50]. These clusters highlight these authors’ focused research efforts and collaborations in AI applications in NPSP.
The paper titled “A comprehensive analysis of AI models for river water quality modeling: 2000–2020” by Tiyasha et al. [26] stands out as the most frequently cited article among the chosen papers for this review, with 414 citations (Figure 6a). The significant citations reflect this study’s influence on the NPSP field. The paper, titled “A Review of ANN Models for Ambient Air Pollution Prediction”, authored by Cabaneros et al. [23], holds the second position with a total of 283 citations. A comparison of these papers across both raw citation counts and citations per year (CPY) reveals consistent dominance by Tiyasha et al. [26], which leads with 414 raw citations and an impressive 69.0 CPY. This robust performance across both metrics underscores the paper’s foundational significance and enduring relevance in the field. Cabaneros et al. [23] similarly maintain their strong position, accumulating 283 raw citations and averaging 40.4 CPY, solidifying their sustained impact since publication, although at a slightly slower rate than the top-ranked paper. However, this alignment between raw citations and CPY is not consistent across all top-performing articles, emphasizing the importance of considering publication age in assessing influence. For example, Qu et al. [47], while ranking third in raw citations with 242, occupy the fifth position when considering citations per year (30.3 CPY). This suggests a steady but perhaps less rapid accumulation of citations over its 8-year lifespan. Conversely, Asha 2022 [51], despite being sixth in raw citations with 137, demonstrates a more rapid recent impact by moving up to third place in citations per year (34.3 CPY). This rapid ascent in CPY for a relatively newer paper (published in 2022) highlights its immediate relevance and significant contemporary influence, indicating that it is quickly becoming a key reference in the field despite having less time to accumulate raw citations. This distinction between cumulative impact (raw citations) and current scholarly attention (citations per year) provides a more nuanced understanding of a paper’s evolving influence within the research landscape.
This study also investigates the sources of the publications addressing AI in NPSP (Figure 6b). A catalog of environmental science and water management journals addressing issues related to NPSP was explored. Figure 6 highlights the top 10 journals that have significantly contributed to the field by publishing influential articles over the past decade. Environmental Science and Pollution Research (8 publications; Impact Factor (IF): 5.8), based in Germany, emerged as the most prominent journal in this area in terms of publication volume. Among the journals with the highest impact factors, Water Research (5 publications; IF: 11.5) from the United Kingdom, along with the Journal of Cleaner Production (6 publications; IF: 9.8) and Science of the Total Environment (6 publications; IF: 8.2) from the Netherlands, stand out for their significant contributions. Other journals also played a role, including Atmospheric Environment (UK), Environmental Modelling and Software (Netherlands), Environmental Monitoring and Assessment (Netherlands), The International Journal of Environmental Research and Public Health (Switzerland), and Sustainability (Switzerland). While some journals, like Environmental Science and Pollution Research, lead in publication volume, others, such as Water Research, despite a lower number of publications, demonstrate higher impact factors, indicating a focus on highly cited research. Journals with the highest impact factors, such as Water Research, Journal of Cleaner Production, and Science of the Total Environment, indicate a trend toward publication in more rigorous, multidisciplinary outlets. These journals provide invaluable insights and research findings that enhance our understanding of today’s complex environmental challenges.

3.1.4. Co-Occurrence and Multiple Correspondence Analysis

The co-occurrence analysis of the keywords indicates that “artificial intelligence (25 occurrences)”, “machine learning (16 occurrences)”, and “nonpoint source pollution (11 occurrences)” are the most commonly occurring keywords (Figure 7a). This finding suggests a significant focus on using AI and ML techniques to address NPSP issues. For instance, AI models such as random forest and support vector machines have been used to assess NPSP by estimating the input and export of nutrients in water bodies [44], and SOM and ANN for assessing water quality [57]. Five distinct patterns emerged from the co-occurrence analysis. The first revolves around the general application of ML for evaluating groundwater quality [58,59,60,61,62]. The second encompasses the use of SVM [63], RF [64], FL [62], and ANN [65] for assessing NPSP-related problems. The third emphasizes the connection between air pollution and water quality concerns [43,49,66]. The fourth investigates the overall use of AI for assessing NPSP from a sustainability environmental perspective [67]. The last one provides a comprehensive overview of various types of pollution [68]. In addition, a multiple correspondence analysis was conducted, which led to the creation of Figure 7b. The results showed two main clusters. Cluster 1 (in red) comprises 32 keywords related to pollution assessment, including risk assessment, prediction, variability, water type (surface, groundwater), and pollution type (chemical with N, P, dissolved oxygen, organic matter, and microbiological with E. coli). Cluster 2 (in blue) consists of six keywords (air pollution, mortality, impact, area, emissions, and exposure) focused on the pathway and impacts of pollution.

3.2. Synthesis of AI Applications in NPSP Studies

3.2.1. NPSP Characterization

NPSP presents a complex environmental challenge and studies have focused on diffuse origins (Supplementary Table S1), across land-use activities, such as agriculture, urban runoff, atmospheric deposition, and industrial discharges [41,69]. Characterizing NPSP requires understanding its impact across a broad scope of pollution types and environmental settings. This includes water pollution, focus on river systems for detection and control [26,29,50], strategies for pollutant removal [24], and analysis of contaminants in groundwater and water supplies [31]. Air pollution is another significant area, emphasizing outdoor air pollution forecasting and broader AI research trends in air quality management [23,30,32,33]. Agricultural pollution is also a major contributor to NPSP, requiring dedicated study [25]. Expanding beyond single-media assessments, NPSP is increasingly viewed from a multi-environmental perspective, encompassing pollution across water, air, and soil, especially in vulnerable ecosystems like mangrove forests and in the general context of environmental pollution management [27,28].
The pollutants associated with NPSP are wide-ranging, and their occurrence exhibits significant spatiotemporal variability. Agricultural runoff is a major transport pathway of nutrients and sediment, while urban runoff contributes to a cocktail of chemicals and solids [70,71,72]. Atmospheric deposition, linking air pollution from industrial and vehicular emissions to water and soil contamination [43,51,71,73], is mostly associated with fine particulate matter and gaseous pollutants (e.g., NO2, CO2). Considering these complex dynamic processes associated with NPSP across different spatial scales, advanced monitoring and modeling techniques, such as AI, are needed to simulate pollutant behavior and quantify source contributions. See Table 3 for more details.

3.2.2. Integration of Emerging AI Technologies

AI applications in NPSP management have been enhanced through the synergistic integration of complementary technologies that support data acquisition and real-time monitoring. For instance, low-cost IoT stations (e.g., qHAWAX) [88] and mobile sensor networks [89] are deployed for real-time measurement of pollutant concentrations (e.g., NO2, PM), providing crucial localized data. Similarly, AI-enhanced toxicology platforms (e.g., ETAPM-AIT) [51] and smart agriculture systems with intelligent nutrient sensors [25,90] have integrated IoT networks to improve pollution monitoring and predictive precision. High-resolution spatiotemporal data for assessing urban water quality, agricultural runoff, and composting activities is obtained through unmanned aerial vehicles (e.g., drones) equipped with multispectral cameras [91,92], while satellite imaging platforms like Landsat and Sentinel-2 are widely applied to track broad-scale land-use change and pollutant dispersion [93,94,95]. GIS further enhances these capabilities by integrating diverse data streams to map pollutant transport pathways and identify pollution hotspots [44,54]. Together, these integrated tools provide AI models with the rich and diverse data necessary to detect pollution patterns, forecast pollutant loads, and inform timely, targeted interventions across complex NPSP landscapes.
Beyond data acquisition, the synergistic integration of these enabling tools with advanced AI techniques significantly amplifies model functionality, leading to substantial improvements in prediction accuracy and decision-support capacity. The robust and diverse data streams generated by sensors, UAVs, and remote platforms (e.g., satellites) provide ML models like XGBoost [91] and optimized extreme learning machines (ELM such as mixed kernel ELM with particle swarm optimization (PSO-MK-ELM) [95] with enriched input features, ultimately enhancing their generalizability and performance stability. Furthermore, DL architectures (e.g., ANNs), excel at modeling complex, long-term environmental trends by effectively leveraging the dense, high-frequency datasets originating from IoT and satellite networks [93,94]. To address the inherent consequences of NPSP (e.g., eutrophication), explainable tools like SHapley Additive exPlanation (SHAP) [83] offer crucial transparency by interpreting AI model inputs and outputs, fostering greater stakeholder trust and facilitating policy adoption. The integration of big data platforms like Hadoop MapReduce [89] translates AI-based air quality analyses into graphical formats (e.g., pollution maps) for practitioners and policymakers. Moreover, the coupling of AI with advanced analytical tools such as spectroscopy and fluorescence probes [96,97] allows for precise pollutant characterization, adding a critical layer of detail for pollutant source identification and real-time management strategies. These combinations equip AI frameworks with enhanced analytical depth, improved real-time responsiveness, and greater operational scalability, essential attributes for effectively tackling the multifaceted challenges of NPSP control.

3.3. AI Modeling Approaches in NPSP Management

3.3.1. Overview of AI Model Applications

AI models are extensively applied in NPSP assessment, offering enhanced predictive accuracy, automation, and adaptability across varied spatial contexts such as rivers [74,84], aquifers [76], marine ecosystems [98], and landfills [67]. Their ability of AI to model complex nonlinear relationships, process high-dimensional datasets, and capture spatiotemporal variability makes them particularly suited for addressing the diffuse and dynamic nature of NPSP [53]. AI techniques have been employed for major NPSP functional tasks such as water quality prediction, groundwater vulnerability mapping, and pollutant-specific modeling. Table 4 provides details about the AI models, inputs, outputs, and performance metrics used to evaluate the performance of the AI models.
Water Quality Prediction and Monitoring: AI models such as SVM, GEP, and MLP have been widely applied in river systems to estimate key water quality parameters, including electrical conductivity, total dissolved solids, and sodium adsorption ratio [74]. ANNs have facilitated virtual water quality monitoring by modeling a broad spectrum of chemical indicators, such as total suspended solids, chemical oxygen demand, and nutrients like sulfate, phosphate, bicarbonate, and nitrate [77]. For water quality index estimation, models like BPNN, ANFIS, and SVM used simple yet meaningful inputs such as dissolved oxygen, ammonia, and pH [53]. Similar approaches have also been successfully applied to assess NPSP in lakes, where DT, SVM, and ANN models predict WQI using variables like temperature, pH, turbidity, and coliform counts [99]. These applications demonstrate that AI models can effectively detect subtle pollution patterns and forecast water quality, thereby supporting real-time, data-driven decisions and the proactive management of NPSP for sustainable water resource protection.
Groundwater Vulnerability Mapping: In groundwater systems, AI is increasingly used for pollution vulnerability mapping and risk classification, providing critical insights for water resource management. BRT and KNN have been applied to nitrate vulnerability assessments by integrating hydrogeological, land-use, and topographic predictors, allowing for more precise identification of areas at risk of contamination [21]. CNN further enhances spatial modeling by capturing the heterogeneity of aquifer pollution, making it particularly effective in regions with complex spatial variability [76]. AI-based frameworks that combine PSO, SVM, and NBC have also been used for groundwater water quality index prediction using physicochemical profiles, including electrical conductivity, pH, alkalinity, sulfate, and nitrate [60]. Additionally, FL models provide rule-based classification for groundwater quality monitoring, offering enhanced interpretability that is valuable for regulatory and policy applications [62]. Building upon these diverse applications, AI enhances groundwater vulnerability mapping by automating the integration of diverse datasets to simulate scenarios and anticipate the impacts of land use, agriculture, and climate variability on groundwater quality, leading to more informed decision-making and proactive management strategies.
Pollutant-Specific Modeling: Assessing the levels of specific pollutants, such as trace metals and nutrients, is also a crucial component in addressing NPSP. In the context of highway runoff, the MT–GA hybrid model, combining Model Trees and a Genetic Algorithm, was used to predict concentrations of chromium, lead, zinc, total organic carbon, and total suspended solids [72]. Models such as WER-GBO, LSSVM, and ANFIS have demonstrated high accuracy in monthly sodium concentration prediction. These models are effective by capturing complex, non-linear relationships between input parameters, such as lag-time of discharge and sodium, and pollutant levels [100]. GBT has also been successfully used to predict arsenic concentrations using dissolved oxygen, pH, and salinity data; its ensemble nature allows it to handle complex interactions among these variables, leading to robust predictions [61]. RF, BRT, and LR have been utilized for arsenic hazard mapping in groundwater, integrating diverse spatial data to delineate areas prone to high arsenic levels [101]. Groundwater nitrate concentration mapping has been addressed using SVM, RF, and Bayesian-ANN, which integrate terrain, hydrology, and land-use data for improved spatial accuracy. These models can help identify complex spatial patterns and relationships between land surface characteristics and groundwater nitrate levels [64]. BGLM and BART have further contributed to nitrate modeling by offering probabilistic outputs and greater flexibility in handling variable-rich datasets. The Bayesian framework provides not just point predictions but also uncertainty estimates, which are crucial for risk assessment and decision-making [102]. Overall, the AI models are particularly advantageous for pollutant-specific tasks due to their ability to handle nonlinear interactions and optimize performance, even with limited or imbalanced datasets [26,32].
AI Applications in Other NPSP Contexts: Beyond traditional aquatic systems, AI has also been applied to NPSP sources such as landfill leachate, urban runoff, and marine sediments. Models like FL, RBFANN, and MLPANN have accurately predicted landfill leachate characteristics based on physicochemical parameters (e.g., hardness, turbidity, oxygen demand) and heavy metal concentrations (e.g., lead, chromium, cadmium) [67]. Genetic Algorithm–Ridge Regression models have also been used to assess aquifer vulnerability near landfill sites [58]. These models are capable of handling the complex, multivariate nature of leachate composition, which is crucial for treatment and monitoring. Additionally, they demonstrate AI’s ability to simulate complex subsurface processes and evaluate the potential for leachate migration to contaminate groundwater [58,67]. In coastal and marine settings, SVR and GA have been employed to predict polyaromatic hydrocarbon concentrations for pollution mapping, demonstrating the ability of these models to handle the complex fate and transport of pollutants in dynamic marine environments [98]. Similarly, the health and pollutant levels in mangrove ecosystems affected by multi-source pollution have been assessed using a range of AI models, including ANN, BPNN, GRNN, RBFNN, XGBoost, and FL. These models enable the integration of data from diverse pollution origins, facilitating a comprehensive evaluation of their cumulative impact on sensitive coastal ecosystems [27].
AI Model Performance: The effectiveness of AI models across diverse NPSP settings (e.g., rivers, lakes, highways, aquifers, mangroves, marine ecosystems) is consistently demonstrated by various performance metrics reported in the literature. Some of the commonly used metrics include R2, RMSE, MAPE, NSE, and AUC, which serve to validate model accuracy and reliability. For instance, studies using models like SVM, GEP, and MLP for river water quality estimation reported a high R2 value (0.88) and a low RMSE (19.71), signaling good model fit and minimal prediction error [74]. Similarly, hybrid and ensemble models have demonstrated effectiveness in predicting complex outputs. An MT–GA hybrid model achieved a high R2 (0.87) when predicting runoff metal concentrations [72]. In another study, WER-GBO and LSSVM models predicted monthly sodium loads with a low RMSE (0.639) [100]. Groundwater vulnerability assessments have also shown strong spatial predictive power. For example, CNN applications reported high AUC (0.95), indicating a significant signal and excellent predictive capability [76]. BRT applications likewise demonstrated classification reliability with a strong AUC value (0.79) [21]. Additionally, ANN and ANFIS models used in virtual monitoring and WQI prediction demonstrated strong modeling capabilities, capturing non-linear trends while maintaining acceptable MAPE (<10%) and NSE value (0.68) [53,77]. These consistent performance trends across diverse scenarios underscore AI’s capacity to deliver reliable, scalable, and context-sensitive predictions in support of proactive NPSP management.
Table 4. Summary of AI Applications in NPSP.
Table 4. Summary of AI Applications in NPSP.
NoPurposeEnvironmental ContextApplied AI ModelsInput
(Predictor)
Output
(Predictand)
Performance MetricsReference
1Water quality evaluationRiverSVM, GEP, MLPEC, TDS, SAREC, TDS, SARRMSE, MAE, R2, DDR[74]
2Prediction of highway runoff qualityHighwayMT–GACr, Pb, Zn, TOC and TSS
annual average daily, antecedent dry period, rainfall, maximum 5-min rain intensity
Cr, Pb, Zn, TOC and TSSR2[72]
3Aquifer vulnerability mappingAquiferCNNNa+, K+, Ca2+, Mg2+, Cl, SO42−, HCO3 and HCO32−Vulnerability maps: IVI, SVI, and TVIAUC[76]
4Nitrate pollution vulnerability mappingGroundwaterBRT and KNNNO3, Depth to groundwater, Hydraulic conductivity, Aquifer thickness, Net recharge, Distance from river, Drainage density, Land use, Well density, soil texture, permeability, and soil organic matter content, Surface slope.Groundwater vulnerability maps,Sensitivity, Specificity, Area under ROC curve, and Kappa[21]
5Virtual water quality monitoringRiverANNT, pH, TSS, hardness, alkalinity, EC, BOD, COD, DO, CO2, Ca, Mg, P, Cl, SO42−, PO43 HCO3 and NO3T, pH,
TSS, hardness, alkalinity, EC,
BOD, COD, DO, CO2, Ca, Mg, P, Cr, Cl, SO42−, PO43 HCO3 and NO3
MAE, RMSE, R2, MAPE,
NSE
[77]
6Water quality predictionRiverBPNN, ANFIS, SVR), MLRDO, BOD, T, pH, NH3, and WQIWQIDC, RMSE, and R[53]
7Water quality predictionGroundwaterPSO, NBC, SVMEC, pH, TDSs, TH, alkalinity, bicarbonate, Cl, SO4, NO3, fluoride, Ca, Mg, Na, K, Fe.WQIConfusion matrix[60]
8Water sodium concentration predictionRiverWER-GBO, LSSVM, ANFISDischarge and NaMonthly sodium predictionR, RMSE, KGE, MAE, MAPE, IA[100]
9Arsenic concentration predictionRiverGBTAs, DO, pH, T, salinity, DSAs concentrationR2[61]
10Groundwater pollution vulnerabilityGroundwaterGA-Ridge regressionDepth to water, net recharge, topography, and impact of vadose zone mediaDepth to water, net recharge, topography, and impact of vadose zone mediaMSE[58]
11Environmental pollution mappingNear-shore marine sedimentsSVR and GATotal petroleum hydrocarbons descriptorTotal polyaromatic hydrocarbonsR, MSE, MAE, MAPD[98]
12Groundwater contamination modellingGroundwaterBGLM, BRNN, BART, and BRRElevation, slope, plan curvature, profile curvature, annual rainfall, groundwater depth, distance from residential, distance from the river, Na, K, and topographic wetness indexGroundwater nitrate concentrationR2[102]
13Water quality predictionLakeSVM, DT, ANNT°, pH, turbidity, and coliformsWQIMSE, RMSE, and RSE[99]
14Groundwater quality monitoringGroundwaterFLpH, T°, turbidity COD, BOD, PO4, NO2, NO3, NH4, DO, EC, and FCWQI [62]
15Landfill leachate penetration managementLandfillsFL, RBFANN, and MLPANNFe, Pb, Cr, Cd, Molybdenum, N, Al, Na, COD, TDS, EC, Cl, hardness, turbidityPredicted leachateR2 RMSE:[67]
16Spatial groundwater nitrate concentration estimationGroundwaterSVM, RF, Baysia-ANNElevation, slope, plan curvature, profile curvature, rainfall, piezometric depth, distance from the river, distance from residential, Na, K, and topographic wetness indexGroundwater nitrate concentrationR2, RMSE[64]
17Groundwater arsenic hazard ModellingGroundwaterRF, BRT, LRElevation, slope, aquifer connectivity, distance from the Ganges and other major rivers, minor rivers, streams and estuaries, groundwater depth and fluctuation, potential groundwater recharge, groundwater-fed irrigated area, land cover, and population.As concentrationSensitivity
Specificity
Accuracy
[101]

3.3.2. Classification of AI Techniques

Understanding AI hierarchical classification is essential when looking at AI models in NPSP. In Figure 8 and Table 1, the AI models applied in addressing NPSP issues were classified based on the AI classes generated by Mukhamediev et al. [38]. In NPSP studies, three subclasses of ML were identified: supervised learning (SL), deep learning (DL), and reinforcement learning (RL). SL includes several AI models, such as SVM, applied to estimate the input and export of nutrients [44]. RF, a decision tree-based ensemble learning method, has demonstrated its efficacy in evaluating spatial water quality distribution and identifying features impacting the water quality [103]. Boosted regression trees were used to quantify the effects of nonpoint source nitrate pollution in groundwater [21]. ANNs have been widely employed to model and predict water quality parameters affected by NPSP [57]. FL can form the basis for implementing control strategies to enable decision-making or supervisory control for pollution minimization and mitigation processes [104]. FL was also used to control the aerobic stage of wastewater treatment processes [28], while ANFIS assisted in predicting the water quality index [65].
The findings suggest that DL methods are the most prominent subset of AI in the NPSP method, focusing on precision and accuracy for predictions [105]. CNN was used to map the total aquifer pollution vulnerability [76]. FFNN models and predicts water quality [65]. RNN was used to analyze and accurately forecast pollutants [52]. LSTM was applied for water quality prediction [55], while DNN showed good performance in pollution forecasting [32].
The results indicate that the RL model used to assess NPSP employed Monte Carlo (MC) simulation, which is well-suited for complex simulations and is mainly applied in combination with other AI models. MC simulation techniques were used to quantify uncertainty while applying ANN for pollution forecasting [106]. A water quality risk assessment was conducted using MC simulation and the artificial neural network method [107]. Virtual water quality monitoring was performed using Monte Carlo-optimized artificial neural networks [77]. Combining ANN, RF, and MC simulations developed a machine learning-based algorithm for water supply pollution source identification [108]. These innovative approaches enable real-time monitoring and adaptive management [66], allowing for quick responses to changing environmental conditions [26].

3.3.3. Advantages and Limitations of AI Models

AI models have been widely adopted to address the complex, non-linear, and dynamic nature of NPSP due to their inherent flexibility, proficiency in analyzing large and heterogeneous environmental datasets, ability to uncover pollution trends, and capacity to generate accurate pollution predictions at regional, national, and global scales [9,109]. Models such as ANN, SVM, and RF have demonstrated strong performance in predicting water quality indices, nutrient loads, and pollutant concentrations across diverse systems, including rivers, lakes, groundwater, and urban runoff [24,53,99]. ANNs are particularly tolerant of incomplete datasets and effective in modeling pollutant transport and water quality, while RF models are known for their robustness against overfitting and ability to handle noisy NPSP data [91]. LSTM and CNN are especially suited for capturing temporal pollution dynamics and spatial heterogeneity from remote sensing data, respectively [78,83]. ANFIS offers relative interpretability advantages and is particularly useful in uncertainty-prone scenarios or for early warning systems, although they are generally less effective for highly dynamic NPSP processes unless integrated with other modeling techniques [67,110]. Similarly, ensemble learning methods (e.g., WER-GBO, LSSVM, PSO-SVM, GA-ANN) can significantly boost pollution prediction accuracy and stability by combining the strengths of multiple models, though they often introduce added complexity and higher computational overhead [29,83]. These high capabilities and sophisticated functions make AI models very useful for NPSP assessment.
However, despite their advantages, AI models also present several limitations (Table 5). A primary concern is the “black-box” nature of deep learning models such as ANN, CNN, and LSTM, which limits interpretability and transparency [10,32]. The lack of transparency into how predictions (e.g., pollutant concentrations, transport pathways) are generated may undermine trust and confidence in the model results. This can hinder their acceptance and effective use in decision-making and regulatory contexts such as setting water quality standards, guiding pollution control interventions, and supporting environmental impact assessments [56,76]. Furthermore, many AI models (e.g., ANN, LSTM, CNN) require large volumes of high-quality, representative training data, which are often unavailable or imbalanced in real-world environmental monitoring settings (e.g., surface water bodies, catchments/watersheds, agricultural fields, urban areas) [51]. These models may also sometimes struggle to generalize beyond the specific conditions of the training dataset, reducing their reliability when applied to different locations, time periods, and aforementioned environmental contexts [51,110]. Overfitting remains a risk, particularly when data is noisy or unbalanced, and many advanced models (e.g., GBM, RT, CART) also impose high computational demands [85,111]. Additionally, training instability on small datasets and the need for expert knowledge for hyperparameter optimization limit the accessibility and scalability of these models [112,113]. Ultimately, while AI models offer powerful tools for improving the accuracy and depth of NPSP analysis, their successful deployment depends on balancing predictive capabilities with transparency, data requirements, and usability within real-world decision-making frameworks.

3.4. Knowledge Gaps and Future Research Directions

Despite recent advancements in using AI for nonpoint source pollution (NPSP) assessment, several key research gaps still require attention. These fall into four main areas: (a) AI Model Development, Optimization, and Validation; (b) Data Limitations and Monitoring Challenges; (c) Governance, Policy, and Social Dimensions; and (d) System Integration.
When focusing on AI model development, many studies aimed at predicting water quality struggle with selecting the most appropriate inputs and do not fully utilize ensemble optimization techniques, which limits both AI model performance and generalizability [53]. Moreover, DL models like CNNs and LSTM networks are not yet fully utilized in AI-based environmental monitoring platforms such as OAI-AQPC and ETAPM-AIT. This limits their ability to effectively capture complex spatiotemporal pollution dynamics [51,89]. Data-related challenges represent another significant gap. Limited data availability, inconsistent spatial and temporal resolution, and insufficient standardization make it harder to effectively monitor pollution and properly validate models [20,83].
On the policy side, AI-based studies often fail to integrate important socio-political factors such as population dynamics, income inequality, stakeholder perspectives, and environmental justice considerations. Yet, including these is crucial for developing pollution management strategies that are truly inclusive and actionable [91,116]. Finally, insufficient research demonstrates the effective integration between AI systems and crucial enabling technologies such as IoT sensor networks, remote sensing platforms, and GIS. This lack of integration makes it challenging to achieve synchronized pollution data collection, scale up for real-time, basin-wide applications, and ensure different systems work together smoothly to support quick and informed decision-making in pollution management [27,29,73]. Future research on AI applications in NPSP should address these critical gaps to enhance model robustness, data availability, policy relevance, and cross-system integration. More details on these knowledge gaps and recommended research directions are summarized in Table 6 below.

4. Discussion

4.1. AI in NPSP: Current Status and Perspectives

The findings highlight the status of AI-based research for NPSP and provide valuable insights into the trends and collaborations in this field. The recent increase (Figure 3) in research indicates a growing recognition within the scientific community of AI’s potential for improving NPSP source identification [55,117,118], forecasting [50,53], and remediation efforts [47]. This highlights AI’s potential in developing practical solutions for NPSP and underscores the importance of integrating advanced technologies into environmental research [33,55,119]. As for country productivity, China is leading in AI applications for NPSP, accounting for a significant portion (29.4%) of the total publications (Figure 3). Collaborations between China and other countries, such as the United States, Canada, Iran, and Australia, are also prominent, indicating a scientific synergy among these nations in employing cutting-edge technologies to address NPSP challenges.
Nevertheless, given the significant impact of NPSP, it is reasonable to anticipate even greater collaboration, considering that NPSP is a worldwide issue. The findings suggested limited involvement from the South American continent. Balance in participation across continents could positively impact the productivity and economic growth of nations (e.g., South America) over time, especially as the use of AI in NPSP contexts becomes more critical for achieving sustainability goals [116]. With sufficient domestic research and development, South American countries may be better prepared to leverage cutting-edge AI techniques that can help monitor pollution sources, model impacts on ecosystems and human health, and design cost-effective remediation strategies. Increased engagement from scientists and decision-makers could help address this disparity and better position these nations for long-term environmental protection and economic prosperity [65,116].
Several authors, such as Liu H., Liu Y., and Zang A., have demonstrated their involvement in AI applications in NPSP research by conducting studies focusing on water quality monitoring [20] and prediction [120], pollution mitigation [40], and data-driven machine learning [49] (Figure 5). These authors remain the most productive authors for collaborating on several studies. Their contributions and collaborations demonstrate their significant involvement and expertise in AI-focused NPSP research, however, Tiyasha et al. [26] and Cabaneros et al. [23] were the two most cited studies according to our results (Figure 6). Together, these two papers significantly contribute to the field of NPSP by shedding light on the application of AI models in water quality modeling, the use of ANN models in air pollution prediction, and the risk analysis of heavy metal concentration in waters. Moreover, by offering valuable insights, methodologies, and recommendations for further research, these studies aim to advance the understanding, prediction, and mitigation of environmental pollution. However, the proximity in time between these top two ranked papers suggests that the rate of citations is not significantly influenced by the publication period but rather by the content of the papers. This suggests that the content of a publication is very important. This study’s co-occurrence and multiple correspondence analysis addressed the content aspect by shedding light on underlying themes and writing directions of the studies on AI applications for NPSP assessment (Figure 7). This can help identify gaps and under-explored areas, preventing duplication of efforts. It also provides ideas for future research by highlighting popular AI models and factors like impacted water bodies and pollution types. Seeing what types of pollution, like nutrients, organic matter, and microbes may shape how they prioritize certain issues
The recent advancement of AI has shown how it can be a crucial tool in addressing the complex challenges posed by NPSP [49]. To effectively utilize AI for NPSP mitigation, it is essential to have a comprehensive structure underlying the NPSP system that covers various aspects, as shown in Figure 9. AI-driven solutions rely on diverse data types, including visual and numerical data, to understand the complexities of NPSP sources and pollutant types [54,121]. Data is then processed and analyzed using supervised, reinforcement, and deep learning models to extract meaningful insights [38]. The successful application of AI in NPSP mitigation also depends on the technology available, specialized software, and appropriate tools [66,90]. Additionally, expertise in both environmental domains and AI is crucial for developing and deploying AI-based models tailored to address NPSP challenges comprehensively. The combination of AI’s analytical capabilities and environmental insights empowers decision-makers and stakeholders to proactively combat NPSP from various sources, such as the atmosphere [115], agricultural operations, and urban areas [122]. By focusing on sediment [98], chemical contaminants, and pathogens [10,118] as pollutant types, AI can identify sources [118], track their movement through processes like runoff and infiltration [67], and predict impacts on water quality, human health [61], and the economy [116].
As NPSP is a widespread environmental issue with diverse ecological and societal consequences [49], integrating environmental and AI expertise can transform NPSP management strategies. AI-driven solutions offer a proactive approach to NPSP using historical [121] and real-time data [90] to identify potential hotspots and prioritize intervention efforts. Various ML techniques, as presented in Figure 8, have been used to assess NPSP. The scientific community must determine the most effective methods for NPSP and environmental pollutant assessment [49]. This classification provides researchers with a roadmap of recommended models for specific projects on NPSP [104]. It is crucial to establish the links between AI models (e.g., SVM, ANN, MLP) and other AI algorithms (e.g., GA, MC, KNN), as emphasized by Mukhamediev et al. [38] since they can improve the performance of the model.
The application of AI in NPSP offers numerous advantages, such as the ability to handle vast and complex datasets [57], as well as the ability to make precise and efficient estimations and predictions [65]. Moreover, AI’s flexibility in simulating pollution scenarios [123] and its automatic and iterative features empower researchers to develop adaptive management approaches, ensuring prompt and effective responses to pollution issues [90]. Although the applications of AI in NPSP may encounter obstacles, their advantages for NPSP may outweigh these challenges. AI-driven NPSP solutions can be pivotal in a broader context of environmental sustainability. They enhance pollution management and raise public awareness by providing accessible and real-time data [66,90,124]. This increased awareness can trigger a sense of responsibility, leading to more sustainable practices. Moreover, AI can support evidence-based policy-making by providing comprehensive data and insights, ensuring environmental policies are based on scientific rigor [125].

4.2. AI in NPSP: Opportunities and Challenges

The integration of AI into NPSP assessment and management represents new opportunities to address some of the most complex aspects of NPSP (e.g., diffuse sources, complex causes, and lack of clear, simple solutions), considering that AI is capable of learning complex, non-linear, and high-dimensional relationships directly from data, which are inherently characteristic of NPSP dynamics [9,112]. This inherent suitability translates into significant opportunities for enhancing NPSP management despite documented limitations (see Table 5). Some of the opportunities that AI presents include:
  • Early Detection of Pollution Events by processing high-frequency time-series data, allowing for real-time anomaly detection (e.g., sudden spikes in pollutant concentrations), enabling rapid response to pollution incidents and mitigation of downstream ecological and health impact [57,88].
  • Effective Use of Incomplete Data by inferring patterns, imputing missing values, or building robust models to improve applicability in data-scarce regions, to ensure broader geographic implementation of pollution management strategies [10,24,30].
  • Precision Monitoring with Ensemble Models by combining the strengths of multiple AI models to improve prediction stability and reliability, making them highly suitable for complex or variable pollution scenarios [85,102].
  • Scalable Integration Across Data Systems by facilitating data integration from diverse sources (e.g., IoT sensor networks, weather databases, satellites, GIS) [90,91] and types (e.g., meteorological, water quality, geospatial, physiochemical data) into AI models, allowing for dynamic scenario planning, pollution hotspot identification, and real-time decision support [45,74,97].
Despite these substantial opportunities, the application of AI in NPSP contexts is not without challenges. Key challenges include:
  • Interpretability and Transparency (The “Black Box” Problem): The most prominent limitation is the complexity and lack of transparency in how many advanced AI models, particularly DL models (LSTM, ANN, CNN), arrive at their predictions. Often labeled as “black boxes”, these models make it difficult to understand the underlying reasoning process. This may make it difficult for any stakeholders who require clear and scientific explanations of model outputs to validate the model logic and outputs that can be communicated and trusted by the public [70,114].
  • Data Availability, Quality, and Representativeness: Many AI models (e.g., DL models) require large volumes of high-quality and representative training data to perform effectively. In environmental settings relevant to NPSP (e.g., scattered monitoring stations across large watersheds, intermittent sampling), such data are often sparse, incomplete, inconsistent in quality, or imbalanced (e.g., focusing on specific conditions or locations). Furthermore, significant data heterogeneity (variability in measurement methods, spatial scales, and temporal coverage) can severely affect model generalizability, limiting their reliability when applied across diverse geographic regions or different temporal contexts than the training data [57,85].
  • Computational Demands and Expertise Requirements: Many AI models (e.g., GBM, ensemble learning) impose high computational demands, often requiring access to specialized hardware like powerful GPUs and significant processing times for training and testing [24,30,85,88]. Applying these models effectively in NPSP assessment often involves processing large, complex datasets (e.g., multi-year pollutant records) and simulating hydrological and pollutant transport processes across diverse landscapes (e.g., urban areas, agricultural fields, coastal wetlands), which further increase computational demands. Moreover, the application of AI requires specialized expertise in model selection, architecture design, hyperparameter optimization, and results interpretation [53,115]. Therefore, the effective application of AI in this area requires combining AI expertise with environmental and NPSP domain knowledge. This need for interdisciplinary expertise, combined with high computational requirements, can present significant barriers to the widespread adoption and practical implementation of AI in NPSP management.
However, these challenges should not discourage the adoption of AI in NPSP research and practice, as the growing urgency of environmental degradation underscores the need for scalable, adaptive, and accurate tools like AI. Therefore, even with their shortcomings, the ability of AI to uncover pollution patterns (e.g., dominant pollutant sources), enhance predictive precision, and support evidence-based interventions highlights its growing importance as an essential instrument for sustainable and effective NPSP management.

4.3. Assumptions and Limitations

This study systematically searched peer-reviewed papers using various combinations of keywords related to AI and NPSP and the presented results represent the timeline and searchwords used. Different search timelines and terms may have resulted in different studies being identified. Additionally, it should be noted that only the “Web of Science” and ScienceDirect search engines were used for this study, which may limit the scope of coverage, potentially excluding relevant studies from other databases such as Scopus, IEEE Xplore, or regional repositories like CNKI. While non-English studies were excluded to avoid translation and interpretation challenges, we recognize this as a limitation. However, we recommend the inclusion of multilingual and regionally indexed databases in future studies to provide a more comprehensive global perspective. Some studies may have been missing, as keywords were not explicitly mentioned in the titles, abstracts, and publications. The results may not reflect all available information about AI applications in NPSP. However, we believe the selected documents contain the information necessary to draw valid and reliable conclusions.
The study also utilized bibliometric analysis, a powerful tool for analyzing large amounts of data from different sources. However, it is essential to acknowledge the limitations of bibliometric analysis. The software used for bibliometric analysis, such as Biblioshiny, can introduce inaccuracies if inputs are not thoroughly reviewed beforehand, and the programmer’s biases can affect the software’s decision-making capabilities. Despite these limitations, it is also essential to consider co-authorship and keyword analysis in this study. Authors who collaborated on multiple documents were included, and keywords with a frequency of five or more were considered. Different selection conditions would have yielded different results. Nonetheless, the findings are highly relevant to future scientific research on the applications of AI in NPSP.

5. Conclusion and Next Steps

The present study demonstrates the evolving research landscape regarding AI applications for NPSP management. It highlights the essential role of enabling technologies such as IoT, GIS, and remote sensing in facilitating real-time pollution monitoring and enhancing the accuracy and scalability of AI models. Furthermore, the study categorized the AI models while critically assessing current limitations and future research needs related to AI model development, data constraints, and the crucial need for robust governance and policy frameworks, thereby bringing together the need for expertise from both environmental science and AI modeling.
Building on these insights, we propose the following actionable steps for future research and implementation:
  • Develop AI-Driven Early Warning Systems: Create real-time monitoring platforms using machine learning (e.g., LSTM, CNN) to forecast runoff events like phosphorus spikes following rainfall or pesticide discharge post-irrigation. Couple these systems with watershed models (e.g., SWAT-AI hybrids) to issue alerts and recommend pre-emptive measures such as delayed fertilizer application or buffer activation.
  • Advanced Explainable and Robust AI Models: Use interpretable AI (e.g., SHapley Additive exPlanations (SHAP), based on cooperative game theory, Local Interpretable Model-agnostic Explanations (LIME)) to ensure transparency in predictions of NPSP. For example, explain how each feature, rainfall, land slope, and fertilizer application, contribute to increasing or decreasing nitrate concentrations in runoff, considering both global interpretability (overall feature importance across the model) and local interpretability (feature impact on individual predictions). Validate models using multi-watershed datasets across different climates and land uses. Standardize model performance reporting (e.g., RMSE, MAE, recall, F1 score), including uncertainty quantification, to facilitate reproducibility and policy uptake.
  • Integrate Governance and Policy Frameworks: Develop institutional guidelines for AI-based environmental decision-making that mandate transparency, data ethics, and fairness. Engage local stakeholders (farmers, community groups, regulators) in the design and deployment phases. Address environmental justice by prioritizing pollution monitoring in historically marginalized or overburdened regions.
  • Invest in Enabling Infrastructure: Fund the deployment of low-cost water quality sensors (e.g., nitrate, turbidity, pH) and remote sensing systems (e.g., drones, satellites). Develop open-access geospatial databases and cloud-computing platforms (e.g., Google Earth Engine) to allow researchers and agencies to access, share, and analyze environmental data at scale.
  • Focus on Emerging and Understudied Pollutants: Use AI techniques (e.g., ensemble learning, anomaly detection) to map and model microplastic dispersion, legacy pesticides, antibiotic runoff, and emerging contaminants. Conduct integrated modeling of their sources (e.g., landfills, plasticulture), transport mechanisms, and ecological/human health risks in different agroecosystems.
A critical next phase of this work involves systematically evaluating and quantitatively ranking the performance of various AI models across key NPSP assessment domains—such as water quality forecasting, pollution source identification, and groundwater vulnerability mapping—using standardized, comparable metrics. This effort will equip researchers and practitioners with evidence-based guidance for selecting the most effective AI tools tailored to specific pollution scenarios. Ultimately, continued research and multidisciplinary collaborative efforts are essential to reach the transformative potential of AI in achieving sustainable NPSP management by enabling precision pollution control, data-informed policymaking, and sustainable environmental practices.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su17135810/s1, Table S1: Synthesis of review papers included in this study; Table S2: Sensitivity analysis.

Author Contributions

A.M. and A.A.: Conceptualization, Visualization, Methodology. A.A.: Resources, Supervision, Project administration, Funding acquisition. A.M., R.N., K.P., L.H., M.J., B.W. and A.A.: Data curation and writing—original draft preparation. A.M. and A.A.: Writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the National Institute of Food and Agriculture of the United States Department of Agriculture (USDA-NIFA) to Florida A&M University through Non-Assistance Cooperative Agreement grant no. 58-6066-1-044. Additionally, support from USDA-NIFA capacity building grants 2017-38821-26405 and 2022-38821-37522, USDA-NIFA Evans-Allen Project, Grant 11979180/2016–01711, USDA-NIFA grant no. 2018-68002-27920, as well as National Science Foundation Grant no. 1735235 awarded as part of the National Science Foundation Research Traineeship and Grant no. 2123440.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors thank Herbert Franklin, and Ernesta Hunter for their feedback and editing support.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ahi, Y.; Dilcan, C.C.; Koksal, D.D.; Gultas, H.T. Reservoir Evaporation Forecasting Based on Climate Change Scenarios Using Artificial Neural Network Model. Water Resour. Manag. 2022, 37, 2607–2624. [Google Scholar] [CrossRef]
  2. Anandhi, A.; Srinivas, V.V.; Nanjundiah, R.S.; Nagesh Kumar, D. Downscaling Precipitation to River Basin in India for IPCC SRES Scenarios Using Support Vector Machine. Int. J. Climatol. 2008, 28, 401–420. [Google Scholar] [CrossRef]
  3. Anandhi, A.; Srinivas, V.V.; Kumar, D.N.; Nanjundiah, R.S. Role of Predictors in Downscaling Surface Temperature to River Basin in India for IPCC SRES Scenarios Using Support Vector Machine. Intl J. Climatol. 2009, 29, 583–603. [Google Scholar] [CrossRef]
  4. Markovič, G. Wastewater Management Using Artificial Intelligence. E3S Web Conf. 2018, 45, 00050. [Google Scholar] [CrossRef]
  5. Morain, A.; Ilangovan, N.; Delhom, C.; Anandhi, A. Artificial Intelligence for Water Consumption Assessment: State of the Art Review. Water Resour. Manag. 2024, 38, 3113–3134. [Google Scholar] [CrossRef]
  6. Tang, H.W.; Lei, Y.; Lin, B.; Zhou, Y.L.; Gu, Z.H. Artificial Intelligence Model for Water Resources Management. Proc. Inst. Civ. Eng.-Water Manag. 2010, 163, 175–187. [Google Scholar] [CrossRef]
  7. Zharikova, E.P.; Grigoriev, J.Y.; Grigorieva, A.L. Artificial Intelligence Methods for Detecting Water Pollution. IOP Conf. Ser. Earth Environ. Sci. 2022, 988, 022082. [Google Scholar] [CrossRef]
  8. Uhlenbrook, S.; Yu, W.; Schmitter, P.; Smith, D.M. Optimising the Water We Eat—Rethinking Policy to Enhance Productive and Sustainable Use of Water in Agri-Food Systems across Scales. Lancet Planet. Health 2022, 6, e59–e65. [Google Scholar] [CrossRef]
  9. Wang, S.; Wang, Y.; Wang, Y.; Wang, Z. Assessment of Influencing Factors on Non-Point Source Pollution Critical Source Areas in an Agricultural Watershed. Ecol. Indic. 2022, 141, 109084. [Google Scholar] [CrossRef]
  10. Zulkifli, S.N.; Rahim, H.A.; Lau, W.-J. Detection of Contaminants in Water Supply: A Review on State-of-the-Art Monitoring Technologies and Their Applications. Sens. Actuators B Chem. 2018, 255, 2657–2689. [Google Scholar] [CrossRef]
  11. Arabi, M.; Govindaraju, R.S.; Hantush, M.M. Cost-Effective Allocation of Watershed Management Practices Using a Genetic Algorithm. Water Resour. Res. 2006, 42. [Google Scholar] [CrossRef]
  12. Lei, P.; Shrestha, R.K.; Zhu, B.; Han, S.; Yang, H.; Tan, S.; Ni, J.; Xie, D. A Bibliometric Analysis on Nonpoint Source Pollution: Current Status, Development, and Future. Int. J. Environ. Res. Public Health 2021, 18, 7723. [Google Scholar] [CrossRef]
  13. Muhammed, K.; Anandhi, A.; Chen, G.; Poole, K. Define–Investigate–Estimate–Map (DIEM) Framework for Modeling Habitat Threats. Sustainability 2021, 13, 11259. [Google Scholar] [CrossRef]
  14. Deepa, R.; Anandhi, A.; Alhashim, R. Volumetric and Impact-Oriented Water Footprint of Agricultural Crops: A Review. Ecol. Indic. 2021, 130, 108093. [Google Scholar] [CrossRef]
  15. Adu, J.; Kumarasamy, M.V. Assessing Non-Point Source Pollution Models:A Review. Pol. J. Environ. Stud. 2018, 27, 1913–1922. [Google Scholar] [CrossRef]
  16. Xepapadeas, A. The Economics of Non-Point-Source Pollution. Annu. Rev. Resour. Econ. 2011, 3, 355–373. [Google Scholar] [CrossRef]
  17. Xue, L.; Hou, P.; Zhang, Z.; Shen, M.; Liu, F.; Yang, L. Application of Systematic Strategy for Agricultural Non-Point Source Pollution Control in Yangtze River Basin, China. Agric. Ecosyst. Environ. 2020, 304, 107148. [Google Scholar] [CrossRef]
  18. Xie, Z.; Ye, C.; Li, C.; Shi, X.; Shao, Y.; Qi, W. The Global Progress on the Non-Point Source Pollution Research from 2012 to 2021: A Bibliometric Analysis. Environ. Sci. Eur. 2022, 34, 121. [Google Scholar] [CrossRef]
  19. Kang, O.; Lee, S.; Wasewar, K.; Kim, M.; Liu, H.; Oh, T.; Janghorban, E.; Yoo, C. Determination of Key Sensor Locations for Non-Point Pollutant Sources Management in Sewer Network. Korean J. Chem. Eng. 2013, 30, 20–26. [Google Scholar] [CrossRef]
  20. Li, N.; Ning, Z.; Chen, M.; Wu, D.; Hao, C.; Zhang, D.; Bai, R.; Liu, H.; Chen, X.; Li, W.; et al. Satellite and Machine Learning Monitoring of Optically Inactive Water Quality Variability in a Tropical River. Remote Sens. 2022, 14, 5466. [Google Scholar] [CrossRef]
  21. Motevalli, A.; Naghibi, S.A.; Hashemi, H.; Berndtsson, R.; Pradhan, B.; Gholami, V. Inverse Method Using Boosted Regression Tree and K-Nearest Neighbor to Quantify Effects of Point and Non-Point Source Nitrate Pollution in Groundwater. J. Clean. Prod. 2019, 228, 1248–1263. [Google Scholar] [CrossRef]
  22. Sivertun, A.; Prange, L. Non-Point Source Critical Area Analysis in the Gisselo Watershed Using GIS. Environ. Modell. Softw. 2003, 18, 887–898. [Google Scholar] [CrossRef]
  23. Cabaneros, S.M.; Calautit, J.K.; Hughes, B.R. A Review of Artificial Neural Network Models for Ambient Air Pollution Prediction. Environ. Model. Softw. 2019, 119, 285–304. [Google Scholar] [CrossRef]
  24. Fan, M.; Hu, J.; Cao, R.; Ruan, W.; Wei, X. A Review on Experimental Design for Pollutants Removal in Water Treatment with the Aid of Artificial Intelligence. Chemosphere 2018, 200, 330–343. [Google Scholar] [CrossRef]
  25. Hussain, F.; Ahmed, S.; Muhammad Zaigham Abbas Naqvi, S.; Awais, M.; Zhang, Y.; Zhang, H.; Raghavan, V.; Zang, Y.; Zhao, G.; Hu, J. Agricultural Non-Point Source Pollution: Comprehensive Analysis of Sources and Assessment Methods. Agriculture 2025, 15, 531. [Google Scholar] [CrossRef]
  26. Tiyasha; Tung, T.M.; Yaseen, Z.M. A Survey on River Water Quality Modelling Using Artificial Intelligence Models: 2000–2020. J. Hydrol. 2020, 585, 124670. [Google Scholar] [CrossRef]
  27. Wong, W.Y.; Al-Ani, A.K.I.; Hasikin, K.; Khairuddin, A.S.M.; Razak, S.A.; Hizaddin, H.F.; Mokhtar, M.I.; Azizan, M.M. Water, Soil and Air Pollutants’ Interaction on Mangrove Ecosystem and Corresponding Artificial Intelligence Techniques Used in Decision Support Systems—A Review. IEEE Access 2021, 9, 105532–105563. [Google Scholar] [CrossRef]
  28. Ye, Z.; Yang, J.; Zhong, N.; Tu, X.; Jia, J.; Wang, J. Tackling Environmental Challenges in Pollution Controls Using Artificial Intelligence: A Review. Sci. Total Environ. 2020, 699, 134279. [Google Scholar] [CrossRef]
  29. Bhatt, D.; Swain, M.; Yadav, D. Artificial Intelligence Based Detection and Control Strategies for River Water Pollution: A Comprehensive Review. J. Contam. Hydrol. 2025, 271, 104541. [Google Scholar] [CrossRef]
  30. Guo, Q.; Ren, M.; Wu, S.; Sun, Y.; Wang, J.; Wang, Q.; Ma, Y.; Song, X.; Chen, Y. Applications of Artificial Intelligence in the Field of Air Pollution: A Bibliometric Analysis. Front. Public Health 2022, 10, 933665. [Google Scholar] [CrossRef]
  31. Li, X.; Liang, G.; He, B.; Ning, Y.; Yang, Y.; Wang, L.; Wang, G. Recent Advances in Groundwater Pollution Research Using Machine Learning from 2000 to 2023: A Bibliometric Analysis. Environ. Res. 2025, 267, 120683. [Google Scholar] [CrossRef] [PubMed]
  32. Masood, A.; Ahmad, K. A Review on Emerging Artificial Intelligence (AI) Techniques for Air Pollution Forecasting: Fundamentals, Application and Performance. J. Clean. Prod. 2021, 322, 129072. [Google Scholar] [CrossRef]
  33. Subramaniam, S.; Raju, N.; Ganesan, A.; Rajavel, N.; Chenniappan, M.; Prakash, C.; Pramanik, A.; Basak, A.K.; Dixit, S. Artificial Intelligence Technologies for Forecasting Air Pollution and Human Health: A Narrative Review. Sustainability 2022, 14, 9951. [Google Scholar] [CrossRef]
  34. Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. Int. J. Surg. 2010, 8, 336–341. [Google Scholar] [CrossRef]
  35. Aria, M.; Cuccurullo, C. Bibliometrix: An R-Tool for Comprehensive Science Mapping Analysis. J. Informetr. 2017, 11, 959–975. [Google Scholar] [CrossRef]
  36. Van Eck, N.J.; Waltman, L. Text Mining and Visualization Using VOSviewer. arXiv 2011, arXiv:1109.2058. [Google Scholar]
  37. Xie, H.; Zhang, Y.; Wu, Z.; Lv, T. A Bibliometric Analysis on Land Degradation: Current Status, Development, and Future Directions. Land 2020, 9, 28. [Google Scholar] [CrossRef]
  38. Mukhamediev, R.I.; Popova, Y.; Kuchin, Y.; Zaitseva, E.; Kalimoldayev, A.; Symagulov, A.; Levashenko, V.; Abdoldina, F.; Gopejenko, V.; Yakunin, K.; et al. Review of Artificial Intelligence and Machine Learning Technologies: Classification, Restrictions, Opportunities and Challenges. Mathematics 2022, 10, 2552. [Google Scholar] [CrossRef]
  39. Lintern, A.; McPhillips, L.; Winfrey, B.; Duncan, J.; Grady, C. Best Management Practices for Diffuse Nutrient Pollution: Wicked Problems Across Urban and Agricultural Watersheds. Environ. Sci. Technol. 2020, 54, 9159–9174. [Google Scholar] [CrossRef] [PubMed]
  40. Zhang, F.; Sun, Q.; Mehrabadi, M.; Khoshnevisan, B.; Zhang, Y.; Fan, X.; Zhai, L.; Xia, Y.; Wu, M.; Liu, D.; et al. Joint Analytical Hierarchy and Metaheuristic Optimization as a Framework to Mitigate Fertilizer-Based Pollution. J. Environ. Manag. 2021, 278, 111493. [Google Scholar] [CrossRef]
  41. Xiang, C.; Wang, Y.; Liu, H. A Scientometrics Review on Nonpoint Source Pollution Research. Ecol. Eng. 2017, 99, 400–408. [Google Scholar] [CrossRef]
  42. Li, S.; Zhuang, Y.; Zhang, L.; Du, Y.; Liu, H. Worldwide Performance and Trends in Nonpoint Source Pollution Modeling Research from 1994 to 2013: A Review Based on Bibliometrics. J. Soil Water Conserv. 2014, 69, 121A–126A. [Google Scholar] [CrossRef]
  43. Liu, H.; Yue, F.; Xie, Z. Quantify the Role of Anthropogenic Emission and Meteorology on Air Pollution Using Machine Learning Approach: A Case Study of PM2.5 during the COVID-19 Outbreak in Hubei Province, China. Environ. Pollut. 2022, 300, 118932. [Google Scholar] [CrossRef] [PubMed]
  44. Zheng, Y.; Wang, Q.; Zhang, X.; Yu, J.; Li, C.; Chen, L.; Liu, Y. Nitrogen and Phosphorus Retention Risk Assessment in a Drinking Water Source Area under Anthropogenic Activities. Remote Sens. 2022, 14, 2070. [Google Scholar] [CrossRef]
  45. Tian, Z.; Yu, Z.; Li, Y.; Ke, Q.; Liu, J.; Luo, H.; Tang, Y. Prediction of River Pollution Under the Rainfall-Runoff Impact by Artificial Neural Network: A Case Study of Shiyan River, Shenzhen, China. Front. Environ. Sci. 2022, 10, 887446. [Google Scholar] [CrossRef]
  46. Zhi-Guo, Z.; Yi-sheng, S.; Zong-xue, X. Prediction of Urban Water Demand on the Basis of Engel’s Coefficient and Hoffmann Index: Case Studies in Beijing and Jinan, China. Water Sci. Technol. 2010, 62, 410–418. [Google Scholar] [CrossRef]
  47. Qu, L.; Huang, H.; Xia, F.; Liu, Y.; Dahlgren, R.A.; Zhang, M.; Mei, K. Risk Analysis of Heavy Metal Concentration in Surface Waters across the Rural-Urban Interface of the Wen-Rui Tang River, China. Environ. Pollut. 2018, 237, 639–649. [Google Scholar] [CrossRef]
  48. Liu, Y.; Jing, Y.; Lu, Y. Research on Quantitative Remote Sensing Monitoring Algorithm of Air Pollution Based on Artificial Intelligence. J. Chem. 2020, 2020, 7390545. [Google Scholar] [CrossRef]
  49. Liu, X.; Lu, D.; Zhang, A.; Liu, Q.; Jiang, G. Data-Driven Machine Learning in Environmental Pollution: Gains and Problems. Environ. Sci. Technol. 2022, 56, 2124–2133. [Google Scholar] [CrossRef]
  50. Liu, Y.; Liang, Y.; Ouyang, K.; Liu, S.; Rosenblum, D.S.; Zheng, Y. Predicting Urban Water Quality With Ubiquitous Data-A Data-Driven Approach. IEEE Trans. Big Data 2022, 8, 564–578. [Google Scholar] [CrossRef]
  51. Asha, P.; Natrayan, L.; Geetha, B.T.; Beulah, J.R.; Sumathy, R.; Varalakshmi, G.; Neelakandan, S. IoT Enabled Environmental Toxicology for Air Pollution Monitoring Using AI Techniques. Environ. Res. 2022, 205, 112574. [Google Scholar] [CrossRef] [PubMed]
  52. Feng, R.; Zheng, H.; Gao, H.; Zhang, A.; Huang, C.; Zhang, J.; Luo, K.; Fan, J. Recurrent Neural Network and Random Forest for Analysis and Accurate Forecast of Atmospheric Pollutants: A Case Study in Hangzhou, China. J. Clean Prod. 2019, 231, 1005–1015. [Google Scholar] [CrossRef]
  53. Abba, S.I.; Pham, Q.B.; Saini, G.; Linh, N.T.T.; Ahmed, A.N.; Mohajane, M.; Khaledian, M.; Abdulkadir, R.A.; Bach, Q.-V. Implementation of Data Intelligence Models Coupled with Ensemble Machine Learning for Prediction of Water Quality Index. Sci. Pollut. Res. 2020, 27, 41524–41539. [Google Scholar] [CrossRef]
  54. Xiao, H.; Ji, W. Relating Landscape Characteristics to Non-Point Source Pollution in Mine Waste-Located Watersheds Using Geospatial Techniques. J. Environ. Manag. 2007, 82, 111–119. [Google Scholar] [CrossRef] [PubMed]
  55. Wang, P.; Yao, J.; Wang, G.; Hao, F.; Shrestha, S.; Xue, B.; Xie, G.; Peng, Y. Exploring the Application of Artificial Intelligence Technology for Identification of Water Pollution Characteristics and Tracing the Source of Water Quality Pollutants. Sci. Total Environ. 2019, 693, 133440. [Google Scholar] [CrossRef]
  56. Brokamp, C.; Jandarov, R.; Rao, M.B.; LeMasters, G.; Ryan, P. Exposure Assessment Models for Elemental Components of Particulate Matter in an Urban Environment: A Comparison of Regression and Random Forest Approaches. Atmos. Environ. 2017, 151, 1–11. [Google Scholar] [CrossRef]
  57. Sengorur, B.; Koklu, R.; Ates, A. Water Quality Assessment Using Artificial Intelligence Techniques: SOM and ANN-A Case Study of Melen River Turkey. Water Qual. Expo. Health 2015, 7, 469–490. [Google Scholar] [CrossRef]
  58. Ahn, J.J.; Kim, Y.M.; Yoo, K.; Park, J.; Oh, K.J. Using GA-Ridge Regression to Select Hydro-Geological Parameters Influencing Groundwater Pollution Vulnerability. Env. Monit Assess 2012, 184, 6637–6645. [Google Scholar] [CrossRef]
  59. Kourakos, G.; Harter, T. Vectorized Simulation of Groundwater Flow and Streamline Transport. Environ. Modell. Softw. 2014, 52, 207–221. [Google Scholar] [CrossRef]
  60. Agrawal, P.; Sinha, A.; Kumar, S.; Agarwal, A.; Banerjee, A.; Villuri, V.G.K.; Annavarapu, C.S.R.; Dwivedi, R.; Dera, V.V.R.; Sinha, J.; et al. Exploring Artificial Intelligence Techniques for Groundwater Quality Assessment. Water 2021, 13, 1172. [Google Scholar] [CrossRef]
  61. Ahmed, M.F.; Lim, C.K.; Bin Mokhtar, M.; Khirotdin, R.P.K. Predicting Arsenic (As) Exposure on Human Health for Better Management of Drinking Water Sources. Int. J. Environ. Res. Public Health 2021, 18, 7997. [Google Scholar] [CrossRef]
  62. Azzirgue, E.M.; Cherif, E.K.; Tchakoucht, T.A.; Azhari, H.E.; Salmoun, F. Testing Groundwater Quality in Jouamaa Hakama Region (North of Morocco) Using Water Quality Indices (WQIs) and Fuzzy Logic Method: An Exploratory Study. Water 2022, 14, 3028. [Google Scholar] [CrossRef]
  63. Ji, X.; Lu, J. Forecasting Riverine Total Nitrogen Loads Using Wavelet Analysis and Support Vector Regression Combination Model in an Agricultural Watershed. Environ. Sci. Pollut. Res. 2018, 25, 26405–26422. [Google Scholar] [CrossRef] [PubMed]
  64. Band, S.S.; Janizadeh, S.; Pal, S.C.; Chowdhuri, I.; Siabi, Z.; Norouzi, A.; Melesse, A.M.; Shokri, M.; Mosavi, A. Comparative Analysis of Artificial Intelligence Models for Accurate Estimation of Groundwater Nitrate Concentration. Sensors 2020, 20, 5763. [Google Scholar] [CrossRef] [PubMed]
  65. Hmoud Al-Adhaileh, M.; Waselallah Alsaade, F. Modelling and Prediction of Water Quality by Using Artificial Intelligence. Sustainability 2021, 13, 4259. [Google Scholar] [CrossRef]
  66. Dhanwani, R.; Prajapati, A.; Dimri, A.; Varmora, A.; Shah, M. Smart Earth Technologies: A Pressing Need for Abating Pollution for a Better Tomorrow. Environ. Sci. Pollut. Res. 2021, 28, 35406–35428. [Google Scholar] [CrossRef]
  67. Bagheri, M.; Bazvand, A.; Ehteshami, M. Application of Artificial Intelligence for the Management of Landfill Leachate Penetration into Groundwater, and Assessment of Its Environmental Impacts. J. Clean. Prod. 2017, 149, 784–796. [Google Scholar] [CrossRef]
  68. Zhang, W.; Gao, P.; Chen, Z.; Qiu, H. Preventing Agricultural Non-Point Source Pollution in China: The Effect of Environmental Regulation with Digitization. Int. J. Environ. Res. Public Health 2023, 20, 4396. [Google Scholar] [CrossRef]
  69. Wang, M.; Chen, L.; Wu, L.; Zhang, L.; Xie, H.; Shen, Z. Review of Nonpoint Source Pollution Models: Current Status and Future Direction. Water 2022, 14, 3217. [Google Scholar] [CrossRef]
  70. Chan, P.L.R.; Arhonditsis, G.B.; Thompson, K.A.; Eimers, M.C. A Regional Examination of the Footprint of Agriculture and Urban Cover on Stream Water Quality. Sci. Total Environ. 2024, 945, 174157. [Google Scholar] [CrossRef]
  71. Krupnova, T.G.; Rakova, O.V.; Bondarenko, K.A.; Tretyakova, V.D. Environmental Justice and the Use of Artificial Intelligence in Urban Air Pollution Monitoring. Big Data Cogn. Comput. 2022, 6, 75. [Google Scholar] [CrossRef]
  72. Opher, T.; Friedler, E. A Preliminary Coupled MT–GA Model for the Prediction of Highway Runoff Quality. Sci. Total Environ. 2009, 407, 4490–4496. [Google Scholar] [CrossRef] [PubMed]
  73. Almalawi, A.; Alsolami, F.; Khan, A.I.; Alkhathlan, A.; Fahad, A.; Irshad, K.; Qaiyum, S.; Alfakeeh, A.S. An IoT Based System for Magnify Air Pollution Monitoring and Prognosis Using Hybrid Artificial Intelligence Technique. Environ. Res. 2022, 206, 112576. [Google Scholar] [CrossRef] [PubMed]
  74. Sarafaraz, J.; Ahmadzadeh Kaleybar, F.; Mahmoudi Karamjavan, J.; Habibzadeh, N. Predicting River Water Quality: An Imposing Engagement between Machine Learning and the QUAL2Kw Models (Case Study: Aji-Chai, River, Iran). Results Eng. 2024, 21, 101921. [Google Scholar] [CrossRef]
  75. Opher, T.; Ostfeld, A.; Friedler, E. Modeling Highway Runoff Pollutant Levels Using a Data Driven Model. Water Sci. Technol. 2009, 60, 19–28. [Google Scholar] [CrossRef]
  76. Nadiri, A.A.; Moazamnia, M.; Sadeghfam, S.; Gnanachandrasamy, G.; Venkatramanan, S. Formulating Convolutional Neural Network for Mapping Total Aquifer Vulnerability to Pollution. Environ. Pollut. 2022, 304, 119208. [Google Scholar] [CrossRef]
  77. Mitrović, T.; Antanasijević, D.; Lazović, S.; Perić-Grujić, A.; Ristić, M. Virtual Water Quality Monitoring at Inactive Monitoring Sites Using Monte Carlo Optimized Artificial Neural Networks: A Case Study of Danube River (Serbia). Sci. Total Environ. 2019, 654, 1000–1009. [Google Scholar] [CrossRef] [PubMed]
  78. Kuo, Y.-M.; Munoz-Carpena, R. Simplified Modeling of Phosphorus Removal by Vegetative Filter Strips to Control Runoff Pollution from Phosphate Mining Areas. J. Hydrol. 2009, 378, 343–354. [Google Scholar] [CrossRef]
  79. Huang, Y.; Chen, S.; Tang, X.; Sun, C.; Zhang, Z.; Huang, J. Dynamic Patterns and Potential Drivers of River Water Quality in a Coastal City: Insights from a Machine-Learning-Based Framework and Water Management. J. Environ. Manag. 2024, 370, 122911. [Google Scholar] [CrossRef]
  80. Huan, J.; Fan, Y.; Xu, X.; Zhou, L.; Zhang, H.; Zhang, C.; Hu, Q.; Cai, W.; Ju, H.; Gu, S. Deep Learning Model Based on Coupled SWAT and Interpretable Methods for Water Quality Prediction under the Influence of Non-Point Source Pollution. Comput. Electron. Agric. 2025, 231, 109985. [Google Scholar] [CrossRef]
  81. He, Y.; He, Y.; Sen, B.; Li, H.; Li, J.; Zhang, Y.; Zhang, J.; Jiang, S.C.; Wang, G. Storm Runoff Differentially Influences the Nutrient Concentrations and Microbial Contamination at Two Distinct Beaches in Northern China. Sci. Total Environ. 2019, 663, 400–407. [Google Scholar] [CrossRef]
  82. Yang, R.; Yin, L.; Hao, X.; Liu, L.; Wang, C.; Li, X.; Liu, Q. Identifying a Suitable Model for Predicting Hourly Pollutant Concentrations by Using Low-Cost Microstation Data and Machine Learning. Sci. Rep. 2022, 12, 19949. [Google Scholar] [CrossRef] [PubMed]
  83. Zhu, L.; Cui, T.; Runa, A.; Pan, X.; Zhao, W.; Xiang, J.; Cao, M. Robust Remote Sensing Retrieval of Key Eutrophication Indicators in Coastal Waters Based on Explainable Machine Learning. ISPRS J. Photogramm. Remote Sens. 2024, 211, 262–280. [Google Scholar] [CrossRef]
  84. Zhang, Y.; Li, W.; Wen, W.; Zhuang, F.; Yu, T.; Zhang, L.; Zhuang, Y. Universal High-Frequency Monitoring Methods of River Water Quality in China Based on Machine Learning. Sci. Total Environ. 2024, 947, 174641. [Google Scholar] [CrossRef] [PubMed]
  85. Feng, B.; Ma, J.; Liu, Y.; Wang, L.; Zhang, X.; Zhang, Y.; Zhao, J.; He, W.; Chen, Y.; Weng, L. Application of Machine Learning Approaches to Predict Ammonium Nitrogen Transport in Different Soil Types and Evaluate the Contribution of Control Factors. Ecotoxicol. Environ. Saf. 2024, 284, 116867. [Google Scholar] [CrossRef]
  86. Shi, C.; Zhuang, N.; Li, Y.; Xiong, J.; Zhang, Y.; Ding, C.; Liu, H. Identifying Factors Influencing Reservoir Eutrophication Using Interpretable Machine Learning Combined with Shoreline Morphology and Landscape Hydrological Features: A Case Study of Danjiangkou Reservoir, China. Sci. Total Environ. 2024, 951, 175450. [Google Scholar] [CrossRef]
  87. Xu, Y.; Su, B.; Wang, H. Development of a Runoff Pollution Empirical Model and Pollution Machine Learning Models of the Paddy Field in the Taihu Lake Basin Based on the Paddy In Situ Observation Method. Water 2022, 14, 3277. [Google Scholar] [CrossRef]
  88. Montalvo, L.; Fosca, D.; Paredes, D.; Abarca, M.; Saito, C.; Villanueva, E. An Air Quality Monitoring and Forecasting System for Lima City With Low-Cost Sensors and Artificial Intelligence Models. Front. Sustain. Cities 2022, 4, 849762. [Google Scholar] [CrossRef]
  89. Hamza, M.A.; Shaiba, H.; Marzouk, R.; Alhindi, A.; Asiri, M.M.; Yaseen, I.; Motwakel, A.; Rizwanullah, M. Big Data Analytics with Artificial Intelligence Enabled Environmental Air Pollution Monitoring Framework. CMC-Comput. Mater. Contin. 2022, 73, 3235–3250. [Google Scholar]
  90. Zhuang, Y.; Wen, W.; Ruan, S.; Zhuang, F.; Xia, B.; Li, S.; Liu, H.; Du, Y.; Zhang, L. Real-Time Measurement of Total Nitrogen for Agricultural Runoff Based on Multiparameter Sensors and Intelligent Algorithms. Water Res. 2022, 210, 117992. [Google Scholar] [CrossRef]
  91. Chen, B.; Mu, X.; Chen, P.; Wang, B.; Choi, J.; Park, H.; Xu, S.; Wu, Y.; Yang, H. Machine Learning-Based Inversion of Water Quality Parameters in Typical Reach of the Urban River by UAV Multispectral Data. Ecol. Indic. 2021, 133, 108434. [Google Scholar] [CrossRef]
  92. Song, B.; Park, K. Comparison of Outdoor Compost Pile Detection Using Unmanned Aerial Vehicle Images and Various Machine Learning Techniques. Drones 2021, 5, 31. [Google Scholar] [CrossRef]
  93. Fouladi Osgouei, H.; Zarghami, M.; Mosaferi, M.; Karimzadeh, S. A Novel Analysis of Critical Water Pollution in the Transboundary Aras River Using the Sentinel-2 Satellite Images and ANNs. Int. J. Environ. Sci. Technol. 2022, 19, 9011–9026. [Google Scholar] [CrossRef]
  94. Jakovljevic, G.; Alvarez-Taboada, F.; Govedarica, M. Long-Term Monitoring of Inland Water Quality Parameters Using Landsat Time-Series and Back-Propagated ANN: Assessment and Usability in a Real-Case Scenario. Remote Sens. 2024, 16, 68. [Google Scholar] [CrossRef]
  95. Lin, Y.; Li, L.; Yu, J.; Hu, Y.; Zhang, T.; Ye, Z.; Syed, A.; Li, J. An Optimized Machine Learning Approach to Water Pollution Variation Monitoring with Time-Series Landsat Images. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102370. [Google Scholar] [CrossRef]
  96. Bertone, E.; Burford, M.A.; Hamilton, D.P. Fluorescence Probes for Real-Time Remote Cyanobacteria Monitoring: A Review of Challenges and Opportunities. Water Res. 2018, 141, 152–162. [Google Scholar] [CrossRef] [PubMed]
  97. Tawabini, B.; Yassin, M.A.; Benaafi, M.; Adetoro, J.A.; Al-Shaibani, A.; Abba, S.I. Spatiotemporal Variability Assessment of Trace Metals Based on Subsurface Water Quality Impact Integrated with Artificial Intelligence-Based Modeling. Sustainability 2022, 14, 2192. [Google Scholar] [CrossRef]
  98. Akinpelu, A.A.; Ali, M.E.; Owolabi, T.O.; Johan, M.R.; Saidur, R.; Olatunji, S.O.; Chowdbury, Z. A Support Vector Regression Model for the Prediction of Total Polyaromatic Hydrocarbons in Soil: An Artificial Intelligent System for Mapping Environmental Pollution. Neural Comput. Appl. 2020, 32, 14899–14908. [Google Scholar] [CrossRef]
  99. Azrour, M.; Mabrouki, J.; Fattah, G.; Guezzaz, A.; Aziz, F. Machine Learning Algorithms for Efficient Water Quality Prediction. Model. Earth Syst. Environ. 2022, 8, 2793–2801. [Google Scholar] [CrossRef]
  100. Ahmadianfar, I.; Shirvani-Hosseini, S.; Samadi-Koucheksaraee, A.; Yaseen, Z.M. Surface Water Sodium (Na+) Concentration Prediction Using Hybrid Weighted Exponential Regression Model with Gradient-Based Optimization. Environ. Sci. Pollut. Res. 2022, 29, 53456–53481. [Google Scholar] [CrossRef]
  101. Chakraborty, M.; Sarkar, S.; Mukherjee, A.; Shamsudduha, M.; Ahmed, K.M.; Bhattacharya, A.; Mitra, A. Modeling Regional-Scale Groundwater Arsenic Hazard in the Transboundary Ganges River Delta, India and Bangladesh: Infusing Physically-Based Model with Machine Learning. Sci. Total Environ. 2020, 748, 141107. [Google Scholar] [CrossRef] [PubMed]
  102. Alkindi, K.M.; Mukherjee, K.; Pandey, M.; Arora, A.; Janizadeh, S.; Pham, Q.B.; Anh, D.T.; Ahmadi, K. Prediction of Groundwater Nitrate Concentration in a Semiarid Region Using Hybrid Bayesian Artificial Intelligence Approaches. Environ. Sci. Pollut. Res. 2022, 29, 20421–20436. [Google Scholar] [CrossRef] [PubMed]
  103. Wang, F.; Wang, Y.; Zhang, K.; Hu, M.; Weng, Q.; Zhang, H. Spatial Heterogeneity Modeling of Water Quality Based on Random Forest Regression and Model Interpretation. Environ. Res. 2021, 202, 111660. [Google Scholar] [CrossRef]
  104. Chan, C.; Huang, G. Artificial Intelligence for Management and Control of Pollution Minimization and Mitigation Processes. Eng. Appl. Artif. Intell. 2003, 16, 75–90. [Google Scholar] [CrossRef]
  105. Papadomanolaki, M.; Vakalopoulou, M.; Zagoruyko, S.; Karantzalos, K. Benchmarking Deep Learning Frameworks for the Classification of Very High Resolution Satellite Multispectral Data. In Proceedings of the ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XXIII ISPRS Congress, Prague, Czech Republic, 12–19 July 2016; Volume III-7, pp. 83–88. [Google Scholar] [CrossRef]
  106. Cabaneros, S.M.; Hughes, B. Methods Used for Handling and Quantifying Model Uncertainty of Artificial Neural Network Models for Air Pollution Forecasting. Environ. Model. Softw. 2022, 158, 105529. [Google Scholar] [CrossRef]
  107. Jiang, Y.; Nan, Z.; Yang, S. Risk Assessment of Water Quality Using Monte Carlo Simulation and Artificial Neural Network Method. J. Environ. Manag. 2013, 122, 130–136. [Google Scholar] [CrossRef]
  108. Grbčić, L.; Lučin, I.; Kranjčević, L.; Družeta, S. A Machine Learning-Based Algorithm for Water Network Contamination Source Localization. Sensors 2020, 20, 2613. [Google Scholar] [CrossRef] [PubMed]
  109. Luo, M.; Liu, X.; Legesse, N.; Liu, Y.; Wu, S.; Han, F.X.; Ma, Y. Evaluation of Agricultural Non-Point Source Pollution: A Review. Water Air Soil Pollut. 2023, 234, 657. [Google Scholar] [CrossRef]
  110. Yonar, A.; Yonar, H. Modeling Air Pollution by Integrating ANFIS and Metaheuristic Algorithms. Earth Syst. Environ. 2023, 9, 1621–1631. [Google Scholar] [CrossRef]
  111. Kwon, S.; Noh, H.; Seo, I.W.; Jung, S.H.; Baek, D. Identification Framework of Contaminant Spill in Rivers Using Machine Learning with Breakthrough Curve Analysis. Int. J. Environ. Res. Public Health 2021, 18, 1023. [Google Scholar] [CrossRef]
  112. Feng, R.; Zheng, H.; Zhang, A.; Huang, C.; Gao, H.; Ma, Y. Unveiling Tropospheric Ozone by the Traditional Atmospheric Model and Machine Learning, and Their Comparison:A Case Study in Hangzhou, China. Environ. Pollut. 2019, 252, 366–378. [Google Scholar] [CrossRef] [PubMed]
  113. Meng, L.; Yan, Y.; Jing, H.; Yousuf Jat Baloch, M.; Du, S.; Du, S. Large-Scale Groundwater Pollution Risk Assessment Research Based on Artificial Intelligence Technology: A Case Study of Shenyang City in Northeast China. Ecol. Indic. 2024, 169, 112915. [Google Scholar] [CrossRef]
  114. Dubinsky, E.A.; Butkus, S.R.; Andersen, G.L. Microbial Source Tracking in Impaired Watersheds Using PhyloChip and Machine-Learning Classification. Water Res. 2016, 105, 56–64. [Google Scholar] [CrossRef] [PubMed]
  115. Carbajal-Hernández, J.J.; Sánchez-Fernández, L.P.; Carrasco-Ochoa, J.A.; Martínez-Trinidad, J. Fco. Assessment and Prediction of Air Quality Using Fuzzy Logic and Autoregressive Models. Atmos. Environ. 2012, 60, 37–50. [Google Scholar] [CrossRef]
  116. Ma, B. The Impact of Environmental Pollution on Residents’ Income Caused by the Imbalance of Regional Economic Development Based on Artificial Intelligence. Sustainability 2023, 15, 637. [Google Scholar] [CrossRef]
  117. Carroll, S.P.; Dawes, L.; Hargreaves, M.; Goonetilleke, A. Faecal Pollution Source Identification in an Urbanising Catchment Using Antibiotic Resistance Profiling, Discriminant Analysis and Partial Least Squares Regression. Water Res. 2009, 43, 1237–1246. [Google Scholar] [CrossRef]
  118. Zhang, Y.; Brusseau, M.L.; Neupauer, R.M.; Wei, W. General Backward Model to Identify the Source for Contaminants Undergoing Non-Fickian Diffusion in Water. Environ. Sci. Total Environ. 2019, 693, 133440. [Google Scholar] [CrossRef]
  119. Kuai, P.; Li, W.; Liu, N. Evaluating the Effects of Land Use Planning for Non-Point Source Pollution Based on a System Dynamics Approach in China. PLoS ONE 2015, 10, e0135572. [Google Scholar] [CrossRef] [PubMed]
  120. Lui, W.; Xu, Y.; Fan, D.; Li, Y.; Shao, X.-F.; Zheng, J. Alleviating Corporate Environmental Pollution Threats toward Public Health and Safety: The Role of Smart City and Artificial Intelligence. Saf. Sci. 2021, 143, 105433. [Google Scholar] [CrossRef]
  121. Sriram, S.; Santhiya, S.; Rajeshkumar, G.; Gayathri, S.; Vijaya, K. Predict the Quality of Freshwater Using Support Vector Machines. In Proceedings of the 2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC), Salem, India, 4–6 May 2023; pp. 370–377. [Google Scholar]
  122. Kumwimba, M.N.; Zhu, B.; Stefanakis, A.I.; Ajibade, F.O.; Dzakpasu, M.; Soana, E.; Wang, T.; Arif, M.; Muyembe, D.K.; Agboola, T.D. Advances in Ecotechnological Methods for Diffuse Nutrient Pollution Control: Wicked Issues in Agricultural and Urban Watersheds. Front. Environ. Sci. 2023, 11, 1199923. [Google Scholar]
  123. Sun, R.; Cheng, X.; Chen, L. A Precipitation-Weighted Landscape Structure Model to Predict Potential Pollution Contributions at Watershed Scales. Landsc. Ecol. 2018, 33, 1603–1616. [Google Scholar] [CrossRef]
  124. Liu, D.; Yao, Z.; Yang, X.; Xiong, C.; Nie, Q. Research Progress and Trend of Agricultural Non-Point Source Pollution from Non-Irrigated Farming Based on Bibliometrics. Water 2023, 15, 1610. [Google Scholar] [CrossRef]
  125. Pouyanfar, N.; Harofte, S.Z.; Soltani, M.; Siavashy, S.; Asadian, E.; Ghorbani-Bidkorbeh, F.; Kecili, R.; Hussain, C.M. Artificial Intelligence-Based Microfluidic Platforms for the Sensitive Detection of Environmental Pollutants: Recent Advances and Prospects. Trends Environ. Anal. Chem. 2022, 34, e00160. [Google Scholar] [CrossRef]
Figure 1. Review articles on AI applications in NPSP [10,23,24,25,26,27,28,29,30,31,32,33].
Figure 1. Review articles on AI applications in NPSP [10,23,24,25,26,27,28,29,30,31,32,33].
Sustainability 17 05810 g001
Figure 2. PRISMA diagram for article inclusion/exclusion in this systematic review.
Figure 2. PRISMA diagram for article inclusion/exclusion in this systematic review.
Sustainability 17 05810 g002
Figure 3. Annual distribution of publications.
Figure 3. Annual distribution of publications.
Sustainability 17 05810 g003
Figure 4. Map showing the productivity and collaboration among countries (lines’ size indicating the collaboration frequency (thickest line: 7; thinnest line: 1), color variation indicating percentage of documents produced in the total selected documents in this study (darkest blue: 29.4%, lightest blue).
Figure 4. Map showing the productivity and collaboration among countries (lines’ size indicating the collaboration frequency (thickest line: 7; thinnest line: 1), color variation indicating percentage of documents produced in the total selected documents in this study (darkest blue: 29.4%, lightest blue).
Sustainability 17 05810 g004
Figure 5. Network map showing authorship analysis. The size of the nodes reflects the frequency. The more the item occurs, the larger the node and font size. The colors indicate the cluster to which a node has been assigned.
Figure 5. Network map showing authorship analysis. The size of the nodes reflects the frequency. The more the item occurs, the larger the node and font size. The colors indicate the cluster to which a node has been assigned.
Sustainability 17 05810 g005
Figure 6. Most cited papers (a) and top 10 journals (b). Bold and underlined values indicate the top three articles in each category. *: Paper age in years. Numbers in red indicate journal impact factors [23,26,28,47,51,52,53,54,55,56].
Figure 6. Most cited papers (a) and top 10 journals (b). Bold and underlined values indicate the top three articles in each category. *: Paper age in years. Numbers in red indicate journal impact factors [23,26,28,47,51,52,53,54,55,56].
Sustainability 17 05810 g006
Figure 7. (a) Wordcloud of keywords and (b) conceptual structure map.
Figure 7. (a) Wordcloud of keywords and (b) conceptual structure map.
Sustainability 17 05810 g007
Figure 8. Subsections of AI classes linked to NPSP models.
Figure 8. Subsections of AI classes linked to NPSP models.
Sustainability 17 05810 g008
Figure 9. A broad perspective on AI applications in NPSP.
Figure 9. A broad perspective on AI applications in NPSP.
Sustainability 17 05810 g009
Table 1. List of abbreviations.
Table 1. List of abbreviations.
General Concepts & Frameworks
AI: Artificial Intelligence
DL: Deep Learning
GIS: Geographic Information System
IoT: Internet of Things
ML: Machine Learning
NPSP: Non-Point Source Pollution
PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses
RL: Reinforcement Learning
SL: Supervised Learning
UAV: Unmanned Aerial Vehicle
USA: United States of America
Machine-Learning & AI Models/Algorithms
ACA: Ant Colony Algorithm
ANFIS: Adaptive Neuro-Fuzzy Inference System
ANN: Artificial Neural Network
BART: Bayesian Additive Regression Trees
Baysia-ANN: Bayesian Artificial Neural Network
BGLM: Bayesian Generalised Linear Model
BPNN: Back-Propagation Neural Network
BRNN: Bayesian Regularised Neural Network
BRR: Bayesian Ridge Regression
BRT: Boosted Regression Tree
CNN: Convolutional Neural Network
CS: Cuckoo Search
DL: Deep Learning
DNN: Deep Neural Network
DT: Decision Tree
ELM: Extreme Learning Machine
ESN: Echo State Network
FFNN: Feed-Forward Neural Network
FL: Fuzzy Logic
FNN: Fuzzy Neural Network
GA: Genetic Algorithm
GA-ANN: Genetic-Algorithm-optimised Artificial Neural Network
GA-Ridge: Genetic-Algorithm-optimised Ridge Regression
GBM: Gradient Boosting Machine
GBT: Gradient Boosting Tree
GEP: Gene Expression Programming
GRNN: Generalised Regression Neural Network
HCA: Hierarchical Cluster Analysis
IA: Immune Algorithm
ICA: Imperialist Competitive Algorithm
KNN: k-Nearest Neighbour
LR: Logistic Regression
LSSVM: Least-Squares Support Vector Machine
LSTM: Long Short-Term Memory
LUR: Land Use Regression
LURF: Land Use Regression Forest
MARS: Multivariate Adaptive Regression Splines
MCS: Monte Carlo Simulation
MLP: Multi-Layer Perceptron
MLPANN: Multi-Layer Perceptron Artificial Neural Network
MT–GA: Model Tree combined with a Genetic Algorithm
NBC: Naïve Bayes Classifier
PSO: Particle Swarm Optimisation
PSO-SVM: Particle Swarm Optimisation-tuned Support Vector Machine
RBFNN: Radial Basis Function Neural Network
RBN: Radial Basis Network
RF: Random Forest
RNN: Recurrent Neural Network
SOM: Self-Organising Map
SVM: Support Vector Machine
SVR: Support Vector Regression
WA: Wavelet Analysis
WER-GBO: Weighted-Error-Rate/Gradient-Based Optimiser
WNARX: Wavelet Non-linear Autoregressive model with Exogenous inputs
WNN: Wavelet Neural Network
XGB: eXtreme Gradient Boosting
Performance Metrics & Statistical Indices
AUC: Area Under the ROC Curve
DC: Determination Coefficient
DDR: Degree of Determination Ratio
IA: Index of Agreement
Kappa: Cohen’s Kappa statistic
KGE: Kling-Gupta Efficiency
MAE: Mean Absolute Error
MAPD: Mean Absolute Percentage Deviation
MAPE: Mean Absolute Percentage Error
MSE: Mean Squared Error
NSE: Nash–Sutcliffe Efficiency
R: Pearson correlation coefficient/R language
RMSE: Root Mean Square Error
ROC: Receiver Operating Characteristic
RSE: Relative Squared Error
R2: Coefficient of Determination
WQI: Water Quality Index
Indices/Vulnerability Maps
AQI: Air Quality Index
IVI: Intrinsic Vulnerability Index
SVI: Specific Vulnerability Index
TVI: Total Vulnerability Index

Platforms & Systems
ETAPM-AIT: Environmental Toxicology for Air Pollution Monitoring using AI Technique
LIME: Local Interpretable Model-agnostic Explanations
OAI-AQPC: Optimal AI-based Air Quality Prediction and Classification
SWAT: Soil and Water Assessment Tool
Environmental & Pollutant Parameters
Al: Aluminium
As: Arsenic
BOD: Biochemical Oxygen Demand
Ca: Calcium
Cd: Cadmium
CH4: Methane
Cl: Chloride ion
Co: Cobalt
CO: Carbon Monoxide
COD: Chemical Oxygen Demand
Environmental & Pollutant Parameters (Cont.)
CODMn: Permanganate-index Chemical Oxygen Demand
CO2: Carbon Dioxide
Cr: Chromium
DIN: Dissolved Inorganic Nitrogen
DO: Dissolved Oxygen
DS: Dissolved Solids
E. coli: Escherichia coli
EC: Electrical Conductivity
Fe: Iron
HCO3: Bicarbonate ion
K: Potassium
Mg: Magnesium
Na: Sodium
NH3: Ammonia
NH3-N: Ammonia nitrogen
NH4+-N: Ammonium nitrogen
Ni: Nickel
NO: Nitric Oxide
NO2: Nitrogen Dioxide
NO3: Nitrate
NOx: Nitrogen Oxides
O3: Ozone
P: Phosphorus
PAHs: Polycyclic Aromatic Hydrocarbons
Pb: Lead
PM10: Particulate Matter ≤10 µm
PM2.5: Particulate Matter ≤2.5 µm
PO43−: Phosphate ion
S: Sulfur
SAR: Sodium Adsorption Ratio
SO2: Sulfur Dioxide
SO42−: Sulfate ion
T: Temperature
TDS: Total Dissolved Solids
TH: Total Hardness
TN: Total Nitrogen
TOC: Total Organic Carbon
TP: Total Phosphorus
TPH: Total Petroleum Hydrocarbons
TSS: Total Suspended Solids
V.: Vibrio genus bacteria
VOCs: Volatile Organic Compounds
Zn: Zinc
Table 2. Inclusion and exclusion criteria.
Table 2. Inclusion and exclusion criteria.
StageInclusion CriteriaExclusion CriteriaJustification
Screening & Eligibility
-
Type of publication
-
Full-text availability
-
Language of articles
-
Relevance to the study
-
Quality of the article
-
Comparative analysis of content.
-
Assessing, judging, and identifying potential bias risks, and appraising internal or external validity.
-
Peer-reviewed journal articles
-
Published in English
-
Full-text availability
-
Only high-quality articles with internal and external validity
-
Includes search terms: AI, ML, data mining, expert systems, predictive modeling, deep learning nonpoint source pollution
-
Apply AI to NPSP pollution (monitoring, prediction, modeling, source apportionment, etc.)
-
Present original methods, models, or comparative evaluations
-
Demonstrate interdisciplinary integration (e.g., AI + hydrology, GIS, IoT)
-
Non-peer-reviewed sources (e.g., blogs, magazines, grey literature)
-
Articles written in languages other than English
-
Abstract-only or inaccessible articles
-
Articles lacking both AI relevance and NPSP context
-
Studies focused only on point-source pollution with no clear connection to NPSP
-
Papers using AI terms superficially without applying or evaluating them
-
Studies lacking evidence of cross-field application or conceptual integration
-
Duplicate records or studies with substantial content overlap (>50%)
-
Books, theses, dissertations, technical reports, and conference abstracts
-
Ensures scientific quality, credibility, and reliability of included studies.
-
Limits were set based on language proficiency and available resources for accurate assessment and to avoid loss in translation.
-
Required to enable full evaluation of study objectives, methods, and findings.
-
Aligns directly with search strategy and ensures relevance to the study’s central theme.
-
Maintains alignment with the review’s specific focus on NPSP and avoids dilution of results.
-
Ensures that included studies contribute methodologically or empirically to the AI–NPSP field.
-
Supports the framing of NPS pollution as a wicked problem requiring multi-disciplinary solutions.
-
Prevents redundancy and ensures a diverse set of perspectives and methods.
-
These formats sometime lack peer-review, methodological detail, or accessibility for critical evaluation.
Table 3. NPSP characterization based on factors such as pollution sources, drivers, pollutants, transport pathways, and impacts.
Table 3. NPSP characterization based on factors such as pollution sources, drivers, pollutants, transport pathways, and impacts.
NoPollution SourceDriversPollutantsTransportImpactsReference
1Industrial, agricultural, residential, and urban discharges, Atmospheric dry/wet depositionPoor wastewater management and mixed land-use activitiesOrganic compounds, heavy metals, N and P compounds,Diffuse runoff and atmospheric depositionEutrophication risks from nutrient buildup (algal blooms) threatening drinking water and irrigation supply; overall degradation of ecosystem services and human health due to contaminated freshwater.[74]
2Highway stormwater runoff, vehicle-derived contaminants (oil, tire wear, etc.) on road surfacesHigh traffic volume and precipitation eventsHeavy metals (e.g., Cr, Pb, Zn), PAHs, TOC, and TSSStormwater runoffAcute and chronic ecological effects, including soil and water contamination by metals and PAHs. Polluted runoff degrades water quality and can infiltrate aquifers, posing risks to drinking water.[75]
3Agricultural activitiesIntensive agriculture (e.g., fertilizer) and improper waste disposalNO3, As, and fluorideLeaching and runoffElevated nitrate and toxin levels compromise groundwater quality and pose health risks. Pollution transfer is accelerated in vulnerable areas, threatening drinking water supplies.[76]
4Urban wastewater discharges and agricultural fertilizersExpanding urban areas and intensive farmingNO3Leaching and runoffGroundwater nitrate contamination leads to serious human health issues: increased risks of cancer, methemoglobinemia (“blue baby syndrome”), thyroid disorders, and other illnesses.[21]
5Untreated municipal wastewatersPoor wastewater treatmentOrganic matter (high BOD) and nutrients (N, P), and elevated levels of saltsSewage outfalls and runoff Eutrophication and hypoxia in the Danube are exacerbated, harming aquatic life and ecosystem health. Altered thermal and flow regimes (warmer, stagnant water) boost algal growth and disrupt natural stratification, leading to biodiversity losses[77]
6Industrial, agricultural, and domestic sewage dischargesRapid urbanization with inadequate sewer infrastructure and treatment systemsExcess N (total N, ammoniacal nitrogen) and PSewage effluent and runoffElevated N and P levels cause eutrophication (algal overgrowth), threatening freshwater resources and ecosystem services like drinking water, food supply, and biodiversity.[20]
7Phosphate mining operationsLandscape disturbance (excavation of phosphate rock) and insufficient containment of waste materialsVarious forms of P (total P, particulate P, dissolved P)Rainfall runoffDegrades water quality, causing eutrophication in downstream ecosystems[78]
8Industrial smokestacks, highways, landfills and parking lotsRapid industrialization and urban growthFine particulate matter (PM2.5, PM10) laden with toxic heavy metals (As, Cr, Co, Cd, Ni, Pb) and PAHsAtmospheric dispersion, traffic and windDeteriorating air quality.[71]
9Municipal wastewater and agricultural fertilizers and manureHigh population density and intensive fertilizer useP, N, permanganate COD RunoffExcess nutrient inputs degrade water quality, leading to algal blooms and oxygen depletion[79]
10Livestock, fertilizer use, sewage discharge, livestock manureIntensive farming and inadequate rural waste treatmentNH3-N and TNNutrients leach from soils into groundwater and run off overland during rainfallEutrophication[80]
11Stormwater runoff, treated sewage effluent, and submarine groundwater dischargeIntense rainfall events and inadequate runoff infrastructure, Coastal development near animal farms and sewage sourcesNutrients (N, P), agricultural pesticides, heavy metals, and especially microbial pathogens are identified. Marine Vibrio bacteria (e.g., V. parahaemolyticus, V. vulnificus) and E. coliSurface runoff, sewage outfallsPost-storm water quality deteriorates, public health risks and coastal ecosystem imbalance[81]
12Outdoor compost piles (OCPs) and nutrient-rich wastePoor management of agricultural waste (high-nutrient compost and manure left exposed), Heavy rainfall eventsHigh concentrations of N and PRainfall runoffEutrophication, water quality degradation, aquatic ecosystems harms, urban water supplies affectation[70]
13Urban industrial and traffic emissionsExcessive and urban (traffic) and industrial activitiesNO2, SO2, CO, particulate matter (PM2.5/PM10), O3, and VOCs.Atmospheric deposition, windPublic health risks[52]
14LandfillsInsufficient containment at landfill sites and high waste volumesHeavy metals Fe, Pb, Cr, Cd, Zn, Ni, etc.), hCOD, and other inorganic solutes (e.g., Na, SO4)Leachate infiltrates through topsoil and subsoil into groundwater.Soil ecosystem disturbance, groundwater contamination and plant heavy metal uptake[67]
15agricultural activities (crop residue burning), fossil fuel combustion (vehicles, power plants), residential heating (wood/coal burning), natural events (forest fires)Population growth, climate change, and industrial development increasePM2.5, CO2, NO2, CO, CH4, and NH3Atmospheric depositionAir quality reduction public health risks[51]
16Industrial activities, transportation (vehicle emissions), coal-fired power plants, and household use of solid fuelsRapid urbanization, high energy demand, and improper waste managementCO2, SO2, NO2, O3, PM2.5, and PM10Atmospheric depositionAir quality reduction public health risks[73]
17Vehicle exhaust and coal burning, natural processes (desert dust storms)Urban traffic and industrial emissions increase PM2.5, PM10 and gaseous pollutants like CO, NOx (NO and NO2, and SO2Airborne particle dispersion, Seasonal windsAir quality reduction of public health risks[82]
18Fertilizer,Rapid population growth, intensive agriculture, industrialization, and urban expansionDIN and soluble reactive PAtmospheric deposition, agricultural runoff, and sewageEutrophication in bay waters, algal blooms, and red tides[83]
19Domestic sewage, garbage, and human waste from villages and towns, plus livestock excrement and agricultural chemicals from cropland and pastureopen dumping of sewage, intense animal husbandry, and Unmanaged manureN and PRunoffEutrophication and water quality decline[44]
20Physical debris (plastic waste), chemical contaminants, thermal discharges, and oil spillsHuman maritime activities (shipping accidents, illegal dumping) and local industry (which can warm water or release chemicals)Plastic debris, Cu and pesticidesFloating debris and oil, atmospheric depositionMercury biomagnification, mortality of native aquatic organisms and changes in species composition.[7]
21Agricultural practices and industrial activitiesIndustrial expansion and increased use of fertilizers, while urbanization and population growthTN, TP, and NH4+-NRainfall runoff and erosionLow access to safe drinking water, eutrophication, loss of aquatic biodiversity, and public health risks[84]
22Farmlands and rural settlementsLand-use changes, excessive use of fertilizers and poor waste managementTP and TNRainfall runoff from agricultural fields and villagesAlgae proliferation and ecosystem stress[63]
23Agricultural farmlandsOver-application of fertilizers beyond crop needs and improper timing of fertilizer applicationNH4+-N and PPercolation and runoffNutrient enrichment of surface waters (eutrophication) and contamination of groundwater with nitrates, reducing drinking water quality[85]
24Agricultural landscapes, fertilizer and pesticide usePoor fertilizer management and intense cultivation practicesN and PSurface runoffEutrophication and algal blooms[86]
25Oil palm plantations and mining activitiesInadequate wastewater managementAs, Al, Cd, and CrRunoffDrinking water source contamination and public health risk[61]
26Stormwater waste and debris, agricultural activitiesUnregulated discharges industrial growth, high fertilizers useNH3urban runoff (overflow from sewers, contaminated soils washes), agricultural return flowswater quality deterioration unfit for drinking and bathing, Ecosystem health decline and loss of biodiversity[53]
27Agricultural paddy fieldHeavy use of fertilizers and pesticidesTN, NH4+-N, TP, and organics (measured as CODmn)Surface runoff during irrigationAlgal blooms. Water quality deterioration, affection of drinking water supplies and biodiversity[87]
Table 5. Pros and cons of AI models applied in NPSP.
Table 5. Pros and cons of AI models applied in NPSP.
ModelAdvantages (Pros)Limitations (Cons)References
ANN
  • Highly flexible for modeling complex nonlinear pollutant relationships
  • Can effectively incorporate a priori unknown knowledge from the given training pollution dataset
  • Tolerant to incomplete data, suitable for nutrient load prediction and river water quality forecasting
  • Require large training datasets of historical pollution data
  • Needs careful optimization to accurately model specific pollutant behaviors
  • Prone to overfitting under complex pollution scenarios (e.g., diffuse pollution sources)
  • Limited interpretability (‘black box’ nature), making it difficult to understand specific pollution cause-and-effect relationships
[10,23,24,26,29,32]
SVM
  • Effective for small-to-medium water quality datasets with nonlinear relationships
  • Strong generalization and classification flexibility for pollution source identification.
  • Effective avoidance of overfitting and multiclass minimization when classifying pollution sources or levels.
  • Sensitive to kernel choice and parameter tuning for optimal pollution prediction accuracy.
  • Scalability is limited for large-scale watershed pollution modeling datasets.
  • Computationally expensive and memory-intensive for larger environmental datasets.
  • Lack of transparency and interpretability in the predicted pollution outcome
[20,29,32,53,92,112]
RF
  • Robust prediction accuracy for pollution levels.
  • Resilient to overfitting with complex environmental data. Handles high-dimensional and noisy NPSP data (Diverse pollutant sources and pathways). Provides variable importance ranking, identifying key factors driving NPSP
  • Limited interpretability, making it hard to explain predicted pollution patterns.
  • May require large, balanced datasets for stable predictions across spatially heterogeneous watersheds
[20,56,85,91,92,111,114]
LSTM
  • Excels at capturing temporal pollutant dynamics
Ideal for modeling pollution linked to rainfall-runoff events
  • High computational demand for training on long-term pollution time series.
Large historical pollution datasets are required to prevent unstable training
[29,52]
CNN
  • Powerful in extracting spatial features from satellite/UAV imagery
  • Effective for pollution mapping across large river basins
  • High training data requirement, needing extensive labeled pollution imagery.
  • Instability with small or imbalanced pollution datasets.
  • Limited explainability, hindering use in pollution policy contexts
[76,113]
GBM
  • High predictive accuracy for modeling complex pollutant interactions
  • Supports feature selection in heterogeneous NPSP environments
  • Computationally demanding when modeling numerous pollution variables.
  • Overfitting risk if not carefully tuned to specific pollution datasets.
  • Moderate interpretability regarding pollution drivers
[83,84,91,115]
ANFIS
  • Effectively combines interpretability and nonlinearity in pollution modeling.
  • Useful for pollutant load forecasting under uncertainty
  • Training complexity increases with more environmental input variables.
  • Labor-intensive tuning of membership functions for specific pollutant characteristics
[51,110]
Ensemble Learning
  • Enhance stability and pollution prediction accuracy by combining different AI models
  • Increases overall model complexity for pollution forecasting. Demands more computational resources for combined pollution analyses
[24,29,53,83]
FL
  • Handles uncertainty and imprecise environmental data
  • Applicable for early warning systems in case of pollution events
  • Efficiently handles outliers and ambiguous non–linear relationships between the input variables (e.g., Pollutants)
  • Limited capability for highly dynamic, nonlinear pollutant behaviors without integration with other AI techniques.
  • Individual approaches often underperform for both long-term and short-term forecasting.
  • Lacks adaptive and self-learning capabilities.
[32,67,115]
RT and CART
  • Interpretable models for understanding basic pollution relationships
  • Effective in handling nonlinear pollution patterns
  • Prone to overfitting with complex pollution data.
  • Standalone trees offer weaker pollution predictive performance without ensemble boosting.
[56,83]
DL
  • Capable of automatic feature extraction from large, heterogeneous environmental datasets
  • Supports detection of subtle pollution signals
  • Improved spatial and temporal stability for a multi-step ahead forecasting of pollutants
  • Needs a large volume of environmental training data for efficient performance.
  • Require significant computational resources for complex pollution modeling.
  • Major interpretability challenges for environmental applications.
[32,83,89]
Table 6. Summary of knowledge gaps and direction for future studies.
Table 6. Summary of knowledge gaps and direction for future studies.
Identified Research GapsFuture Research SuggestedReferences
AI Model Development, Optimization, and Validation
  • AI models for water quality prediction are limited by suboptimal input selection and lack integration with advanced optimization techniques, reducing their applicability.
  • Explore the optimization of input parameter sets tailored for complex, diffuse pollution conditions and integrate advanced optimization algorithms with ensemble AI models to further enhance prediction accuracy for water quality assessment.
[53]
  • Limited exploration of deep learning architectures in AI-based environmental monitoring frameworks like OAI-AQPC constrains the prediction and classification accuracy for complex NPSP scenarios.
  • Explore extending the OAI-AQPC framework by designing and integrating advanced deep learning architectures (e.g., CNNs, LSTMs) for more robust prediction and classification of diffuse pollution patterns and real-time pollution monitoring.
[89]
  • Need for developing more flexible and context-aware modeling approaches as current IoT–AI environmental monitoring frameworks (ETAPM-AIT) limit predictive accuracy and adaptability in complex, dynamic pollution conditions.
  • Enhancing the predictive performance of the ETAPMAIT model by integrating advanced deep learning architectures. This may lead to more accurate and robust predictions, particularly in complex or data-rich environments.
[51]
  • Limited exploration of deep learning architectures in AI-based environmental monitoring frameworks like OAI-AQPC constrains the prediction and classification accuracy for complex NPSP scenarios.
  • Explore extending the OAI-AQPC framework by designing and integrating advanced deep learning architectures (e.g., CNNs, LSTMs) for more robust prediction and classification of diffuse pollution patterns and real-time pollution monitoring.
[89]
  • Need for developing more flexible and context-aware modeling approaches as current IoT–AI environmental monitoring frameworks (ETAPM-AIT) limit predictive accuracy and adaptability in complex, dynamic pollution conditions.
  • Enhancing the predictive performance of the ETAPMAIT model by integrating advanced deep learning architectures. This may lead to more accurate and robust predictions, particularly in complex or data-rich environments.
[51]
  • ANFIS models for air pollution prediction are sometimes limited to a set of metaheuristic optimization techniques, limiting generalizability and comparative benchmarking across different environments.
  • Expand air pollution prediction efforts by applying trained ANFIS structures across diverse geographical locations. Additionally, explore and benchmark the effectiveness of alternative algorithms for ANFIS optimization to enhance predictive robustness and adaptability to varying pollution and meteorological conditions.
[110]
  • Current retrieval models rely on pixel-by-pixel satellite image processing, neglecting spatiotemporal correlations of WQPs. Additionally, heavy reliance on manual feature engineering for non-optically active WQPs increases workload.
  • Advance water quality retrieval by integrating DL algorithms (e.g., CNNs for spatial feature extraction and RNNs for temporal dynamics) to automatically capture complex spatiotemporal patterns in remote sensing images, minimizing reliance on manual feature engineering. Further, incorporate hydrometeorological conditions (e.g., droughts, floods) as additional inputs to enhance the predictive performance of eutrophication indicator models in coastal environments.
[83]
  • Limited exploration of different CNN architectures for groundwater vulnerability mapping.
  • Explore the application of advanced CNN architectures and systematically evaluate the effects of model dimensionality (1D, 2D, 3D CNNs) on the accuracy and spatial prediction patterns of groundwater vulnerability to pollution, to optimize model selection and improve mapping precision.
[76]
  • Existing ANN-based air pollution forecasting studies often lack systematic handling of missing data and ignore the computational efficiency and running time constraints critical for real-time applications.
  • Investigate and integrate advanced imputation methods for handling missing environmental datasets and systematically evaluate the computational penalties and runtime performance of ANN models to enhance their practical applicability for real-time pollution forecasting and monitoring.
[23]
  • AI-driven air quality models primarily focus on pollutant concentration predictions but lack systematic validation through comparative benchmarking against other indices and real-world health impact data, limiting their credibility for NPSP management and public health applications.
  • Strengthen the validation of AI-based air quality indices by (i) systematically comparing the proposed AQI against other established air quality indices, (ii) correlating AQI scores with disease incidence rates, and (iii) analyzing specific polluted areas to assess the effectiveness of predictive models in capturing the health and environmental impacts of urban NPSP.
[115]
Data Limitations and Monitoring Challenges
  • Limited availability of long-term continuous data from hyperspectral remote sensing for monitoring optically inactive WQPs; current satellite observations often fail to capture the high temporal variability of nutrient concentrations.
  • Explore the integration of high-temporal-resolution multispectral and hyperspectral remote sensing data with machine learning frameworks to improve the monitoring of rapid nutrient fluctuations in large-scale inland river systems, particularly focusing NPSP dynamics.
[20]
  • Although the ML-based (e.g., ETR) indirect monitoring method demonstrated strong performance, its predictive capacity could be further improved by integrating additional hydrological and anthropogenic predictors. Moreover, the method’s transferability to different types of water bodies with varying hydro-environmental conditions remains untested.
  • Explore the integration of supplementary predictors such as sediment load and flow velocity to enhance prediction accuracy at specific stations. Additionally, adapt and optimize the proposed machine learning-based framework for application to other water bodies (e.g., lakes, estuaries) by tailoring methodological steps to local hydro-environmental characteristics and nutrient cycling dynamics.
[84]
  • Despite advances in ML applications for groundwater pollution, challenges remain regarding the high computational costs of model training and deployment, limiting scalability and broader implementation.
  • Explore methods to mitigate computational costs in groundwater pollution modeling to enable scalable and real-time pollution assessment and management.
[31]
  • Existing AI models (e.g., FL, NN) effectively predict landfill leachate penetration into groundwater but do not address the downstream impacts of heavy metal contamination on soil health and plant physiology.
  • Investigate the levels of heavy metal accumulation in soils affected by landfill leachate infiltration and assess their impact on plant growth and physiological functions to inform the development of targeted remediation strategies and promote sustainable agricultural practices in NPSP-affected areas.
[67]
Governance, Policy, and Social Dimensions
  • Limited understanding of the complex relationship between income inequality and environmental pollution and the inadequacy of addressing both issues through a single, unified strategy.
  • Explore integrated modeling approaches that combine socio-economic and environmental data using advanced AI techniques (e.g., hybrid KNN-SVM models) to better capture the dynamic interactions between income inequality and environmental pollution across different regional development contexts.
[116]
  • GBT provides exposure estimates from arsenic pollution but lacks causal analysis linking long-term arsenic ingestion via drinking water to specific human health outcomes, and insufficient evaluation of water treatment interventions limits risk mitigation strategies.
  • Conduct causal studies to establish direct relationships between arsenic ingestion from diffuse water sources and human health risks and evaluate the effectiveness of water treatment technologies at both the centralized treatment plant and household levels to enhance safe drinking water supply management.
[61]
  • Traditional LUR models struggle to capture complex spatial and nonlinear relationships in particulate matter exposure assessment.
  • LURF is expected to be a valuable tool for exposure assessment in epidemiological studies that examine the relationship between elemental components of particulate matter and associated health outcomes. Looking ahead, the integration of random forest and other advanced machine learning techniques into land use regression models may significantly improve the accuracy and resolution of exposure assessments, supporting more precise and data-driven public health research.
[56]
  • GA-XGBoost algorithm was used primarily for urban river water quality monitoring but focused on pollutant concentration prediction while overlooking dynamic socio-environmental factors, such as human migration patterns, that can influence NPSP trends.
  • Investigate the relationship between population mobility (e.g., migration into/out of towns) and changes in pollutant levels, particularly arsenic concentrations, to enhance AI-driven urban water quality monitoring models and better capture emerging NPSP dynamics.
[91]
System Integration: IoT, Remote Sensing
  • Limited evaluation of AI–IoT frameworks under real-time, variable environmental conditions constrains their scalability and adaptability for NPSP monitoring and forecasting
  • AI system integrated with IoT can be evaluated in terms of its response time and predictive performance across various system components. Additionally, its effectiveness under different operational conditions may be assessed, with dynamic context-based optimization applied based on system operator parameters. Incorporating multiple nodes capable of running predictive algorithms could further enhance the system’s speed, scalability, and reliability.
[73]
  • Difficulty in gathering synchronized air, water, and soil quality data limits the ability to develop holistic and accurate AI-based environmental decision support systems for mangrove ecosystems.
  • Develop integrated AI-based monitoring frameworks (e.g., AI, sensors) capable of capturing synchronized air, water, and soil quality datasets in mangrove ecosystems and handling missing data to improve NPSP management and resilience.
[27]
  • AI frameworks for river water quality assessment often suffer from insufficient integration of real-time IoT sensor data, limited dataset scalability, and a lack of interdisciplinary approaches, restricting their precision, adaptability, and large-scale implementation for NPSP management.
  • Explore the integration of advanced AI techniques with IoT-based sensor networks to enhance data quality, availability, and real-time monitoring capabilities, while also fostering interdisciplinary research that combines environmental science, ML, and remote sensing.
[29]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Morain, A.; Nedd, R.; Poole, K.; Hawkins, L.; Jones, M.; Washington, B.; Anandhi, A. Artificial Intelligence Application in Nonpoint Source Pollution Management: A Status Update. Sustainability 2025, 17, 5810. https://doi.org/10.3390/su17135810

AMA Style

Morain A, Nedd R, Poole K, Hawkins L, Jones M, Washington B, Anandhi A. Artificial Intelligence Application in Nonpoint Source Pollution Management: A Status Update. Sustainability. 2025; 17(13):5810. https://doi.org/10.3390/su17135810

Chicago/Turabian Style

Morain, Almando, Ryan Nedd, Kevin Poole, Lauren Hawkins, Micala Jones, Brian Washington, and Aavudai Anandhi. 2025. "Artificial Intelligence Application in Nonpoint Source Pollution Management: A Status Update" Sustainability 17, no. 13: 5810. https://doi.org/10.3390/su17135810

APA Style

Morain, A., Nedd, R., Poole, K., Hawkins, L., Jones, M., Washington, B., & Anandhi, A. (2025). Artificial Intelligence Application in Nonpoint Source Pollution Management: A Status Update. Sustainability, 17(13), 5810. https://doi.org/10.3390/su17135810

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop