3.1. Contribution of the Selected Articles
The analysis of the eight articles reveals a remarkable convergence in the application of the IoT and artificial intelligence (AI) to enhance water quality monitoring and management. This technological convergence is evident in all the articles, which explore different approaches and applications to ensure drinking water’s safety and water resources’ sustainability.
Real-time data collection is a central point in all studies. Various types of sensors are used to measure the physicochemical parameters of water, such as pH, temperature, turbidity, conductivity, dissolved oxygen, and specific pollutant levels, such as nitrate and phosphate. IoT plays a crucial role in connecting these sensors, enabling the transmission of data to centralized systems, which facilitates remote monitoring and real-time water quality analysis. This emphasis on real-time data collection underscores the importance of up-to-date information for making quick and effective water management decisions.
AI, mainly through machine learning techniques like artificial neural networks (ANNs), is widely used to analyze the data collected by the sensors. These techniques allow the identification of complex patterns in the data, classification of water potability, and prediction of future trends. Ref. [
20] uses regression models to estimate biological indicators, while refs. [
17,
18] employ machine learning algorithms to analyze sensor data and detect anomalies in water quality. Ref. [
5] uses ensemble learning to classify water potability, combining different algorithms to improve prediction accuracy. Ref. [
19] explores advanced deep learning techniques, such as Time-Distributed Convolutional Neural Network–Long Short-Term Memory (TD-CNN-LSTM), to further enhance the accuracy of water quality prediction.
Water quality monitoring systems based on IoT and AI find applications in various areas. Ref. [
20] focuses on catchment areas, while [
5] develops a portable and low-cost system for rural areas where access to safe drinking water is challenging. Ref. [
17] addresses monitoring at water treatment plants, and [
21] proposes a system to monitor water consumption in apartments. This diversity of applications demonstrates the flexibility and potential of these technologies to be adapted to different contexts and needs.
Despite significant advances, challenges such as the costs of implementing and maintaining systems, limited connectivity in remote areas, and the need to ensure the quality of collected data still need to be addressed. The interpretability of AI models is also crucial, as it allows decision-makers to understand the factors influencing water quality and take appropriate measures.
This review reveals a dynamic and constantly evolving field of research. The combination of IoT and AI is revolutionizing water quality monitoring and management, offering innovative solutions to ensure safe drinking water, the sustainability of water resources, and public health. Continued research in these areas can significantly advance society and the environment.
3.2. Comparative Synthesis of the Selected Works
A01 [16] provides a broad review that consolidates IoT + AI applications for water quality monitoring and prediction. Its advantages are as follows: scope and synthesis across sensing, connectivity, and ANN-based analytics; and providing context to subsequent primary studies.
A02 [5] conducts rural potability classification using KNN/CART/HV-NB with Zigbee and 2G/3G backhaul; parameters include pH, turbidity, TDS, and temperature. Its advantages include low-cost design targeting underserved areas with intermittent connectivity.
A03 [17] addresses real-time drinking-water monitoring with KNN/DT/SVM; parameters span pH, turbidity, temperature, conductivity, and cellular-level metrics. Its advantage is that it provides multi-algorithm comparison for potability assessment in operational settings.
A04 [4] performs cloud-integrated monitoring over Wi-Fi using ESP-class MCU and employs deep learning for inference; parameters include pH, temperature, and dissolved oxygen. Its advantage is its scalable architecture with seamless device–cloud integration.
A05 [18] proposes a portable, field-deployable system (Arduino Uno + LoRa) for agricultural water; parameters include pH, temperature, nitrate, and phosphate, with KNN classification. Its advantage lies in providing a long-range, low-power link enabling large-area coverage without Wi-Fi.
A06 [19] provides a TD-CNN-LSTM time-series model over Arduino-class edge; parameters include pH, temperature, conductivity, and flow. Its advantage involves temporal modeling that captures dynamics beyond static thresholds.
A07 [20] conducts catchment-scale analysis using zero-inflated models (ZIP/ZINB) on microbiological and optical parameters (e.g., coliforms,
E. coli, color). Its advantage is its statistically principled handling of excess zeros in environmental counts.
A08 [21] utilizes a Wi-Fi + Arduino Mega system for automated usage/flow monitoring with ANN-based analytics. Its advantage lies in its emphasis on consumption patterns, enabling demand-side management.
Collectively, these studies span household, agricultural, and watershed contexts. Their complementary emphases on connectivity choices, embedded platforms, and applied algorithms reveal design trade-offs across application domains. A comprehensive synthesis of these characteristics is presented in
Table 8, while the frequency patterns are further detailed in
Table 9,
Table 10,
Table 11 and
Table 12 and
Figure 4,
Figure 5,
Figure 6 and
Figure 7.
3.3. Answers to the Research Questions
This section presents the answers to the defined research questions. The questions were discussed in a manner that addresses only the primary question rather than its sub-questions (e.g., QP01.1), as the central question encompasses the subjects covered in its sub-questions.
The articles presented a wide variety of parameters extracted from water samples, as shown in
Table 9, with many present in only one or two articles. The parameters that stood out the most were temperature, pH, turbidity, and conductivity, representing usage of 100%, 87.5%, 50%, and 37.5%, respectively.
Table 9.
Research Question 01.
Table 9.
Research Question 01.
| Parameter | Paper |
|---|
| pH | A01, A02, A03, A04, A05, A06, A07 |
| Turbidity | A01, A02, A03, A07 |
| Total Dissolved Solids | A02 |
| Temperature | A01, A02, A03, A04, A05, A06, A07, A08 |
| Conductivity | A01, A03, A06 |
| Salinity | A01 |
| Dissolved Oxygen | A01, A04 |
| Fecal Coliforms | A01 |
| Total Coliforms | A01, A07 |
| Escherichia coli | A01, A07 |
| Presumptive or Intestinal Enterococcus | A01, A07 |
| Ammonium | A03 |
| Nitrate | A03, A05 |
| Potassium | A03 |
| Live and Dead Cell Count | A03 |
| Total Cell Count | A03 |
| Percentage of Intact Cells | A03 |
| CO2 | A04 |
| Humidity | A04 |
| Phosphate | A05 |
| Flow | A06, A08 |
| Clostridium perfringens | A07 |
| Color | A07 |
The additional parameters have also demonstrated significant importance in the monitoring process. However, the specific parameters required vary depending on the application and the intended use of the analyzed water. A comprehensive analysis is necessary to ascertain whether these parameters enhance the application. Consequently, emphasis is generally placed on the most frequently utilized parameters, encompassing a broader spectrum of applications.
Figure 4 represents the frequency of appearance among the articles analyzed
Temperature, pH, turbidity, and conductivity recur across studies because they are (i) fundamental physical–chemical indicators that proxy broad classes of contamination and treatment efficacy; (ii) inexpensive to sense with mature, commercially available probes; (iii) relatively straightforward to calibrate and maintain in field conditions; and (iv) widely incorporated by regulatory indices and operational heuristics used by utilities. In addition, temperature modulates solubility and reaction kinetics; pH captures acid–base balance affecting corrosion, disinfection, and bioavailability; turbidity reflects particulate load and is an early-warning proxy for runoff and microbial risk; and conductivity tracks ionic strength and salinity, correlating with TDS and intrusion events [
22,
23,
24,
25,
26]. These variables offer high signal-to-effort ratios for automated classification, which explains their prevalence in
Table 9.
Figure 4.
Frequency of occurrence of the parameters in the analyzed articles.
Figure 4.
Frequency of occurrence of the parameters in the analyzed articles.
Regarding the machine learning algorithms used for data analysis, two methods stand out in terms of frequency, namely K-Nearest Neighbors (KNN) and artificial neural networks (ANNs), representing usage of 37.5% and 25%, respectively. The other articles presented various other artificial intelligence algorithms (listed in
Table 10) and their respective strengths and weaknesses. However, it is noted that the most commonly used techniques (shown in order of use in
Figure 5 chart) tend to be more robust and have a wider variety of applications and destinations for analyses.
Once again, it is essential to emphasize that selecting techniques is contingent upon the specific application. While certain methods may be less pervasive, they may offer heightened relevance in particular contexts. For instance, the “Classification and Regression Tree” and “Decision Tree” methods are predominantly applied in scenarios where pre-established classification rules prevail.
Table 10.
Research Question 02.
Table 10.
Research Question 02.
| Algorithm | Paper |
|---|
| Zero-Inflated Poisson Regression (ZIP) | A07 |
| Zero-Inflated Negative Binomial Regression (ZINB) | A07 |
| TD-CNN-LSTM | A06 |
| KNN | A02, A03, A05 |
| Deep Learning (DL) | A04 |
| Decision Tree (DT) | A03 |
| Support Vector Machines (SVsM) | A03 |
| ANN | A01, A08 |
| Classification and Regression Tree (CART) | A02 |
| Hard Voting with Naive Bayes (HV-NB) | A02 |
Figure 5.
Frequency of use of algorithms.
Figure 5.
Frequency of use of algorithms.
Two factors were analyzed among the technologies: the form of communication and the microcontroller used.
Among the types of communication demonstrated in
Table 11, Wi-Fi communication represented 50% of the technologies used in the analyzed articles, with Zigbee also standing out with a usage rate of 25%. In this case, the widespread use of Wi-Fi in most of the reviewed studies is due to the high reliability of the technology, as well as its broad coverage and ease of integration with various services and applications, whether third-party or custom-developed.
Table 11.
Research Question 03—technology.
Table 11.
Research Question 03—technology.
| Technology | Paper |
|---|
| Wi-Fi | A01, A03, A04, A08 |
| Bluetooth | A01 |
| Zigbee | A02, A03 |
| 2G/3G | A02 |
| Lora | A05 |
Figure 6 below presents a graph demonstrating the most commonly used data transmission technologies among the articles analyzed.
Figure 6.
Frequency of technologies.
Figure 6.
Frequency of technologies.
Wi-Fi modules (notably ESP8266/ESP32-class) are cheap, integrate TCP/IP stacks, and leverage existing LAN/routers in homes, labs, and plants, reducing deployment friction and recurring costs. Their throughput supports multi-parameter payloads and over-the-air updates, and the ubiquitous development ecosystem (Arduino IDE v2.3.6, PlatformIO Core 6, MicroPython 1.26) lowers time-to-prototype. Where infrastructure is absent or energy budgets are tighter, studies resort to LoRa or cellular (as in A05 and A02), but in the analyzed works, most occurred in Wi-Fi-covered environments, explaining the dominance in
Figure 6.
Regarding the microcontroller used, shown in
Table 12, there was no clear preference for a single one, as some articles did not detail the processing core used in the application. However, if we consider a family of microcontrollers, the general use of Arduinos can be highlighted, present in 62.5% of the analyzed articles. The use of Arduino development boards is attributed to the ease of its development environment and the broad market presence of these boards.
Table 12.
Research Question 03—microcontroller.
Table 12.
Research Question 03—microcontroller.
| Microcontroller | Paper |
|---|
| Arduino Mega | A01, A08 |
| ESP8266 | A01, A04 |
| Arduino Uno | A05, A06 |
| Arduino (not specified) | A02 |
| ESP32 | A01 |
| Raspberry Pi 3 | A01 |
The graph in
Figure 7 shows the distribution of the use of microcontrollers mentioned in the articles.
Arduino Uno/Mega and ESP-based boards present a gentle learning curve, abundant libraries for common water quality probes (pH, turbidity, DO) and stable toolchains. They balance I/O richness (ADC, UART/I
2C) with low cost, and their communities provide robust examples and shields. For lightweight inference, they support edge preprocessing and simple ML pipelines, while heavier models can be offloaded. This practicality drives their frequency in
Table 12. We note that domain constraints (e.g., ultra-low power or industrial EMC) can motivate alternatives (e.g., STM32, TI, or PLCs), which appear less often in the selected set.
Figure 7.
Frequency of microcontrollers.
Figure 7.
Frequency of microcontrollers.