Next Article in Journal
Size-Selective Adsorption Phenomena and Kinetic Behavior of Alcohol Homologs in Metal–Organic Framework QCM Sensors: Reconciling Apparent Contradictions
Previous Article in Journal
Emergency Wound Infection Monitoring and Treatment Based on Wearable Electrochemical Detection and Drug Release with Conductive Hydrogel
Previous Article in Special Issue
Origin Identification of Table Salt Using Flame Atomic Absorption and Portable Near-Infrared Spectrometries
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Advanced Chemometric Techniques for Environmental Pollution Monitoring and Assessment: A Review

1
Department of Chemical Engineering, Jubail Industrial College, Al-Jubail 31961, Saudi Arabia
2
Global Forensic and Justice Center, Department of Chemistry and Biochemistry, Florida International University, Miami, FL 33199, USA
*
Author to whom correspondence should be addressed.
Chemosensors 2025, 13(7), 268; https://doi.org/10.3390/chemosensors13070268
Submission received: 14 May 2025 / Revised: 15 July 2025 / Accepted: 17 July 2025 / Published: 21 July 2025
(This article belongs to the Special Issue Chemometrics Tools Used in Chemical Detection and Analysis)

Abstract

Chemometrics has emerged as a powerful approach for deciphering complex environmental systems, enabling the identification of pollution sources through the integration of faunal community structures with physicochemical parameters and in situ analytical data. Leveraging advanced technologies—including satellite imaging, drone surveillance, sensor networks, and Internet of Things platforms—chemometric methods facilitate real-time and longitudinal monitoring of both pristine and anthropogenically influenced ecosystems. This review provides a critical and comprehensive overview of the foundational principles underpinning chemometric applications in environmental science. Emphasis is placed on identifying pollution sources, their ecological distribution, and potential impacts on human health. Furthermore, the study highlights the role of chemometrics in interpreting multidimensional datasets, thereby enhancing the accuracy and efficiency of modern environmental monitoring systems across diverse geographic and industrial contexts. A comparative analysis of analytical techniques, target analytes, application domains, and the strengths and limitations of selected in situ and remote sensing-based chemometric approaches is also presented.

1. Introduction

Chemometrics represents a robust and systematic methodology that integrates measurement-based analytical chemistry with advanced statistical and mathematical modeling. It plays a pivotal role in analytical chemistry, particularly in optimizing experimental conditions and extracting meaningful chemical information from complex datasets (Figure 1). By applying both qualitative and quantitative analyses, chemometric tools facilitate the exploration of variable relationships using univariate and multivariate techniques. These methods are instrumental in mining data derived from environmental monitoring processes, employing both supervised and unsupervised pattern recognition strategies.
Supervised learning methods rely on predefined output variables, enabling targeted assessments such as sample classification and quality prediction based on known parameters. This approach enhances the precision of environmental monitoring by reducing data dimensionality and improving interpretability. In contrast, unsupervised methods autonomously identify intrinsic data structures by grouping variables or samples based on similarity, uncovering latent patterns without prior knowledge. PCA, FA, and CA are among the most widely used unsupervised methods, particularly suited for multidimensional environmental datasets [1,2,3,4,5,6].
Traditional statistical techniques often fall short in handling the complexity and variability inherent to environmental systems. In contrast, multivariate statistical and geostatistical approaches offer more comprehensive insights. For example, CA has been applied to evaluate spatial correlations among sampling points while PCA and multivariate regression models have been employed at regional scales to discern environmental attributes and pollutant distributions [7,8,9,10].
Figure 2 illustrates the growing significance of chemometric techniques, as evidenced by a marked increase in publications from 2000 to 2024 using the keywords “chemometrics,” “environmental analysis,” and “pollution assessment.” The literature reveals chemometrics as a highly effective tool for identifying pollution sources by integrating faunal structure dynamics, physicochemical profiling, and in situ toxicity analysis. These approaches have proven particularly useful in analyzing large and complex datasets derived from soil and water systems, including assessments of heavy metal contamination [11,12,13].
Unbiased chemometric techniques, such as PCA, HACA, DA, and MLR, have been extensively used to evaluate the quality of air, water, and soil. HACA typically delineates pollution sources into three clusters: low, moderate, and high intensity. DA, applied via forward and backwards stepwise procedures, has helped isolate key variables responsible for environmental degradation. PCA and FA have further confirmed major pollution contributors, including mobile sources (e.g., vehicular emissions), stationary sources (e.g., fossil fuel combustion and industrial operations), cross-boundary pollution (e.g., transnational wildfires and volcanic emissions), and agricultural activities [14,15,16,17,18,19,20,21,22,23]. Mobile air pollution is predominantly generated by transportation systems—such as cars, buses, and trucks—whereas stationary sources arise from industrial activities, power plants, open burning, and food processing facilities [24,25]. Cross-boundary pollution, meanwhile, encompasses airborne contaminants originating from external events such as wildfires or volcanic activity in neighboring regions.
Modern analytical techniques now generate vast datasets related to environmental pollutants in water, soil, and air. However, sustaining ecosystem health requires actionable insights derived from these data. Ecological resilience is increasingly threatened by anthropogenic pressures—deforestation, biodiversity loss, climate change, and unregulated development—which have already led to the degradation of critical habitats such as coral reefs. Nearly 19% of the world’s coral reefs have been lost, with an additional 35% at risk due to ocean acidification, rising temperatures, pollution, and illegal fishing. This decline jeopardizes the livelihoods of over a billion people, nearly 13% of the global population, who live within 100 km of a reef and depend on its biodiversity for sustenance and economic security [26,27,28,29].
Despite the widespread application of chemometric methods, a comprehensive understanding of their core principles and best practices remains limited in existing literature. This review aims to address this gap by providing a critical evaluation of chemometric tools best suited for environmental applications. It also explores potential pollution sources and their implications for human health. Ultimately, this study advocates for the integration of chemometric frameworks into modern environmental monitoring systems, enabling more responsive, accurate, and scalable assessments in both scientific and industrial domains.

2. Environmental Monitoring and Assessment

Environmental monitoring is fundamental to assessing and managing the quality of natural ecosystems. Chemometric techniques play a crucial role in this domain by enabling the efficient handling of large, complex datasets, uncovering hidden patterns, and facilitating accurate identification of pollution sources. These data-driven approaches are indispensable for interpreting the growing volume of information generated by modern environmental monitoring systems.
Monitoring involves systematic sampling and analysis of environmental media—such as air, water, soil, or integrated ecosystems—to detect changes in contaminant levels over time and space. Effective monitoring requires well-defined sampling strategies and robust analytical protocols that account for both internal variability and external environmental factors. As technologies evolve, the scale and complexity of data increase, necessitating the adoption of advanced statistical and computational tools for data analysis.
At its core, environmental monitoring includes measuring key analytes, identifying pollution sources, and communicating findings. However, comprehensive monitoring is often resource-intensive, involving time-consuming and labor-intensive sampling protocols. These challenges underscore the need for streamlined and automated analytical methods that can deliver reliable insights in real-time or near-real-time settings [30].

2.1. In Situ Monitoring

In situ chemometrics refers to the application of statistical and mathematical models directly at the point of environmental data acquisition. This real-time or on-site analysis approach minimizes sample handling, accelerates decision-making, and improves the accuracy of environmental assessments. Recent advances in in situ monitoring technologies have significantly expanded the capabilities of environmental surveillance, offering fast, cost-effective, and scalable solutions across diverse ecological systems [31,32,33,34,35]. Another innovative technique employed planar microwave sensors for continuous in situ monitoring of water quality in freshwater systems, particularly in mining-impacted areas. These sensors operate by emitting microwave signals and analyzing shifts in resonant frequencies within the GHz range. Changes in resonance indicate alterations in water composition, which correlate with the presence of trace metal pollutants. Field deployments in four contaminated sites in the UK confirmed the system’s sensitivity and responsiveness to metals such as lead, cadmium, arsenic, and mercury. This technology provides a rapid, low-cost, and reliable method for continuous monitoring of freshwater pollution, enabling early detection and timely environmental interventions [32].
One notable application involved the development of an in-situ Raman spectroscopy method for monitoring salt disproportionation in pharmaceuticals. This approach enabled the real-time quantification of polymorphic forms, including two metastable and one stable free base, from the disproportionation of two salt types. Unlike earlier studies that utilized pre-characterized polymorph mixtures, this method analyzed evolving polymorph compositions in situ, providing deeper insight into polymorphic transitions during the disproportionation process [31,35,36,37,38]. Bojko et al. [35] investigated the application of in situ SPME on Mediterranean marine sponges for untargeted exometabolomic profiling. This technique enabled the capture of chemical signatures directly from the sponge–environment interface, revealing rich chemical information that reflects both biological processes and environmental stressors. The work highlighted the effectiveness of SPME in minimally invasive, real-time environmental surveillance. Parallel progress has occurred in pharmaceutical sciences, where in situ spectroscopic techniques are increasingly employed to investigate solid-state transformations, particularly salt disproportionation during drug dissolution and storage. Nie et al. [36] utilized Raman spectroscopy and mapping to study salt-to-free-base conversions within tablet matrices. Their spatially resolved data revealed significant heterogeneity in conversion patterns, underscoring the importance of Raman tools in stability assessment and formulation optimization. Previous research has integrated NIR imaging with Raman mapping to investigate the disproportionation of drug HCl salts during dissolution. Their combined approach enabled real-time visualization of phase changes, providing a deeper understanding of how these transformations affect drug release and solubility [37]. Ewing et al. [38] employed ATR-FTIR imaging alongside Raman mapping to study salt disproportionation under hydrated conditions. Their results emphasized the spatial and temporal evolution of salt forms during drug delivery, providing crucial insights into formulation performance and release kinetics. In the same context, Nie et al. [39] compared backscattering and transmission Raman spectroscopy methods for quantifying solid-state form conversions in pharmaceutical tablets. The authors found transmission Raman spectroscopy to be more accurate for analyzing internal structures due to its better penetration depth and reduced surface bias. This work underlines the importance of selecting suitable Raman modalities for in situ pharmaceutical analyses.
A summary of recent in situ chemometric approaches—including their targeted analytes, analytical methods, environmental applications, advantages, and limitations—is provided in Table 1.

2.2. Remote Monitoring

Remote monitoring refers to the use of advanced technologies to collect, analyze, and interpret data from distant locations without the need for physical presence. This approach is particularly valuable for observing expansive, difficult-to-access, or hazardous environmental systems. It has found widespread applications in diverse sectors, including environmental surveillance, industrial process control, healthcare, agriculture, and security. In the context of environmental monitoring, remote technologies enable continuous assessment of parameters such as air and water quality, deforestation, and indicators of climate change. These systems utilize a suite of tools, including satellites, UAVs, sensor networks, and IoT devices, to facilitate real-time and long-term data acquisition across terrestrial and aquatic environments [40]. Table 2 provides an overview of key remote sensing technologies and their typical environmental applications [41,42,43,44,45,46,47].
Satellite-based remote sensing plays a vital role in environmental monitoring by providing extensive spatial and temporal coverage for assessing variables such as air quality, water pollution, land use changes, and vegetation dynamics. The integration of satellite retrievals with chemometric techniques enables advanced, scalable, and cost-effective environmental monitoring solutions. These tools are particularly valuable in regions with limited ground infrastructure and have enormous potential in addressing global climate and sustainability challenges. The data, however, is often high-dimensional and complex. Chemometric techniques, which encompass multivariate statistical methods and machine learning tools, are essential for analyzing these complex datasets, identifying patterns, and making precise predictions [48]. Satellite retrieval refers to the process of deriving geophysical or environmental parameters (e.g., aerosol optical depth, chlorophyll concentration, and land surface temperature) from measurements made by satellite sensors. Popular satellite sensors include MODIS, TROPOMI on S5P, the Landsat series, and the VIIRS. These sensors provide raw radiometric or spectral data, which are processed through retrieval algorithms to extract quantitative environmental indicators. To interpret and model these retrievals, chemometric techniques are widely used. ATBM is highly influential in integrating geometric and radiometric information that corresponds to tensor power iteration and image detection. ANN helps to reduce data dimensionality while preserving variance. It is often used to enhance efficiency and computational complexity in power iterations [49]. GWR is employed to estimate surface-level pollution concentrations, such as PM2.5, using satellite-derived AOD. The GWR forecasters also included replicated aerosol composition and information on land use [50]. GOSAT has evolved into a widely used method for detecting CH4, which has been enhanced by the XGBoost algorithm. The model yielded an RMSE value of 0.0251 µg/L and a cross-validation coefficient (R2) of 0.79, along with an MAPE of 0.88%. This model is also capable of apprehending complex nonlinear relationships between satellite column concentrations and their influencing elements [51]. Supervised classification methods such as SVM and RF are applied for vegetation health monitoring, land use categorization, and water quality assessment [52]. For example, Van Donkelaar et al. [50] developed a hybrid geophysical-statistical model combining satellite AOD data with chemical transport models and ground-based observations to generate global estimates of PM2.5. Gholizadeh et al. [48] demonstrated the application of chemometrics in analyzing water quality by using PCA and ANN to interpret MODIS data for parameters such as chlorophyll-a and turbidity. The satellite types, along with chemometric methods for environmental monitoring, are presented in Table 3. Despite their effectiveness, these methods encounter challenges such as atmospheric interferences, sensor calibration errors, and the need for adequate ground-truth data for validation. The integration of chemometric techniques with machine learning and data fusion approaches (e.g., combining satellite, UAV, and ground sensor data) is a growing trend to improve accuracy and spatio-temporal coverage.
A wide range of sensor types, as follows, is employed in remote environmental monitoring, often in conjunction with chemometric techniques to enhance data interpretation and predictive modeling:
Optical sensors: Utilizing UV–Vis, NIR, and Raman spectroscopy, these sensors are employed for both qualitative and quantitative analysis of chemical constituents in environmental samples, with a primary focus on monitoring water quality. It is highly recommended to estimate physicochemical variables, such as dissolved organic carbon, nitrate, and turbidity, which also improve measurement absorbance accuracy [53].
Electrochemical sensors: Commonly applied for detecting heavy metals, pesticides, and organic pollutants in soil and water through redox and ion-selective mechanisms. The sensor has been effectively utilized to study the cytotoxicity of heavy metals such as Cd2+, Cr6+, Cu2+, Pb2+, and Zn2+ using eukaryotic cells that are highly sensitive and cost-effective [54].
Biosensors: Engineered for high specificity, these sensors target biological contaminants, including pathogens, toxins, and microbial pollutants. It also offers numerous benefits over orthodox analytical techniques, including portability, sensitivity, ease, affordability, selectivity, and the capability to measure the contaminated water’s toxicity [55].
Gas sensors: Utilized for ambient air monitoring, particularly for gaseous pollutants like NO2, O3, and VOCs. The gas sensor based on nanostructures can enhance detection performance by employing nanoparticles and carbon materials as a composite. However, SnO2 is used on a large scale as an active material for commercial purposes in the detection of various gases. Alternatively, researchers continually explore new sensing materials that exhibit higher selectivity and sensitivity at low operating temperatures, a fast response, high recovery competence, and replicability compared to commercial sensors for detecting gases, while maintaining the elasticity and toughness required for environmental gas sensors [56].
Electronic noses and tongues: These sensor arrays mimic human olfactory and gustatory systems to detect and differentiate complex chemical mixtures in air and water environments [57].
The integration of these sensor platforms with chemometric algorithms significantly enhances the ability to interpret large, multidimensional datasets enabling robust assessments of environmental quality and more accurate identification of sources. As sensor sensitivity and communication technologies advance, remote chemometric monitoring is expected to play an increasingly pivotal role in real-time environmental decision-making and early-warning systems.
Table 4 summarizes key remote monitoring strategies involving chemometrics, outlining their focus areas, analytical methodologies, target analytes, advantages, and limitations as reported in the recent literature.

3. Types of Chemometric Techniques

Chemometric techniques encompass a suite of mathematical and statistical methods designed to extract relevant insights from complex chemical datasets. These techniques are particularly valuable in fields such as spectroscopy, chromatography, environmental science, and food analysis. Chemometric approaches can be broadly categorized into unsupervised learning, supervised learning, regression analysis, correlation techniques, and time series analysis, depending on the nature and purpose of the studies. Different approaches are typically used with large datasets, dividing them into training or test sets and validation sets. In contrast, when datasets contain fewer samples, resampling algorithms are favored to split the data, assuming that the data consumed by these algorithms, popular for CV, is also representative of future data. The CV divides the data into k-folds where one-fold is used as a validation set and the remaining k-1 folds are used to build the model. These are usually repeated k times to achieve them as a validation set, and the predictive accomplishment is estimated as the average representative and independent datasets (Figure 3).

3.1. Unsupervised Learning Methods

Unsupervised learning methods are primarily employed for pattern recognition, exploratory data analysis, and dimensionality reduction, especially when class labels are unknown. These methods play a central role in extracting hidden structures and trends from complex, high-dimensional datasets in chemometrics.

3.1.1. Cluster Analysis

CA is a fundamental unsupervised chemometric technique used to identify natural groupings or patterns in multivariate datasets. In chemical and environmental sciences, CA is frequently applied to spectral, chromatographic, and environmental data to group samples with similar chemical profiles without requiring predefined class labels.
For instance, El-Rawy et al. [3] employed an integrated chemometric approach that combined PCA and HCA to assess groundwater quality using 180 samples. The SAR classification indicated that 91.67% of the samples were suitable for irrigation, with no adverse effects related to sodium in the groundwater samples, as indicated by an R2 value greater than 0.8. The study identified key hydrogeochemical processes, such as evaporation, seawater intrusion, and ion exchange, and grouped sampling sites based on water quality similarities, producing a spatial distribution map that highlights areas unsuitable for irrigation.
In another study, Adejuwon et al. [68] utilized PCA, HCA, and FA to evaluate surface water quality in a lake system. Chemometric analysis enabled the identification of pollution sources, primarily municipal runoff and domestic wastewater, by clustering sampling locations and interpreting factor loadings. The study indicated that higher values of TSS, DO, etc., were observed in the summer season compared to other seasons and showed a strong correlation among them, notably between phosphate and total hardness (r = 0.978) in the wet season and between pH and temperature (r = 0.995) in the dry season.
Similarly, Curcic et al. [69] applied PCA and CA to monitor pesticide residues in surface waters using HPLC. The study effectively tracked the spatial and temporal contamination patterns of organic micropollutants across two river systems with a primary focus on the presence of dimethachlor and its metabolites, such as dimethachlor ethanesulfonic acid and dimethachlor oxalamic acid, in surface water. The accuracy of the method was determined using RMSE and R2 values, which ranged from 0.569 to 1.242 and from 0.997 to 0.998, respectively.
An innovative approach by Tinnevelt et al. [60] employed high-resolution flow cytometry data analyzed through a novel chemometric framework, DAMACY. The investigation successfully predicted all the nutrient variables including NO2, NO3, etc., with R2 values ranging from 0.49 to 0.73, indicating that these variables have a significant influence on phytoplankton cells. This integrated PCA and discriminant analysis to track phytoplankton community shifts in real-time, providing early warning indicators for aquatic ecosystem disturbances.

3.1.2. Artificial Neural Networks

ANNs are powerful machine learning models that are increasingly adopted in chemometrics to address nonlinear, multivariate problems that traditional linear models (e.g., PCA and PLS) may not adequately resolve. ANNs are particularly effective in uncovering hidden structures, reducing data dimensionality, and facilitating predictive modeling in environmental systems (Figure 4).
Nurhayati et al. [71] demonstrated the application of ANN for real-time monitoring of DOC in wastewater using fluorescence and UV absorbance data. The model achieved high predictive accuracy (R2 = 0.9079; RMSE = 0.2989 mg/L), positioning ANN as a viable alternative to standard DOC measurement protocols.
ANNs have also been applied to UV–Vis spectroscopic data to estimate nutrient concentrations—such as nitrogen and phosphorus—as well as COD and suspended solids in eutrophic rivers, outperforming traditional regression methods [72]. Further, ANN models have been employed to predict trihalomethane formation in drinking water networks using diverse water quality parameters, achieving high prediction accuracy and enabling proactive water treatment management [73]. Additional studies (Figure 5) have highlighted the effectiveness of ANN in air quality forecasting [74] and in predicting the degradation behavior of pharmaceutical contaminants in aquatic systems, with strong experimental validation [75,76]. The use of AI and ML techniques for the simulation process is becoming increasingly widespread in studying atmospheric properties related to environmental issues, such as pollution, which is primarily concentrated among primary pollutants, including gaseous and PM. Therefore, it is necessary to accumulate atmospheric data based on meteorological variables, which include PM10, TT, TT−1, TMean, RHT, RHT−1, RHMean, WST, WST−1, WSMean, PRT, PR48H, PR72H, and day as the consecutive day on which the data was recorded. The model successfully simulated PM10 concentrations between 10 μg/m3 and 50 μg/m3, which is satisfactory, especially considering that these concentrations comprise most of the recorded data in the area [75]. The QSAR model, based on multiple linear regression, plays a significant role in removing pollutants from water. The model utilizes MDs to elucidate the contaminant’s physicochemical properties, including energy difference, electron affinity, halogen atoms, and ring atoms [76]. The study revealed that pH, DOC, and alkalinity could influence the QSAR model to determine the water quality.
A comparative summary of ANN-based models applied in environmental monitoring is provided in Table 5, highlighting their focus, input variables, performance metrics, and application areas.

3.2. Supervised Learning Methods

Supervised learning methods are employed when both input variables and corresponding output classes (labels) are known. These techniques are instrumental in classification tasks, predictive modeling, and pattern recognition, especially when prior knowledge of group membership is available. One of the most widely used supervised chemometric tools in environmental analysis is DA.

Discriminant Analysis

DA is a powerful multivariate technique widely used in environmental monitoring for classifying and differentiating among predefined groups—such as pollution sources, sampling locations, or ecological conditions—based on observed variables, including chemical concentrations or spectral features. DA constructs linear or nonlinear functions that maximize the separation between classes, facilitating both classification and interpretation of key discriminating variables.
Common forms of DA include the following:
Linear Discriminant Analysis: Assumes homogeneity in class covariance matrices; optimal for linearly separable data with equal variance.
Quadratic Discriminant Analysis: This method allows each class to have a distinct covariance structure, providing flexibility when class variances differ.
Partial Least Squares Discriminant Analysis: Integrates dimensionality reduction with classification, particularly useful in handling multicollinearity and noisy datasets. Compared to more complex nonlinear models such as ANNs, DA is computationally efficient, provides interpretable decision boundaries, and yields insights into variable importance. It is often used in combination with other chemometric tools to enhance classification accuracy and interpretability:
PCA–DA: PCA is used to reduce dimensionality before applying DA to classify the resulting components.
PLS–DA: Combines PLS regression and DA to model the relationship between predictor variables and categorical outcomes.
CA with DA: Clustering methods may first identify natural groupings in unlabeled data, which are then validated or refined using DA.
The integration of DA with these complementary techniques enhances its utility in complex environmental monitoring scenarios where spatial, temporal, and multivariate data are prevalent. A comparative summary of DA applications in environmental monitoring, including study focus, classification performance, and target analytes, is presented in Table 6.

3.3. Factorial Methods

Factorial methods are a family of multivariate statistical techniques used to reduce data dimensionality, identify latent structures, and highlight patterns in complex environmental datasets. Several factorial methods are crucial, including exploratory factor analysis, confirmatory factor analysis, correspondence analysis, multiple correspondence analysis, and PCA. Among them, PCA is widely used in environmental chemometrics to simplify complex datasets, identify fundamental patterns, and enhance monitoring techniques. In chemometrics, factorial methods are valuable for analyzing data from water quality assessments, groundwater contamination analyses, soil and agricultural monitoring, and air quality monitoring [78,80,81,82,83,84]. A comparative analysis of factorial method applications in environmental monitoring is summarized in Table 7.

Principal Component Analysis

PCA is a foundational unsupervised chemometric technique widely employed for data exploration, dimensionality reduction, and pattern recognition. It transforms a dataset containing potentially correlated variables into a new set of orthogonal variables known as principal components. These components are ranked by the amount of variance they capture, with the first few typically preserving most of the dataset’s variability. This transformation enables the simplification of complex, multivariate data while retaining essential structural information [85].
PCA plays a critical role in chemometrics by uncovering latent patterns, identifying relationships among variables, and reducing data complexity, thereby enhancing interpretation and decision-making. For example, Younes et al. [86] applied PCA to evaluate the efficiency of different aerogel materials in ion removal processes. The analysis helped identify the most influential factors affecting performance, which are pH, porosity, Brunauer–Emmett–Teller surface area, and density, supporting optimal material selection for water treatment applications. Numerous recent studies highlight the versatility of PCA across various scientific domains [87,88,89,90,91]. PCA was used to estimate the specific area size and variation in environmental resonant measurements, including carrying capacity related to resource amount, ecological quality, social economy, and infrastructure composition. It is a beneficial tool for appraising ECC that also reveals the spatial distribution of ECC indices on a large scale. The study shows that the ecological quality carrying capacity is the lowest, followed by the infrastructure composition carrying capacity. The socio-economic carrying capacity, and ultimately the amount of resources, depend on advancing environmental quality and improving resource utilization efficiency in a region [87]. The spatial distribution PCA model is quite satisfactory, as it effectively links the spatial features of soil pollution with data that produces a linear relationship. Applying eigenvector-based PCA, it efficiently evaluates soil pollution in three dimensions, making it easier to identify the potential sources of heavy metals. Agriculture was the major source (65.5%) of soil pollution contributing to the deposition of numerous heavy metals. The other sources, including traffic (17.9%) and natural pollution (11.1%), also contributed to soil pollution [88]. However, multicollinearity between multiple variables can often delay interpretation, which can significantly impact environmental health research. Nevertheless, Supervised Principal Component Analysis can easily solve this issue [89]. The procedure did not account for any potential nonlinear relationship between exposure and response, which may decrease the residual confounding risk by including nonlinear and interaction terms. The SPCA is efficiently adopting the computational and interpretable system to explain collinearity and exposure misclassification, which is inevitable. Predictably, if there were a factual correlation with the health outcome, the consequence evaluations could be biased towards being insignificant due to standard random errors. The PCA analysis is also effective in determining water quality, which is highly significant for human health. PCA modeling revealed that the primary source of anthropogenic pollution is attributed to the five main PCs. In open dumping sites, groundwater is predominantly contaminated with microbial and heavy metals [90]. The PCA and RF algorithm, combined with LCA, is a powerful tool for measuring the environmental effects of numerous solid waste disposal technologies and guiding ecological protection efforts. Multiple methods of LCA–PCA–RF modeling identified the leading cause of environmental loads as human toxicity potential, resulting from the accumulation of large amounts of dioxins and PM2.5 formed during the carbothermal reduction and oxygen-enhanced side-gusting of inorganic waste disposal [91]. Therefore, encourage the collaborative consumption of solid waste in all regions, both indoors and outdoors, and reinforce environmental management to achieve zero pollution.
Beyond its core applications, PCA offers several advantages that significantly enhance chemometric analyses:
Dimensionality reduction: PCA condenses large datasets into a smaller number of variables (principal components) while preserving the majority of the variance. This simplification improves data visualization and interpretability, particularly for high-dimensional spectroscopic or chromatographic data [92].
Noise reduction: By focusing on components that explain the most variance, PCA helps filter out experimental noise and emphasizes meaningful patterns, improving the reliability of downstream analyses [93].
Outlier detection: PCA is effective for identifying anomalies within datasets, as deviations from the main data structure become more visible when projected onto the principal component space. This capability supports quality control and error detection in analytical workflows [94].
PCA’s adaptability enables its broad application in various chemometric domains, including food authentication, environmental monitoring, and pharmaceutical analysis. In environmental chemistry, PCA helps interpret large datasets to assess pollution levels and identify sources of contamination. In food science, PCA is used to detect adulteration, verify geographical origin, and monitor product quality—for example, distinguishing between apple cultivars, monitoring beer aging, and identifying adulteration in Camellia oils [95]. In pharmaceutical applications, PCA, combined with techniques such as near-infrared spectroscopy and hyperspectral imaging, has enabled the detection of counterfeit drugs and the identification of substandard formulations, contributing to enhanced product integrity and quality assurance [96].
The chemometric approaches that assess the quality of the predictive model are referred to as validation, which requires a diagnostic metric that explains the variance or residual calculation. The diversity of techniques could considerably impact the quality of interference amendment and investigation in numerous applications (Figure 6). The methods vary in their ability to handle diverse types of interference and data intricacies, making the selection of techniques imperative for attaining accurate results. Approaches like PCA are efficient in dimensionality reduction and the identification of variation in primary sources, which helps to isolate interference consequences, whereas PLS regression is more suited for managing overlapping spectral characteristics and quantitative investigation in the presence of interference. Because samples are often complex mixtures, which typically provide intricate data, this is often the case for spectroscopy and chromatographic analysis. However, the chromatographic method’s diversity allows specialists to select techniques tailored to the complexity of the data, confirming that interference is efficiently examined. The various chemometric techniques, including PCA, PLS, DA, RF, and SVM, are not always suitable as a single technique for monitoring and assessing environmental scenarios. The inappropriate use of a technique can mislead the interpretation of results; this issue can be addressed by engaging a diverse set of techniques allowing analysts to cross-validate their outcomes and confirm the robustness of their inferences. However, by employing appropriate chemometric methods, the accuracy and reliability of analytical results can be enhanced efficiently by properly handling and correcting various types of interference.

4. Environmental Application of Chemometric Techniques

4.1. Air Quality

Air pollution remains one of the most pressing environmental challenges globally, arising from the contamination of the atmosphere by physical, chemical, and biological agents in both indoor and outdoor environments. Poor indoor air quality is particularly hazardous, often linked to conditions such as Sick Building Syndrome and other occupational health risks. Beyond human health, air pollution negatively impacts vegetation, livestock, infrastructure, and ecosystems [97,98].
Atmospheric pollution occurs when concentrations of airborne contaminants surpass ambient regulatory thresholds. These pollutants, which vary in type and concentration over time, are typically more pronounced in densely populated urban centers and industrial regions [28]. Due to the dynamic interactions between physical transport mechanisms and chemical transformations, atmospheric pollutants exhibit complex, nonlinear behaviors. ANNs have proven particularly adept at modeling such nonlinear relationships, making them highly effective tools for predicting pollutant behavior and analyzing environmental data [99].
ANNs have demonstrated excellent performance in distinguishing regional patterns of air pollution. In a study analyzing 11 years of air quality data, pollutants such as PM10, NO2, SO2, CO, O3, and the API were found to be within Malaysian guideline limits. Strong positive correlations were observed between PM10, NO2, SO2, and API values while CO and O3 showed negative correlations with the API [100]. Notably, PM10 and SO2 emerged as the primary contributors influencing API values, primarily originating from industrial activities and vehicular emissions. In contrast, air quality in mining regions exhibited spatial variation based on proximity to water bodies and mining elevation.
Vehicle activity within mining areas is a significant contributor to both gaseous and particulate air pollutants. This emission load is further intensified by environmental factors such as wind speed and surface radiation, which facilitate the dispersion and distribution of contaminants. Statistical tools, such as Pearson correlation analysis, have proven effective in establishing relationships between surface temperature and concentrations of ambient air pollutants, offering insights into pollution dynamics in such areas [101]. Advanced chemometric techniques, including PCA and PMF, have been successfully employed to identify the sources of air pollutants such as CO, NO2, SO2, and PMs (Figure 7). The PCA indicated that the type of fuel could have a significant influence on the emission levels, PAHs, and TOC in the dust. The CA identified a correlation between PAH concentrations, dust amounts, and TOC, revealing that the proportionality is related to the size of the molecules [102]. The PCA and FA methods have detected eight pollutants in the LPS and SHPS regions, and eleven pollutants in the MPS region. The MLR models specify that the primary pollutant is PM10, which is predicted to be the result of industrial activities, transportation, and agronomic practices [20]. However, it was found that the SHPS model is the most accurate for predicting total API, as it offers a better correlation between projected and calculated API, with the highest R2 values for SHPS, MPS, and LPS being 0.894, 0.878, and 0.837, respectively [103]. The PLS-DA and PCA methods are highly significant in determining pollutants such as PM10, PM2.5, PM1.0, and O3, which affect indoor air quality [104]. The technique is too competent to provide a clear portrait of pollutants’ patterns and trends along with their associated sources. The pollutants that primarily emitted NO2, CO, and PM10, are attributed to industrial activities, building construction, and motor vehicles [105]. It is assumed that the formation of pollutants, including CH4, NmHC, THC, O3, and PM10, is primarily caused by motor vehicles [106]. These methods detect patterns in pollutant distribution and can pinpoint likely emission sources even in the absence of on-site sampling or field investigations. This capability is particularly valuable for assessing spatial variability in air quality across remote or inaccessible regions (Table 8).
Analytical evaluations have further demonstrated the effectiveness of smokeless fuels in reducing airborne pollutants and dust emissions. While solid fuels remain commonly used—particularly for residential heating during colder months, evidence suggests that smokeless alternatives can significantly lower emission levels. These fuels represent a viable pathway toward cleaner coal technologies, especially in regions where solid fuel use is prevalent in household energy consumption.
The adoption of smokeless fuels, combined with improved land use planning and rigorous air quality monitoring, could support the development of more sustainable urban environments. In particular, urban planning strategies that integrate risk assessment for industrial zones and building expansions can contribute to more effective control of air pollution. Additionally, incorporating sustainable transportation solutions—such as biofuel-powered systems—can further reduce the release of toxic emissions, supporting long-term improvements in urban air quality and public health.

4.2. Water Quality

Organic pollutants are among the primary contributors to water pollution, posing significant challenges to both global environmental health and public health. The escalating discharge of untreated wastewater—driven by the expansion of livestock farming, industrial activities, and urban development—continues to deteriorate water quality worldwide. Climate change, particularly global warming, exacerbates this issue by reducing the availability of freshwater from rivers, lakes, and reservoirs. The presence of persistent and toxic contaminants in water further heightens the risk to aquatic ecosystems, natural resources, and overall biodiversity.
The degradation of water resources containing organic pollutants represents a critical environmental crisis. Multivariate statistical methods, particularly chemometric techniques, have proven to be powerful tools for assessing and quantifying water pollution. These approaches are instrumental in evaluating water quality for various uses, safeguarding aquatic life, and managing environmental resources effectively [107]. Chemometric techniques facilitate the reduction in large, complex datasets enabling the extraction of key variables that significantly influence water quality. They also help identify the sources of contamination, whether natural—such as water–rock interactions—or anthropogenic, including agricultural runoff and industrial discharge [108].
To assess the physicochemical properties of water, samples were collected from drainage systems carrying untreated sewage, agricultural runoff, and industrial effluents across urban, peri-urban, and rural areas. Fifteen water quality parameters were analysed using a combination of chemometric models, including PCA, FA, CA, and DA. These methods revealed strong interrelationships among the variables, providing insight into the sources of pollution. PCA successfully identified the dominant factors influencing water quality while FA highlighted agricultural and domestic wastes as the primary contributors to contamination [Table 9]. Through HCA and DA, the spatial variability of water quality parameters (Figure 8) was simplified into four principal influencing factors [109,110]. However, several parameter concentrations, including TH, BOD, total alkalinity and TDS, exceeded permissible limits—indicating significant pollution from agricultural, domestic, and industrial sources. Routine monitoring and assessment of drinking water are crucial for detecting contamination risks and ensuring water safety [111]. Public awareness campaigns are equally important for promoting an understanding of water quality issues and mitigating the health impacts associated with polluted water sources. Chemometric methods, such as PCA, have also been effectively applied to determine heavy metal concentrations in river water, providing precise and reproducible analyses [112,113,114]. These models serve as valuable tools in independently evaluating water quality and identifying hazardous residues that pose threats to public health. Their predictive capabilities enable the development of practical water management strategies, including intervention plans and early-warning systems. The outcomes of such analyses have facilitated the creation of pollution hazard maps, which provide critical data for environmental authorities and communities to assess risk levels and implement measures to reduce the adverse effects of water contamination on human health and ecosystems.

4.3. Soil Quality

Various agricultural activities, illegal waste dumping, and underground storage tank leakage are primarily responsible for soil pollution, a critical global concern today. It also increased the difficulty related to public health and regulatory compliance due to the results of contaminated groundwater, which directly impacted the ecological systems. Elemental soil analysis is crucial for providing detailed information on contamination resulting from atmospheric accumulation or transport in surface water and groundwater. Multivariate techniques have recently been enhanced to improve the analysis methodology using two-dimensional datasets from soil sampling for evaluation involving CA, DA, FA, and PCA models [115]. These methods can identify site locations with pollutants and categorize the toxic pollutants that affect the environment, as well as their relationship with the influencing variables. The soil profiles indicated the presence of chromium (Cr) and vanadium (V) [115], which is typically due to atmospheric pollutants contributing to pollution, possibly from transportation and agricultural influences. The absolute principal component score approach is applied to reveal the contribution of each identified parameter to the total chemical parameter concentration by the PCA model with a latent factor. The cluster analysis interpreted the similarity of variables among different pattern formations in terms of pollution risk or spatial settings [116]. PCA initially evaluates the dataset structure to identify the most influential factors based on the data structure obtained. It could enhance the soil quality performance of a non-ferrous lead–zinc smelter and help determine the reasons for influencing outcomes from its origin with involved consequences that could affect the region [117]. Chemometric analysis also included PLSR, SPCA, and the Least Absolute Shrinkage and Selection Operator calibration methodology, utilizing UV–Vis-NIR spectra to predict the twenty-two physical, chemical, and biological properties of the soil. It showed the highest Ca and Mg concentrations, as well as the highest β-glucosidase, fungal biomass, and phospholipid fatty acid values, which also enhanced prediction accuracy, particularly in terms of reducing prediction error and bias for innumerable soil properties [118]. Different approaches that integrate with Vis-NIR and SWIR are capable of processing spectral data for quantifying SOM [119]. The study ascertained that the topsoil characteristics are used to establish the optimal wavelength band for developing a model to estimate SOM content from rivers. In addition, PLSR and SVR models were applied to a pre-processing procedure that reduced the discriminant error caused by spectral overlaps, thereby enhancing the model’s performance (Figure 9).
The chemometric approach was able to quantify major pollution resources differently, which may be related to manufacturing activity, soil-specific properties, and aerosol alluviation practices. The corresponding distribution regression patterns are designed to provide a quantitative explanation of each identified source’s contribution to the total chemical parameter concentrations managed for soil quality monitoring and regulatory compliance. Moreover, sampling locations were also performed for spatial analysis to demonstrate differences between sampling regions in similar patterns. This resulted in higher pollution levels in industrial areas and comparatively lower pollution levels in locations near the shoreline and in mountainous regions. These methods are effective for soil characterization (Table 10). However, traditional methods for agrochemical analysis are often time-consuming and costly. They also used hazardous chemical reagents but applied eco-friendly alternatives, such as chemometrics techniques, which are tools that can generate accurate predictions for most soil characteristics.

5. Conclusions

Environmental monitoring and assessment are inherently complex due to the dynamic interplay between natural processes and anthropogenic activities. This complexity often leads to overlapping variables and non-trivial datasets that challenge conventional analysis methods. Chemometric techniques offer a powerful and systematic approach to managing this complexity, enabling the extraction of meaningful information from large, multivariate datasets with high accuracy.
DA, when applied to geospatial datasets, has demonstrated its effectiveness in distinguishing between monitoring locations based on environmental variables. HACA has revealed temporal variations in pollution levels, providing insight into monthly and annual trends often associated with suspended atmospheric particulates such as smoke, dust, and chemical pollutants. PCA has been particularly effective in reducing data dimensionality and identifying key pollution sources—primarily emissions from fossil fuel combustion in transportation and industrial processes.
Moreover, ANNs have demonstrated superior predictive performance and selectivity compared to traditional methods, such as DA, especially in regional discrimination tasks. These findings underscore the importance of selecting chemometric techniques that are specifically tailored to the nature of the environmental problem being addressed.
Ultimately, the integration of appropriate chemometric methods into environmental monitoring frameworks can significantly enhance data interpretation, pollution source identification, and temporal trend analysis. To ensure effective environmental management and pollution mitigation, it is crucial to implement targeted action plans based on robust chemometric analysis.

Author Contributions

Conceptualization, A.K. and S.M.H.; investigation, A.K., S.M.H. and Y.U.; methodology, S.M.H. and Y.U.; writing—original draft preparation, S.M.H. and Y.U.; writing—review and editing, A.K., S.M.H. and Y.U. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing does not apply to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
Aerosol Optical DepthAOD
Active Pharmaceutical IngredientsAPI
Affinity Tensor-Based MatchingATBM
Agglomerative Hierarchical ClusterAHC
Air Pollution IndexAPI
AluminumAl
AmmoniaNH4
ArsenicAs
Artificial IntelligenceAI
Artificial Neural NetworkANN
Asymmetric Least Squares Splines Regression AsLSSR
Atomic Absorption SpectroscopyAAS
Attenuated Total ReflectionATR
Average air Temperature between current and previous dayTMean
Average relative Humidity between current and previous dayRHMean
Average wind Speed between current and previous dayWSMean
BicarbonateHCO3
Biochemical Oxygen DemandBOD
BoronB
CadmiumCd
CalciumCa
Calcium CarbonateCaCO3
Carbon MonoxideCO
Chemical Oxygen DemandCOD
ChlorideCl
ChromiumCr
Cluster AnalysisCA
Collected total precipitation past 48 hPR48H
Collected total precipitation past 72 hPR72H
Colored dissolved organic mattersCDOM
Continuous Wavelet TransformCWT
CopperCu
Correlation Optimized WarpingCOW
Current-day air temperatureTT
Current-day relative humidityRHT
Current-day total precipitationPRT
Current-day wind speedWST
Discrete Wavelet TransformDWT
Discriminant AnalysisDA
Dissolved Organic CarbonDOC
Dissolved OxygenDO
Discriminant Analysis of Multi-Aspect CytometryDAMACY
Electrical ConductivityEC
Electrothermal Atomic Absorption SpectroscopyETAAS
Energy Dispersive-X-ray FluorescenceEDXRF
Environmental Carrying CapacityECC
Extreme Gradient BoostingXGBoost
eXtensible Computational Mass SpectrometryXCMS
Factor AnalysisFA
Flame Ionization DetectorFID
Fast Fourier TransformFFT
Functional Analysis of Variance FANOVA
Fourier Transform InfraredFTIR
GAS Chromatography–Mass SpectrophotometryGCMS
Geographically Weighted RegressionGWR
GigahertzGHz
Hazard QuotientHQ
Hierarchical Agglomerative Cluster AnalysisHACA
Hierarchical Cluster AnalysisHCA
High-Performance Liquid ChromatographyHPLC
HydrochlorideHCl
Hyperspectral Vegetation IndicesHVIs
Inductively Coupled Plasma Optical Emission SpectrometryICP-OES
Internet of ThingsIoT
Interval Correlation Optimized Shiftingicoshift
Ion ChromatographyIC
IronFe
Land Use and Land CoverLULC
Life Cycle AssessmentLCA
Linear Discriminant AnalysisLDA
Liquid Chromatography–Mass SpectrophotometryLCMS
Low Pollution SourceLPS
ManganeseMg
Mean Absolute Percentage ErrorMAPE
MercuryHg
Metabolomic Analysis and Visualization ENgineMAVEN
MethaneCH4
Moderate Pollution SourceMPS
Moderate Resolution Imaging SpectroradiometerMODIS
Molecular descriptorsMDs
Multi-Angle Imaging SpectroradiometerMISR
Multiple Linear RegressionMLR
Multiplicative Scatter CorrectionMSC
National Institute of Standard TechnologyNIST
Near-InfraredNIR
Near-Infrared Reflectance SpectroscopyNIRS
NickelNi
NitrateNO2
Nitrogen DioxideNO2
Non-methane HydrocarbonsNmHC
Normalized Difference Vegetation IndexNDVI
Not AvailableNA
Norris-Williams derivationNW
Orthogonal Partial Least SquareOPLS
Ozone O3
Partial Least SquarePLS
Partial Least Square RegressionPLSR
Particulate Matter with a Diameter of 2.5 or LessPM2.5
Particulate Matter with a Diameter of 10 or LessPM10
PbLead
PhosphorusP
Polycyclic Aromatic HydrocarbonsPAHs
Positive Matrix FactorizationPMF
Potassium permanganateKMnO4
Previous-day air TemperatureTT-1
Previous-day relative HumidityRHT-1
Previous-day wind SpeedWST-1
Principal Component AnalysisPCA
Quadratic Discriminant AnalysisQDA
Quantitative Structure–Activity RelationshipQSAR
Random ForestRF
Root Mean Square Error of Cross-ValidationRMSECV
Root Mean Squared Error of PredictionRMSEP
Savitzky–Golay polynomial filtersSG
Sea surface salinitySSS
Secchi disk depthSDD
SeleniumSe
Sentinel-5 PrecursorS5P
Short-Wavelength InfraredSWIR
Slightly High Pollution SourceSHPS
SodiumNa
Sodium Absorption RatioSAR
Soil Organic MatterSOM
Solid-Phase MicroextractionSPME
Standard Normal VariateSTV
Sulfur DioxideSO2
Support Vector MachinesSVM
Suspended SolidSS
Tin DioxideSnO2
TitaniumTi
Total Dissolved SolidTDS
Total HardnessTH
Total HydrocarbonsTHC
Total Kjeldahl Nitrogen methodTKN
Total Organic CarbonTOC
Total PhosphorusTP
Total Suspended SolidsTSS
TrihalomethanesTHMs
Ultraviolet-VisibleUV–Vis
United KingdomUK
Unmanned Aerial VehiclesUAVs
VanadiumV
Visible Infrared Imaging Radiometer SuiteVIIRS
Volatile Organic CompoundsVOCs
Water Quality IndexWQI
Wireless Sensor NetworksWSNs
ZincZn

References

  1. Sara, T.; Cinti, S. How Can Chemometrics Support the Development of Point of Need Devices? Anal. Chem. 2021, 93, 2713–2722. [Google Scholar] [CrossRef] [PubMed]
  2. González-Domínguez, R.; Sayago, A.; Fernández-Recamales, A. An Overview on the Application of Chemometrics Tools in Food Authenticity and Traceability. Foods 2022, 11, 3940. [Google Scholar] [CrossRef] [PubMed]
  3. El-Rawy, M.; Fathi, H.; Abdalla, F.; Alshehri, F.; Eldeeb, H. An Integrated Principal Component and Hierarchical Cluster Analysis Approach for Groundwater Quality Assessment in Jazan, Saudi Arabia. Water 2023, 15, 1466. [Google Scholar] [CrossRef]
  4. Chen, T.; Zhang, H.; Sun, C.; Li, H.; Gao, Y. Multivariate statistical approaches to identify the major factors governing groundwater quality. Appl. Water Sci. 2018, 8, 215. [Google Scholar] [CrossRef]
  5. Raphaëlle, V.; Judith, G.-A.; Florent, M.; Nicole, L.M.; Jordi, D.B.; Gemma, M.; Christophe, P.; Isabelle, R.; Francine, K.; Jean, M. Assessment of dietary patterns in nutritional epidemiology: Principal component analysis compared with confirmatory factor analysis. Am. J. Clin. Nutr. 2012, 96, 1079–1092. [Google Scholar] [CrossRef] [PubMed]
  6. Michael, G.; Patrick, J.F.G.; Trevor, H.; Alfonso, I.D.; Angelos, M.; Elena, T. Principal component analysis. Nat. Rev. Methods Primers 2022, 2, 100. [Google Scholar]
  7. Ryu, J.; Liu, K.B.; McCloskey, T.A. The use of multivariate PCA dataset in identifying the underlying drivers of critical stressors, looking at global problems through a local lens. Data Brief 2022, 41, 107946. [Google Scholar] [CrossRef] [PubMed]
  8. Gregory, O.N.; Jean, V.; Isabelle, C.; Olivier, T. Using a multivariate regression tree to analyze trade-offs between ecosystem services: Application to the main cropping area in France. Sci. Total Environ. 2021, 764, 142815. [Google Scholar]
  9. Chu, K.; Liu, W.; She, Y.; Hua, Z.; Tan, M.; Liu, X.; Gu, L.; Jia, Y. Modified Principal Component Analysis for Identifying Key Environmental Indicators and Application to a Large-Scale Tidal Flat Reclamation. Water 2018, 10, 69. [Google Scholar] [CrossRef]
  10. Benjamin, P.; Per, A. Principal component analyses for integrated ecosystem assessments may primarily reflect methodological artefacts. ICES J. Mar. Sci. 2018, 75, 1021–1028. [Google Scholar]
  11. Abel, I.; Vanya, N.; Tsado, J.M.; Stanley, O.; Lucky, E.; Alexander, I.A.; Esther, B.; Jonathan, I.; Mariam, A.M.; Singh, K.R.B. Chemometric approach in environmental pollution analysis: A critical review. J. Environ. Manag. 2022, 309, 114653. [Google Scholar] [CrossRef] [PubMed]
  12. Jelena, V.P.; Sladana, C.A.; Snezana, M.M.; Snezana, B.T.; Mile, M.B. Chemometric characterization of heavy metals in soils and shoots of the two pioneer species sampled near the polluted water bodies in the close vicinity of the copper mining and metallurgical complex in Bor (Serbia): Phytoextraction and biomonitoring contexts. Chemosphere 2021, 262, 127808. [Google Scholar] [CrossRef] [PubMed]
  13. Nur, Z.S.; Ahmad, S.M.S.; Jyh, C.P.; Izuddin, F.A.; Norzahir, S.; Mohd, K.A.K.; Hammad, F.M.S. Application of chemometrics techniques to solve environmental issues in Malaysia. Heliyon 2019, 5, e02534. [Google Scholar] [CrossRef] [PubMed]
  14. Abhijeet, D. An optimized approach for predicting water quality features and a performance evaluation for mapping surface water potential zones based on Discriminant Analysis (DA), Geographical Information System (GIS) and Machine Learning (ML) models in Baitarani River Basin, Odisha. Desalina. Water Treat. 2025, 321, 101039. [Google Scholar]
  15. Jafar, R.; Awad, A.; Hatem, I.; Jafar, K.; Awad, E.; Shahrour, I. Multiple Linear Regression and Machine Learning for Predicting the Drinking Water Quality Index in Al-Seine Lake. Smart Cities 2023, 6, 2807–2827. [Google Scholar] [CrossRef]
  16. Dargahi, P.; Nasseri, S.; Hadi, M.; Nodehi, R.N.; Mahvi, A.H. Prediction models for groundwater quality parameters using a multiple linear regression (MLR): A case study of Kermanshah, Iran. J. Environ. Health Sci. Eng. 2022, 21, 63–71. [Google Scholar] [CrossRef] [PubMed]
  17. Nur, F.S.Z.; Mohd, S.M.N.; Fatimah, A.R.; Munira, I.; Mohd, A.A. Hybridization of hierarchical clustering with persistent homology in assessing haze episodes between air quality monitoring stations. J. Environ. Manag. 2022, 306, 114434. [Google Scholar] [CrossRef] [PubMed]
  18. Zulkepli, N.F.S.; Noorani, M.S.M.; Razak, F.A.; Ismail, M.; Alias, M.A. Cluster Analysis of Haze Episodes Based on Topological Features. Sustainability 2020, 12, 3985. [Google Scholar] [CrossRef]
  19. Srinivasan, K.; Thirumalini, P.K. Assessment of Chennai’s Ambient Air Quality Data using Multivariate Analysis from 2005 to 2015. Asian J. Appl. Sci. 2017, 5, 320–329. [Google Scholar] [CrossRef]
  20. Azman, A.; Hafizan, J.; Ezureen, E.; Mohd, E.T.; Azizah, E.; Mohd, N.A.R.; Kamaruzzaman, Y.; Mohd, K.A.K.; Che, N.C.H.; Ahmad, S.M.S.; et al. Identification Source of Variation on Regional Impact of Air Quality Pattern Using Chemometric. Aerosol Air Qual. Res. 2015, 15, 1545–1558. [Google Scholar] [CrossRef]
  21. Thara, S.; Kamonrat, K.; Chatchawal, W. A comprehensive review on advancements in sensors for air pollution applications. Sci. Total Environ. 2024, 951, 175696. [Google Scholar]
  22. Masthurah, A.; Juahir, H.; Mohd, Z.N.B. Case study Malaysia: Spatial water quality assessment of Juru, Kuantan and Johor River Basins using environmetric techniques. J. Surv. Fish. Sci. 2021, 7, 19–40. [Google Scholar] [CrossRef]
  23. Veerasingan, A.S.; Hafizan, J.; Ananthy, R.; Ali, M.M. Chemometric Interpretation on the Occurrence of Endocrine Disruptors in Source Water from Malaysia. Clean Soil Air Water 2015, 43, 804–810. [Google Scholar]
  24. Arshad, K.; Hussain, N.; Ashraf, M.H.; Saleem, M.Z. Air pollution and climate change as grand challenges to sustainability. Sci. Total Environ. 2024, 928, 172370. [Google Scholar]
  25. Perera, F. Pollution from Fossil-Fuel Combustion is the Leading Environmental Threat to Global Pediatric Health and Equity: Solutions Exist. Int. J. Environ. Res. Public Health 2017, 15, 16. [Google Scholar] [CrossRef] [PubMed]
  26. Xinghui, L.; Kuppusamy, S.; Huichao, Z.; Kuldeep, K.S.; Fuchun, Z.; Saraschandra, N.; Anbarasu, K.; Ramya, R.; Aruliah, R.; Xiang, G. Frontiers in environmental cleanup: Recent advances in remediation of emerging pollutants from soil and water. J. Hazard. Mater. Adv. 2024, 16, 100461. [Google Scholar] [CrossRef]
  27. Miller, T.; Durlik, I.; Kostecka, E.; Kozlovska, P.; Łobodzińska, A.; Sokołowska, S.; Nowy, A. Integrating Artificial Intelligence Agents with the Internet of Things for Enhanced Environmental Monitoring: Applications in Water Quality and Climate Data. Electronics 2025, 14, 696. [Google Scholar] [CrossRef]
  28. Sing, W.A.; Vrontos, S.; Taylor, M.L. An assessment of people living by coral reefs over space and time. Glob. Change Biol. 2022, 28, 7139–7153. [Google Scholar] [CrossRef] [PubMed]
  29. Abrego, D.; Howells, E.J.; Smith, S.D.A.; Madin, J.S.; Sommer, B.; Schmidt-Roach, S.; Cumbo, V.R.; Thomson, D.P.; Rosser, N.L.; Baird, A.H. Factors Limiting the Range Extension of Corals into High-Latitude Reef Regions. Diversity 2021, 13, 632. [Google Scholar] [CrossRef]
  30. Madeleine, F.D.; Aaron, E.; Daniel, C.; James, C.; Vi, K.T.; Russell, J.C.; Kay, L. Chemometrics for environmental monitoring: A review. Anal. Methods 2020, 12, 4597–4620. [Google Scholar] [CrossRef] [PubMed]
  31. Mohan, S.; Li, Y.; Chu, K.; Shi, B.; De La Paz, L.; Bakre, P.; Foti, C.; Rucker, V.; Lai, C. Developing In Situ Chemometric Models with Raman Spectroscopy for Monitoring an API Disproportionation with a Complex Polymorphic Landscape. Pharmaceuticals 2023, 16, 327. [Google Scholar] [CrossRef] [PubMed]
  32. Frau, I.; Wylie, S.; Byrne, P.; Onnis, P.; Cullen, J.; Mason, A.; Korostynska, O. Microwave Sensors for In Situ Monitoring of Trace Metals in Polluted Water. Sensors 2021, 21, 3147. [Google Scholar] [CrossRef] [PubMed]
  33. Inobeme, A.; Natarajan, A.; Pradhan, S.; Adetunji, C.O.; Ajai, A.I.; Inobeme, J.; Tsado, M.J.; Jacob, J.O.; Pandey, S.S.; Singh, K.R.; et al. Chemical Sensor Technologies for Sustainable Development: Recent Advances, Classification, and Environmental Monitoring. Adv. Sens. Res. 2024, 3, 2400066. [Google Scholar] [CrossRef]
  34. Felemban, S.; Vazquez, P.; Moore, E. Future Trends for In Situ Monitoring of Polycyclic Aromatic Hydrocarbons in Water Sources: The Role of Immunosensing Techniques. Biosensors 2019, 9, 142. [Google Scholar] [CrossRef] [PubMed]
  35. Bojko, B.; Onat, B.; Boyaci, E.; Psillakis, E.; Dailianis, T.; Pawliszyn, J. Application of in situ Solid-Phase Microextraction on Mediterranean Sponges for Untargeted Exometabolome Screening and Environmental Monitoring. Front. Mar. Sci. 2019, 6, 632. [Google Scholar] [CrossRef]
  36. Nie, H.; Liu, Z.; Marks, B.C.; Taylor, L.S.; Byrn, S.R.; Marsac, P.J. Analytical approaches to investigate salt disproportionation in tablet matrices by Raman spectroscopy and Raman mapping. J. Pharm. Biomed. Anal. 2016, 118, 328–337. [Google Scholar] [CrossRef] [PubMed]
  37. Wray, P.S.; Sinclair, W.E.; Jones, J.W.; Clarke, G.S.; Both, D. The use of in situ near infrared imaging and Raman mapping to study the disproportionation of a drug HCl salt during dissolution. Int. J. Pharm. 2015, 493, 198–207. [Google Scholar] [CrossRef] [PubMed]
  38. Ewing, A.V.; Wray, P.S.; Clarke, G.S.; Kazarian, S.G. Evaluating drug delivery with salt formation: Drug disproportionation studied in situ by ATR-FTIR imaging and Raman mapping. J. Pharm. Biomed. Anal. 2015, 111, 248–256. [Google Scholar] [CrossRef] [PubMed]
  39. Nie, H.; Klinzing, G.; Xu, W. A comparative study of applying backscattering and transmission Raman spectroscopy to quantify solid-state form conversion in pharmaceutical tablets. Int. J. Pharm. 2022, 617, 121608. [Google Scholar] [CrossRef] [PubMed]
  40. Sabins, F.F. Remote Sensing: Principles and Interpretation, 3rd ed.; Waveland Press Inc.: Long Grove, IL, USA, 2007. [Google Scholar]
  41. Hadi; Krasovskii, A.; Maus, V.; Yowargana, P.; Pietsch, S.; Rautiainen, M. Monitoring Deforestation in Rainforests Using Satellite Data: A Pilot Study from Kalimantan, Indonesia. Forests 2018, 9, 389. [Google Scholar] [CrossRef]
  42. Gu, Z.; Zeng, M. The Use of Artificial Intelligence and Satellite Remote Sensing in Land Cover Change Detection: Review and Perspectives. Sustainability 2024, 16, 274. [Google Scholar] [CrossRef]
  43. Aleksey, V.; Denis, K.; Andrey, O.; Dmitry, K.; Irina, I. Scientific Bases Development for Oil Spill Accidents Automated Detection Using Drones. Transport. Res. Procedia 2023, 68, 585–590. [Google Scholar]
  44. Muksimova, S.; Umirzakova, S.; Mardieva, S.; Abdullaev, M.; Cho, Y.I. Revolutionizing Wildfire Detection Through UAV-Driven Fire Monitoring with a Transformer-Based Approach. Fire 2024, 7, 443. [Google Scholar] [CrossRef]
  45. Singh, Y.; Walingo, T. Smart Water Quality Monitoring with IoT Wireless Sensor Networks. Sensors 2024, 24, 2871. [Google Scholar] [CrossRef] [PubMed]
  46. Udaya, D.; Lumini, B.; Ridma, W.; Kishanga, K.; Bathiya, J. Forest fire detection system using wireless sensor networks and machine learning. Sci. Rep. 2022, 46, 12. [Google Scholar]
  47. Majumder, A.; Losito, M.; Paramasivam, S.; Kumar, A.; Gatto, G. Buoys for marine weather data monitoring and LoRaWAN communication. Ocean Eng. 2024, 313, 119521. [Google Scholar] [CrossRef]
  48. Gholizadeh, M.H.; Melesse, A.M.; Reddi, L. A Comprehensive Review on Water Quality Parameters Estimation Using Remote Sensing Techniques. Sensors 2016, 16, 1298. [Google Scholar] [CrossRef] [PubMed]
  49. Chen, S.; Yuan, X.; Yuan, W.; Niu, J.; Xu, F.; Zhang, Y. Matching Multi-Sensor Remote Sensing Images via an Affinity Tensor. Remote Sens. 2018, 10, 1104. [Google Scholar] [CrossRef]
  50. Donkelaar, A.V.; Martin, R.V.; Brauer, M.; Hsu, N.C.; Kahn, R.A.; Levy, R.C.; Apte, J.S. Global estimates of fine particulate matter using a combined geophysical-statistical method with information from satellites, models, and monitors. Environ. Sci. Technol. 2015, 50, 3762–3772. [Google Scholar] [CrossRef] [PubMed]
  51. Wan, Y.; Chen, F.; Fan, L.; Sun, D.; He, H.; Dai, Y.; Li, L.; Chen, Y. Conversion of surface CH4 concentrations from GOSAT satellite observations using XGBoost algorithm. Atmos. Environ. 2023, 301, 119694. [Google Scholar] [CrossRef]
  52. Tan, Y.-C.; Duarte, L.; Teodoro, A.C. Comparative Study of Random Forest and Support Vector Machine for Land Cover Classification and Post-Wildfire Change Detection. Land 2024, 13, 1878. [Google Scholar] [CrossRef]
  53. Kumar, M.; Khamis, K.; Stevens, R.; Hannah, D.M.; Bradley, C. In-situ optical water quality monitoring sensors—Applications, challenges, and future opportunities. Front. Water 2024, 6, 1380133. [Google Scholar] [CrossRef]
  54. Zhu, X.; Qin, H.; Liu, J.; Zhang, Z.; Lu, Y.; Yuan, X.; Wu, D. A novel electrochemical method to evaluate the cytotoxicity of heavy metals. J. Hazard. Mater. 2014, 271, 210–219. [Google Scholar] [CrossRef] [PubMed]
  55. Fang, D.; Gao, G.; Shen, J.; Yu, Y.; Zhi, J. A reagentless electrochemical biosensor based on thionine wrapped E. coli and chitosan-entrapped carbon nanodots film modified glassy carbon electrode for wastewater toxicity assessment. Electrochim. Acta 2016, 222, 303–311. [Google Scholar] [CrossRef]
  56. Shivani, D.; Mehta, B.R.; Tyagi, A.K.; Sood, K. A review on environmental gas sensors: Materials and technologies. Sens. Int. 2021, 2, 100116. [Google Scholar]
  57. James, C.; Vi, K.T.; Aaron, E.; Sheeana, G.; Samuel, C.; Piumie, R.; Kay, L.; Russell, J.C.; Daniel, C. Combining Chemometrics and Sensors: Toward New Applications in Monitoring and Environmental Analysis. Chem. Rev. 2020, 120, 6048–6069. [Google Scholar] [CrossRef] [PubMed]
  58. Falcioni, R.; Gonçalves, J.V.F.; de Oliveira, K.M.; de Oliveira, C.A.; Reis, A.S.; Crusiol, L.G.T.; Furlanetto, R.H.; Antunes, W.C.; Cezar, E.; de Oliveira, R.B.; et al. Chemometric Analysis for the Prediction of Biochemical Compounds in Leaves Using UV-VIS-NIR-SWIR Hyperspectroscopy. Plants 2023, 12, 3424. [Google Scholar] [CrossRef] [PubMed]
  59. Krzebietke, S.; Daszykowski, M.; Czarnik-Matusewicz, H.; Stanimirova, I.; Pieszczek, L.; Sienkiewicz, S.; Wierzbowska, J. Monitoring the concentrations of Cd, Cu, Pb, Ni, Cr, Zn, Mn and Fe in cultivated Haplic Luvisol soils using near-infrared reflectance spectroscopy and chemometrics. Talanta 2023, 251, 123749. [Google Scholar] [CrossRef] [PubMed]
  60. Gerjen, H.T.; Olga, L.; Dillen, A.; Mathijs, L.; Rinze, W.G.; Machteld, R.; Harrie, K.; George, D.; Arnold, V.; Lutgarde, M.C.B.; et al. Water quality monitoring based on chemometric analysis of high-resolution phytoplankton data measured with flow cytometry. Environ. Int. 2022, 170, 107587. [Google Scholar] [CrossRef] [PubMed]
  61. Malavi, D.; Nikkhah, A.; Raes, K.; Van, H.S. Hyperspectral Imaging and Chemometrics for Authentication of Extra Virgin Olive Oil: A Comparative Approach with FTIR, UV-VIS, Raman, and GC-MS. Foods 2023, 12, 429. [Google Scholar] [CrossRef] [PubMed]
  62. Kumar, S.P.J.; Kuriachan, L. Chemometric appraisal of groundwater quality for domestic, irrigation and industrial purposes in Lower Bhavani River basin, Tamil Nadu, India. Int. J. Environ. Anal. Chem. 2020, 102, 3437–3460. [Google Scholar] [CrossRef]
  63. Gao, H.; Lv, C.; Song, Y.; Zhang, Y.; Zheng, L.; Wen, Y.; Peng, J.; Yu, H. Chemometrics data of water quality and environmental heterogeneity analysis in Pu River, China. Environ. Earth Sci. 2015, 73, 5119–5129. [Google Scholar] [CrossRef]
  64. Al-Odaini, N.A.; Zakaria, M.P.; Zali, M.A.; Juahir, H.; Yaziz, M.I.; Surif, S. Application of chemometrics in understanding the spatial distribution of human pharmaceuticals in surface water. Environ. Monit. Assess. 2012, 184, 6735–6748. [Google Scholar] [CrossRef] [PubMed]
  65. Alvarez, G.M.; Ballabio, D.; Amigo, J.M.; Viguri, J.R.; Bro, R. A chemometric approach to the environmental problem of predicting toxicity in contaminated sediments. J. Chemom. 2010, 24, 379–386. [Google Scholar] [CrossRef]
  66. Hafizan, J.; Sharifuddin, M.Z.; Ahmad, Z.A.; Mohd, K.Y.; Mazlin, B.M. Spatial assessment of Langat river water quality using chemometrics. J. Environ. Monit. 2010, 12, 287–295. [Google Scholar]
  67. Manuel, D.P.D.; Artur, K. A guide to good practice in chemometric methods for vibrational spectroscopy, electrochemistry, and hyphenated mass spectrometry. TrAC Trends Anal. Chem. 2021, 135, 116157. [Google Scholar] [CrossRef]
  68. Adejuwon, E.O.; Ogwueleka, T.C.; Ogungbemi, E.O.; Prabhu, R.; Nava, A.R.; Yates, K. Assessment of Surface Water Quality Using Chemometric Tools: A Case Study of Jabi Lake, Abuja, Nigeria. Iran. J. Sci. Technol. Trans. Civ. Eng. 2025, 49, 829–852. [Google Scholar] [CrossRef]
  69. Curcic, L.; Loncar, B.; Pezo, L.; Stojic, N.; Prokic, D.; Filipovic, V.; Pucarevic, M. Chemometric Approach to Pesticide Residue Analysis in Surface Water. Water 2022, 14, 4089. [Google Scholar] [CrossRef]
  70. Rocha, W.F.D.C.; Prado, C.B.D.; Blonder, N. Comparison of Chemometric Problems in Food Analysis using Non-Linear Methods. Molecules 2020, 25, 3025. [Google Scholar] [CrossRef] [PubMed]
  71. Nurhayati, M.; You, Y.; Park, J.; Lee, B.J.; Kang, H.G.; Lee, S. Artificial neural network implementation for dissolved organic carbon quantification using fluorescence intensity as a predictor in wastewater treatment plants. Chemosphere 2023, 335, 139032. [Google Scholar] [CrossRef]
  72. Lyu, Y.; Zhao, W.; Kinouchi, T.; Nagano, T.; Tanaka, S. Development of statistical regression and artificial neural network models for estimating nitrogen, phosphorus, COD, and suspended solid concentrations in eutrophic rivers using UV–Vis spectroscopy. Environ. Monit. Assess. 2023, 195, 1114. [Google Scholar] [CrossRef] [PubMed]
  73. Babaei, A.A.; Tahmasebi, B.Y.; Baboli, Z.; Heydar Maleki, H.; Angali, K.A. Using water quality parameters to prediction of the ion-based trihalomethane by an artificial neural network model. Environ. Monit. Assess. 2023, 195, 917. [Google Scholar] [CrossRef] [PubMed]
  74. Zarra, T.; Galang, M.G.; Ballesteros, F.; Belgiorno, V.; Naddeo, V. Environmental odour management by artificial neural network—A review. Environ. Int. 2019, 133, 105189. [Google Scholar] [CrossRef] [PubMed]
  75. de Lima, B.D.; de Cassia, M.A.R.; de Oliveira, G.G.; Paim, B.L. The performance of artificial neural networks for modeling daily concentrations of particulate matter from meteorological data. Environ. Monit. Assess. 2023, 195, 1305. [Google Scholar] [CrossRef] [PubMed]
  76. Amaya, J.A.G.; Colmenares, A.N.N.; Rodríguez, A.F.C.; Pulido, J.G. Artificial neural network-based QSAR model for predicting degradation techniques of pharmaceutical contaminants in water bodies with experimental verification. Environ. Sci. Water Res. Technol. 2024, 10, 1492–1498. [Google Scholar] [CrossRef]
  77. Zhang, J.; Yang, R.; Chen, R.; Li, Y.C.; Peng, Y.; Liu, C. Multielemental Analysis Associated with Chemometric Techniques for Geographical Origin Discrimination of Tea Leaves (Camelia sinensis) in Guizhou Province, SW China. Molecules 2018, 23, 3013. [Google Scholar] [CrossRef] [PubMed]
  78. Athamena, A.; Gaagai, A.; Aouissi, H.A.; Burlakovs, J.; Bencedira, S.; Zekker, I.; Krauklis, A.E. Chemometrics of the Environment: Hydrochemical Characterization of Groundwater in Lioua Plain (North Africa) Using Time Series and Multivariate Statistical Analysis. Sustainability 2023, 15, 20. [Google Scholar] [CrossRef]
  79. Reta, C.; Asmellash, T.; Atlabachew, M.; Mehari, B. Multielement analysis coupled with chemometrics modelling for geographical origin classification of teff [Eragrostis tef (Zuccagni) Trotter] grains from Amhara Region, Ethiopia. BMC Chem. 2023, 17, 50. [Google Scholar] [CrossRef] [PubMed]
  80. Benkov, I.; Varbanov, M.; Venelinov, T.; Tsakovski, S. Principal Component Analysis and the Water Quality Index—A Powerful Tool for Surface Water Quality Assessment: A Case Study on Struma River Catchment, Bulgaria. Water 2023, 15, 1961. [Google Scholar] [CrossRef]
  81. Kamal, M.A.; Almohana, A.I. Assessment of Physicochemical Water Quality using Principal Component Analysis: A Case Study Wadi Hanifa, Riyadh. Civ. Eng. Res. J. 2022, 12, 555850. [Google Scholar]
  82. Su, Q.; Yu, H.; Xu, X.; Chen, B.; Yang, L.; Fu, T.; Liu, W.; Chen, G. Using Principal Component Analysis (PCA) Combined with Multivariate Change-Point Analysis to Identify Brine Layers Based on the Geochemistry of the Core Sediment. Water 2023, 15, 1926. [Google Scholar] [CrossRef]
  83. Martini, E.; Wollschläger, U.; Musolff, A.; Werban, U.; Zacharias, S. Principal Component Analysis of the Spatiotemporal Pattern of Soil Moisture and Apparent Electrical Conductivity. Vadose Zone J. 2017, 16, 1–12. [Google Scholar] [CrossRef]
  84. Acal, C.; Aguilera, A.M.; Sarra, A.; Evangelista, A.; Di Battista, T.; Palermi, S. Functional ANOVA Approaches for Detecting Changes in Air Pollution During the COVID-19 Pandemic. Stoch. Environ. Res. Risk Assess. 2022, 36, 1083–1101. [Google Scholar] [CrossRef] [PubMed]
  85. Karamizadeh, S.; Abdullah, S.; Manaf, A.; Zamani, M.; Hooman, A. An Overview of Principal Component Analysis. J. Signal Inf. Process. 2013, 4, 173–175. [Google Scholar] [CrossRef]
  86. Younes, K.; Kharboutly, Y.; Antar, M.; Chaouk, H.; Obeid, E.; Mouhtady, O.; Abu-samha, M.; Halwani, J.; Murshid, N. Application of Unsupervised Machine Learning for the Evaluation of Aerogels’ Efficiency towards Ion Removal—A Principal Component Analysis (PCA) Approach. Gels 2023, 9, 304. [Google Scholar] [CrossRef] [PubMed]
  87. Liu, Y.; Zhang, J.; Wang, S.; Wang, Y.; Zhao, A. Assessment of Environmental Carrying Capacity Using Principal Component Analysis. J. Geosci. Environ. Protect. 2018, 6, 54–65. [Google Scholar] [CrossRef]
  88. Liu, J.; Kang, H.; Tao, W.; Li, H.; He, D.; Ma, L.; Tang, H.; Wu, S.; Yang, K.; Li, X. A spatial distribution—Principal component analysis (SD-PCA) model to assess pollution of heavy metals in soil. Sci. Total Environ. 2023, 859, 160112. [Google Scholar] [CrossRef] [PubMed]
  89. Mamouei, M.; Zhu, Y.; Nazarzadeh, M.; Hassaine, A.; Salimi-Khorshidi, G.; Cai, Y.; Rahimi, K. Investigating the association of environmental exposures and all-cause mortality in the UK Biobank using sparse principal component analysis. Sci. Rep. 2022, 12, 9239. [Google Scholar] [CrossRef] [PubMed]
  90. Akbar, T.A.; Javed, A.; Ullah, S.; Ullah, W.; Pervez, A.; Akbar, R.A.; Javed, M.F.; Mohamed, A.; Mohamed, A.M. Principal Component Analysis (PCA)–Geographic Information System (GIS) Modeling for Groundwater and Associated Health Risks in Abbottabad, Pakistan. Sustainability 2022, 14, 14572. [Google Scholar] [CrossRef]
  91. Chen, S.; Yu, L.; Zhang, C.; Wu, Y.; Li, T. Environmental impact assessment of multi-source solid waste based on a life cycle assessment, principal component analysis, and random forest algorithm. J. Environ. Manag. 2023, 339, 117942. [Google Scholar] [CrossRef] [PubMed]
  92. Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef] [PubMed]
  93. Keithley, R.B.; Wightman, R.M.; Heien, M.L. Multivariate concentration determination using principal component regression with residual analysis. TrAC Trends Anal. Chem. 2009, 28, 1127–1136. [Google Scholar] [CrossRef] [PubMed]
  94. Mejia, A.F.; Nebel, M.B.; Eloyan, A.; Caffo, B.; Lindquist, M.A. PCA leverage: Outlier detection for high-dimensional functional magnetic resonance imaging data. Biostatistics 2017, 18, 521–536. [Google Scholar] [CrossRef] [PubMed]
  95. Biancolillo, A.; Marini, F.; Ruckebusch, C.; Vitale, R. Chemometric Strategies for Spectroscopy-Based Food Authentication. Appl. Sci. 2020, 10, 6544. [Google Scholar] [CrossRef]
  96. Alessandra, B.; Federico, M. Chemometric Methods for Spectroscopy-Based Pharmaceutical Analysis. Front. Chem. 2018, 6, 576. [Google Scholar] [CrossRef] [PubMed]
  97. Ferreira, S.L.C. Chemometrics and Statistics, Experimental Design. In Encyclopedia of Analytical Science, 3rd ed.; Worsfold, P., Poole, C., Townshend, A., Miro, M., Eds.; Academic Press: San Diego, CA, USA, 2019; pp. 420–424. [Google Scholar]
  98. Veron, J.E.N.; Hoegh-Guldberg, O.; Lenton, T.M.; Lough, J.M.; Obura, D.O.; Pearce-Kelly, P.; Sheppard, C.R.C.; Spalding, M.; Stafford-Smith, M.G.; Rogers, A.D. The coral reef crisis: The critical importance of <350 ppm CO2. Mar. Pollut. Bull. 2009, 58, 1428–1436. [Google Scholar]
  99. Mutalib, S.N.S.A.; Juahir, H.; Azid, A.; Sharif, S.M.; Latif, M.T.; Aris, A.Z.; Zain, S.M.; Dominick, D. Spatial and temporal air quality pattern recognition using environmetric techniques: A case study in Malaysia. Environ. Sci. Process. Impacts 2013, 15, 1717–1728. [Google Scholar] [CrossRef] [PubMed]
  100. Abdullah, A.; Saudi, A.S.M.; Shafii, N.Z.; Kamarudin, M.K.A.; Sukki, F.M. Temporal analysis and predictive modeling of ambient air quality in Hulu Langat district, Selangor, Malaysia: A chemometric approach. J. Malays. Inst. Plan. 2024, 22, 394–408. [Google Scholar] [CrossRef]
  101. Pavel, M.R.S.; Zaman, S.U.; Jeba, F.; Islam, M.S.; Salam, A. Long-term (2003–2019) air quality, climate variables, and human Health Consequences in Dhaka, Bangladesh. Front. Sustain. Cities 2021, 3, 681759. [Google Scholar] [CrossRef]
  102. Muzyka, R.; Chrubasik, M.; Pogoda, M.; Sajdak, M. Chemometric analysis of air pollutants in raw and thermally treated coals–low-emission fuel for domestic applications, with a reduced negative impact on air quality. J. Environ. Manag. 2021, 281, 111787. [Google Scholar] [CrossRef] [PubMed]
  103. Hua, A.K. Applied chemometric approach in identification sources of air quality pattern in Selangor, Malaysia. Sains Malays. 2018, 47, 471–479. [Google Scholar] [CrossRef]
  104. Azid, A.; Amran, M.A.; Samsudin, M.S.; Rani, N.L.A.; Khalit, S.I.; Gasim, M.B.; Yunus, K.; Saudi, A.S.M.; Amin, S.N.S.M.; Yusof, K.M.K.K. Assessing indoor air quality using chemometric models. Pol. J. Environ. Stud. 2018, 27, 2443–2450. [Google Scholar] [CrossRef] [PubMed]
  105. Hanapiah, S.M.; Saudi, A.S.M.; Rizman, Z.I. Assessment on pattern of urban air quality by using chemometric technique: A case study in Kota Kinabalu, Sabah. J. Fundam. Appl. Sci. 2017, 9, 861–870. [Google Scholar] [CrossRef]
  106. Azid, A.; Juahir, H.; Toriman, M.E.; Endut, A.; Kamarudin, M.K.A.; Rahman, M.N.A.; Hasnam, C.N.C.; Saudi, A.S.M.; Yunus, K. Source apportionment of air pollution: A case study in Malaysia. J. Tek. 2015, 72, 83–88. [Google Scholar] [CrossRef]
  107. Thomas, E.O. Evaluation of groundwater quality using multivariate, parametric and non-parametric statistics, and GWQI in Ibadan, Nigeria. Water Sci. 2023, 37, 117–130. [Google Scholar] [CrossRef]
  108. Kaur, H.; Rajor, A.; Kaleka, A.S. Application of chemometric modeling for identification of pollution sources from drains of Ghaggar River, Punjab, India. Sadhana 2022, 47, 251. [Google Scholar] [CrossRef]
  109. Elkorashey, R.M. Utilizing chemometric techniques to evaluate water quality spatial and temporal variation. A case study: Bahr El-Baqar drain–Egypt. Environ. Technol. Innov. 2022, 26, 102332. [Google Scholar] [CrossRef]
  110. Kustomo; Rasidah; Oktaviano, D. Chemometrics analysis for the groundwater quality assessment in UIN Walisongo Semarang. Adv. Eng. Res. 2021, 211, 53–60. [Google Scholar]
  111. Pathak, H. Chemometric analysis of drinking water quality parameters of Sagar city, Madhya Pradesh, India. Ovidius Univ. Ann. Chem. 2020, 31, 99–105. [Google Scholar] [CrossRef]
  112. Benamar, A.; Mahjoubi, F.Z.; Ali, G.A.M.; Kzaiber, F.; Oussama, A. A chemometric method for contamination sources identification along the Oum Er Rbia river (Morocco). Bulg. Chem. Commun. 2020, 52, 159–171. [Google Scholar]
  113. Ioele, G.; De Luca, M.; Grande, F.; Durante, G.; Trozzo, R.; Crupi, C.; Ragno, G. Assessment of surface water quality using multivariate analysis: Case study of the Crati River, Italy. Water 2020, 12, 2214. [Google Scholar] [CrossRef]
  114. Miller, T.I. Assessment of water quality using chemometric methods—A case study of Rusałka Lake, NW-Poland. Ecol. Montenegrina 2020, 27, 80–89. [Google Scholar] [CrossRef]
  115. Bam, E.K.P.; Akumah, A.M.; Bansah, S. Geochemical and chemometric analysis of soils from a data scarce river catchment in West Africa. Environ. Res. Commun. 2020, 2, 035001. [Google Scholar] [CrossRef]
  116. Nedyalkova, M.; Simeonov, V. Chemomertic risk assessment of soil pollution. Open Chem. 2019, 17, 711–721. [Google Scholar] [CrossRef]
  117. Dimitrov, D.S.; Nedyalkova, M.A.; Donkova, B.V.; Simeonov, V.D. Chemometric assessment of soil pollution and pollution source apportionment for an industrially impacted region around a non-ferrous metal smelter in Bulgaria. Molecules 2019, 24, 883. [Google Scholar] [CrossRef] [PubMed]
  118. Bellino, A.; Colombo, C.; Iovieno, P.; Alfani, A.; Palumbo, G.; Baldantoni, D. Chemometric technique performances in predicting forest soil chemical and biological properties from UV-Vis-NIR reflectance spectra with small, high dimensional datasets. IForest 2015, 9, 101–108. [Google Scholar] [CrossRef]
  119. Kim, M.J.; Lee, H.I.; Choi, J.H.; Lim, K.J.; Mo, C. Development of a Soil Organic Matter Content Prediction Model Based on Supervised Learning Using Vis-NIR/SWIR Spectroscopy. Sensors 2022, 22, 5129. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The applications of chemometrics in processing analytical big data obtained from various sources.
Figure 1. The applications of chemometrics in processing analytical big data obtained from various sources.
Chemosensors 13 00268 g001
Figure 2. The number of publications related to chemometrics from 2000 to 2024 as per the most advanced scientific research database, Dimensions AI.
Figure 2. The number of publications related to chemometrics from 2000 to 2024 as per the most advanced scientific research database, Dimensions AI.
Chemosensors 13 00268 g002
Figure 3. Chemometrics pipeline overview—when the experimental dataset (n) is large enough (a threshold of 40 is indicated in the scheme), the data can be split into a training set and a test set. On the other hand, when working with small datasets (n < 40), resampling algorithms for cross-validation are preferred over data splitting (adopted from [67], Copyright©Elsevier).
Figure 3. Chemometrics pipeline overview—when the experimental dataset (n) is large enough (a threshold of 40 is indicated in the scheme), the data can be split into a training set and a test set. On the other hand, when working with small datasets (n < 40), resampling algorithms for cross-validation are preferred over data splitting (adopted from [67], Copyright©Elsevier).
Chemosensors 13 00268 g003
Figure 4. A multilayer perceptron consists of input, output, hidden layers, and nodes, with anticipated links (adapted from [70], Copyright©MDPI).
Figure 4. A multilayer perceptron consists of input, output, hidden layers, and nodes, with anticipated links (adapted from [70], Copyright©MDPI).
Chemosensors 13 00268 g004
Figure 5. Application of ANN in environmental odor management (adopted from [74], Copyright©Elsevier).
Figure 5. Application of ANN in environmental odor management (adopted from [74], Copyright©Elsevier).
Chemosensors 13 00268 g005
Figure 6. A network analysis site was created, where yellow, blue, and green colors signify the three main clusters corresponding to chemometric methods, instrumental techniques, and applications, respectively (adopted from [67], Copyright©Elsevier).
Figure 6. A network analysis site was created, where yellow, blue, and green colors signify the three main clusters corresponding to chemometric methods, instrumental techniques, and applications, respectively (adopted from [67], Copyright©Elsevier).
Chemosensors 13 00268 g006
Figure 7. Air pollutants source contribution (PM2.5, PM10, CO, NO2, and SO2) attained from PMF modeling (adopted from [101], Copyright©Frontiers).
Figure 7. Air pollutants source contribution (PM2.5, PM10, CO, NO2, and SO2) attained from PMF modeling (adopted from [101], Copyright©Frontiers).
Chemosensors 13 00268 g007
Figure 8. Chemometric approaches to analyzing water quality in a complex dataset (adopted from [109], Copyright© Elsevier).
Figure 8. Chemometric approaches to analyzing water quality in a complex dataset (adopted from [109], Copyright© Elsevier).
Chemosensors 13 00268 g008
Figure 9. Experimental flow diagram for emerging SOM prediction models (adopted from [119], Copyright©MDPI).
Figure 9. Experimental flow diagram for emerging SOM prediction models (adopted from [119], Copyright©MDPI).
Chemosensors 13 00268 g009
Table 1. The focus, analytical methods, analytes, application, advantages, and limitations of some in situ chemometric methods are reported in the literature.
Table 1. The focus, analytical methods, analytes, application, advantages, and limitations of some in situ chemometric methods are reported in the literature.
FocusChemometric ApproachAnalytical MethodAnalytesAccuracy/Precision
& R2
ApplicationAdvantagesLimitationsRef.
Monitoring of polymorphic transitions of pharmaceutical compoundsPCA, PLSIn situ Raman spectroscopy and X-ray diffractionAPI salts (HCl and maleate)For HCl salt, RMSECV = 0.056, 0.034, and 0.022
For maleate salt,
RMSECV = 0.016 and 0.023
Real-time monitoring for quality control and stability assessmentDetects multiple polymorphs
to enhance calibration for complex systems
Requirements of complex calibration[31]
Real-time monitoring of trace metal contaminationPLSMicrowave spectroscopy and planar sensorsTrace metals
(e.g., Pb, Cd, As, and Hg)
R2 > 0.96Continuous environmental monitoring of water qualityImmediate response, in situ monitoring, and cost-effectiveNon-specific metal detection[32]
In situ SPME for untargeted exometabolome screeningPCA, PLS, DAGC-MS/LC-MSPrimary (amino acids and fatty acids), secondary (alkaloids and terpenes), and pollutants80.92–100%
p < 0.05
Marine health monitoring, chemical ecology, and natural product discoveryNon-invasive and eco-friendly with minimal contaminationSpecific for polar compounds[35]
Detection of salt disproportionationPCA, PLSRaman spectroscopy and X-ray diffractionPioglitazone HCl saltRMSECV = 0.45
R2 = 0.996
Quality control and stability assessmentDetects minor species, provides spatial distribution, and enhances sensitivitySpectral interference requires advanced techniques[36]
Disproportionation of the drug HCl saltNANIR and
Raman spectroscopy
Avicel and API saltNAFormulation stability, drug release, and efficacyNon-destructive, spatially resolved monitoring, and
better process understanding
Requires advanced data analysis[37]
Disproportionation of drugsNAATR-FTIR and Raman spectroscopyAvicel and API saltNAParticularly for salt-based drug delivery systems, drug solubility, and bioavailabilityHigh spatial and chemical specificityRequires complex data interpretation and spectral deconvolution[38]
Solid-state form conversion within intact pharmaceutical tabletsPCA, PLSRaman spectroscopyPioglitazone hydrochloride saltRMSECV = 0.506, 0.837
R2 = 0.928, 0.981
Process monitoring, quality control, and regulatoryReducing surface bias, non-destructive, and rapidRequires careful calibration and uniform sample density[39]
Table 2. Brief descriptions and applications of key technologies for remote environmental monitoring.
Table 2. Brief descriptions and applications of key technologies for remote environmental monitoring.
TechnologyDescriptionExampleRef.
SatellitesUse optical, thermal, or radar sensors to observe Earth from spaceMonitoring deforestation, glacial retreat, and land use changes[41,42]
Drones (UAVs)Provide high-resolution aerial images and multispectral dataMapping crop health, detecting oil spills, and wildfire monitoring[43,44]
IoT SensorsGround-based devices connected via wireless networksAir/water quality sensors in urban areas or rivers[45]
WSNsNetworks of sensors transmitting environmental dataForest fire detection and weather stations[46]
Buoys and Floating PlatformsFor marine environmentsOcean temperature, salinity, pH, and wave height monitoring[47]
Table 3. Satellite types and chemometric methods for environmental monitoring.
Table 3. Satellite types and chemometric methods for environmental monitoring.
Satellite/SensorEnvironmental Parameters RetrievedCommon Chemometric MethodsApplicationsRef.
MODISCDOM, SDD, TSS, TP, SSS, DO, BOD, CODPCA, ANNEvaluating and quantifying the
water quality
[48]
ATBM, UAVGeometric and radiometric
information
ANNTensor power iteration and detection process[49]
MODIS, MISRAODGWRPM2.5 estimation[50]
GOSATCH4XGBoostAtmospheric profiling and greenhouse gas monitoring[51]
Sentinel 2ALULC, NDVISVM, RFWater quality assessment, agricultural monitoring, and land cover change detection[52]
Table 4. The focus, analytical methods, analytes, application, advantages, and limitations of some selected remote monitoring chemometric methods reported in the literature.
Table 4. The focus, analytical methods, analytes, application, advantages, and limitations of some selected remote monitoring chemometric methods reported in the literature.
FocusChemometric ApproachAnalytical MethodAnalytesAccuracy/Precision
& R2
ApplicationAdvantagesLimitationsRef.
Development of non-destructive, rapid, and accurate methodHVIs, PCA, PLSRUV–VIS-NIR-SWIRChlorophylls, carotenoids, flavonoids, and ligninR2 > 0.75
p < 0.01
Real-time plant health monitoring
and breeding
ecophysiological studies
Non-invasive,
Rapid, and high throughput
High hyperspectral equipment cost, and
complex data analysis
[58]
Estimate the concentrations of heavy metals in soilPLS, PCANIRSHeavy metals (Cd, Cu, Pb, Ni, Cr, Zn, Mn, and Fe)RMSEP = 9.63, 11.5%
R2 = 0.86, 0.58
Assessing soil contaminationNon-destructive, rapid, cost-effective, and high throughputCalibration dependency, matrix effects, and limited analyte scope[59]
Detection of highly polar pesticide residuesOPLS, PCALC-MS/MS50 medium to highly polar pesticidesR2 = 0.49–0.73Monitor sediment contaminationHigh sensitivity and selectivityRequires advanced equipment and technical skill[60]
The study evaluates the use of plant-based ingredientsPLS, PCA, DAFTIR, UV–Vis, Raman, GC-MSPlant-derived ingredients and
microbial counts
100, 99.8, 99.6, 96.6 and 93.7%
RMSEP-1.1% R2 = 0.97
Useful in the industry for producing reduced fat and natural preservationReduces fat content, and enhances safety and shelf lifeLimited details on the specific plant ingredients[61]
Improving water quality assessmentOPLS, PCAFlow cytometry50 medium to highly polar pesticidesR2 = 0.49–0.73Early detection of environmental changes, pollution events, and ecosystem health monitoringRapid, real-time tracking, and high-sensitivityRequires specialized equipment and expertise, high cost, and complex data interpretation[60]
Assessing groundwater qualityPCAWQI, SAR, EC, UV–Vis, ICCa, Mg, and ClWQI = 17–47%
EC = 17–64%
Water quality assessmentIdentification of patterns in water quality dataDepending on the quality of the data available[62]
Heavy metal contamination in the groundwaterHCA, PCAAASFe, Mn, Pb, Cd, Cr, and As34.21–82.97%
p < 0.05
Assessing health risks and guiding safe waterIdentifies health-threatening pollutantsIt does not cover all possible contaminants.[63]
Assessing human pharmaceuticals in waterCA, PCA, DALC-MS/MSNineteen pharmaceuticals100%
R2 > 0.75
p < 0.05
River water
quality monitoring
Identifies
pharmaceuticals
Focus limited to selected pharmaceuticals[64]
Evaluating sediment qualityPCA, DA, PLS, ANNNAHeavy metals and polycyclic aromatic hydrocarbons92.3–97.2%Broad use in environmental analysisEnhances accuracy, and
reduces experimental trials
Requires high-quality, representative data[65]
Quantifying inorganic arsenic speciesDA, ANNETAASAs(III) and As(V)92–98%
88–91%
Assessment of arsenic contaminationHigh selectivity and sensitivityIt requires careful pH control and multiple extraction steps.[66]
Table 5. Comparative overview of ANN techniques’ performance in some environmental monitoring studies.
Table 5. Comparative overview of ANN techniques’ performance in some environmental monitoring studies.
Environmental TargetInput DataANN RolePerformance HighlightsRef.
DOC in WastewaterFluorescence intensity and UV absorbanceQuantification of DOCR2 = 0.9079; RMSE = 0.2989 mg/L[71]
Nutrient and COD levels in RiversUV–Vis spectral dataEstimation of N, P, COD, and SSANN outperformed regression models[72]
THMs in WaterPhysicochemical water parametersPrediction of THM levelsHigh accuracy vs. traditional models[73]
PM10 in the AirMeteorological variablesForecasting air pollutionR2 = 0.81; RMSE = 7.40 µg/m3[75]
Pharmaceutical Degradation in WaterMolecular descriptorsPredicting optimal degradation techniquesExperimentally validated predictions[76]
Table 6. Comparative overview of the DA techniques used in some environmental monitoring studies.
Table 6. Comparative overview of the DA techniques used in some environmental monitoring studies.
Environmental MatrixChemometric ApproachType of DAKey Discriminating VariablesAccuracy/Precision
& R2
Main ApplicationRef.
Tea leaf samplesMultielemental analysis + chemometricsLDAAs, K, La, and Pb98.9%Discriminating tea origins based on geochemical fingerprint[77]
Surface water (lake)Physicochemical analysis + DANot specifiedpH, EC, BOD, and TDS98.5–100%Assessment and classification of water quality[68]
GroundwaterHydrochemical analysis + multivariate statisticsNot explicitly DA-onlyMajor ion such as Ca2+, Mg2+, Cl, NO3, etc.R2 = 0.62–0.96Characterizing groundwater facies and pollution sources[78]
Phytoplankton (flow cytometry)DAMACY algorithm (DA-based)LDA-based (with anomaly detection)Cell size, fluorescence, and scatteringR2 = 0.49–0.73Real-time monitoring and early pollution detection[60]
Teff grainMultielemental ICP-OES + chemometricsLDAFe, Mn, Zn, Ca, etc.96%Origin authentication of grains from different zones[79]
Table 7. A comparative analysis of factorial method applications in environmental monitoring is summarized.
Table 7. A comparative analysis of factorial method applications in environmental monitoring is summarized.
Environmental MatrixMain ObjectiveFactorial MethodKey FindingsChemometric ContributionRef.
Surface water (river)Assess pollution sources and seasonal changes in water qualityPCAIdentified major pollution sources; separated seasonal trendsPCA revealed key influencing parameters (DO, BOD, NO3, etc.) and anthropogenic vs. natural impact[80]
Surface water (wadi)Evaluate spatial variation in water quality in Wadi HanifaPCAExplained 85% of total variance with 3 PCs; salinity and nutrients were the main driversPCA helped classify water quality zones and contamination levels[81]
Groundwater (plain)Characterize hydrochemical processes and pollution sourcesPCAIdentified geochemical processes: silicate weathering, evaporation, and salinizationPCA simplified hydrochemical data into manageable PCs for interpretation[78]
Sediment core samplesIdentify brine layers and geochemical change pointsPCA + Change-Point AnalysisRevealed stratification and historical geochemical transitionsPCA reduced complexity; change-point detection linked transitions to salinity[82]
Soil sampleDetermine the pattern of soil moisture and apparent electrical conductivityPCAProvided insights into controlling factors and the major soil water changing aspects responsible for the soil moisture spatial patternPCA accounts for 86% of the total dataset’s variance, and all are significant in illustrating the spatial association between the topsoil and its sequential variations in soil moisture.[83]
Air quality data sample (several monitoring stations)Air quality changes in terms of air pollutionVariance for functional data (FANOVA).Significant reduction of NO2 but increased PM10 and P2.5 in the lockdown periodFANOVA analysis was feasible, allowing for the comparison and rejection of the null hypothesis of impartiality for mean functions of all contaminants[84]
Table 8. Applications of chemometric methods for air quality monitoring.
Table 8. Applications of chemometric methods for air quality monitoring.
Chemometric
Techniques
Analytical MethodPollutants and LevelsAccuracy/Precision
& R2
Ref.
DA, HACA, PCA, ANNsAPIStation 1
SO2: Maximum—0.1 ppm, Average—0.015 ppm; NO2: Maximum—0.22 ppm, Average—0.053 ppm; O3: Maximum—0.15 ppm, Average—0.034 ppm; CO: Maximum—10.41 ppm, Average—2.138 ppm; PM10: Maximum—806 µg/m3, Average—88.24 µg/m3; API: Maximum—392, Average—57.651
Station 2
SO2: Maximum—0.06 ppm, Average—0.006 ppm; NO2: Maximum—0.05 ppm, Average—0.022 ppm; O3: Maximum—0.14 ppm, Average—0.045 ppm; CO: Maximum—5.72 ppm, Average—1.393 ppm; PM10: Maximum—640 µg/m3, Average—83.27 µg/m3; API: Maximum—153, Average—51.068
Station 3
SO2: Maximum—0.02 ppm, Average—0.003 ppm; NO2: Maximum—0.12 ppm, Average—0.012 ppm; O3: Maximum—0.08 ppm, Average—0.024 ppm; CO: Maximum—4.13 ppm, Average—0.978 ppm; PM10: Maximum—411 µg/m3, Average—70.616 µg/m3; API: Maximum—188, Average—41.762
87.2%
R2 > 0.75
p < 0.05
[99]
PCA, SPC, ANNsAPISO2: Maximum—0.084 ppm, Average—0.003 ppm; NO2: Maximum—1.325 ppm, Average—0.013 ppm; O3: Maximum—0.149 ppm, Average—0.022 ppm; CO: Maximum—5.658 ppm, Average—0.702 ppm; PM10: Maximum—438.61 µg/m3, Average—51.866 µg/m3; API: Maximum—323, Average—56.431R2 = 0.9[100]
PCA, PMFHQPM2.5: 65 ppm; PM10: 150 ppm; CO: 35 ppm; O3: 0.12 ppm; NO3: 0.053 ppm; SO2: 0.014 ppmR2 = 0.37
p < 0.05
[101]
CA, PCAGC-FIDAnthracene: Maximum—6420 ppm; Phenanthrene: Maximum—13,880 ppm; Fluorene: 5200 ppm; Acenaphthene: 5791 ppm99%[102]
HCA, DA, PCA, MLRAPIO3: Average—0.1 ppm; CO: Average—30 ppm; NO2: Average—0.18 ppm, SO2: Average—0.15 ppm; PM10: Average—120 µg/m395.38%
p < 0.05
[103]
PCA, PLS-DA, LDA, AHCTurnkey dust mate detector, and gas meterO3: 0.467 ppm; CO: 0.781 ppm; CO2: 0.892 ppm; PM1: 0.798 µg/m3; PM2.5: 0.752 µg/m3; PM10: 0.751 µg/m389.05%
R2 > 0.75
[104]
PCA, SPCAPICO: CL—0.631 ppm, Upper control limit (UCL)—0.915 ppm, Lower control limit (LCL)—0.347 ppm, Maximum—37 ppm; PM10: CL—47.304 µg/m3, UCL—68.463 µg/m3, LCL—26.146 µg/m3R2 = 0.49–1[105]
PCAUV-fluoresenece, and Teledyne API-FIDStation 1
CO: Maximum—4.85 ppm, Average—1.24 ppm; O3: Maximum—0.12 ppm, Average—0.03 ppm; PM10: Maximum—780 µg/m3, Average—81.24 µg/m3; SO2: Maximum—0.13 ppm, Average—0.01 ppm; NO2: Maximum—0.06 ppm, Average—0.02 ppm; CH4: Maximum—9.75 ppm, Average—2.49 ppm; NmHC: Maximum—5.15 ppm, Average—0.055 ppm; THC: Maximum—10.5 ppm, Average—2.96 ppm; API: Maximum—125.88, Average—57.84
Station 4
CO: Maximum—2.84 ppm, Average—0.86 ppm; O3: Maximum—0.16 ppm, Average—0.04 ppm; PM10: Maximum—202 µg/m3, Average—58.70 µg/m3; SO2: Maximum—0.1 ppm, Average—0.01 ppm; NO2: Maximum—0.06 ppm, Average—0.02 ppm; CH4: Maximum—9.33 ppm, Average—2.91 ppm; NmHC: Maximum—4.81 ppm, Average—0.41 ppm; THC: Maximum—9.6 ppm, Average—3.24 ppm; API: Maximum—158, Average–50.14
Station 7
CO: Maximum—3.82 ppm, Average—0.99 ppm; O3: Maximum—0.12 ppm, Average—0.02 ppm; PM10: Maximum—760 µg/m3, Average—94.66 µg/m3; SO2: Maximum—0.06 ppm, Average—0.01 ppm; NO2: Maximum—0.06 ppm, Average—0.01 ppm; CH4: Maximum—6.4 ppm, Average—2.24 ppm; NmHC: Maximum—6.17 ppm, Average—0.58 ppm; THC: Maximum—8.2 ppm, Average—2.75 ppm; API: Maximum—151, Average—57.32
Station 10
CO: Maximum—3.32 ppm, Average—0.57 ppm; O3: Maximum—0.06 ppm, Average—0.02 ppm; PM10: Maximum—357 µg/m3, Average—55.56 µg/m3; SO2: Maximum—0.04 ppm, Average—0.00 ppm; NO2: Maximum—0.04 ppm, Average—0.01 ppm; CH4: Maximum—6.64 ppm, Average—2.2 ppm; NmHC: Maximum—4.54 ppm, Average—0.4 ppm; THC: Maximum—7.6 ppm, Average—2.54 ppm; API: Maximum—97, Average—38.41
1%
R2 > 0.75
p < 0.05
[106]
HACA, DA, PCA, FA, MLRAPILPS region
CO: 0.896 ppm; NO2: 0.939 ppm; SO2: 0.697 ppm; PM10: 0.646 µg/m3; O3: 0.343 ppm; CH4: 0.263 ppm; NO:0.873 ppm; Non-methane hydrocarbon: 0.887 ppm
MPS region
CO: 0.933 ppm; NO2: 0.733 ppm; SO2: 0.906 ppm; O3: 0.213 ppm; CH4: 0.913 ppm; NO: 0.857 ppm
SHPS region
CO: 0.801 ppm; NO2: 0.747 ppm; SO2: 0.108 ppm; O3: 0.024 ppm; CH4: 0.263 ppm; NO:0.918 ppm; Non-methane hydrocarbon: 0.218 ppm
91.67–97.22%[20]
Table 9. Applications of chemometric methods to determine water quality.
Table 9. Applications of chemometric methods to determine water quality.
Chemometric
Techniques
Analytical MethodParameters and ConcentrationAccuracy/Precision
& R2
Ref.
HCA, PCApH meter, conductivity meter, spectrophotometry, and
flame photometer
pH: Minimum—4.4, Maximum—7.10, Mean—6.49; EC: Minimum—270, Maximum—1870, Mean—893.56;
TDS: Minimum—142, Maximum—1720, Mean—536.88; K+: Minimum—9.8, Maximum—99.6, Mean—56.97;
Na+: Minimum—5.1, Maximum—59.52, Mean—31.96; Mg2+: Minimum—3.73, Maximum—9.5, Mean—6.02;
Ca2+: Minimum—1.3, Maximum—7.37, Mean—4.71; Cl: Minimum—57.6, Maximum—1476, Mean—445.01;
SO4: Minimum—0, Maximum—6.1, Mean—2.26; HCO3: Minimum—100.5, Maximum—609, Mean—253;
NO3: Minimum—0, Maximum—6.09, Mean—1.36
69.9%
R2 = 0.849, 0.968
p < 0.05
[107]
PCA, FA, CA, DAConductivity meter, titration, UV-spectrophotometry, and TKNpH: Minimum—6.8, Maximum—8.3, Mean—7.8; COD: Minimum—40, Maximum—120, Mean—81.6;
BOD: Minimum—12, Maximum—48, Mean—22.2; Alkalinity: Minimum—222, Maximum—514, Mean—360; TDS: Minimum—104, Maximum—360, Mean—264; TSS: Minimum—7, Maximum—153, Mean—74.2; SO4-S: Minimum—49.4, Maximum—185.3, Mean—89.9; NO3-N: Minimum—0.1, Maximum—4.1, Mean—1.6;
NO2-N: Minimum—0.7, Maximum—45.7, Mean—15.8
100%
R2 > 0.7
[108]
PCA, HCApH meter, conductivity meter, spectrophotometry, TKN, IC, and ICP-MS pH: 7.33; EC: 1.03; TDS: 728; BOD: 18; COD: 22; K+: 0.56; Na+: 4.67; Mg2+: 1.72; Ca2+: 3.44; Cl: 3.68; SO4: 2.08; HCO3: 4.38; NO3: 12.55; NH4: 3.3176–81.1%
R2 = 0.3–1
p < 0.05
[109]
PCA, CApH meter, conductivity meter,
UV–Vis spectrophotometer, and AAS
pH: Minimum—6.83, Maximum—7.63, Mean—7.29; TDS: Minimum—148.5, Maximum—662, Mean—328.3; Fe: Minimum—0, Maximum—0.03, Mean—0.0177; SO4: Minimum—0, Maximum—2, Mean—1; NO3: Minimum—0.6, Maximum—1, Mean—0.833; CaCO3: Minimum—184, Maximum—678.8, Mean—354; Cr: Minimum—0, Maximum—0.05, Mean—0.0267; Zn: Minimum—0.09, Maximum—0.43, Mean—0.2567; CN: Minimum—0, Maximum—0.04, Mean—0.0133; KMnO4: Minimum—2.43, Maximum—3.63, Mean—3.2218.43–81.57%
p < 0.05
[110]
FA, CApH meter, conductivity meter,
incubation and titration,
argentometric titration, and
complexometric titration
pH: Minimum—7.16, Maximum—8.34, Mean—7.86; DO: Minimum—6.7, Maximum—8.8, Mean—7.786; TDS: Minimum—304.3, Maximum—452.8, Mean—348.2; BOD: Minimum—3.2, Maximum—5.8, Mean—4.72; Cl: Minimum—24.61, Maximum—55.85, Mean—36.31; Mg2+: Minimum—18.74, Maximum—119.14, Mean—45.84; Ca2+: Minimum—115.9, Maximum—168.26, Mean—131.0575.4–83.05%
p < 0.05
[111]
PCApH meter, conductivity meter, turbidimetry, nephelometric method, titrimetry, and ICP-OESpH: Minimum—7.42, Maximum—8.59, Mean—8.21; DO: Minimum—4.62, Maximum—8.8, Mean—7.24;
CE: Minimum—856, Maximum—2420, Mean—1827.58; Nitrites: Minimum—0.003, Maximum—2.09, Mean—0.447; Cl: Minimum—134.9, Maximum—724.9, Mean—458.19; NO3: Minimum—4.22, Maximum—13.64, Mean—9.68; Cu: Minimum—0.036, Maximum—0.539, Mean—0.135; Cd: Minimum—0.088, Maximum—0.378, Mean—0.137; Pb: Minimum—0.069, Maximum—0.307, Mean—0.109; Cr: Minimum—0.0143, Maximum—0.278, Mean—0.073
84–96%
R2 = 0.256–0.989
[112]
PCApH meter, conductivity meter, and GC-MSSamples–Crati 13
pH: 8; NH4+: 0.19; N-NO2: 0.06; Al3+: 0.09; As: 0.09; Cr: 0.4; Fe: 36; Hg: 0.3; Ni: 0.5; Pb: 3; B: 5; Se: 0.04
29%, 49%[113]
CA, FA, DApH meter, conductivity meter,
incubation and titration, and
complexometric titration
pH: 7.30–8.96; BOD: 0.6–9; DO: 4.3–16.4; NO3: 0.02–1.5; NO2: 0.006–0.953; NH4: 0.08–2.8; COD: 22; Mg2+: 4–66; Ca2+: 43–253; Cl: 25–91; SO4: 19–185; Pb: 1–13.6; Cd: 1–776–100%[114]
Table 10. Applications of chemometric methods to determine soil quality.
Table 10. Applications of chemometric methods to determine soil quality.
Chemometric
Techniques
Analytical MethodParameters and ConcentrationAccuracy/Precision
& R2
Ref.
HCA, PCApH meter and EDXRFNIST SRM-1646a (Estuarine Sediment)
K: 0.67 cg/kg; Ca: 0.456 cg/kg; Fe: 1.743 cg/kg; Ti: 0.556; V: 47.62 mg/kg; Cr: 63.9 mg/kg; Ni: 21.8 mg/kg; Cu: 7.9 mg/kg; Zn: 45.57 mg/kg
IAEA SOIL-7 (Austria)
K: 1.34 cg/kg; Ca: 22.75 cg/kg; Fe: 3.28 cg/kg; Ti: 4583.29; V: 68.84 mg/kg; Cr: 123.89 mg/kg; Ni: <22.9 mg/kg; Cu: 14.3 mg/kg; Zn: 104.21 mg/kg
36.94–85.99%[115]
HCA, PCA ICP MS and UV–VIS spectroscopypH: minimum—5.45, maximum—8.25, mean—6.91; N total (mg/kg): minimum—794.66, maximum—2856, mean—1564.65; P total (mg/kg): minimum—268.16, maximum—1920.83, mean—744.87; TC (%): minimum—0.88, maximum—4.42, mean—2.28; TOC (%): minimum—0.93, maximum—3.73, mean—1.98; As (mg/kg): minimum—3.59, maximum—16.63, mean—7.55; Cu (mg/kg): minimum—11.76, maximum—97.42, mean—44.86; Cr (mg/kg): minimum—18.1, maximum—230.42, mean—71.87; Ni (mg/kg): minimum—9.44, maximum—85.19, mean—37.63; Cd (mg/kg): minimum—0.22, maximum—0.63, mean—0.4; Zn (mg/kg): minimum—30.58, maximum—115.03, mean—64.03; Pb (mg/kg): minimum—11.41, maximum—37.42, mean—20.7890–110%
R2 = 0.79–0.91
[116]
CA, PCApH meter, ICP-OES, and ETAASLocation: K
Zn (mg/kg): 103.74; Cd (mg/kg): 0.17; Pb (mg/kg): 17.66; Cu (mg/kg): 33.55; Hg (mg/kg): 0.033
Location: KRM
Zn (mg/kg): 66.33; Cd (mg/kg): 0.38; Pb (mg/kg): 14.33; Cu (mg/kg): 12.07; Hg (mg/kg): 0.036
Location: Kagri
Zn (mg/kg): 384.6; Cd (mg/kg): 2.22; Pb (mg/kg): 19.69; Cu (mg/kg): 8; Hg (mg/kg): 0.106
Location: AS
Zn (mg/kg): 409.7; Cd (mg/kg): 0.97; Pb (mg/kg): 31.68; Cu (mg/kg): 25.22; Hg (mg/kg): 0.072
Location: PB
Zn (mg/kg): 330.4; Cd (mg/kg): 0.81; Pb (mg/kg): 14.73; Cu (mg/kg): 30.76; Hg (mg/kg): 0.043
80%
R2 = 0.81–0.93
[117]
PCA, CAPotentiometry, UV–Vis-NIR, ICP-OES, and GCLocation: F. Sylvatica
pH: 6.07 ± 0.24; TOC (mg/g): 171.30 ± 26.80; Total N (mg/g): 10.32 ± 1.85; Total Ca (mg/g): 46.81 ± 11.68; Total K (mg/g): 11.18 ± 4.08; Total Mg (mg/g): 7.10 ± 2.82; Total Mn (mg/g): 1.35 ± 0.56; Total Na (mg/g): 2.76 ± 1.17; Total Fe (mg/g): 24.49 ± 6.70; Total Al (mg/g): 42.46 ± 15.72; Py-Fe (mg/g): 6.18 ± 1.75; Py-Al (mg/g): 18.02 ± 3.97; Ox-Fe (mg/g): 12.65 ± 2.88; Ox-Al (mg/g): 23.93 ± 7.79
Location: Q. Cerris
pH: 6.57 ± 0.23; TOC (mg/g): 80.10 ± 9.70; Total N (mg/g): 6.12 ± 0.71; Total Ca (mg/g): 20.89 ± 12.93; Total K (mg/g): 3.84 ± 1.70; Total Mg (mg/g): 9.33 ± 4.62; Total Mn (mg/g): 1.98 ± 1.34; Total Na (mg/g): 0.14 ± 0.06; Total Fe (mg/g): 22.59 ± 4.72; Total Al (mg/g): 22.23 ± 12.20; Py-Fe (mg/g): 1.95 ± 0.42; Py-Al (mg/g): 2.23 ± 0.69; Ox-Fe (mg/g): 8.33 ± 2.31; Ox-Al (mg/g): 6.45 ± 2.37
Location: Q. Ilex
pH: 6.87 ± 0.29; TOC (mg/g): 328.80 ± 50.30; Total N (mg/g): 17.65 ± 3.47; Total Ca (mg/g): 140.06 ± 49.95; Total K (mg/g): 6.25 ± 3.04; Total Mg (mg/g): 15.12 ± 7.51; Total Mn (mg/g): 0.96 ± 0.37; Total Na (mg/g): 1.12 ± 0.69; Total Fe (mg/g): 11.14 ± 2.47; Total Al (mg/g): 25.13 ± 19.75; Py-Fe (mg/g): 1.64 ± 0.56; Py-Al (mg/g): 4.86 ± 2.24; Ox-Fe (mg/g): 5.22 ± 3.24; Ox-Al (mg/g): 13.62 ± 10.35
[118]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Haque, S.M.; Umar, Y.; Kabir, A. Advanced Chemometric Techniques for Environmental Pollution Monitoring and Assessment: A Review. Chemosensors 2025, 13, 268. https://doi.org/10.3390/chemosensors13070268

AMA Style

Haque SM, Umar Y, Kabir A. Advanced Chemometric Techniques for Environmental Pollution Monitoring and Assessment: A Review. Chemosensors. 2025; 13(7):268. https://doi.org/10.3390/chemosensors13070268

Chicago/Turabian Style

Haque, Shaikh Manirul, Yunusa Umar, and Abuzar Kabir. 2025. "Advanced Chemometric Techniques for Environmental Pollution Monitoring and Assessment: A Review" Chemosensors 13, no. 7: 268. https://doi.org/10.3390/chemosensors13070268

APA Style

Haque, S. M., Umar, Y., & Kabir, A. (2025). Advanced Chemometric Techniques for Environmental Pollution Monitoring and Assessment: A Review. Chemosensors, 13(7), 268. https://doi.org/10.3390/chemosensors13070268

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop