Predicting Arsenic (As) Exposure on Human Health for Better Management of Drinking Water Sources

Chemical pollution in the transboundary Langat River in Malaysia is common both from point and non-point sources. Therefore, the water treatment plants (WTPS) at the Langat River Basin have experienced frequent shutdown incidents. However, the Langat River is one of the main sources of drinking water to almost one-third of the population in Selangor state. Meanwhile, several studies have reported a high concentration of Arsenic (As) in the Langat River that is toxic if ingested via drinking water. However, this is a pioneer study that predicts the As concentration in the Langat River based on time-series data from 2005–2014 to estimate the health risk associated with As ingestion via drinking water at the Langat River Basin. Several time-series prediction models were tested and Gradient Boosted Tree (GBT) gained the best result. This GBT model also fits better to predict the As concentration until December 2024. The mean concentration of As in the Langat River for both 2014 and 2024, as well as the carcinogenic and non-carcinogenic health risks of As ingestion via drinking water, were within the drinking water quality standards proposed by the World Health Organization and Ministry of Health Malaysia. However, the ingestion of trace amounts of As over a long period might be detrimental to human health because of its non-biodegradable characteristics. Therefore, it is important to manage the drinking water sources to minimise As exposure risks to human health.


Introduction
The presence of toxic arsenic (As) in the environment, especially in the aquatic environment, is very detrimental to all living organisms due to its persistent and non-biodegradable characteristics [1][2][3][4][5]. Meanwhile, globally, many studies have reported As exposure risks to human health via environmental media, especially As ingestion via drinking water and dietary consumption [4][5][6][7][8][9][10][11][12][13]. Arsenic in the environment is mainly from natural sources such as weathering of mineral rocks that contributed to the hydrochemistry of the river [14][15][16]. Apart from natural sources, industrialization, urbanization, and other anthropogenic activities have also contributed to the As concentration in the environment and enhanced the As exposure risks to human health [17]. In the Langat River Basin in Malaysia, several studies have also reported a high concentration of As in the river both from natural and man-made sources. For instance, the mean concentrations of As were reported in the Langat River by Ahmed et al. [6], 1.65 ± 0.93 µg/L (range 0.33-3.04 µg/L); Aries et al. [14], 11.18 ± 8.29 µg/L (range 1.79-21.48 µg/L); Sarmani [18], 201.11 µg/L (range 90-330 µg/L), and Yusuf [19], 27.50 µg/L. Therefore, examining the water quality in terms of As contamination in the Langat River is very important because it provides drinking water to almost one-third of the population in Selangor state. It is assumed that the health risk of As ingestion via drinking water reduces along with the reduction of As concentration in the drinking water sources. Thus, a prediction study based on the time-series data of As concentration will also contribute to managing the pollution reduction in the Langat River as well as reducing the As exposure risks to human health.
The river quality in terms of the Water Quality Index (WQI) showed an increase in 2019 around Malaysia. Out of the 672 rivers monitored, 61% showed clean water quality in 2019. The percentage of slightly polluted rivers and polluted rivers were 30% and 9%, respectively [20]. However, there are many branches of the Langat River in Selangor state and the maximum of these branches were slightly polluted, i.e., class III that requires extensive treatment before drinking. The average WQI of the Langat River also ranged from 'slightly polluted' to 'clean' during 2003-2019 [20][21][22][23]. Although the upstream water quality of the Langat River was clean, the physico-chemical parameters including several metals in so many cases had crossed the national water quality index at mid to downstream. Additionally, the upstream of Langat River also recorded a high concentration of several metals such as aluminium, arsenic, cadmium, chromium, lead, etc., mainly from natural sources [6,[24][25][26][27]. Therefore, several water treatment plants (WTPs) such as Sg. Semenyih, Sg. Langat, Cheras Mile 11, Bukit Tempoi, Salak Tinggi WTPs from 2009 to 2020 have experienced frequent shutdown incidents due to the chemical pollutions in the Langat River [25,28]. The sources of this pollution were both point and non-point sources such as the effluent from industrial zones and animal husbandries, leachate from landfills, run-off from the palm oil plantation and agriculture activities, frequent flash floods due to climate change, and similar [25,29].
The land-use and land cover changes due to rapid urbanization and industrialization in South East Asia have also altered the underground geochemistry and enhanced the mobilization of As [30]. The dominant soil types in the Langat River Basin such as oxisols and ultisols are also the natural sources of As in the Langat River, apart from the mining and navigation activities, as well as the use of arsenal herbicides in the agricultural activates [6,14,18]. Therefore, the leadership roles of local governments to enforce the laws effectively via the functional multi-stakeholders' platform is very important to reduce the pollution in the river both from point and non-point sources [24,31,32]. Ahmed et al. [26] have also emphasized the capacity-building of relevant stakeholders via awareness-raising, advocacy, and appropriate training using the multi-stakeholder platforms for decision making in the river basin management, including reducing pollution in line with the integrated water resources management (IWRM). However, the inadequate collaboration and cooperation among all the relevant stakeholders have been a challenge to manage the transboundary Langat River, especially for the pollution management [26,31,33,34]. Therefore, scientific data and information such as the prediction model of As pollution in the Langat River can contribute significantly to the decision-making processes of the local government for pollution reduction in the Langat River because it has the mandate based on the 'Local Government Act 1976'. Thus, this study used the time-series data of As concentration in the Langat River (2005-2014) and predicted As concentration until December 2024 along with estimating the As exposure risks to human health via ingestion to suggest effective leadership roles of the relevant stakeholders for better drinking water management.

Study Area
Langat River Basin is one of the important river basins in the Selangor state of Malaysia which covers approximately an area of 2409 km 2 [21]   In the Langat Basin, there are two dams, Semenyih and Langat, which are potential reservoirs of drinking water. Apart from these dams, there are many ex-mining ponds all over the basin, especially at the Kuala Langat near the Paya Indah Wetlands. The topography of the basin is defined as both mountainous and flat, and the average elevation ranges from 1440-400 m. The elevation of the central basin on average is below 200 m followed by less than 100 m at the lower basin. The rock beneath the Langat River Basin is igneous rock that is mainly granite. Therefore, the geology of the basin is defined as Hawthornden Schist and Kenny Hill Formation (sandstone and phyllite). Tanah Curam

Data Collection
Monthly water quality data of As (µg/L) was collected from the four water quality monitoring stations of the Department of Environment Malaysia (DOE) in the Langat River ( Figure 1) from January 2005 to December 2014. Accordingly, the monthly data of physicochemical parameters such as dissolved oxygen (DO, mg/L), pH, temperature (Temp, • C), salinity (SAL, ppt), and dissolved solids (DS, mg/L) were also obtained (2005-2014) from the same stations of the Department of Environment Malaysia. The monthly missing values during 2005-2014 were replaced with the yearly mean value of As and physico-chemical parameters data. The mean value of obtained As and physico-chemical parameters were found significant at a 95% confidence interval via the one-sample t-test (Table 2). SPSS software (Version 21.0, IBM Corp., Armonk, NY, USA) was applied to perform the descriptive statistics of As concentrations and physico-chemical parameters. The descriptive statistics include the calculation of the minimum (min.), maximum (max.), mean, and standard deviation (Std. Dev.) of the concentration of the water quality parameters in the Langat River. The standard deviation was calculated to observe the precision of each water quality parameter. Accordingly, Microsoft Excel 2016 (Microsoft Office Professional Plus 2016, Microsoft Corporation, Redmond, Washington, USA) was used to produce the trend graphs of As and the physico-chemical parameters during 2005-2014 of the Langat River. Pearson's statistical correlation analysis was applied to estimate the correlations among the As concentrations and physico-chemical water quality parameters. SPSS software (Version 21.0, IBM Corp., Armonk, NY, USA) was applied to perform the descriptive statistics of As concentrations and physico-chemical parameters. The descriptive statistics include the calculation of the minimum (min.), maximum (max.), mean, and standard deviation (Std. Dev.) of the concentration of the water quality parameters in the Langat River. The standard deviation was calculated to observe the precision of each water quality parameter. Accordingly, Microsoft Excel 2016 (Microsoft Office Professional Plus 2016, Microsoft Corporation, Redmond, Washington, USA) was used to produce the trend graphs of As and the physico-chemical parameters during 2005-2014 of the Langat River. Pearson's statistical correlation analysis was applied to estimate the correlations among the As concentrations and physico-chemical water quality parameters.

Prediction Model for Arsenic Concentration
During the process of modelling to predict, RapidMiner Studio (Version 9.8, Rapid-Miner, Inc., Boston, MA USA) was used. There are six models, namely Generalized Linear Model, Deep Learning, Decision Tree, Random Forest, Gradient Boosted Trees (GBT), and Support Vector Machine (SVM) that give the best top six results. Out of the six models, GBT has the best gain and best performance. The six steps of GBT processes (i.e., details in Supplementary Material: Tables S1-S3, Figures S1 and S2) are generalized as follows ( Figure 2) solely to run this set of data for QA (i.e., quality assurance) and QC (i.e., quality control) purposes.

Prediction Model for Arsenic Concentration
During the process of modelling to predict, RapidMiner Studio (Version 9.8, RapidMiner, Inc., Boston, MA USA) was used. There are six models, namely Generalized Linear Model, Deep Learning, Decision Tree, Random Forest, Gradient Boosted Trees (GBT), and Support Vector Machine (SVM) that give the best top six results. Out of the six models, GBT has the best gain and best performance. The six steps of GBT processes (i.e., details in Supplementary Material: Tables S1-S3, Figures S1 and S2) are generalized as follows ( Figure 2) solely to run this set of data for QA (i.e., quality assurance) and QC (i.e., quality control) purposes. Friedman [38] proposed the Gradient Boosted Trees (GBT) model in 2001. GBT produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than cleaning data [38,39]. The gradient boosted tree (GBT) algorithm performed best on both the validation set and the challenge test set. For instance, the GBT algorithm, which outperformed other models on both the validation set and the challenge test set, is also used for 2017's Soccer Prediction Challenge [39], multi-solar power forecasting [40], and clinical mastitis based on milking data set [41]. Friedman [38] proposed the Gradient Boosted Trees (GBT) model in 2001. GBT produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than cleaning data [38,39]. The gradient boosted tree (GBT) algorithm performed best on both the validation set and the challenge test set. For instance, the GBT algorithm, which outperformed other models on both the validation set and the challenge test set, is also used for 2017's Soccer Prediction Challenge [39], multi-solar power forecasting [40], and clinical mastitis based on milking data set [41].

Human Health Risk Assessment
The following equations of the United States Environmental Protection Agency [42,43] have been used to assess the human health risk of As ingestion via drinking water from the Langat River. These equations have also been used globally by researchers to assess As exposures' risks to human health [6,[44][45][46][47][48][49].

Arsenic Concentration in Langat River
The mean As concentration, 3.73 ± 1.97 µg/L (Table 3), in the Langat River was within the river and drinking water quality standards proposed by the Ministry of Health (MOH) 10 µg/L, the World Health Organization (WHO) 10 µg/L, and the United States Environmental Protection Agency (USEPA) 150 µg/L. However, the mean range of As, 0.98-12.87 µg/L, showed that the maximum mean As concentration had exceeded the maximum limit of MOH and WHO, respectively, 10 µg/L. Among all the water-sampling stations, the 1L15 station at the upstream showed the highest maximum concentration of As, 21.94 µg/L followed by As, 15.04 µg/L at the 1L05 station. On the other hand, the mean values of physico-chemical parameters, DO 6.24 ± 0.86 mg/L, pH 12.63 ± 0.67, temp 20.96 ± 1.05 • C, SAL 9.43 ± 3.15 ppt, and DS 53.14 ± 27.86 mg/L, were within the class I category proposed by the Dept. of Environment [52]. Among the stations, the 1L25 station at the downstream recorded a higher value of pH 29.46 ± 1.74 and SAL 37.46 ± 12.33 ppt than the other stations (Table 3).  The decreasing trend (R 2 = 0.81; Figure 3) of As concentration from upstream to downstream in the Langat River might be due to the higher level of salinity 37.46 ppt (Table 3) towards downstream (R 2 = 0.63). The increasing trend of DO (R 2 = 0.98) might have also enhanced the As mobility and its precipitation on sediments along with other salts. Moreover, a moderate negative correlation between As concentration and DO (r = −0.524, p = 4.2 × 10 −10 ) was found in the Langat River at a 99% confidence level (Table 4). Other studies have also reported a lower level of As concentration at the downstream of the Langat River than the upstream [6,57,58], which might be due to the proximity of water sampling stations towards the sea, i.e., Straits of Malacca where the Langat River meets the Indian Ocean.  The higher level of salts in downstream than in upstream increases the mobility of As and its precipitations on the sediments. However, the higher level of As in the upstream 1L15 4.74 ± 2.87 μg/L (Figure 4) might be because of the natural weathering of arsenopyrite minerals in the Titiwangsa mountain range [6,58]. The man-made activities along with the non-point sources of pollution might have also increased As concentration in the stations of midstream 1L05 As 4.96 ± 2.57 μg/L. Effluent from the sewage treatment plants and industries, leachate from the landfills and animal farms, and such are the point sources of pollution in the Langat River. However, runoff of arsenal herbicides from the agricultural and palm oil plantation areas, especially via the frequent flash floods, are the main non-point sources of As pollution in the Langat River [6,29,59]. Table 4. Correlation analysis between As (μg/L) and physico-chemical parameters in the Langat River.
The higher level of salts in downstream than in upstream increases the mobility of As and its precipitations on the sediments. However, the higher level of As in the upstream 1L15 4.74 ± 2.87 µg/L (Figure 4) might be because of the natural weathering of arsenopyrite minerals in the Titiwangsa mountain range [6,58]. The man-made activities along with the non-point sources of pollution might have also increased As concentration in the stations of midstream 1L05 As 4.96 ± 2.57 µg/L. Effluent from the sewage treatment plants and industries, leachate from the landfills and animal farms, and such are the point sources of pollution in the Langat River. However, runoff of arsenal herbicides from the agricultural and palm oil plantation areas, especially via the frequent flash floods, are the main non-point sources of As pollution in the Langat River [6,29,59]. Figure 4) showed a little bit of an increasing trend (R 2 = 0.10) along with the increasing trend of DS (R 2 = 0.33). These indicate that higher runoff of nutrients in the Langat River might have been due to the huge forest and land clearance activities for rapid urbanization and industrialization at the basin [29,31]. Moreover, the Pearson correlation analysis also showed that As concentration in the Langat River has a very strong positive correlation with DS (r = 0.704, p = 1 × 10 −13 ) and temp. (r = 0.229, p = 0.006) at a 99% confidence level, respectively (Table 4).

Prediction Model on Arsenic Concentration in Langat River
The GBT model predicted an As concentration of 3.11 μg/L for December 2024 (Figure 5) based on the monthly time-series data of As (μg/L) concentration from 2005-2014 in the Langat River. However, the mean concentration of As, 3.11 μg/L, for December 2024 is lower than the concentration of As 3.73 ± 1.97 μg/L at December 2014. This indicates the better river basin management practices by the entire relevant stakeholders, especially to reduce the river pollution. It is also noted that the Academy of Sciences Malaysia (ASM), which is a strategic partner of the government of Malaysia, has produced the 'Transforming the Water Sector: National Integrated Water Resources Management Plan, Strategies and Road Map' in 2016. This IWRM plan is currently at the implementation level and could bring good results for the river basin management in Malaysia [59]. Accordingly, the temporal distribution of As concentration in the Langat River from 2005-2014 ( Figure 4) showed a little bit of an increasing trend (R 2 = 0.10) along with the increasing trend of DS (R 2 = 0.33). These indicate that higher runoff of nutrients in the Langat River might have been due to the huge forest and land clearance activities for rapid urbanization and industrialization at the basin [29,31]. Moreover, the Pearson correlation analysis also showed that As concentration in the Langat River has a very strong positive correlation with DS (r = 0.704, p = 1 × 10 −13 ) and temp. (r = 0.229, p = 0.006) at a 99% confidence level, respectively (Table 4).

Prediction Model on Arsenic Concentration in Langat River
The GBT model predicted an As concentration of 3.11 µg/L for December 2024 ( Figure 5) based on the monthly time-series data of As (µg/L) concentration from 2005-2014 in the Langat River. However, the mean concentration of As, 3.11 µg/L, for December 2024 is lower than the concentration of As 3.73 ± 1.97 µg/L at December 2014. This indicates the better river basin management practices by the entire relevant stakeholders, especially to reduce the river pollution. It is also noted that the Academy of Sciences Malaysia (ASM), which is a strategic partner of the government of Malaysia, has produced the 'Transforming the Water Sector: National Integrated Water Resources Management Plan, Strategies and Road Map' in 2016. This IWRM plan is currently at the implementation level and could bring good results for the river basin management in Malaysia [59].
Ahmed et al. [6] also forecasted an As concentration of 3.45 µg/L (i.e., for January 2020) in the Langat River based on the monthly data from January 2005 to August 2015 using the auto-regression moving average analysis. However, the forecast of the As concentration of 3.45 µg/L at January 2020 by Ahmed et al. [6] was also lower than the determined mean As concentration of 3.55 ± 1.75 µg/L in Langat River in that study from January 2005 to August 2015. Similarly, the forecast of As concentration, 3.11 µg/L (i.e., for December 2024), in this study was also lower than the mean As concentration of 3.73 ± 1.97 µg/L from January 2005 to December 2014. Moreover, As concentration at the four sampling stations in this study from upstream to downstream of the Langat River also showed a moderate decreasing trend (R 2 = 0.57) based on the data 2005-2014, which might be due to the dissolution of As concentration towards downstream because of the high level of salinity ( Figure 6). Ahmed et al. [6] also forecasted an As concentration of 3.45 μg/L (i.e., for January 2020) in the Langat River based on the monthly data from January 2005 to August 2015 using the auto-regression moving average analysis. However, the forecast of the As concentration of 3.45 μg/L at January 2020 by Ahmed et al. [6] was also lower than the determined mean As concentration of 3.55 ± 1.75 μg/L in Langat River in that study from January 2005 to August 2015. Similarly, the forecast of As concentration, 3.11 μg/L (i.e., for December 2024), in this study was also lower than the mean As concentration of 3.73 ± 1.97 μg/L from January 2005 to December 2014. Moreover, As concentration at the four sampling stations in this study from upstream to downstream of the Langat River also showed a moderate decreasing trend (R 2 = 0.57) based on the data 2005-2014, which might be due to the dissolution of As concentration towards downstream because of the high level of salinity ( Figure 6).   Ahmed et al. [6] also forecasted an As concentration of 3.45 μg/L (i.e., for January 2020) in the Langat River based on the monthly data from January 2005 to August 2015 using the auto-regression moving average analysis. However, the forecast of the As concentration of 3.45 μg/L at January 2020 by Ahmed et al. [6] was also lower than the determined mean As concentration of 3.55 ± 1.75 μg/L in Langat River in that study from January 2005 to August 2015. Similarly, the forecast of As concentration, 3.11 μg/L (i.e., for December 2024), in this study was also lower than the mean As concentration of 3.73 ± 1.97 μg/L from January 2005 to December 2014. Moreover, As concentration at the four sampling stations in this study from upstream to downstream of the Langat River also showed a moderate decreasing trend (R 2 = 0.57) based on the data 2005-2014, which might be due to the dissolution of As concentration towards downstream because of the high level of salinity ( Figure 6).

Health Risk of Arsenic Ingestion via Drinking Water
This study calculated that the long-term As ingestion via drinking water from the Langat River has no potential carcinogenic and non-carcinogenic health risk. The carcinogenic lifetime cancer risk (LCR) has been calculated at 7.10 × 10 −5 in 2014 and 6.55 × 10 −5 for 2024 (Figure 7). LCR 10 −5 indicates the risk of one additional occurrence of cancer in one hundred thousand people [42]. Similarly, the non-carcinogenic hazard quotient (HQ) has been calculated at 0.35 in 2014 and 0.33 for 2024, which are less than 1. Any value of HQ less than 1 is safe from being a non-carcinogenic health risk. The previous study also reported no potential carcinogenic (i.e., LCR 9.7 × 10 −6 ) and non-carcinogenic (i.e., HQ 4.8 × 10 −2 ) health risks of As ingestion via drinking water in the Langat River Basin in 2015. However, the association has been indicated between long-term As ingestion via drinking water, and kidney failure and liver cancer in the Langat River Basin [6]. Therefore, further study is required to link the causal relationship between As ingestion via drinking water and human health risk at the Langat River Basin to reduce the health burden. genic lifetime cancer risk (LCR) has been calculated at 7.10 × 10 −5 in 2014 and 6.55 × 10 −5 for 2024 (Figure 7). LCR 10 −5 indicates the risk of one additional occurrence of cancer in one hundred thousand people [42]. Similarly, the non-carcinogenic hazard quotient (HQ) has been calculated at 0.35 in 2014 and 0.33 for 2024, which are less than 1. Any value of HQ less than 1 is safe from being a non-carcinogenic health risk. The previous study also reported no potential carcinogenic (i.e., LCR 9.7 × 10 −6 ) and non-carcinogenic (i.e., HQ 4.8 × 10 −2 ) health risks of As ingestion via drinking water in the Langat River Basin in 2015. However, the association has been indicated between long-term As ingestion via drinking water, and kidney failure and liver cancer in the Langat River Basin [6]. Therefore, further study is required to link the causal relationship between As ingestion via drinking water and human health risk at the Langat River Basin to reduce the health burden.

Conclusions.
The prediction of the As concentration of 3.11 μg/L for 2024 following the GBT model was within the drinking water quality standards of the Ministry of Health (MOH), the World Health Organization (WHO), and the United States Environmental Protection Agency (USEPA), respectively. The prediction of As concentration for 2024 was based on the monthly time-series data from 2005 to 2014. Moreover, the determined As concentration of 3.73 ± 1.97 μg/L at 2014 in the Langat River was also within the drinking water quality standards of MOH, WHO, and USEPA. Therefore, the predicted carcinogenic and non-carcinogenic health risks of As ingestion via drinking water at the Langat River Basin showed no potential health risks both for 2014 (HQ 0.35; LCR 7.10 × 10 −5 ) and 2024 (HQ 0.33; LCR 6.55 × 10 −5 ). Thus, the health risk of As ingestion via drinking water is reduced along with the reduction of As concentration for 2024 compared with 2014. This indicates the better management of the Langat River, including the pollution reduction via the effective implementation of IWRM policies along with the empowerment of the local government for its leadership roles.
Although the current implementation of the policy such as 'Transforming the Water Sector: National Integrated Water Resources Management Plan, Strategies and Road Map' is contributing to better IWRM, however, the effective implementation of policies should prioritize addressing specific units in IWRM such as the river basin management authority. Because several local authorities are functioning within the Langat River Basin, the river basin management authority could coordinate better with all the local authorities for total water management including the issues of transboundary river pollution. Moreover, a further causal study is required to link the causal relationship between As ingestion via drinking water and human health risk at the Langat River Basin to suggest reducing As exposure risks to human health. Hence, extensive studies on water treatment technologies

Conclusions
The prediction of the As concentration of 3.11 µg/L for 2024 following the GBT model was within the drinking water quality standards of the Ministry of Health (MOH), the World Health Organization (WHO), and the United States Environmental Protection Agency (USEPA), respectively. The prediction of As concentration for 2024 was based on the monthly time-series data from 2005 to 2014. Moreover, the determined As concentration of 3.73 ± 1.97 µg/L at 2014 in the Langat River was also within the drinking water quality standards of MOH, WHO, and USEPA. Therefore, the predicted carcinogenic and noncarcinogenic health risks of As ingestion via drinking water at the Langat River Basin showed no potential health risks both for 2014 (HQ 0.35; LCR 7.10 × 10 −5 ) and 2024 (HQ 0.33; LCR 6.55 × 10 −5 ). Thus, the health risk of As ingestion via drinking water is reduced along with the reduction of As concentration for 2024 compared with 2014. This indicates the better management of the Langat River, including the pollution reduction via the effective implementation of IWRM policies along with the empowerment of the local government for its leadership roles.
Although the current implementation of the policy such as 'Transforming the Water Sector: National Integrated Water Resources Management Plan, Strategies and Road Map' is contributing to better IWRM, however, the effective implementation of policies should prioritize addressing specific units in IWRM such as the river basin management authority. Because several local authorities are functioning within the Langat River Basin, the river basin management authority could coordinate better with all the local authorities for total water management including the issues of transboundary river pollution. Moreover, a further causal study is required to link the causal relationship between As ingestion via drinking water and human health risk at the Langat River Basin to suggest reducing As exposure risks to human health. Hence, extensive studies on water treatment technologies at both the water treatment plant and household water filtration levels are required to ensure a safe drinking water supply at the Langat River Basin.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/ijerph18157997/s1: the Six steps of Gradient Boosted Trees (GBT) model's processes ( Figure S1. Overview relative error; Figure S2. Overview runtimes (ms), Table S1. Data used for the Gradient Boosted Trees (GBT) model, Table S2. Comparison of Absolute Error, Root Mean Square, Relative Error, Correlation and Squared Error with Different Models, Table S3. Factors that support and contradict the prediction of each station for the predicted As value for December 2024).