Evaluation and Prediction of the Impacts of Land Cover Changes on Hydrological Processes in Data Constrained Southern Slopes of Kilimanjaro, Tanzania

: This study provides a detailed assessment of land cover (LC) changes on the water balance components on data constrained Kikafu-Weruweru-Karanga (KWK) watershed, using the integrated approaches of hydrologic modeling and partial least squares regression (PLSR). The soil and water assessment tool (SWAT) model was validated and used to simulate hydrologic responses of water balance components response to changes in LC in spatial and temporal scale. PLSR was further used to assess the inﬂuence of individual LC classes on hydrologic components. PLSR results revealed that expansion in cultivation land and built-up area are the main attributes in the changes in water yield, surface runoff, evapotranspiration (ET), and groundwater ﬂow. The study ﬁndings suggest that improving the vegetation cover on the hillside and abandoned land area could help to reduce the direct surface runoff in the KWK watershed, thus, reducing ﬂooding recurring in the area, and that with the ongoing expansion in agricultural land and built-up areas, there will be profound negative impacts in the water balance of the watershed in the near future (2030). This study provides a forecast of the future hydrological parameters in the study area based on changes in land cover if the current land cover changes go unattended. This study provides useful information for the advancement of our policies and practices essential for sustainable water management planning.


Introduction
Anthropic activities such as those leading to extensive land cover changes and climate changes are among the main drivers for changes in hydrological processes of the watershed [1][2][3]. Anthropogenic modification of land use/cover is a topmost determinant of environmental changes at spatial and temporal scales [4,5]. It is a principal determining factor of global environmental changes with severe impacts on human livelihoods [6]. The current rates are higher than ever recorded [7]. Many studies have also shown that land use/cover changes influence the hydrology of watersheds [8][9][10][11]. Thus, evaluating the impact of land cover (LC) and climate changes on water resource availability is an important Earth 2021, 2, 225-247. https://doi.org/10.3390/earth2020014 https://www.mdpi.com/journal/earth challenge for the current hydrological science [12,13]. Further, there is a growing need for a scientific community to balance human needs and environmental sustainability [14]. Thus, understanding the environmental impacts of land use/cover changes is a fundamental part of sustainable land planning and development [15]. In practice, land cover changes affect the surface water balance of an area, thus impacting evapotranspiration, initial surface runoff, and groundwater flow [14,16]. Apart from impacting water resources and hydrologic water balance, LC changes can directly affect local communities, the biota, and possible water management plans [17,18]. The spatial distribution patterns of land use/cover can substantially impact runoff and sediment transport processes at different dimensions [19,20]. It has both local, regional and global occurrence and is reported to continue in the future [14]. It potentially has large impacts on water resources; thus, it is important to understand the possible effects of LC changes on the runoff variability and possible measures [21].
Most places in East Africa have experienced a conversion of natural forests to settlements, urban centers, farmlands, and grazing lands at varying dimensions [22,23]. This conversion creates a need for a balance between the needs for a sprouting population, such as food production, and the minimization of the negative environmental impacts on other ecosystem services such as quality and quantity of water [24]. Although food production requires a sustainable water supply, land use/cover changes resulting from agricultural expansion affect water resources. These changes afflict food production in the long run [25]. Thus, it is essential to manage land use/cover changes at a catchment scale [26]. However, quantifying its impacts is one of the challenging activities [27].
Mt. Kilimanjaro slopes are typical landscapes with the highest recorded land cover dynamics, and their consequences on water resources, food and energy production have been reported in previous studies [28]. Changes in land cover in most of the areas surrounding Mt. Kilimanjaro slopes have the potential to impact water resources [29][30][31][32][33][34]. These changes trigger the need to understand land cover change trajectories and surface-groundwater interaction among the critical requirements in water management practices in the area. Surface-groundwater interaction affects water management and water rights changes, nutrients loading from aquifers to streams, and in-stream flow requirements for aquatic species at a watershed scale [35,36]. The knowledge regarding land use/cover changes in relation to water balance components on the slopes of Mt. Kilimanjaro is of utmost importance due to limited information with regard to groundwater flow [37].
Land degradation and land cover changes have contributed to the decline in surface water resources around Mt. Kilimanjaro [34]. Human activities are reported as the main contributor to the rapid land cover changes around the same area. The activities include encroachment due to logging, agricultural expansion and settlements, which in turn have created significant changes in the land cover [38]. Indeed, the increased anthropogenic activities are mainly driven by the fast-growing population, which also strengthen the pressure on the available water resources on a day to day basis [39]. Other drivers include socioeconomic development and pressure on land for expansion in agriculture [40].
Some of the proposed strategies to curb the expansion of agriculture due to the demand for food to suffice the expanding population on the slopes of Mt. Kilimanjaro include the improvement of the land tenure security and the introduction of modern land and water conservation measures [41]. These strategies are aimed at increasing per capital production and discourage the opening up of more land for agriculture [34]. Shrinking in the forest area along the mountain landscapes has also been reported in several studies [42,43]. However, the quantitative estimates of water losses due to deforestation in the Pangani basin are scanty or missing [44]. Hence, it is essential to understand the effect of land management practices at the basin and sub-basins scale; these practices increase the impact of hydrologic variability on the society and ecosystem [21].
Mount Kilimanjaro is hydrologically important [42,45], as a water tower for the East African region [46]. Local populace on the mountain slopes of North-eastern Tanzania and South-eastern Kenya predominately depends on freshwater supplies for domestic, hy-dropower production, industrial, and irrigation [42,47]. Further, the mountain harbors the most effective water source in the fog interception zone along the thick forest reserve [45]. Therefore, impacting the water balance of Mt. Kilimanjaro will affect the attainment of the local, regional and global sustainable development milestones.
KWK watershed is one of the mountainous watersheds along the southern slopes of Mt. Kilimanjaro and the northern part of the Pangani river basin. Being located on the mountain slopes with greater human activities, KWK experiences tremendous changes in its land cover, thus necessitating quantifying the impacts of land cover changes on the water balance of the watershed for sustainable water resources management. However, despite the growing need, simulation of hydrological processes using water balance components, such as surface runoff in mountainous areas with irregular terrain and complex hydrologic processes, is challenging [48]. Estimating the parameters for the hydrological simulation model (SWAT) is hampered by the variation of temperature and precipitation with elevation and spatial variability due to complex terrain [49].
A modeling approach is typically the best method to assess the impacts of land use/cover change on the water balance. Models can be used to evaluate the historical and future implications of land use/cover changes on the hydrology of a catchment [50]. This study used a soil and water assessment tool (SWAT) for assessing the impacts of land cover changes on the water balance components of the KWK watershed on the southern slopes of Mt. Kilimanjaro. SWAT has been tested and used to solve complex watershed management problems in many regions all over the world [21,47,[51][52][53]. Therefore, the objectives of this paper are three-fold viz: to set up, parameterize and calibrate the SWAT model in terms of streamflow. Further, to investigate the impact of land cover changes on the water balance in historical (1993-2018) and the near future (2018-2030), and to establish the impact of the individual land cover type on the water balance using PLSR modeling. This study provides a comprehensive analysis of the historical and future land cover dynamics and their impacts on the hydrological processes. Hence, the findings of this study are useful for the advancement of our policies, practices and management practices aimed to attain environmental and water resources sustainability. Apart from analyzing the historical impacts of land cover changes on hydrological processes, this study also forecasts the future hydrological parameters in the study area based on changes in land cover if the current land cover changes go unattended. In a broader perspective, the results from this study may help realization of the United Nations sustainable development goals, such as ending poverty (Goal 1), food security (Goal 2), availability and sustainable management of water and sanitation for all (Goal 6), biodiversity conservation (Goal 15) and energy for all (Goal 7) [54].

The Study Area
This study was done at the Kikafu, Weruweru, and Karanga (KWK) watershed, which spans from the thick mountain forest on the southern slopes of Mount Kilimanjaro (Figure 1a). The study area, rainfall pattern (Figure 1b), population and anthropic activities were detailed in Said, Komakech [28] and Said, Hyandye [41]. It is worth mentioning that the largest water extractor is agriculture, being dominated by small to large scale irrigation along the mountain slopes. The primary soil types in the agricultural area are chromic luvisols, Eutic cambisols, haplic nitisols, and humic nitisols (Figure 2b). More than 70% of the watershed falls under 0-8% slope class (Figure 2a) dominated by maize and bean farms; runoff velocity is high in the lowlands due to a comparatively high annual rainfall in the highlands. Therefore, runoff is the most crucial management factor in the watershed.

The General Approach of the Study
This study employed both the historical and the near future (2030) land cover change analysis approach to analyze and simulate water balance changes in the KWK watershed. Several studies suggest that there might be potential impacts of land cover changes on the

The General Approach of the Study
This study employed both the historical and the near future (2030) land cover change analysis approach to analyze and simulate water balance changes in the KWK watershed. Several studies suggest that there might be potential impacts of land cover changes on the hydrology of Mt. Kilimanjaro; nevertheless, there is limited knowledge on this. Thus, with

The General Approach of the Study
This study employed both the historical and the near future (2030) land cover change analysis approach to analyze and simulate water balance changes in the KWK watershed. Several studies suggest that there might be potential impacts of land cover changes on the hydrology of Mt. Kilimanjaro; nevertheless, there is limited knowledge on this. Thus, with a high population rate in the watershed, it was worth analyzing these impacts of land cover changes on hydrological processes. This study is an integrative work done in a geographic information system (GIS) environment and statistical analysis at a watershed scale. Field surveys and interviews were also done to gain community insight on water demand, withdrawal and crop management. Where necessary, precise locations of some of the land features such as forests, water bodies, forests and agriculture fields were determined using a hand-held Garmin Global Positioning System (GPS) for further analysis. ArcSWAT embedded in ArcGIS version 10.5 was used to set up and parameterize the KWK SWAT model. The calibration of the KWK SWAT model was done in SWAT Calibration and Uncertainty Procedures (SWAT CUP) using the river flow gauged data. The land covers were varied in SWAT to establish the relationship between land cover changes and hydrological response in the studied watershed. Thereafter, PLSR was done to determine the relationship between individual land use/cover changes and hydrological response using the XLSTAT add-in tool in Microsoft Excel.

Soil and Water Assessment Tool (SWAT)
Soil and water assessment tool (SWAT) is a popular physical-based hydrological model used to simulate hydrologic processes within the watershed [55]. Since its development by the United States Department of Agriculture (USDA), SWAT has been used to predict the impact of management practices on water, sediments, and agricultural chemical yields in large data-scarce basins [56]. SWAT has been used to assess the impacts of anthropogenic activities that degenerate the natural river basin systems and thus impacting the water balance of a watershed [57,58]. In recent years, SWAT has been widely used to simulate watershed hydrological processes in many countries [21,52,53,59,60].
SWAT operates on a daily time step with optional monthly or annual output. The model divides a watershed into a unique combination of soil and vegetation types which provides the basic unit for computation of flow accumulation. SWAT simulates the hydrological cycle using the water balance equation (Equation (1)) where SW t (mm) is the final soil water content, SW 0 is the initial soil water content on day i, t is the time, R i is the precipitation amount on day i. Q i (mm) is the amount of surface runoff on day i, ET i (mm) is the evapotranspiration (ET) amount on day i. G i (mm) is the amount of water entering the vadose zone from the soil profile on day i, and B i is the amount of return flow on day i. To be able to tell whether the changes were attributed to climate or land use/cover changes, only land use/cover was varied. The land phase of the hydrological cycle is simulated in the soil and water assessment tool (SWAT) based on the water balance equation [61]. SWAT model simulates the surface runoff volumes and peak runoff rates using the Soil Conservation Service (SCS) curve number (CN) [62] for each of the hydrologic response unit (HRU) using daily rainfall data. The HRU analysis in SWAT includes the delineation of HRUs by overlaying the spatial map of slope classes, land use, and soil data. Thus, changes in land use will likely generate HRUs of different pattern; as a result, different hydrological parameters will be simulated.

ArcSWAT Model Input Data
Watershed data, namely soil, land use, slope, and weather data, are required to run the SWAT model. These data are used to define the hydrologic response units (HRU) of the watershed. The detailed information for all data used to prepare the SWAT model for the KWK watershed is summarized in Table 1. The digital elevation model was derived from the 30 m resolution topography data obtained from the Shuttle Radar Topography Mission (SRTM). The digital elevation model (DEM) was used to delineate the watershed, generate a stream network, and provide topographical parameters, such as overland slope and slope length for each watershed. Burning of the digitized stream network from the Google Earth interface was opted to eliminate errors due to DEM elevation generalization. In this option, the stream network was overlaid onto DEM to force alignment of the stream to follow the specific path. From the same elevation data, the slope map ( Figure 2a) was generated using the spatial analysis tool. The slope map was reclassified into five categories, i.e., <0-8%, 8-15%, 15-25%, 25-45% and >45%.
Soil data is essential for hydrological simulations with SWAT. Soil physical and chemical properties (texture, organic carbon, bulk density, soil available water content, hydraulic conductivity and bulk density) in soil layers help as determining factors for surface runoff [5]. Soil data were obtained from the Food and Agriculture Organization (FAO) Harmonized global soils database at http://www.waterbase.org/download_data.html (accessed on 17 May 2021). The watershed boundary was used to extract the soil data from the FAO soil database of the African soils slice. The soil map comprised six categories ( Figure 2b). The attributes of these soils were updated using a "user soil" table from the MapWindow SWAT12 database due to the fact that the "user soil" table of ArcSWAT12 soil database contains USA soils only.
The land cover maps for the years 1993, 2006, and 2018 ( Figure 3) were classified from time-series Landsat satellite images. Further, the near future (2030) land cover map was simulated using the Markov Chain and Cellular Automata (CA) models embedded in Idrisi Selva Software. The details of the land cover classification and CA-Markov validation are not part of the current work. However, it is worth mentioning that the classification accuracy assessment values satisfy the minimum accuracy threshold of 85% needed for efficient and authentic LC change analysis and modeling [63,64]. The assessment results are regarded as acceptable because Kappa values are greater than 80%, as Jensen (2007) stated. This procedure is detailed in Said, Hyandye [41]. These maps were independently used to simulate the hydrological impacts of land cover changes at the KWK watershed.
Precipitation data was obtained from the four ground-based weather stations at Kilimanjaro International airport (KIA), Lyamungu, Moshi Airport, and Kibosho Mission ( Figure 1b). These were combined with the Climate Forecast System Reanalysis (CFSR) global weather data for SWAT. KIA weather station is slightly out of the watershed, but it contains good quality data, and it is located in influential area data that necessitated its inclusion in this study. The ground-based gauging stations data were provided by the Tanzania Meteorological Agency, whereas the CFSR weather data were freely accessed from http://globalweather.tamu.edu/ accessed on 18 October 2019. The use of the CFSR data was due to the unavailability of some parameters for the three stations within the watershed (wind speed, solar radiation, and relative humidity). Precipitation data was obtained from the four ground-based weather stations at Kilimanjaro International airport (KIA), Lyamungu, Moshi Airport, and Kibosho Mission ( Figure 1b). These were combined with the Climate Forecast System Reanalysis (CFSR) global weather data for SWAT. KIA weather station is slightly out of the watershed, but it contains good quality data, and it is located in influential area data that necessitated its inclusion in this study. The ground-based gauging stations data were provided by the Tanzania Meteorological Agency, whereas the CFSR weather data were freely accessed from http://globalweather.tamu.edu/ accessed on 18 th October 2019. The use of the CFSR data was due to the unavailability of some parameters for the three stations within the watershed (wind speed, solar radiation, and relative humidity).
The CFSR of the National Centers for Environmental Prediction (NCEP) provides ready to use weather data with a good resolution between 1979 and 2014 [65]. The CFSR data often captures the rainfall pattern very well; however, it often overestimates the gauged rainfall [66]. Hence, Kilimanjaro and Lyamungo stations were used to perform bias correction of the CFSR precipitation data. Additionally, the two more stations in the highland areas (not shown in Figure 1) were used during bias correction of CFSR precipitation data; the highest station was located at an altitude of around 2200 m a.m.s.l. The bias of the CFSR data was corrected by a linear bias correction approach which is well described by Worqlul, Yen [67]. This approach reduces the volume difference between CFSR and gauged rainfall data while keeping the pattern. The two datasets (uncorrected CFSR and gauged rainfall data) involved in the linear bias correction process covered the same time window . The annual volume difference between the observed and bias-corrected data was minimized to zero.
The river discharge data for model calibration and validation periods with varying time length ranging from 1979 to 2018 for Kikafu, Weruweru and Karanga rivers at their gauging station were obtained from the Pangani Basin Water Organization. River discharge data cleaning revealed two anomalies; the data discontinuity (gaps) which might be due to inadequate gauge reading and or records keeping and abnormally high discharge values. During data cleaning, abnormally high discharge values were removed, and gaps were filled using simple interpolation and linear regression methods [68]. The hydrological years with relatively consistent data series were considered for the model, whereas those with the most significant outliers were eliminated from further application in hydrological modeling. Generally, the gauge stations are located in perennial outlets The CFSR of the National Centers for Environmental Prediction (NCEP) provides ready to use weather data with a good resolution between 1979 and 2014 [65]. The CFSR data often captures the rainfall pattern very well; however, it often overestimates the gauged rainfall [66]. Hence, Kilimanjaro and Lyamungo stations were used to perform bias correction of the CFSR precipitation data. Additionally, the two more stations in the highland areas (not shown in Figure 1) were used during bias correction of CFSR precipitation data; the highest station was located at an altitude of around 2200 m a.m.s.l. The bias of the CFSR data was corrected by a linear bias correction approach which is well described by Worqlul, Yen [67]. This approach reduces the volume difference between CFSR and gauged rainfall data while keeping the pattern. The two datasets (uncorrected CFSR and gauged rainfall data) involved in the linear bias correction process covered the same time window . The annual volume difference between the observed and bias-corrected data was minimized to zero.
The river discharge data for model calibration and validation periods with varying time length ranging from 1979 to 2018 for Kikafu, Weruweru and Karanga rivers at their gauging station were obtained from the Pangani Basin Water Organization. River discharge data cleaning revealed two anomalies; the data discontinuity (gaps) which might be due to inadequate gauge reading and or records keeping and abnormally high discharge values. During data cleaning, abnormally high discharge values were removed, and gaps were filled using simple interpolation and linear regression methods [68]. The hydrological years with relatively consistent data series were considered for the model, whereas those with the most significant outliers were eliminated from further application in hydrological modeling. Generally, the gauge stations are located in perennial outlets with discharge throughout the year, the peak discharge follows the rain season. The discharge exhibits small increase during the short rains between October and December (OND); however, this has not been consistent due to variability in this season. The March-April-May (MAM) rains are the ones producing highest discharge with the highest peak in May. However, calibration was only done at one station in the outlet due to data quality issues.

Model Set-Up and Parameterization
A total of 33 sub-basins and 532 hydrologic response units (HRU) were delineated during the SWAT model set-up and HRU definition processes. The HRU are the areas in the watershed with a combination of unique soil, slope and land cover. This unique combination helps to account for differences in evapotranspiration and other hydrological conditions with different land covers, soils, and slopes [69]. Having achieved the basic operational KWK SWAT model, the model was further edited to include management activities such as crops management schedule season in the watershed. Since it was not possible to estimate the rate of fertilizer applications for various crops as well as quantifying the amount of water used for irrigation, the KWK SWAT model adopted auto fertilization and auto irrigation for the crops.
The KWK SWAT model parameters, such as the potential evapotranspiration, were estimated using the Hargreaves method, while the curve number method was used for the runoff estimation. Other parameters were manually edited before carrying out automatic parameters estimation using the SWAT model calibration and uncertainty procedure (SWAT-CUP) [70]. The manual parameter adjustments were based on the expert knowledge of the watershed physical and hydrogeological characteristics such as river channel width, leaf area index, and soil types.

Automatic Calibration and Validation of the SWAT Model
The SWAT model calibration and uncertainty procedure (SWAT-CUP) was used to calibrate and validate the KWK SWAT model automatically. Model calibration and sensitivity analysis were done at the watershed outlet due to data discontinuity and abnormally high discharge values as compared to precipitation. Simulations that were set up using the 1993 LULC map were used to calibrate monthly streamflow from 1987 to 1993. After calibration, the simulations that were set up using the same LULC map was used to validate the monthly streamflow from 1994 to 2000. Selection of the calibration and validation time i.e., between 1987 and 2000 was based on the quality of the available discharge data, and good data continuity. Sensitivity analysis was carried out in SWAT-CUP, whereby 22 flow parameters were used, and the model was run for 1000 iterations. The significance of the sensitivity of each parameter was determined using t-stat and p-value [71]. The higher the absolute t-stat values among the parameters and the smaller p-values, the higher the sensitivity. Usually, p-values close to 0 and 0.05 are acceptable [72]. The t-stat is used to identify the relative significance of the parameter, whereas the p-value is used to ascertain the sensitivity significance [71].
Generally, the calibration process was based on varying the hydrological parameters iteratively. The agreement between the simulated and observed streamflow was used as a decision tool for the final parameter variation. The model performance was evaluated comparatively using the streamflow hydrograph for simulated and observed streamflow for both calibration and validation periods. Statistical evaluation of the model was done using per cent bias (PBIAS), coefficient of determination (R 2 ), the Nash and Sutcliffe simulation efficiency (NS). R 2 is a measure of the extent of uniformity between observed and simulated data; R 2 ranges from 0 to 1, with higher values indicating high suitability. However, values higher than 0.5 are considered acceptable for simulation [73]. NS show the extent that observed and simulated data fit each other (Nash and Sutcliffe, 1970), where NSE = 1 is the best value. Per cent bias (PBIAS) measures the average tendency of the simulated data to be larger or smaller than their observed values for a given quantity over the calibration or validation period; the value becomes the best as it comes to zero. The RMSE-observations standard deviation ratio (RSR) standardizes RMSE using the observations standard deviation; generally, the best simulation performance is considered to have relatively lower RSR and hence lower RMSE. The details of model evaluation can be found in Moriasi, G. Arnold [74].
The calibrated flow parameters were used to check the model capacity to simulate measured streamflow results. The streamflow validation of the model was done using a new streamflow dataset without any adjustment in the calibrated flow parameters. Evaluation of the model performance was done using PBIAS, R 2 , and ENS, respectively.
The calibrated and validated parameters from SWAT-CUP were returned into the KWK SWAT model. The new parameters from SWAT-CUP were used to either replace or modify the existing values in the SWAT model. This process was accomplished using the Manual Calibration Helper function in the ArcSWAT environment. Updating the SWAT model using the calibrated and validated parameters from SWAT-CUP meant that the KWK SWAT model was ready for use in hydrological processes simulation in the KWK watershed.

Partial Least Squares Regression Analysis
Partial least squares regression (PLSR) is a robust multivariate regression method, especially when there are collinear predictors, high correlated predictors, numerous predictors equal to or higher than observations and many independent variables [75]. Analysis of land cover change impacts on water balance components is widely performed using multivariate regression analyses [5,21,74,75]. PLSR is an extension of a multiple linear regression model. In the simplest form, a linear model specifies the relationship between a dependent variable y, and a set of predictor variables x, as shown in Equation (2). The model parameter is estimated as a slope of a simple bivariate regression between a matrix column or row as the Y-variable and another parameter vector as the X-variable; this is done for each variable [76].
where k 0 is the regression coefficient for the intercept and k i values are the regression coefficients (for variables 1 to n) computed from the land cover change data.
The predictive quality of the model in this study was improved by running series of PLSR models, in each run; the Q 2 cumulated (Q 2 cum ) was used to eliminate variables with the least influence until the largest Q 2 cum was attained. Generally, Q 2 cum above 0.5 is considered to be of good predictive power [74]. The cross-validated root mean squared error (RMSECV) was used to avoid skewness by the data points, especially when there are outliers. The model's predictive power was measured by using the global goodness of fit (R 2 ) and the cross-validated model quality index (R 2 cross ). R 2 is a fraction of variance in the dependent variable, which can be predicted by the model; whereas, R 2 cross measures prediction goodness. Generally, the importance of predictors (for all variables) is measured by the variable importance for the projection (VIP), where the larger the values, the higher the predictor relevance.
In determining the land use types that interact with hydrological components more than the other, the regression coefficients (RCs) and the variable importance of the projection (VIP) were used. VIP values can be used to show a broader expression of the relative importance of predictors [77]. Based on the world's criteria, VIPs are categorized to indicate the importance of a particular predictor in influencing the dependent variables; VIP values > 1 and/or VIP > 0.8 show that the predictor variable is significantly important, i.e., those predictors with large VIP values can best explain the dependent variable [78,79], the values >0.8 are most pertinent, whereas the values <0.5 show insignificance in explaining the variable [74,78,80]. The RCs show the direction and strength of the impact of each independent variable. Generally, a small RC and large VIP shoes the importance of the variable in prediction in that direction, whereas the small RC and small VIP indicate insignificance of the particular variable; thus, it can be omitted from the model. Moreover, PLSR weight (w) provides more precise and reasonable insight into the sign (+ or −) of the weight of the corresponding coefficients in the model. Usually, the squares of w values greater than 0.2 show that the PLSR component is mainly weighted on the corresponding variables. In addition, the w value shows the direction of influence that the LC class has on the corresponding water balance components. i.e., the negative sign means inversely related, whereas a positive sign means it is directly proportional to most of the selected hydrologic components.
PLSR method associates principal component analysis (PCA) and multiple linear regression features [81]. The method is suitable when the predictors show multicollinearity [82,83]. Generally, land use data exhibit collinearity because an increase in the percentage/area of one type will automatically decrease the percentage/area of one or more of the other land use type [84]. Thus, it is appropriate in this study because it eliminates co-dependence among the variables and provides a more unbiased view of the contribution of the changes in water balance components at the watershed scale.
In this study, PLSR modeling was used to determine the association between the simulated hydrological components and land use/cover classes. The land use types are the predictors, and the annual hydrological components are the response feature. These analyses were done using the statistical package for social science (SPSS) version 21 and the XLSTAT add-in tool [50].

Simulation of Impacts of LU Change Scenarios on Hydrological Processes
The calibrated and validated SWAT model was used to simulate the impacts of LU changes on the hydrological processes of the watershed based on historical LU data for the years 1993, 2006 and 2018. The future impacts of the LC change scenario were assessed using the LU map for the year 2030. Both temporal and spatial variation of land cover change scenarios on various watershed water balance components were considered. In this study, the annual average values of the selected hydrological components were used to assess the impact of land cover changes on the hydrology of KWK watershed. The selected hydrologic components were surface runoff (SurfQ), lateral flow (LatQ), groundwater flow (GWQ), actual evapotranspiration (ET), and water yield (WatQ).
The methods and steps applied in this study to assess LUC change impacts on hydrological processes in KWK watershed are summarized in Figure 4. The figure can be split into three major blogs: the SWAT model set up and parameterization (left-hand side), SWAT model calibration and validation, and fine-tuning of the model (middle blog). The last blog (right-hand side) presents the PLSR modeling and assessment of impacts of land use and cover changes on hydrological processes.

Sensitivity Analysis
The sensitivity analysis showed that there are nine most sensitive parameters of the KWK SWAT model that minor changes in their values were found to influence river flow, shown in Figure 5. These parameters were those with p-value ≤ 0.05. The parameters include a_CN2.mgt, v_ALPHA_BF.gw, a_HRU_SLP.hru, v_GWQMN.gw, a_CH_K2.rte, a_SOL_AWC().sol, a_SLSUBBSN.hru, v_GW_REVAP.gw, a_CANMX.hru and SOL_K().sol. The t-statistical values of the parameters are used to rank the parameters in the order of magnitude of their influence on the streamflow. According to Abbaspour [72], the higher the absolute t-test values, and low p-values, the more sensitive the parameter.

Sensitivity Analysis
The sensitivity analysis showed that there are nine most sensitive parameters of the KWK SWAT model that minor changes in their values were found to influence river flow, shown in Figure 5. These parameters were those with p-value ≤ 0.05. The parameters include a_CN2.mgt, v_ALPHA_BF.gw, a_HRU_SLP.hru, v_GWQMN.gw, a_CH_K2.rte, a_SOL_AWC().sol, a_SLSUBBSN.hru, v_GW_REVAP.gw, a_CANMX.hru and SOL_K().sol. The t-statistical values of the parameters are used to rank the parameters in the order of magnitude of their influence on the streamflow. According to Abbaspour [72], the higher the absolute t-test values, and low p-values, the more sensitive the parameter. The information regarding parameters sensitivity analysis is always helpful in making modelers' work easy as it highlights which parameters to fine-tune during calibration of a hydrological model.

Model Parameters, Calibration and Validation
The calibration and validation curves plotted together with mean monthly rainfall are shown in Figure 6. The rainfall and monthly river discharge in Figure 6 show that the model captured well hydrologic processes within the watershed. Generally, the simulation reflected the observed flow, which shows that the model can efficiently simulate the hydrological impacts of land use changes over the 1993-2030 periods.  Figure 6 shows that the simulated flow reflected the observed flow. However, it is worth stating that the model did not mimic very well both low and high flows as indicated

Model Parameters, Calibration and Validation
The calibration and validation curves plotted together with mean monthly rainfall are shown in Figure 6. The rainfall and monthly river discharge in Figure 6 show that the model captured well hydrologic processes within the watershed. Generally, the simulation reflected the observed flow, which shows that the model can efficiently simulate the hydrological impacts of land use changes over the 1993-2030 periods.

Model Parameters, Calibration and Validation
The calibration and validation curves plotted together with mean monthly rainfall are shown in Figure 6. The rainfall and monthly river discharge in Figure 6 show that the model captured well hydrologic processes within the watershed. Generally, the simulation reflected the observed flow, which shows that the model can efficiently simulate the hydrological impacts of land use changes over the 1993-2030 periods.  Figure 6 shows that the simulated flow reflected the observed flow. However, it is worth stating that the model did not mimic very well both low and high flows as indicated Figure 6. Comparison between the simulated and observed monthly discharge of KWK watershed. Figure 6 shows that the simulated flow reflected the observed flow. However, it is worth stating that the model did not mimic very well both low and high flows as indicated in Figure 6. There is a high level of baseflow throughout the year due to springs along the watershed.
The model calibration parameters that were used for calibration and their default range are summarized in Table 2. Parameter values, i.e., NSE, PBIAS, and R 2 were, respectively, estimated as 0.61, 0.59, and 0.68 in calibration, and 0.66, 0.54, and 0.67 during validation as summarized in Table 3.

The Impact of Land Cover Changes on the Hydrology of the KWK Watershed
The influence of land cover changes on the hydrology of a watershed is indicated using changes in vegetation that signify changes in CN in different decades. Decrease in forest area and changes in other vegetation increases changes in CN of a watershed; reports show that increases in built-up area and increase in population decrease vegetation cover [85]. Continuous demand for space and other natural resources can influence people to shift and settle in forested areas, which will result in forest degradation and increased surface runoff in these areas.
Generally, the results of the impact of land cover changes on the hydrology of the KWK watershed showed that the LC fluctuations in the catchment for the past 25 years (1993-2018) impacted the annual water balance component values of a watershed ( Table 4). The results show higher annual variation in GWQ as compared to LatQ and WatQ. The predicted LC changes of 2030 will result in lower values for almost all selected hydrological parameters than in previous years . It is expected that expansion in agricultural land and built-up areas reduces grassland and shrubland and increases SurfQ and streamflow, and consequently, reduces GWQ [86]. Furthermore, the correlation among hydrological components was also studied ( Table 4). The results show a relatively high positive correlation among the hydrological components at varying degrees. Ideally, the results suggest that changes in one hydrological component result in a change in the value of other components.

The PLSR Model Explained Variations of Individual Land Cover Changes on Water Balance
The partial least squares regression (PLSR) model of the hydrological components in the KWK watershed presented in Table 5 indicates a strong correlation between land cover changes and the variations in water balance components. That is, the correlations between the explanatory (X) variable (Land use) and dependent (Y) variables (Water balance components), with the components all close to 1, correspond to the total explained variations in Y and Q 2 cum [21]. Variation in the hydrological components (Cum explained variation in Y (%)) were 94.1%, 96.8%, and 98.6% for component numbers 1, 2 and 3, respectively. These high correlation values indicate that the PLSR model captured well both the X and the Y variables.

Hydrological Impacts of Individual Land Cover Changes on the Selected Water Balance Components
The areal changes under built-up, agriculture, as well as water positively attributed to the changes in evapotranspiration (ET), surface runoff (surfQ), water yield (WatQ), groundwater flow (GWQ) as well as lateral flow (LatQ) from PLSR are presented in Table 6. Although glacier ice is one of the land use types, it is worth mentioning that the contribution of the snow was not taken into account as a contributor to the annual runoff. The motive behind this decision is that previous studies suggest the absence of isotopic signatures in springs and river discharges [37]. Further, the rocky surface was also omitted during VIP tests.
Water balance is and has always been a crucial aspect to grasp in order to handle water management problems effectively [21]. In this study, changes in the area under natural vegetation such as forest, shrubland, wetland, and grassland were negatively correlated with SurfQ at varying degrees (Table 7). In Table 4, these land covers were found to be in the decreasing trend. These correlation results in Table 6 would mean that improving the vegetation cover on the hillside and abandoned land area could help to reduce the direct surface runoff in the KWK watershed. As a result, it will help to reduce flooding recurring in the area and affecting most of the people in the lowlands.

Model Parameters, Calibration and Validation
The high level of baseflow could have resulted from the incapability of the model to capture baseflows. This observation was also reported by Kishiwa (2018). Additionally, the mountainous nature of the study watershed may be attributed to this phenomenon. The parameter values for NSE, PBIAS, and R 2 in Table 3 are 0.61, 0.59, and 0.68, respectively, for calibration; 0.66, 0.54, and 0.67, respectively, during validation showed satisfactory results. The NSE value > 0.5, PBIAS < ±10 < PBIAS < ±15, and R 2 ≥ 0.6 [73,88] indicating that the model is acceptable for the hydrological simulations. In this regard, the calibrated SWAT model used in this study could efficiently and reliably be used in simulating the hydrological impacts of land use changes in the KWK watershed for the 1993-2030 time step.

The Impact of Land Cover Changes on the Hydrology of the KWK Watershed
The results show that land cover has changed throughout the study period within the catchment; previous studies also show that land use has changed on the entire slopes of Mt. Kilimanjaro [29,32,33,89,90]. The possible reasons for the escalation of agricultural land and the built-up area may be due to higher population growth rate and socioeconomic development such as the fair prices for horticultural crops in the lowlands. Thus, more people are engaging in growing crops in the catchment. Further, the influx of people from outside and highland areas has resulted in an increase of cropland and extension of the built-up areas to the lowland areas that were previously considered comparatively dry and less productive. Perhaps we could experience more decrease in the area under forest; however, most of the forest area falls in the upper area of the Kilimanjaro National Park, which is protected by the law. Although, reports still show anthropogenic activities such as logging, forest burning, charcoal making, the establishment of new villages, livestock keeping and cultivation, landslides, and quarrying across the protected forest reserve [91]. Additionally, the conversion of the lower montane forest into coffee plantation are among the factors contributing to forest loss [92].
Despite the conservation activities, it is worth stating that there is still a flimsy decrease in the forest area, especially close to the forest borders and in the lowlands [29]. Thus, observation shows that the magnitude of variation in ET values is small despite the change in LC with time. However, monthly fluctuations in ET support that the model perfectly captures cropping season that is expected to have relatively higher ET values during vegetative stages; although in general agreement with the fact that seasonal crops (agricultural land) have less ET than perennial trees [5,80]. Studies show that evapotranspiration takes a countable amount of water infiltrating the soil in semiarid regions, and effective recharge depends solely on extreme rainfall events [93].
The expansion of agricultural land, built-up areas, and shrinking of grassland [41] may have resulted in increased surface runoff in the KWK watershed, especially in the lowlands. The changes in WatQ as a result of changes in vegetation is due to changes in CN values. CN values increased with an increase in built-up areas, bare agricultural lands and decreased forest areas [85]. Although, the variation in land use results in changes in hydrological processes, the soil types, geological conditions, and slope are among the factors governing the impacts of land use changes on the water balance components [94]. Thus, KWK being located on the mountain slopes is expected to be affected by these factors. Other studies reported a similar trend in the basin. For example, Kishiwa, Nobert [47] predicted an increase in the annual streamflow by 10% in the year 2060 and steady growth in the streamflow annually, taking 2001 as a baseline year.
Generally, in a situation where most vegetation is converted to built-up areas and the expansion in agricultural land, compaction causes lower permeability and storage capacity resulting in a lower infiltration capacity. Thus, transforming a substantial fraction of rainfall into surface runoff. SurfQ and GWQ increases as the areas with cleared vegetation for agriculture and built-up areas increases. A similar observation was reported in South Africa, where SurfQ is higher and GWQ is lower in bare lands [95]; GWQ is higher in the forest and other vegetative places due to increased infiltration into the shallow and deep aquifers. This may be a reason for flood reoccurrence being common in lowland areas, especially at the beginning of the season, where there are no plants in the farms.

The PLSR Model Explained Variations of Individual Land Cover Changes on Water Balance
The PLSR variable importance of the projected values (VIP) and weights of the independent variables are given in Table 7. The highest VIP value was obtained in agricultural land (VIP = 1.57), built-up area (VIP = 1.28), barren land (VIP = 1.13), shrubland and grassland (VIP = 1.10). It can be noted from Table 7 that forest has comparatively lower influence in impacting the selected water balance components (VIP 0.89). Nevertheless, the forest is still important in influencing water balance components, whereas wetland and water were of minor importance. The relative importance of predictors (VIP) shows that water and wetland exert comparatively less significance (VIP less than 0.8). Component 1 was dominated by agricultural land and built-up areas on the positive side, whereas water, forest, barren land, grassland, wetland and shrubland were on the negative side. In component 2, all land use/cover classes were negative sided except forest, which was on the positive side. Component 3 was also dominated by the negative sided classes except for forest and grassland, while the positive side included water, barren land, wetland and shrubland, agricultural land and built-up areas.

Hydrological Impacts of Individual Land Cover Changes on the Selected Water Balance Components
Most of the previous studies have shown that an increase in vegetation cover, particularly forests, leads to decreasing SurfQ and flood occurrence [5,21]. According to [96], clearing of forest by 45% increased the SurfQ by around 40%. Additionally, an increase in runoff due to replacing rangeland through expansion in agricultural land and built-up areas was reported from 2000 to 2013 in the Olifants basin, South Africa [95].
Increased ET and increased runoff may be a result of the conversion of vegetated areas to agriculture and built-up areas. However, the increased agricultural lands lead to higher water abstraction [97]; this might be the reason for the slight decrease in ET in the near future (Table 4). Increased SurfQ due to increased built-up area was reported in some previous reports [98,99]. Therefore, it is envisaged that the observed reduction in ET reflects the conversion to agricultural lands, which increases water use. Additionally, conversion to agricultural land increases the CN value, thus reducing ET. Memarian, Balasundram [15] also observed that an increase in a built-up area and agricultural land increases CN, thus reducing ET.
Further, increase in cultivated lands is at the expense of other vegetation covers; thus, increased runoff was also reported by Gashaw, Tulu [5] in the Andassa watershed, Blue Nile Basin, Ethiopia, Woldesenbet, Elagib [80], and [100] in Rwanda. Contrary to these results, the study by Twisa, Kazumba [53] in the Wami Ruvu basin revealed a negative influence on surface runoff and water yield by natural forest, woodland, bushland, grassland, water, and wetland. Further, the study reported a negative influence of the built-up area on GWQ and ET. At the same time, natural forest, bushland, woodland, grassland, water, and wetland had a positive influence on ET and GWQ. However, KWK being located on a mountainous place with complex terrain is expected to behave differently from other catchments.
Agricultural land and built-up areas are the main attributes in the changes in WatQ, SurfQ, ET, and GWQ; this means that with the ongoing expansion in agricultural land and built-up areas, the water balance of a watershed will be affected negatively. However, it is worth mentioning that groundwater was not used in the study for validation; thus, care must be taken during GWQ results interpretation. The observation supports the report by Anand, Gosain [51], stating that the intensification of urban and cultivated lands is an essential environmental stressor that significantly affects water balance components of a catchment. For instance, an increase in the built-up area contributes to a higher proportion of surface runoff and streamflow and lowers the quantity of groundwater flow [86].
The increase in WatQ observed in 2018 (Table 4) may chiefly be a result of a decrease in ET resulting from changed forest covers [101]. The small increase in water yield as a result of land cover effects has been reported in an East African watershed [102]. Previous studies also reported an increase in the WatQ and decreased ET [103,104]. Additionlly, an increase in SurfQ due to deforestation and vice versa has been reported in a watershed in South China [105].
Increased SurfQ and decreased ET rate observed in Table 4 may equally be attributed to increased built-up areas, and reduction in forest cover. Previous reports that used the SWAT model also reported a similar trend in the Jinjiang catchment, China, due to deforestation and increased built-up areas [106]. Forest degradation and increased built-up areas decrease ET and increase SurfQ in SWAT simulations [107]. Additionally, Sajikumar and Remya [108] used SWAT to assess the impacts of the land cover on runoff; their report shows increased runoff due to changed land cover in Kerala.
Generally, the increase in runoff may imply increasing soil erosion and sedimentation if left unattended. Twisting land use trends to allow more vegetation cover will reduce wet season flow, increase dry season flow, SurfQ, LatQ and GWQ [75]. Techniques such as replacing cereals with fruit trees and intensifying agronomic practices will advocate increasing productivity of small farms and discourage opening up more land for agriculture. Further, using soil and water management practices may be useful to increase vegetation cover at the KWK watershed.
The PLSR results in Table 7 show that all land use/cover classes are important in influencing changes in water balance components, except for shrubland and wetland. Thus, regulation of other LC classes should be implemented to regulate the impacts in the hydrology of the KWK watershed. The current reported rapid population growth, climate change and expansion in agricultural land, built-up areas, and other economic activities can significantly impact the future if the situation goes unattended [109,110]. The PLSR results from this study are suitable for designing and carrying out sustainable natural resources management practices at the basin scale and other similar environments [111]. Moreover, the results from this study show the potential of using a combined PLSR and hydrologic modeling framework for impact studies in data-constrained environments.

Conclusions
In this study, the implication of the hydrology of the KWK watershed due to changes in land cover over the past few decades and in the near future were evaluated using the calibrated SWAT model and PLSR. The KWK watershed has mostly featured transformation in agriculture and built-up area; thus, a forecast for the future hydrological processes based on changes in land cover in the study area was presented in this study. The effect of land cover change resulting from various land use activities such as urbanization and expansion in agricultural activities is evident. Explicitly, the findings of this work show that changes in the water balance components are the function of the land use changes and vegetation distribution within the watershed. The major LULC changes that affected surface runoff and groundwater components in the watershed during the study period were the expansion of agricultural land, built-up area, and shrinking of the grassland.
The cultivation land area is directly proportional to the surface runoff but inversely proportional to the groundwater flow. However, the decline in woody shrub area has the same effect as the expansion of cultivation land on surface runoff and groundwater flows. Farmland expansion has increased surface runoff and water yield while decreasing the groundwater component and actual evapotranspiration in the KWK watershed. Similarly, the decline in woodland coverage resulted in higher surface runoff and water-yield components but decreased groundwater components and actual evapotranspiration.
The surface runoff of the area during the wet season due to land use and land cover changes shows the implications in the water resources, environmental wellbeing and economic activities in downstream users, especially fishing and irrigation activities along the Nyumba ya Mungu Dam. The results suggest the threatening of the supply of ecosystem services, threatening the people living within and surrounding the watershed. Further, the increased surface runoff would trigger flood recurrence, and possibly sedimentation on the lower slopes poses a threat to fishing and agriculture production and hydropower production. The results of this study have contributed to understanding the attribution of land use changes on the hydrologic components. The information generated from this work will help stakeholders and managers to make rational choices regarding land and water resources management. The approach utilized in this work can be applied to the whole basin and other basins to predict the land use changes and associated impacts on water resources.

Limitations of the Study
The analytical power of PLSR is outstanding and useful for data analysis in many dimensions. However, PLSR is not with limitations; one limitation is that it only provides a general insight into the relationships between different predictors (land use types) and water balance components. However, it cannot provide detailed information on the spatial land use patterns [19,77]. The spatial distribution patterns of land use can significantly influence hydrologic processes at different scales. Thus, future research could focus on the relationship between spatial land use patterns and changes in the water balance components. Furthermore, the availability of good quality meteorological and river discharge datasets, and spatial distribution network might have affected the results to some extent. Thus, future studies should focus on generating good quality datasets and improve spatial distribution of ground based meteorological stations.