Importance of Detailed Soil Information for Hydrological Modelling in an Urbanized Environment

Soil information is critical in watershed-scale hydrological modelling; however, it is still debated which level of complexity the soil data should contain. In the present study, we have compared the effect of two levels of soil data on the hydrologic simulation of a mesoscale, urbanised watershed (630 km2) in central South Africa. The first level of soil data, land type (LT) data, is currently the best, readily available soil information that covers the whole of South Africa. In the LT database, the entire study area is covered by only two soil types. The second level of soil data (DSM) was created by means of digital soil mapping based on hydropedological principles. It resulted in six different soil types with different hydrological behaviour (e.g., interflow, recharge, responsive). The two levels of soil data were each included in the revised version of the Soil and Water Assessment Tool (SWAT+). To compare the effects of different complexity of soil information on the simulated water balance, the outputs of the uncalibrated models were compared to the three nested gauging stations of the watershed. For the LT scenario, the simulation efficiencies calculated with the Kling–Gupta efficiency (KGE) for the three nested gauging stations (640 km2, 550 km2, 54 km2) of 0, 0.33 and −0.23 were achieved, respectively. Under the DSM scenario, KGE increased to 0.28, 0.44 and 0.43 indicating an immediate improvement of the simulation by integrating soil data with detailed information on hydrological behaviour. In the LT scenario, actual evapotranspiration (aET) was clearly underestimated compared to MODIS-derived aET, while surface runoff was overestimated. The DSM scenario resulted in higher simulated aET compared to LT and lower surface runoff. The higher simulation efficiency of DSM in the smaller headwater catchments can be attributed to the inclusion of the interflow soil type, which covers the governing runoff generation process better than the LT scenario. Our results indicate that simulations benefit from more detailed soil information, especially in smaller areas where fewer runoff generation processes dominate.


Introduction
Soil is a dominant factor in controlling hydrological flowpaths through partitioning precipitation into different components of the water balance. This is due to the ability of soil to store and transmit water [1]. Soil information is therefore an important input into physically based hydrological models [2,3], but soil information is often not readily available in appropriate format for modellers to use [4]. Reasons for this are, firstly, that existing soil maps were not primarily produced with hydrological modelling purposes in mind [5], and, secondly, the costs and time involved in measuring important soil hydraulic properties

Study Area
The study area is approximately 630 km 2 and lies between Johannesburg (largest city in South Africa) and the capital city, Pretoria ( Figure 1). The area lies in the Gauteng Province, which hosts a quarter of the country's population (59 million) and is responsible for generating the majority of the gross domestic product. Due to the economic importance of the area, it is subjected to significant development pressure resulting from urbanisation. The Jukskei River drains the catchment in a northerly direction. The geology of the study site is granite and gneiss of the Lanseria Gneiss of the Johannesburg Dome Granite [20,22] with Leptosols, Plinthosols, Cambisols, Stagnosols and Fluvisols being the dominant reference groups found [23]. The vegetation type is Egoli Granite Grassland, forming part of the Mesic Highveld Grassland Bioregion [24]. More than two-thirds of this vegetation unit has been transformed by urbanisation. The catchment lies between 1245 and 1709 m.a.s.l., on the Highveld of South Africa. The terrain is hilly, with the majority of hillslopes having an average slope of less than 5%. The climate is marked by convectional thunderstorms during summer months (October to April), with an average annual rainfall of around 700 mm. Days are hot during summer (average maximum temperature of around 25 °C) with cold nights during winter (average minimum temperature of approximately 4 °C).

The SWAT+ Model, Model Inputs and Setup
The hydrological model SWAT+ (v 1.2.3) was used for the modelling with QSWAT+ (v. 1.2.2) to set up the watershed. SWAT+ is a revised version of the well-known Soil and Water Assessment Tool (SWAT; [25]). SWAT is a process-based semi-distributed catchment scale model which is widely used to simulate water quality and quantity to predict and assess the impacts of land use, climate change, soil erosion and pollution. As one of the first steps, the model divides the catchment into hydrological response units (HRUs). An HRU is a homogenous area in terms of soils, land use and slope. The model calculates various components of the water balance, such as overland flow, infiltration, lateral flow, percolation, evapotranspiration and discharge to the stream from each HRU. In addition, the model is capable of simulating crop growth and nutrient/pollution fluxes through the landscape. For a more complete description of the SWAT model see [26], and for changes in the SWAT+ version see [27]. Only a few important inputs and processes in the model are discussed here. The model was run from January 2000 until December 2013. The first three years were used as a warm-up period, followed by 11 years of validation. Since the aim of the study was to evaluate the direct contribution of improved soil information to modelling efficiency, we did not include a calibration period.  The Jukskei River drains the catchment in a northerly direction. The geology of the study site is granite and gneiss of the Lanseria Gneiss of the Johannesburg Dome Granite [20,22] with Leptosols, Plinthosols, Cambisols, Stagnosols and Fluvisols being the dominant reference groups found [23]. The vegetation type is Egoli Granite Grassland, forming part of the Mesic Highveld Grassland Bioregion [24]. More than two-thirds of this vegetation unit has been transformed by urbanisation. The catchment lies between 1245 and 1709 m.a.s.l., on the Highveld of South Africa. The terrain is hilly, with the majority of hillslopes having an average slope of less than 5%. The climate is marked by convectional thunderstorms during summer months (October to April), with an average annual rainfall of around 700 mm. Days are hot during summer (average maximum temperature of around 25 • C) with cold nights during winter (average minimum temperature of approximately 4 • C).

The SWAT+ Model, Model Inputs and Setup
The hydrological model SWAT+ (v 1.2.3) was used for the modelling with QSWAT+ (v. 1.2.2) to set up the watershed. SWAT+ is a revised version of the well-known Soil and Water Assessment Tool (SWAT; [25]). SWAT is a process-based semi-distributed catchment scale model which is widely used to simulate water quality and quantity to predict and assess the impacts of land use, climate change, soil erosion and pollution. As one of the first steps, the model divides the catchment into hydrological response units (HRUs). An HRU is a homogenous area in terms of soils, land use and slope. The model calculates various components of the water balance, such as overland flow, infiltration, lateral flow, percolation, evapotranspiration and discharge to the stream from each HRU. In addition, the model is capable of simulating crop growth and nutrient/pollution fluxes through the landscape. For a more complete description of the SWAT model see [26], and for changes in the SWAT+ version see [27]. Only a few important inputs and processes in the model are discussed here. The model was run from January 2000 until December 2013. The first three years were used as a warm-up period, followed by 11 years of validation. Since the aim of the study was to evaluate the direct contribution of improved soil information to modelling efficiency, we did not include a calibration period.

Topography and Land Use
Elevation was obtained from a 30-m Shuttle Radar Topography Mission Digital Elevation Model (SRTM DEM) ( [28]; Figure 2a). The current land use was obtained from the 2013-2014 SA National Land-Cover Map dataset [29]. The land-cover was re-grouped into SWAT land uses with pre-defined parameters for each use (Figure 2b). Land-Cover Map dataset [29]. The land-cover was re-grouped into SWAT land uses with pre-defined parameters for each use (Figure 2b).

Climate Information
Daily rainfall and minimum and maximum temperatures were obtained from two climate stations, namely, the Johannesburg Botanical Gardens (BOT) and OR Tambo Airport (INT) (Figure 1) of the South African Weather Service. Daily solar radiation, relative humidity and wind speed were obtained from the Climate Forecast System Reanalysis (CFSR) project [30], done by the National Center for Environmental Prediction (NCEP). This information was used to calculate daily potential evapotranspiration using the Penman-Monteith approach.

Soil Information
SWAT+ requires a soil dataset as a spatial layer. Details on soil horizons such as depth, particle size distribution, saturated hydraulic conductivity (Ks), bulk density (Db), carbon content and available water capacity (AWC) are required for each layer. The latter is synonymous with the more familiar 'plant available water'.
Two levels of soil information were used in this study ( Figure 3). The first was the land type (LT) database [31]. The LT database is the only soil dataset that covers the whole of South Africa at a 1:250,000 scale. A land type is an area with relatively homogenous soil forming factors (climate, geology and topography) resulting in relatively homogenous soil distribution patterns [32]. The LT database is currently the best readily available source of hydrological soil information available in South Africa. With the exception of Ks, all the SWAT+ required properties for the LT data are available from [33] and are summarized in Table 1. ROSSETTA [34] was used to derive the Ks for different horizons from the texture classes.

Climate Information
Daily rainfall and minimum and maximum temperatures were obtained from two climate stations, namely, the Johannesburg Botanical Gardens (BOT) and OR Tambo Airport (INT) (Figure 1) of the South African Weather Service. Daily solar radiation, relative humidity and wind speed were obtained from the Climate Forecast System Reanalysis (CFSR) project [30], done by the National Center for Environmental Prediction (NCEP). This information was used to calculate daily potential evapotranspiration using the Penman-Monteith approach.

Soil Information
SWAT+ requires a soil dataset as a spatial layer. Details on soil horizons such as depth, particle size distribution, saturated hydraulic conductivity (K s ), bulk density (D b ), carbon content and available water capacity (AWC) are required for each layer. The latter is synonymous with the more familiar 'plant available water'.
Two levels of soil information were used in this study ( Figure 3). The first was the land type (LT) database [31]. The LT database is the only soil dataset that covers the whole of South Africa at a 1:250,000 scale. A land type is an area with relatively homogenous soil forming factors (climate, geology and topography) resulting in relatively homogenous soil distribution patterns [32]. The LT database is currently the best readily available source of hydrological soil information available in South Africa. With the exception of K s , all the SWAT+ required properties for the LT data are available from [33] and are summarized in Table 1. ROSSETTA [34] was used to derive the K s for different horizons from the texture classes.
Hydrology 2020, 7, x FOR PEER REVIEW 5 of 15 Figure 3. Different levels of soil information used in the study: (a) land type information (LT) obtained from the [31] and (b) Digital Soil Map (DSM) data, produced by [17,35], together with locations of representative profiles used to derive hydraulic properties of the soil types. The second soil data set was developed through a digital soil mapping approach and is called DSM data for the remainder of this paper. The development of the soil map used in this study is discussed in detail in [17,35]. Here, we provide only key methodological steps that were followed to create the DSM dataset:  [17,35], together with locations of representative profiles used to derive hydraulic properties of the soil types. The second soil data set was developed through a digital soil mapping approach and is called DSM data for the remainder of this paper. The development of the soil map used in this study is discussed in detail in [17,35]. Here, we provide only key methodological steps that were followed to create the DSM dataset: • Environmental covariates (e.g., elevation, slope, topographic wetness index and NDVI) were obtained for the entire Halfway House Granite area (approximately 1050 km 2 ).

•
The conditioned hypercube sampling method (cHLHS) was used to identify 30 hillslopes which are representative of the entire attribute space. Accessibility of the sites was an important consideration. Landowners are not always keen 1) to allow you on their property and 2) allow digging of profiles on their lawns. Large areas of the catchment are also urbanized ( Figure 2b) and the surface sealed; this explains the concentration of observation locations in certain areas ( Figure 3b). • A total of 273 soil observations were made with hand auger (Figure 3b). The soils were classified in accordance with the South African Soil Classification [36] and then regrouped into hydropedological soil types ( Table 2; [37]).

•
The soil observation database was then divided into training (75%) and evaluation (25%) observations. The soil map was then created in R by running the multinomial logistic regression algorithm (MNLR; [38]) on the training data. The produced map (Figure 3b), had an evaluation point accuracy of 80% and a Kappa statistic value of 0.71 [17], which indicates a substantial agreement with reality, and was therefore deemed to be acceptable for use in the modelling exercise.

•
Hydraulic properties for representative profiles were obtained from two consultancy projects in the area [39,40]. Representative profiles (n = 24; Figure 3b) of different hydropedological types were opened. These profiles were typical 'modal' profiles representing the soil forms (Table 2). • Undisturbed core samples were collected from diagnostic horizons. The core samples were used to determine D b , particle size distribution, and the water retention characteristics using the hanging column method. The double ring infiltration method was used to determine the K s of diagnostic soil horizons in situ. For more specific details on the sample strategy and measurement methodology, see [39].

•
Lastly, the hydraulic properties used as SWAT+ inputs for the different horizons of the hydropedological soil types were obtained by averaging these property values of the soil forms in the specific hydropedological soil type (Table 2). These values are summarised in Table 1.
Contrasting with the LT database, the DSM map shows higher variety of soils (two vs. six), and there is considerably more detail on the spatial distribution of the soils (Figure 3). The DSM dataset is dominated by the hydropedological classes 'interflow (soil/bedrock)' and 'recharge (deep)'.

Validation Data and Statistical Comparison
Streamflow data was recorded at three weirs, managed by Department of Water Affairs (DWA), in the catchment, with long-term measurements (Figure 1). A2H044 drains the entire studied catchment (630 km 2 ), whereas A2H043 and A2H047 drain approximately 550 km 2 and 54 km 2 , respectively. Daily streamflow was converted to monthly average values for comparison purposes.
We also compared simulated actual evapotranspiration (S-aET) against aET derived from energy balance modelling using remote sensing data as input. We used data from MOD16 [41,42], which is also based on the Penman-Monteith approach. The 8-day aET was converted to monthly values and averaged for the entire basin using a 'weighted average' approach. This was done by (1) assigning a monthly aET value for each landscape unit (LSU), (2) multiplying this by the fraction of the basin covered by the specific LSU, and (3) adding the aET for all LSU to get a basin value. The monthly average basin value was compared to monthly S-aET at basin scale.

Responsive (wet) Katspruit, Rensburg Gleysols
Gleyed subsoils indicate long periods of saturation, typical of wetland soils. Soils will respond quickly to rain events and promote overland flow due to saturation excess.

Responsive (shallow) Mispah, Glenrosa Leptosols
Shallow soils with bleached colours in the topsoil indicate that underlying bedrock is slowly permeable. Small storage capacity of the soil will quickly be exceeded following rainstorms and promote overland flow generation.
For statistical comparisons, we made use of three widely used statistical indices, namely coefficient of determination (R 2 ), root-mean-square error (RMSE) and the Kling-Gupta efficiency (KGE). The latter is calculated using [43]: where r represents the correlation coefficient, σ sim and σ sim the standard deviations in simulations and observations, respectively, and µ sim and µ obs the means of simulations and observations. KGE = 1 represents a perfect fit and values smaller than −0.41 imply that the means of the observations provide a better fit than the model [44]. In addition to these statistical indices, we assessed streamflow time series, yearly water balances and spatial aET visually to aid in the discussion of model output.

Results
The two model set-ups had identical numbers of sub-basins (19) and landscape units (230), because the same DEM was used to delineate these. The number of HRUs was, however, twice as high for the DSM simulation when compared to the LT simulation (i.e., 2034 and 1132, respectively). This is due to the higher level of detail in the DSM soil map compared to the LT soil map (Figure 3).
For both LT and DSM simulations, baseflow was substantially underestimated in the two larger catchments (Figure 4). The underestimation is more pronounced in the larger (>500 km 2 ) catchment (Figure 4a,b) than in the smaller (54 km 2 ) catchment (Figure 4c). Overestimation of peak flows is associated with both model runs, but more obvious with LT simulations than DSM simulations. Statistical indices indicate that there are moderate correlations with observed streamflow (R 2 ≥ 0.6) at all scales for both simulations. The RMSE error, however, is high and the KGE relatively low. At all three scales, the DSM simulation performed better than the LT simulation when all three statistical indices were considered. The difference in the KGE was especially worth noting in the smaller catchment (Figure 4c   When compared to MOD16-derived actual evaporation (aET), both simulations underestimated aET considerably (see deviation from 1:1 line in Figure 5). The underestimation was larger with the LT soil dataset (Figure 5a) than with the DSM dataset (Figure 5b). These differences are also indicated in the RMSE values of 4.6 and 3.4 for the LT and DSM simulations, respectively. The KGE values were also notably lower with the LT simulation (0.25) when compared to the DSM (0.4) simulation. The underestimation of S-aET and streamflow is also visible when yearly average water balance components are considered (Table 3). Total streamflow is underestimated by 17% for LT and 38% for DSM simulations for the entire basin (A2H044). This underestimation was more pronounced at weir A2H023, with 18% for LT and 41% for DSM. In the smaller catchment (A2H047), streamflow was overestimated by 30% with the LT simulation and underestimated by 18% under the DSM simulation. For the LT simulation, streamflow to rainfall is approximately 46%, with 51% of the rainfall contributing to evapotranspiration in the entire catchment (A2H044). The streamflow:rainfall ratio is The underestimation of S-aET and streamflow is also visible when yearly average water balance components are considered (Table 3). Total streamflow is underestimated by 17% for LT and 38% for DSM simulations for the entire basin (A2H044). This underestimation was more pronounced at weir A2H023, with 18% for LT and 41% for DSM. In the smaller catchment (A2H047), streamflow was overestimated by 30% with the LT simulation and underestimated by 18% under the DSM simulation. For the LT simulation, streamflow to rainfall is approximately 46%, with 51% of the rainfall contributing to evapotranspiration in the entire catchment (A2H044). The streamflow:rainfall ratio is considerably smaller for the DSM simulation (34%), with a larger contribution to evaporation (64%) for this catchment. The total discharge for the LT simulation is considerably higher when compared to the DSM simulation at all scales, a difference of 76, 84 and 127 mm.year −1 for A2H044, A2H023 and A2H047, respectively. This increase is largely due to a 4-fold increase in the simulated overland flow under LT at all scales. The lateral flow component of the DSM simulation is, however, between four and six times higher than for the LT simulation.
Yearly average MOD16-derived aET is 723 mm, which is considerably higher than the rainfall. S-aET is approximately half of the MOD16-derived aET. Differences between S-aET of the different model runs are worth noting. At all scales, LT produced less S-aET than DSM; these differences amount to 24%, 28% and 48% with decreasing catchment size ( Table 3).
The variation in lateral flow simulation is remarkable ( Figure 6). The majority of LSUs in the LT simulation produced less than 30 mm.year −1 (Figure 6a), whereas almost all of the LSUs in the DSM simulation produced more than 90 mm.year −1 (Figure 6b). It is also worth noting that there is limited visual correlation between the relatively low-and high-producing LSUs of the LT simulation and the relatively low-and high-producing LSUs of the DSM simulations.
Hydrology 2020, 7, x FOR PEER REVIEW 10 of 15 to the DSM simulation at all scales, a difference of 76, 84 and 127 mm.year −1 for A2H044, A2H023 and A2H047, respectively. This increase is largely due to a 4-fold increase in the simulated overland flow under LT at all scales. The lateral flow component of the DSM simulation is, however, between four and six times higher than for the LT simulation. Yearly average MOD16-derived aET is 723 mm, which is considerably higher than the rainfall. S-aET is approximately half of the MOD16-derived aET. Differences between S-aET of the different model runs are worth noting. At all scales, LT produced less S-aET than DSM; these differences amount to 24%, 28% and 48% with decreasing catchment size ( Table 3).
The variation in lateral flow simulation is remarkable ( Figure 6). The majority of LSUs in the LT simulation produced less than 30 mm.year −1 (Figure 6a), whereas almost all of the LSUs in the DSM simulation produced more than 90 mm.year −1 (Figure 6b). It is also worth noting that there is limited visual correlation between the relatively low-and high-producing LSUs of the LT simulation and the relatively low-and high-producing LSUs of the DSM simulations.

Streamflow Simulations
The streamflow predictions were surprisingly accurate (especially for weir A2H023), considering that the models were not calibrated against measured flow. Similar studies, for example, obtained R 2 values of between 0.42 and 0.71 [19], 0.15 [20], and between 0.61 and 0.74 [21]. In terms of quantifying overall modelling efficiency, the KGE captures correlation and deviation between simulated and measured values. According to Knoben et al. [44], a KGE of greater than −0.41 implies that the model prediction is a better fit than the mean observed value. With this as benchmark, all the simulations produced 'reasonable' simulations.
There is, however, a clear underestimation of baseflow in the larger catchments (A2H044 and A2H023), which translates into an underestimation of total streamflow (Figure 4, Table 3). This underestimation could be corrected by adjusting groundwater parameters in the model. Decreasing the threshold depth for return flow to occur (GWQMN), decreasing the coefficient of re-evaporation from the groundwater (GW_REVAP) and increasing the threshold depth of the groundwater before

Streamflow Simulations
The streamflow predictions were surprisingly accurate (especially for weir A2H023), considering that the models were not calibrated against measured flow. Similar studies, for example, obtained R 2 values of between 0.42 and 0.71 [19], 0.15 [20], and between 0.61 and 0.74 [21]. In terms of quantifying overall modelling efficiency, the KGE captures correlation and deviation between simulated and measured values. According to Knoben et al. [44], a KGE of greater than −0.41 implies that the model prediction is a better fit than the mean observed value. With this as benchmark, all the simulations produced 'reasonable' simulations.
There is, however, a clear underestimation of baseflow in the larger catchments (A2H044 and A2H023), which translates into an underestimation of total streamflow (Figure 4, Table 3). This underestimation could be corrected by adjusting groundwater parameters in the model. Decreasing the threshold depth for return flow to occur (GWQMN), decreasing the coefficient of re-evaporation from the groundwater (GW_REVAP) and increasing the threshold depth of the groundwater before re-evaporation occurs (REVAMPM) for different HRUs will likely increase baseflow contributions [45]. To optimise simulations and evaluate parameter uncertainty could be the objective of a future study (similar to [16]), but, here, we focused on the direct contribution of different soil inputs to the model performance.
From the simulations, it is clear that more detailed soil information provided better simulations, presumably because the governing runoff generation processes are better reflected. The higher R 2 in the large catchments (Figure 4a,b), especially the higher KGE, indicates better modelling performance. Worth noting is that the improved statistical indices associated with the DSM dataset were observed despite greater underestimation of total discharge on the water balance of the larger catchments (Table 3). This is important, implying that a better representation of the overall water balance does not necessary translate to improved representation of streamflow generation processes. The improved performance can mostly be attributed to improved predictions of peak flows. Higher conductivity (K s ) of surface horizons and deeper soils (Table 1) of the DSM soil inputs will increase infiltration and water storage. Moreover, the textural discontinuity of the interflow soils leads to higher simulated subsurface lateral flows (Figure 3), which also has an effect on the simulated peak flows. The relatively low conductivity and comparatively shallower soils (Table 1) of the LT soil inputs will promote overland flow generation and a quick response to rainfall. This is supported by the summary of water balance in Table 3, where LT simulations generated four times more overland flow than DSM simulations.
The impact of more detailed soil information is more pronounced in the smaller catchment ( Figure 4c, Table 3), with a marked decrease in RMSE and an increase in the KGE from the LT to DSM simulations. The increase in KGE is especially noteworthy. This statistical index improves from −0.23 to 0.43. For the 54-km 2 catchment, the more detailed soil information improved the model predictions from relatively 'poor' to 'acceptable'. In the 54-km 2 catchment, the improvement in the simulations is also visible in the water balance. Total discharge is underestimated by 18% for the DSM simulation (compared to >38% for the same soil dataset in the lager catchments), whereas the LT simulations overestimate streamflow by 30%. From our simulations, it appears that detailed soil information becomes more important in smaller areas, where fewer runoff generation processes dominate, a notion supported by [14]. It is also important to recognize that land-use change, such as urbanisation or open-cast mining, is site-specific. Details of where, when and which hydrological processes dominate are therefore vital for planning in smaller catchments. The optimum scale of soil data for different-sized catchments is however still not known, and certainly worth exploring in future.

Groundwater Contributions
The underestimation of baseflow, especially in the larger catchments (Figure 4a,b), could be attributed to groundwater contributions from outside the catchment area. In general, the groundwater level is deeper than 10 m in the study site [22]. The contribution of groundwater (or fractured rock aquifers) is assumed to be significant, especially during low-rainfall months. Based on our simulations, groundwater contributes to between 62 and 134 mm of the Jukskei's streamflow per year, i.e., between 18% and 38% of streamflow. It would be important to validate these findings through detailed measurements and dedicated modelling.
The external contribution of groundwater from outside the catchment area could also be linked to the underestimation of S-aET when compared to the MOD16 aET ( Figure 5, Table 3). One must keep in mind that MOD16 (or other remote-sensed) aET values are also 'modelled' values. Direct comparisons with simulated values and interpretations in relation to hydrological processes should therefore be made with care. In this study, it appears that MOD16 overestimates aET when compared to rainfall. This is in contrast with another South African study, where MOD16 underestimated aET when compared to measured values [46]. Regardless, the considerably higher MOD16 aET (compared to S-aET) and the underestimation of baseflow (Figure 4), especially during dry months, suggest that streamflow and aET are supported with contributions of water not accounted for in the catchment water balance.
The relatively smaller underestimation of total discharge in the 54-km 2 catchment with the DSM simulations and overestimation with the LT simulation could be an indication that groundwater only makes a considerable contribution to higher-order streams.

Implications for Management
In the study area, subsurface lateral flow has important implications for urbanisation. The term 'wet basement syndrome' (WBS, [39,47]) refers to accumulation of water when lateral flowpaths are intersected by foundations. WBS has implications for infrastructural development (e.g., dampness of walls) and graveyards as well as environmental consequences in the alteration in wetland water regimes. Where wetland waters are supplied through lateral flow, surface sealing or intersection of lateral flowpaths will change the water regimes and, ultimately, the ecosystem services provided by the wetland [48].
Representing the spatial distribution of dominant streamflow generation processes is vital for decision making (e.g., [9,16,19]). Here, we showed how different soil information datasets impact the spatial representation of lateral flow generation. In Figure 6, the relative contributions of lateral flow on 230 different landscape units (LSU) are presented and they differ not only in terms of the volume of flow generated but also in terms of the spatial distribution of the relatively high-and low-generating LSUs. For decision making purposes, the spatial representation could be refined to HRU scale (2034 units). Such information can help with the planning of development projects, such as the location of subsurface drains, graveyards and the type of foundation to consider in different areas. Without detailed soil information, the importance of lateral flow in this catchment would have been underestimated, and this highlights the role that spatially distributed models can play in the design and evaluation of water management plans [18].

Conclusions
This work presented results from hydrological simulations using two levels of soil inputs. More detailed soil information, developed through advanced digital soil mapping techniques, resulted in more accurate simulations of streamflow when compared to measured values. The improved simulation accuracy was obtained without calibration of the model. This is promising for hydrological modelling in ungauged areas, where long-term streamflow monitoring for calibration is absent. The underestimation of baseflow, especially in the larger catchments, and potential contribution of groundwater from beyond the catchment boundary are modelling aspects that need to be considered in future studies.
In our study, the impact of the improved soil information was more pronounced in the smaller catchment than in the larger ones. The ideal level of detail (or scale) of soil information for hydrological modelling of different sized catchments remains an important question. It is, however, clear that the SWAT+ model is sensitive to soil inputs, and we argued that the spatial representation of dominant hydrological processes is captured more accurately with more detailed soil information. Any reasonable effort should therefore be made to improve the soil information to realistically reflect hydrological processes in order to improve land use planning, especially in areas dedicated for urbanisation.