Surface Formations Salinity Survey in an Estuarine Area of Northern Morocco, by Crossing Satellite Imagery, Discriminant Analysis, and Machine Learning

: The salinity of estuarine areas in arid or semi-arid environments can reach high values, conditioning the distribution of vegetation and soil surface characteristics. While many studies focused on the prediction of soil salinity as a function of numerous parameters, few attempted to explain the role of salinity and its distribution within the soil proﬁle in the pattern of landscape units. In a wadi estuary in northern Morocco, landscape units derived from satellite imagery and naturalistic environmental analysis are compared with a systematic survey of salinity by means of apparent electrical conductivity (Eca) measurements. The comparison is based on the allocation of measurement points to an area of the estuary from Eca measurements alone, using linear discriminant analysis and four machine learning methods. The results show that between 57 and 66% of the points are well-classiﬁed, highlighting that salinity is a major factor in the discrimination of estuary zones. The distribution of salinity is mainly the result of the interaction between capillary rise and ﬂooding by the tides and the wadi. The location of the misclassiﬁed points is analysed and discussed, as well as the possible causes of the confusions.


Introduction
Due to their very low altitude, estuarine areas are outpost regions exposed to climate change, one of the consequences of which is the rise in sea level [1][2][3][4]. They are not only subject to alternating tidal flooding twice a day, the amplitude of which varies according to astronomical cycles [5], but also to freshwater, which depends on the distribution of rainfall locally and in the river catchment. This results in gradients in the spatial distribution, intensity, and duration of flooding, making them extremely dynamic, complex, and particularly interesting environments to study [6].
In arid or semi-arid areas, these estuarine zones can have extreme ranges of soil and groundwater salinity, with salt being concentrated by evaporation during dry periods

Study Area
The study area (35 • 54 to 35 • 65 north and 5 • 87 to 6 • 01 west) is located on the Atlantic coast of the Tangier peninsula in northwestern Morocco. It has an area of about 16 km 2 in the estuary of Oued Tahaddart and its tributary Oued Mharhar. It is a low and marshy alluvial plain surrounded by hills, whose altitude varies from 50 to 228 m ( Figure 1).
The geological context consists mainly of formations of the northwestern Rif related to the two Rifian structural domains, namely, the flysch domain represented by the Numidian, and the external domain by the Tangier unit, the internal Prérif and the Habt [15]. These formations are covered by more recent deposits made up of Neogene post-mantle biocalcarenites, upper Tortonian blue marls, Messinian sandy marls, lower Pliocene quartz sands, then quaternary deposits constituting fluvial terraces and continental accumulation glaciers [15,16]. The marine quaternary is visible along the Atlantic coastline [17,18]. quartz sands, then quaternary deposits constituting fluvial terraces and continental accumulation glaciers [15,16]. The marine quaternary is visible along the Atlantic coastline [17,18].
The climate is typical of the western Mediterranean area, with a short, cold, rainy winter (November to April) and a dry, hot summer (May to September). Annual rainfall is relatively high, ranging from 655.8 to 765.3 mm [19], but irregularly distributed throughout the year with a maximum in December. The average annual temperature is 18.2 °C, with a maximum in August and a minimum in January (27.3 and 14.3°C, respectively) [18]. The flow regime of the Oued Mharhar, with an average discharge of about 1 m 3 /s [20], is controlled by the rainfall regime. The alluvial plains of the wadi are periodically flooded by fresh water during floods, most notably when there is a combination of the arrival of atlantic depressions and high tidal coefficients around the equinoxes. During heavy rainfall events, freshwater flows prevent saltwater upwelling, whereas during dry periods, the wadi channels are invaded by marine waters towards upstream [21,22]. A survey of the study area was conducted based on Google Earth (USGS) images from 2003 to 2021, as well as on field observations, including a detailed floristic survey. Five areas were distinguished, namely: 1. A hilly area covered by xerophilous vegetation where permeable and well aerated sandy soils are developed; The climate is typical of the western Mediterranean area, with a short, cold, rainy winter (November to April) and a dry, hot summer (May to September). Annual rainfall is relatively high, ranging from 655.8 to 765.3 mm [19], but irregularly distributed throughout the year with a maximum in December. The average annual temperature is 18.2 • C, with a maximum in August and a minimum in January (27.3 and 14.3 • C, respectively) [18].
The flow regime of the Oued Mharhar, with an average discharge of about 1 m 3 /s [20], is controlled by the rainfall regime. The alluvial plains of the wadi are periodically flooded by fresh water during floods, most notably when there is a combination of the arrival of atlantic depressions and high tidal coefficients around the equinoxes. During heavy rainfall events, freshwater flows prevent saltwater upwelling, whereas during dry periods, the wadi channels are invaded by marine waters towards upstream [21,22]. A survey of the study area was conducted based on Google Earth (USGS) images from 2003 to 2021, as well as on field observations, including a detailed floristic survey. Five areas were distinguished, namely:

1.
A hilly area covered by xerophilous vegetation where permeable and well aerated sandy soils are developed; 2.
A zone covered by halophytic plants. This zone actually includes several types of halophytic vegetation and several types of soil surface conditions, which vary spatially over short distances; 3.
A non flooded saline bare soil zone; 4.
To this zonation, we must add some information likely to influence the distribution of salinity, such as the outlet of a watershed from the Haouta Ben Mediar hill to the south Soil Syst. 2023, 7, 33 4 of 15 of the area (Arrow on Figure 1), and the presence of two sectors of salt exploitation by pumping the water table, one old, the other still in activity. Field data acquisition was complemented by 60 salinity measurements in the water table, either from hand auger holes in the alluvial plain or from existing wells in the surrounding hills.

Soil Salinity Measurements
Soil salinity was assessed by low frequency electromagnetic induction with a portable electromagnetic conductivity meter type EM38B (Geonics Ltd., Ontario Canada) [14,[23][24][25][26]. This instrument measures the apparent electrical conductivity of the soil (ECa) in millisiemens per meter (mS m −1 ). This cost-effective, noninvasive technique provides a high number of quantitative measurements that may be easily georeferenced and are therefore well suited to assess the temporal and spatial variability of soil salinity [27]. The device was used with both vertical (ECaV) and horizontal (ECaH) modes. Under these conditions, it is estimated that about 75% of the signal comes from the upper 1.8 metre and 1 metre of the soil, respectively [14]. The combination of these two measurement modes (ECaV/ECaH) therefore allows estimating the saline profile, either upward for lower values or downward for higher values [23,28]. The measurement range is 0 to 1000 mS m −1 . For values above 700 mS m −1 , a correlation was established for both, ECaV and ECaH, between measurements at the topsoil and at 0.7 m above ground, which was then applied to estimate ECa values over areas of high salinity. This procedure causes a loss of accuracy both in terms of ECa and of salinity gradient evaluation, but it allows the instrument to remain in a measurement range with a substantially linear response. The 4171 measurement points were georeferenced with an accuracy of about 2 to 3 m using GPS. They cover the whole study area with a variable density according to the lateral gradients of ECa values (Figure 1b), but also with higher density on pilot sites for further research (not presented in this work). The measurement campaign took place from August to September 2021, combining low tidal coefficients and low wadi flow in order to have a maximum surface of exposed soil.

Normality Test
Shapiro-Wilks normality tests were performed on the salinity data, both raw and log-transformed, on the whole data set or separately on each of the five zones. In addition, quantile-quantile (QQ) plots were constructed to visually compare the residuals of each distribution to a normal distribution of the same mean and standard deviation [29]. The diagonal on these plots represents the normal character of the distribution. The closer the points of the studied data distribution are to the diagonal, the closer the distribution is to normality.

ECa Data Spatial Analysis
Sample variograms were calculated to describe the spatial structure of ECa and log(ECa) variables. Variograms were estimated from the formula: where N(h) is the number of pairs of points, and s(x i ) and s(x i + h) are the ECa or log(ECa) values at x i and x i + h. The analysis of the variograms was based on their standard characteristics, i.e., range, sill, and nugget effect. The directional variograms, for which point pairs are only included in the calculation of the semi-variance if they were within a 20 degree angular sector, were calculated to obtain a maximum range. The angle minimising the range was then refined by reducing the sector step to 4 degrees. A model variogram was fitted to the sample variograms using the least squares method. Finally, ordinary kriging was used to construct ECa or log (ECa) distribution maps.

Discriminant Analysis
A discriminant analysis (DA) was performed to check whether each measurement point could be identified as belonging to one of the five zones from the ECa measurements alone. DA consists of finding a discriminant function that projects the points onto a line passing through the centroids of the scatter plots corresponding to each zone in an optimal direction in order to best discriminate these scatter plots. However, this proven method has two drawbacks. The first is the linear nature of the approach, which does not necessarily match the nature of the boundaries between the different zones in the data hyperspace. The second is that the effectiveness of the discrimination is assessed on the same dataset that was used to find the discriminant function, and not on a dataset that is independent of the analysis result. To overcome these two drawbacks, non-linear machine learning methods were also used [30].

Machine Learning
Four widely used machine learning algorithms were compared to the discriminant analysis, namely (1) linear naïve Bayesian classification (NBC), a supervised learning algorithm used for classification, (2) K-nearest neighbours (KNN), a non-parametric, supervised learning classifier, which uses proximity to make classifications or predictions about the grouping of an individual data point, (3) K-means clustering (K-means), a distance-based algorithm that group the closest points to form a cluster, and (4) Gaussian mixture models (GMM), which assume that there are X Gaussian distributions, that each of these distributions represents a cluster and therefore groups the data points belonging to a single distribution. The latter two algorithms use the entire data set and distribute it into a given number of classes (here 5 classes). For the first two algorithms, the dataset was split into a training part and a test part. Fot this, the data were classified according to their electrical conductivity and every second sample (50%), every fourth sample (25%) and three out of four samples (75%) were chosen to form a training dataset, with the rest of the data used for cross-validation, i.e., to check the performance of the method in classifying the measurement points according to their area of belonging. In this way, each subset of the data covers the entire ECa salinity range. The machine learning algorithms used do not have a linearity constraint, i.e., they can accommodate non-linear (curvilinear) boundaries between different areas in the hyperspace of the data, and are therefore likely, if this is the case, to provide better results than discriminant analysis.
All calculations were performed with the XLStat software (addinsoft). A confusion matrix was used to validate the accuracy of the classification obtained by discriminant analysis or machine learning. Particular attention was then paid to the location of misclassified points according to each method used.

Results
Within the alluvial zone, the saline water table varies between 1 and 3 m in depth upstream of the study area, in the area covered by halophyte plants and bare soil, and between 0.5 and 1.5 m downstream, in the areas of periodic or frequent flooding. The ECa range was from 3 to 2842 mS.m −1 and from 1 to 2310 mS.m −1 , with mean values of 1165 and 858 mS.m −1 for vertical (ECaV) and horizontal (ECaH) mode measurements, respectively. The coefficient of variation (Cv) was 58% for ECaV and 60% for ECaH. The other statistical parameters (median, variance, and standard deviation) are presented in Table 1. The normality tests led to the rejection of the hypothesis of a normal distribution, both on the raw and the log-transformed data (Figure 2c,f). However, although the normality test was still negative, the Q-Q plots suggested that when each zone was analysed separated, the distribution of the log-transformed data approached normality as illustrated for the hills area (Figure 2,d) and the halophytes area (Figure 2b,e).
The coefficient of variation (Cv) was 58% for ECaV and 60% for ECaH. The other statistical parameters (median, variance, and standard deviation) are presented in Table 1. The normality tests led to the rejection of the hypothesis of a normal distribution, both on the raw and the log-transformed data (Figure 2c,f). However, although the normality test was still negative, the Q-Q plots suggested that when each zone was analysed separated, the distribution of the log-transformed data approached normality as illustrated for the hills area (Figure 2,d) and the halophytes area (Figure 2b,e).  Histograms for raw and log-transformed data on the whole area are presented in Figure 3. Histograms for raw and log-transformed data on the whole area are presented in Figure 3. The average salinity values on each landscape unit were more or less contrasted. The range of salinity followed the succession hills < halophytes < bare soils < frequently flooded area < periodically flooded area (Table 2). While the hill zone was quite distinct, there was a significant overlap of ECa values for the halophyte zone and bare soil on the one hand, and for both flood zones on the other, both for the vertical (ECaV) and horizontal (ECaH) measurements.  The average salinity values on each landscape unit were more or less contrasted. The range of salinity followed the succession hills < halophytes < bare soils < frequently flooded area < periodically flooded area (Table 2). While the hill zone was quite distinct, there was a significant overlap of ECa values for the halophyte zone and bare soil on the one hand, and for both flood zones on the other, both for the vertical (ECaV) and horizontal (ECaH) measurements. The raw and directional variograms at 58 • and 148 • for the vertical mode measurements ECaV and log (ECaV) are plotted in Figure 4. The variograms show periodic structures, which are consistent with the presence of spots. The smaller range for the 58 • angle and the maximum range for the 148 • perpendicular suggest that these spots were roughly oriented along the wadi axis, i.e., parallel to the hydrological axis of the study area. For a study area of 16 km 2 , the number of 4171 measurements corresponded to one measurement per 3832 m 2 , i.e., on average, one measurement every 62 m. This value was much lower than the range, which was around 500 metres. The map calculation is therefore reliable.
The raw and directional variograms at 58° and 148° for the vertical mode measurements ECaV and log (ECaV) are plotted in Figure 4. The variograms show periodic structures, which are consistent with the presence of spots. The smaller range for the 58° angle and the maximum range for the 148° perpendicular suggest that these spots were roughly oriented along the wadi axis, i.e., parallel to the hydrological axis of the study area. For a study area of 16 km², the number of 4171 measurements corresponded to one measurement per 3832 m², i.e., on average, one measurement every 62 m. This value was much lower than the range, which was around 500 metres. The map calculation is therefore reliable.
(a) (b) The resulting kriging maps of ECaV, ECaH, and ECaV/ECaH data are presented in Figure 5. There was an upstream-downstream salinity gradient, but also a lateral variation with the highest values concentrated at the edge of the alluvial plain, near the contact with the hilly areas. In this sector of highest salinity, the ECaV/ECaH ratio was rather low (1.3 to 1.35), reflecting an upward saline profile and capillary rise. The resulting kriging maps of ECaV, ECaH, and ECaV/ECaH data are presented in Figure 5. There was an upstream-downstream salinity gradient, but also a lateral variation with the highest values concentrated at the edge of the alluvial plain, near the contact with the hilly areas. In this sector of highest salinity, the ECaV/ECaH ratio was rather low (1.3 to 1.35), reflecting an upward saline profile and capillary rise.
The results of the discriminant analysis and the confusion matrix in the classification of the points according to zones are presented in Tables 3 and 4. The best rate of well-classified points was obtained by considering the raw data (ECaV and ECaH), and the logarithm of the conductivity ratio log(ECaV/ECaH), which contains information on the vertical salinity gradient within the soil profiles. Under these conditions, the rate of well-classified points was 62.29%. Table 3. Rate of points well-classified according to data conditioning (raw or log-transformed) and according to the consideration of the salinity gradient within the soil profiles (ECa ratio).

ECaV ECaV, ECaH ECaV, ECaH Log(ECaV) Log(ECaV), Log(ECaH) And
And  The results of the discriminant analysis and the confusion matrix in the classification of the points according to zones are presented in Tables 3 and 4. The best rate of wellclassified points was obtained by considering the raw data (ECaV and ECaH), and the logarithm of the conductivity ratio log(ECaV/ECaH), which contains information on the vertical salinity gradient within the soil profiles. Under these conditions, the rate of wellclassified points was 62.29%. Table 3. Rate of points well-classified according to data conditioning (raw or log-transformed) and according to the consideration of the salinity gradient within the soil profiles (ECa ratio).

ECaV
ECaV  The points from the frequent flooding zone had the highest rate of good classification (87.56%). The hilly area also showed a good result, with 78.17% of well-classified points. This is the area with the lowest average salinity. On the other hand, the rate of well-classified observations was lower for the other zones, with many confusions between the bare soil zone and the halophyte zone, and a very poor discrimination of the sector with periodic flooding, with only 20.18% of well-classified points. The distribution of the misclassified points is presented in Figure 6. The points misclassified as belonging to the hill zone were mainly located in the north of the study area, i.e., at the immediate edge of the hill zone, in the first 300 m of the halophyte zone with lower salinity (Figure 6a). Points misclassified as belonging to the halophyte zone were mainly located in the bare soil zone, or along the steep salinity gradient between the hill zones and the other higher salinity zones. A similar distribution was observed for points misclassified as bare soil, mainly confused with the halophyte zone, but also numerous confusions with areas of periodic or frequent flooding. A concentration of confusions could be noted at the outlet of the watershed from the Haouta Ben Mediar hill (red arrow on Figure 6a). Concerning the flood zones, the points misclassified as belonging to the frequent flood zone were located on the former salt exploitation sector, or accompanied the drainage axes towards the halophyte zone. Again, a concentration of misclassified points accompanied the steep salinity gradient between the hills and the periodically flooded area.
Similarly, the results from the machine learning analysis are presented in Table 5. The best results were obtained with the Bayesian naive classification and 5 nearest neighbours algorithms and considering 75% and 25% of the points for training and testing, respectively (Figure 7). On the maps in Figure 7, the density of misclassified points was lower than in Figure 6 as the test sample was only 1042 points instead of the whole dataset for the other algorithms. The results presented by these two methods give very similar results, not only in terms of the percentage of well-classified points, but also in the detail of their location. There were many confusions along the high salinity gradient line, the points classified as belonging to the hilly area were mostly concentrated in the northern part of the halophyte area. The points classified as frequently flooded areas were mainly observed in the former salt exploitation area, and a few other misclassified points at the outlet of the Haouta Ben Mediar basin.    (a) (b) Figure 6. Position of points misclassified by the discriminant analysis: (a) misclassified as hills, halophytes and bare soils; (b) misclassified as periodic and frequent flooding. Table 5. Results of machine learning analysis using Gaussian mixture modelling (GMM), K-means clustering (K-means), K-nearest neighbors (KNN), and naive Bayesian clustering (NBC).

A relevant Zonation
The results show the existence of landscape units, each with its own distribution of ECa values, a phenomenon already observed in estuarine environments [31]. Indeed, the fact that the distribution of values is multimodal can be explained by the existence of several zones, each with its own distribution law, which is confirmed by distributions closer to normality when the zones are analysed separately (Figure 2). The hilly areas show a high ECaV/ECaH ratio (Figure 5c), which reflects a downward saline profile, and thus a leaching of salts to depth facilitated by the sandy soil texture. The other areas, on the other hand, have much lower ECaV/ECaH values, particularly the flooded areas. Due to their position further downstream, i.e., closer to the ocean outlet, the water table is close to the soil surface and capillary rise is important and contributes to an upward saline profile. In the periodic flooding zone, the redistribution of salts under the influence of inundation is less frequent, which confers a very high topsoil salinity. It is precisely in this area that people were building salt works for the exploitation of salt for several centuries.

An Effective Discrimination
Many studies aim at predicting salinity by crossing a set of satellite data combining biophysical indicators of vegetation, surface condition, and field measurements [32][33][34], but few studies focused on the importance of salinity in the zonation of natural or anthropised environments. In our estuary area, the results of the discriminant analysis show that about 62% of the points can be correctly located as belonging to one of the five zones based on the salinity values and the salt profile. The NBC and KNN machine learning algorithms perform slightly better. These results are remarkable considering that only two parameters ECaV and ECaH were measured, representing the salinity at different depths, but which are themselves highly correlated with each other (r² = 0.987). As a result, the information contained in the dataset is poor in terms of diversity, being limited to 1 or 2 parameters. These conditions are outside the usual framework of application of machine

A relevant Zonation
The results show the existence of landscape units, each with its own distribution of ECa values, a phenomenon already observed in estuarine environments [31]. Indeed, the fact that the distribution of values is multimodal can be explained by the existence of several zones, each with its own distribution law, which is confirmed by distributions closer to normality when the zones are analysed separately ( Figure 2). The hilly areas show a high ECaV/ECaH ratio (Figure 5c), which reflects a downward saline profile, and thus a leaching of salts to depth facilitated by the sandy soil texture. The other areas, on the other hand, have much lower ECaV/ECaH values, particularly the flooded areas. Due to their position further downstream, i.e., closer to the ocean outlet, the water table is close to the soil surface and capillary rise is important and contributes to an upward saline profile. In the periodic flooding zone, the redistribution of salts under the influence of inundation is less frequent, which confers a very high topsoil salinity. It is precisely in this area that people were building salt works for the exploitation of salt for several centuries.

An Effective Discrimination
Many studies aim at predicting salinity by crossing a set of satellite data combining biophysical indicators of vegetation, surface condition, and field measurements [32][33][34], but few studies focused on the importance of salinity in the zonation of natural or anthropised environments. In our estuary area, the results of the discriminant analysis show that about 62% of the points can be correctly located as belonging to one of the five zones based on the salinity values and the salt profile. The NBC and KNN machine learning algorithms perform slightly better. These results are remarkable considering that only two parameters ECaV and ECaH were measured, representing the salinity at different depths, but which are themselves highly correlated with each other (r 2 = 0.987). As a result, the information contained in the dataset is poor in terms of diversity, being limited to 1 or 2 parameters. These conditions are outside the usual framework of application of machine learning algorithms, which generally rely on a much larger number of parameters for discrimination [35][36][37].
Salinity and the salt profile are major parameters in the differentiation of the five zones of the estuary.
For the discriminant analysis, the best result is obtained from the raw data and the logarithmic transform of the salinity gradient information. The confusions concern areas with similar salinity ranges and a high degree of overlap (Table 2), with the two highsalinity inundation areas on the one hand, and the intermediate-salinity areas of bare soil or halophyte plants on the other. Taking into account the vertical salinity gradient within the soil profiles somewhat improves the discrimination. However, for hyper-saline environments, the measurement of the gradient from vertical versus horizontal salinities is less accurate for technical reasons, which does not allow the vertical salinity gradient parameter to play a more important role in the classification of these areas, where the capillary rise and redistribution of salts can be significant. However, it is an important criterion describing the operating differential between the zones. Upward gradients (lowest ECaV/ECaH) reflect areas where capillary rise is almost permanent, while flooding removes salts from the upper part of the soil profile or redistributes them within the profile itself, erasing the traces of capillary rise.
The area with a strong salinity gradient between the hilly area and the periodically flooded area concentrates a high proportion of poorly classified points. This can be explained by several factors. On the one hand, the location of the hill zone boundary, based on satellite imagery, is inaccurate by a few metres. On the other hand, the geo-referencing of the measurement points by GPS also has an inaccuracy which, together with the reading inaccuracy and the stability of the EM38 equipment, is responsible for the nugget effect observed on the variograms (Figure 4). An inaccuracy of a few metres can be accompanied by a very large variation in the electrical conductivity value in this sector with a strong salinity gradient, leading to class confusion.
The watershed outlet from the Haouta Ben Mediar hill is accompanied by a concentration of misclassified points in the flood valley to the southeast of the study area (red arrow in Figures 1, 6, and 7). This area appears darker on some of the satellite images consulted (between 2003 and 2021), but not permanently. In Figure 5a,b it can be seen that the outlet of the catchment extends into a lower conductivity axis within the floodplain. The lower ECa values are accompanied by lower ECaV/ECaH values. These observations suggest the presence of a coarser textured channel below the ground surface where the less mineralised water from the catchment area flows. This illustrates the presence of structure not readily identifiable by landscape analysis, revealed by systematic mapping, confirming the complementarity of approaches to better understand the mechansisms driving salinity distribution [28,[38][39][40].

Linearity as a Low-Constraining Condition
The K-nearest neighbours (KNN) algorithm and the naive Bayesian classification (NBC) perform slightly better than the discriminant analysis when assigning the measurement points to the five zones of the estuary, which is not always the case in salinity class prediction studies [32,41]. This suggests that the linearity condition imposed by the discriminant analysis is not very restrictive, but this difference can be interpreted as the fact that the boundaries between zones in the data space are not linear, but curvilinear and irregular. The logarithmic transformation of the data, a procedure that would be more suitable for curvilinear boundaries, is monotonic and regular, and does not improve the result. On the other hand, machine learning methods, especially K-nearest neighbours, can cope with irregular boundaries between different areas in the data space [30], which might be the case in our study.

Factors Other Than Salinity and the Saline Profile
Salinity and salt distribution in the soil profile are key factors in the discrimination of the five estuary zones, but they are not the only factors responsible for the distribution of vegetation, including halophyte plant associations. The granulometry of the soil and the frequency/duration of flooding determine the aeration status of the soil and its redox level. For the same salinity, better aerated environments will be likely to be colonised by vegetation, whereas highly reducing environments will be devoid of it [42][43][44]. The movement of sand transported to the surface by the wind according to the topography can modify the assessment of surface conditions when drawing the zonation. As a result, the distinction between halophyte and bare soil zones can be misjudged. It is important to note that the establishment of halophytic plants is not instantaneous. These plants are slow growing [45], despite high yields under cultivated and seawater-irrigated conditions [46]. Any change in salinity, which can be rapid, is not immediately accompanied by a change in vegetation cover. The river system is extremely active in estuarine environments. This was clearly visible on the satellite images consulted between 2003 and 2021. The position of the wadi and its meanders changes frequently, which notably blurs the zonation between frequent and periodic flooding zones. The same applies to the secondary hydrographic network and the tidal channels, which are generally fractal in structure [47,48] with vertical walls of about 1 metre. They can grow, expand, or fill in as the wadi wanders. This leads to a slow modification and readjustment of water and salt transfers in the landscape. For all these different reasons, the surface mapping does not represent the hydro-saline balance in detail with recent operating conditions of the estuary.

Conclusions
This study aimed to better understand the role of salinity, its intensity, and distribution within soil profiles, on the zonation of an estuary in northern Morocco. The five zones mapped by naturalistic landscape analysis and satellite imagery have their own statistical distribution of apparent electrical conductivity, with more or less overlapping ranges. The various classifications by discriminant analysis or machine learning algorithms correctly position from 57 to 66% of the points in their original area, attesting to the importance of salinity alone on the characteristics of the estuary areas. The saline profile within the soils, ascending or descending, is a secondary parameter, but it contributes to a better differentiation of the zones. The presence of curvilinear and irregular boundaries between the zones in the data hyperspace can explain slightly better results obtained by two algorithms, namely naive Bayesian clustering and 5 nearest neighbours. The statistical methods of discriminant analysis and machine learning help to critically review the zonation of surface states and the understanding of the mechanisms responsible for their spatial distribution.