2. Study Area
This study area is located in Djibouti and surrounding coastal areas in the Gulf of Tadjoura, Bab-el-Mandeb Strait (
Figure 1). Situated in the region of the Horn of Africa and surrounded by waters of the southern Red Sea, Djibouti is a strategic point with geo-maritime importance [
31,
32]. Djibouti is one of the hottest and driest places on Earth with many issues related to climate and weather processes [
33]. The thermal crater saline lake, Lake Assal, is located in this area, which is the lowest point on land in East Africa with extreme levels of evaporation. The topography affects temperature and precipitation, which vary according to the elevation and differ in coastal and mountainous regions [
34].
This region experiences fluctuations in salinity and water levels due to seasonal and annual changes in temperature and precipitation. Dominating landscapes are represented by dry mountain vegetation typical of arid climates. The distribution and genera of flora and fauna are controlled by low precipitation and high temperatures in semi-desert and desert environments [
35,
36,
37]. For example, the 2011 drought in East Africa affected rain-fed agriculture in Djibouti, which led to a regional famine [
38,
39,
40]. Drought-induced crises endanger long-term investments and development, with catastrophic human costs and effects on vulnerable regions of the country [
41,
42,
43].
The drought-prone regions of Djibouti have experienced strong effects from climate change, including fluctuations in the hydrological regime of thermal lakes. The arid climate has triggered environmental problems; for example, it has desiccated and abandoned agricultural lands, and salinated and blocked water sources due to sea level rise and flash floods [
44,
45]. Soil erosion and land degradation vary across topographically diverse landscapes, reflecting the prevailing climatic conditions. [
46,
47]. Moreover, environmental pollution has been reported along the coasts of the Gulf of Tadjoura [
48,
49].
The surroundings of the Horn of Africa are notable for climate–hydrological fluctuations which cause land cover changes [
50,
51]. The coastal region located in the Bab-el-Mandeb region includes diverse types of xeric grasslands and shrubland typical for arid ecoregions (
Figure 2). This region consists of various land cover classes typical for semi-desert areas: dry shrub steppe and grassland, including species of
Vachellia tortilis,
Vachellia nubica, and
Balanites aegyptiaca. Many scarce vegetation spots include scattered trees or shrubs on the sandy plains of bare lands and semi-deserts.
The eastern strip along the Red Sea coast is part of the Eritrean coastal desert [
52]. It includes vegetation along the muddy wadi mouths in diverse habitats [
53], grasslands [
54,
55], and secondary growth areas in the urban areas. The floodplains along the wadi rivers include scarce vegetation typical for semi-desert ecosystems. Xeric vegetation, farmland, mangroves along the coasts of the Red Sea and valleys of wadi [
56,
57], coral reefs along the coasts [
58,
59,
60] and urban areas collectively make up the region’s landscape. Djibouti is one of the most seismically unstable zones of Africa with high tectonic–magmatic activities [
61,
62,
63,
64] due to the closeness of the Afar Triple Junction [
65,
66,
67]. The thermal regime around the Red Sea is explained by its structural connection to the active zone of the East African Rift [
68,
69,
70,
71,
72].
Such complex geological setting and arid environment have resulted in the deficit of water and favoured the distribution of thermal springs with high salinity, located in three regions—Lake Assal, Lake Hanle and Lake Abhe [
73,
74]. The craton origin of Lake Assal is related to the Afar Triple Junction [
75,
76,
77]. The geochemical composition of the Assal Rift includes evaporitic halite deposits [
78,
79]. Its location in the arid zone with strong evaporation and no riverine input contributed to the lake’s increase in salinity, with a high brine concentration above 100 g/L [
80]. Fluctuations of this concentration are related to evaporation, the hot climate and geothermal processes [
81].
3. Materials and Methods
Our methodology presents a case of image reclassification and analysis using ML techniques for environmental mapping in Djibouti, East Africa. A classification scheme consisting of 17 initial land cover classes was employed through a logical sequence of GRASS GIS modules to extract land cover information of the study area in Djibouti. Reclassification includes the sequence of changing the values in a raster dataset to new values. The process is based on a set of rules which defined the criteria of updates in classes. This operation in GRASS GIS was applied for the simplification of land cover polygons, standardisation of assignment of pixels to classes and preparation of data for land cover analysis.
3.1. Data
The data used in this study include four images from the Landsat 8–9 Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS): the training image from 2015, and images with biannual intervals from 2019, 2021 and 2023 (
Figure 3).
The Landsat images were selected as the data source due to their quality and reliability [
82,
83,
84]. Earlier studies reported the effective use of the RS data in environmental studies which prove their utility and value [
85,
86,
87,
88,
89,
90,
91]. Nowadays, satellite images are excellent sources of geoinformation for mapping arid regions [
12,
92,
93]. Spaceborne data are successfully used in the analysis of land cover change and deforestation in tropical forests, and for reporting landscape dynamics [
94,
95,
96,
97]. The importance of Landsat data for environmental research has been described earlier [
98,
99,
100].
In this study, four Landsat images were taken on the following dates: (1) 2 July 2015; (2) 23 July 2017; (3) 19 August 2021; and (4) 16 July 2023. The metadata are summarised in
Table 1. The cloudiness percentage is less than 10% for all the images. The technical parameters of the scenes are provided in
Table 2. Common data for all four images are as follows: Datum and Ellipsoid: World Geodetic System 1984 (WGS84); Product Map Projection L1: Universal Transverse Mercator (UTM) zone 38 (for Djibouti); Worldwide Reference System (WRS): Path—166, Row—52; Landsat Collection Number: 2; Category: T1; Station Identifier: LGN; and Sensor Identifier: OLI/TIRS. The images were taken during day time in nadir with a roll angle of 1° for the 2015 image and 0° for the images from 2017, 2021 and 2023.
The image quality score was 9 for all scenes, according to the Landsat specifications. The remaining various metadata for each scene are summarised in
Table 2. The geographic characteristics of the images are referenced in
Table 3. The Landsat Scene Identifiers (IDs) are provided below.
The images have been filtered to reduce noise due to cloudiness and atmospheric effects. Images were systematically selected from the USGS EarthExplorer repository and taken at the end of the dry season. During this period, cloud cover is generally low but vegetation appears bright green and contrasts with all other land cover types.
3.2. Workflow
The schematic workflow of the research is presented in
Figure 4.
The image processing used to create a land cover map from Landsat OLI/TIRS imagery was carried out in six steps using the following techniques of GRASS GIS:
Image import;
Data exploration and analysis;
Classification of images using MaxLike method;
Reclassification scheme using ‘echo’ of Unix and ‘r.reclass’ module;
Classification of images using gradient boosting algorithm of ML;
Image post-processing and mapping;
Accuracy assessment.
The workflow is based on scripting techniques of GRASS GIS for image analysis (software developed originally by the U.S. Army Construction Engineering Research Laboratories in 1982, now maintained by he developer team in Champaign, IL, USA). The topographic map is plotted using existing techniques in GMT version 6.4.0. [
101,
102,
103,
104,
105] (software developed originally in 1988 by Paul Wessel and Walter H. F. Smith, in Lamont-Doherty Earth Observatory of Columbia University, Palisades, NY, USA).
3.3. Image Preprocessing
The Landsat 8–9 OLI/TIRS images were geometrically rectified according to EPSG coordinates: 2095, Universal Transverse Mercator (UTM) zone for Djibouti according to the USGS technical characteristics. In this study, a scripting method of GRASS GIS was applied, which uses several modules with code snippets explained and commented below.
The images were imported into the working folder using the ‘r.import’ module, then the same procedure was repeated for all the bands and the four images used in this study (2015, 2019, 2021 and 2023). Afterwards, the relief map was imported from GEBCO. The region was set up to the geospatial extent with the isolines obtained using the ‘r.contour’ module with a 200 m interval. To facilitate data processing, the best combinations of Landsat bands were stacked into coloured composites. The bands that produced the best separation for pixels were processed, excluding the panchromatic channels. The Tag Image File Format (TIFF) files were joined using the ‘i.group’ module and enhanced using preprocessing steps which included contrast stretching and atmospheric correction.
3.4. Clustering
The clustering was performed by the ‘i.cluster’ module of GRASS GIS, which generates a signature file and reports results using the k-means clustering algorithm, as explained in previous studies [
106,
107,
108,
109]. Signature file determines the categories (classes) of land cover types using the analysis of spectral reflectance of the pixels and contains clusters and covariance matrices for each image. The initial means for each class before data processing reported in
Appendix A indicate the starting seed values of the multispectral band before iteration is assigned according to the signature files. The means are then calculated during the ‘i.cluster’ procedure and recalculated following the principles of maximal class separation (the contrast between the classes) and minimal class size (the details of classification with respect to a spatial resolution of 30 m). The table in
Appendix A reports such values for each of the 10 classes and 7 multispectral bands of Landsat, starting from 1—‘Coastal aerosol’ to 7—‘SWIR-2’.
The initial mean values for each land cover class for the years 2015, 2017, 2021 and 2023 are reported in
Table A1,
Table A2,
Table A3 and
Table A4; see
Appendix A. Afterwards, the assignment of pixels was performed iteratively for each land cover class. The same was applied for all images for the years 2015, 2017, 2021 and 2023 with a biannual gap. The classification was performed using the maximum-likelihood discriminant analysis classifier by the ‘i.maxlik’ module [
110,
111]. The algorithm used the signature file calculated in the previous step. Using the signature file, the land cover types were classified using maximum-likelihood discriminant analysis (
Figure 5).
3.5. Accuracy Analysis
The analysis of accuracy (
Figure 6) revealed the areas of misclassified pixels within water bodies, including the Bab-el-Mandeb Strait. The accuracy assessment, showing the probability of correctly assigning each pixel to its respective class across all images, is presented in
Figure 6 and calculated with the averages of each calculated band. Afterwards, based on the classified ten categories in the images, two of them (‘water in the sea’ and ‘water in shelf’ areas) were merged as a single class ‘water’, and the names of the classes were assigned to each numerical cluster.
After this step, ML was applied to all the images. The ML step started with the generation of training pixels from land cover classification performed in 2015, which were used as seed data. The extraction of training data was conducted using the ‘r.random’ module. Afterwards, the training pixels were used for classification on the Landsat images (example for 2023) by training a model using the ‘r.learn.train’ module. The gradient boosting classifier was used as an algorithm to perform ML classification. Next, the prediction was performed using the ‘r.learn.predict’ ML module, which applied a fitted estimator from the Python’s Scikit-Learn library [
112] to raster images in the imagery group. In the next step, the computed raster categories were automatically applied to the classification output and checked using the ‘r.category’ module.
3.6. Gradient Boosting
The images were processed using the gradient boosting ML method of image classification. The essential concept of this algorithm is setting the target outcomes of classified images for the next model based on previous iterations, which in the end minimises the error of classification. Gradient boosting presents an advanced ensemble technique of ML, which combines the predictions of multiple classification steps by the principle of decision trees. Such an approach presents a novel method of satellite image processing based on the supervisor approach. The algorithm classifies images using the iterative approach which considers the previous and next classification steps as loops. It combines several decision trees as simplified learners to create a strong prediction mode. By minimising a loss function embedded in the algorithm, each new learner fixes the mistakes committed by the group of prior learners. In this way, the accuracy and predictive capacity of the classifier are increased iteratively, which is directed by the gradient of the loss function.
The core idea is that the algorithm finds the best possible next model when combined with previous ones. In this way, it minimises the overall prediction error. The advantage of such an approach is that it improves the accuracy of classification through the convergence of decision cycles and assigns cells that constitute the mosaic of pixels on a raster image to the defined categories. When several images are processed as a time series, this enables us to effectively monitor landscape dynamics and detect changes.
The class separability matrices were computed for each pair of years (2015–2017 and 2021–2023) and show the distinguishability between classes (Equations (
A1) and (
A2)). The iterative cycles of image classification are summarised in
Appendix B in four statistical tables for each of the study periods, respectively. They report the pixels’ distribution by categories of land cover classes for each year: 2015, 2017, 2021 and 2023. The values indicate percent convergence, which shows at which values the cluster means of land cover categories become stable during iterative classification. Hence, iterative processes aim to achieve the maximal percentage of pixels that no longer move from cluster to cluster during iteration and reach the best possible values.
3.7. Reclassification
To improve accuracy, the images were reclassified using the principle of reclassification scheme. When clusters are being generated by the algorithm, the means of classified categories constantly change, because cells are assigned to these classes and the mean is then recalculated to include new pixels. The goal of the iteration is to maximise the distances between the classes through recalculation. To this end, after the creation of all clusters and the assignment of pixels into these groups, the algorithm ‘i.cluster’ changes cluster means through iteration processes. In this way, the algorithm attempts to increase the contrast between the classes, i.e., the numerical gap in values between the classes.
3.8. Generalisation of FAO Land Cover Map
The FAO land cover map of Djibouti was generalised from the classes of the Land Cover Classification System (LCCS) according to a technical report on the country. The scheme covering land cover categories comprising 10 classes was used and adopted for the study areas within the country (
Table 4). This part of the work was performed using QGIS software, version 3.42.2 ‘Münster’. The QGIS was initially developed by G. Sherman and now a project of the Open Source Geospatial Foundation.
The reclassified image was generated for each year using the ‘r.reclass’ module of GRASS GIS and ‘echo’ Unix command. The raster map was reclassified based on the category values, and new raster maps for the years 2015, 2019, 2021 and 2023 were generated. The category values of the reclassified maps are based on an iteration of the categories using the information generated in ‘landusereclass.txt’ file. These categories were controlled using the ‘r.category’ module. The maps were visualised using modules ‘d.mon’ and ‘g.region’. The color palette was selected from the color tables. Afterwards, the maps were displayed using a combination of ‘d.rast’ and ‘d.vect’ modules. Finally, a cartographic grid was added along with the legends by the ‘d.legend’ module.
The reclassification was performed since water surfaces in the Bab-el-Mandeb Strait and southern Red Sea had different colours. Nevertheless, they belong to the same class of water. The differences in the coastal regions were smooth at the time of image acquisition and appear as dark colours in all images. With each iteration, the values of the means shift to a higher percentage of pixels within each cluster. This is illustrated in the class separability matrix, which shows the distinction between the categories (Equations (
A1) and (
A2);
Appendix C). As means never become completely static, a % convergence and a maximum number of iterations are defined to finalise the process of reclassification before the maximum number of cycles is reached. The final values of the computed class means for each land category are reported in
Appendix D. Once the maximum number of reclassification cycles is reached, the optimal % convergence is achieved through the increased number of iterations during the ‘i.cluster’ procedure. Here, the number of cycles depends on the complexity of the terrain and land cover patterns.
4. Results and Discussion
4.1. Land Cover Change Analysis
The results of the classified satellite images are presented in
Figure 5. The study area was categorised into 10 distinct land cover classes following the FAO classification scheme: (1) salt plains and hardpans; (2) barren land; (3) water bodies; (4) mangroves and aquatic vegetation on regularly flooded (temporarily or permanently) fresh or brackish water; (5) bushes and shrubs; (6) farmland, shrubs and irrigated trees; (7) built-up areas, artificial surfaces and associated areas; (8) broadleaved semi-deciduous forest and woodland; (9) mosaic cropland in cultivated areas; and (10) sparse vegetation in desert areas. Among these classes, the following categories included sub-clusters: water bodies comprise ponds, areas covered by marshes, rivers, estuaries, and coastal waterways; bushland and shrubland include non-classified types of vegetation such as grasslands, shrubs, rangelands, and savannah; and coastal areas surrounding Lake Assal are categorised by salt-tolerant vegetation and mangroves.
Over an estimated period of a decade, from 2013 to 2023, image analysis revealed certain changes in land cover types across the study area of Djibouti. Thus, it is noteworthy that the amount of mangroves along the coastal areas decreased slightly, shrinking by 0.35 km2 from 6.28 km2 to 5.93 km2 during this period. Furthermore, a large area of 1392 km2 that had previously been covered by bushes and shrubland was converted into other land use categories, resulting in a decrease in coverage from 1194 km2. As a result of this reduction, the percentage of bush cover decreased to 15%. Over the course of the studied period, both salt-covered areas around Lake Assal and croplands showed downward tendencies, with farmland areas falling from 3.21 km2 to 2.75 km2 and salt coverage from 17 km2 to 16.04 km2.
Additionally, saline waters mixed with the fresh waters of the estuaries of the Gulf of Tadjoura have a lighter coloration than those of the open sea. Vegetation types such as mosaic croplands, sparse vegetation, xeric shrubland and grassland or bare soil areas are detected and indicated on the maps. Broadleaved deciduous vegetation is relatively scarce in the study area. The regions of shrub, and flooded brackish water areas are mostly occupied by mangrove forests, which have stronger spectral signals and appear as various shades of colors. The computational results of the pixels by land cover classes are summarised in
Table 5.
Xeric shrubland and sparse deciduous forests typical in Djibouti are located along the coasts of the Red Sea or Gulf of Tadjoura. Spectral signatures of this class differ from those of mosaic vegetation (semi-deciduous forests or sparse lands) found in inland regions of the country, on the border with Ethiopia and Eritrea. Thin coastal regions covered with mangrove plants along the coasts of the Bab-el-Mandeb Strait are identified as wetlands partially submerged in water. This type of vegetation shows little difference compared to the salt areas of Lake Assal, which experience significant changes.
There were also notable expansions in other land cover categories of landscapes in Djibouti. For example, the area of bare lands and in the interior deserts (western region of the country) increased significantly between the estimated period of 2015 and 2023. Specifically, the areas expanded from 5.56 km2 to 6.24 km2. This change highlights the processes of desertification of regional landscapes, which is caused by recent climate change and global warming. Hence, desertification processes have become notable over time, with arid regions occupying more areas in inner regions of the country.
4.2. Uncertainties and Sources of Error
Understanding possible uncertainties underlying the classification schemes in RS data processing is imperative. Accurate vegetation mapping using satellite image classification is not easy to achieve due to the spectral confusion between different vegetation types and similarity in spectral reflectance of various plant species [
113,
114,
115,
116]. Therefore, accuracy assessment is a crucial step in the evaluation process. In this case, the generated map was compared to the ground truth values, represented in this case by sample points taken from the FAO-based map. The overall accuracy (OA) and Kappa coefficient (KC) were computed to evaluate the classified map’s accuracy by measuring the similarity between the sample points and the correctly and incorrectly categorised classes (
Table 6).
The sources of errors in automatic classification are also caused by the individual patterns of land cover categories on Earth; the data are not normally distributed in most cases. Moreover, the light signature of the deciduous broadleaved plants enabled us to detect occasionally growing plants on the mountainous slopes in the central regions of the country where precipitation is higher than in semi-desert areas. Thus, deciduous shrubland in all the images is attributed to the high backscatter coefficients of this vegetation type and spectral reflection effects. Therefore, validation using the computed KC was performed and is reported in
Table 6.
In order to avoid uncertainties in classification caused by such cases, coastal types of vegetation were grouped in the same class with other plants having similar spectral properties (grassland). The classification accuracy was performed in order to evaluate possible sources of errors. To this end, images were processed using a rejection probability test, which examines the correctness of the pixel’s classification using the chi-squared test. This made it possible to evaluate the classification of different types of land cover types over the study area using a traditional approach of GRASS GIS based on automated clustering (
Figure 6).
4.3. Interpretation and Data Analysis
Major land cover types in Djibouti include xeric shrubland, bare land areas in mountainous deserts, estuary of the Gulf of Tadjoura, cropland areas and sparse vegetation. Multispectral bands of the images were analysed and compared using a spectral diagram and spatial analysis. The main land cover maps of Djibouti generated using reclassified scenes and ML-based analysis for the images is displayed in
Figure 7 and
Figure 8.
The reclassified images are shown in
Figure 7. Here, the areas of water are merged into one class, which improved the classification. Nevertheless, the neighbouring regions containing vegetation types with similar pixel reflectance were misclassified, which required the ML approach for image processing. A fluctuation in Lake Assal visible on the images is related to climate effects and geothermal activities, as is also mentioned in previous studies. This study revealed that similar to the fluctuations of saline sabkhas in the northern Sahara [
117,
118], the saline lake, Lake Assal, also witnessed slight changes in the water surface and extent of the lacustrine coasts. Nevertheless, the nature of such physiographic variations is more related to the geologic origin [
119].
Landscape change detection was performed through a comparison among the land cover types detected on the ML-based classified satellite images obtained on different dates. The classification of the images is based on an automated evaluation of the spectral reflectance values of the individual pixels on the images, which are assigned to different land cover classes. The comparative analysis of the classified images enabled us to detect landscape changes. The observed transitions of categories were the loss of intact vegetation areas; fluctuations in the saline lake, Lake Assal; gains in vegetation and coastal areas; and secondary vegetation growth. These transitions correspond to the predominant patterns of vegetation response to climate warming and growth annual temperatures, which are visible when comparing central and coastal regions of Djibouti, where the latter have been influenced by the marine climate, especially in the coastal areas of the Gulf of Tadjoura.
The traditional (maximal likelihood) and ML classification maps were compared for 2015, 2019, 2021 and 2023. Comparing the changes on classified images from different dates shows the dynamics of the landscapes over Djibouti from 2015 until 2023. The differences between the land cover or land use types are presented in
Figure 5,
Figure 7 and
Figure 8.
Figure 5 shows the results of the classification made using the maximal likelihood approach;
Figure 6 shows the accuracy assessment;
Figure 7 shows the reclassified maps; and
Figure 8 shows the results of the classification performed using the ML approach.
The comparison of the classified maps revealed that natural vegetation, such as herbaceous and mosaic shrubland, cropland and grassland, has decreased while the sparse vegetation and bare soil areas typical of desert and semi-desert areas in Djibouti have increased. Changes in land cover types in the coastal region of Bab-el-Mandeb and central Djibouti were detected and visualised in the target landscapes over the period from 2015 to 2023. The images were compared based on different dates to assess the landscape dynamics, which is defined by the evaluation of land cover types through the identification of various land cover patches, in different regions of Djibouti.
5. Conclusions
This paper demonstrated the efficiency of RS data processing using ML methods for visualising land cover changes. Using this approach, the dynamics of land cover types has been detected by identifying differences in the spectral reflectance of pixels on images assigned to different categories by ML algorithms of gradient boosting. This method was tested, and has been explained and presented in the form of a series of new maps covering Djibouti. Interdisciplinary problems in geographical sciences often require decisions to be made by diversified approaches. ML algorithms, such as the gradient boosting classifier, improved mapping workflow through programming to optimise RS data classification. When supplementary modules of GIS are used, such tools point out the shortcomings of traditional methods in geoinformatics. The development of cartographic tools for RS data processing using ML is a challenging task. As technologies mature along with the development of programming algorithms, cartographic modelling and the analysis of landscape changes improve by integrating applications of scripting methods. Here, we have demonstrated the use of such tools using Python’s Scikit-Learn library, adapted to GRASS GIS.
Scripting cartographic tools enable us to highlight salient aspects of environmental dynamics through the automation of mapping and data visualisation. They support geospatial data processing, modelling, visualisation and interpretation. For such geologically complex areas, integrated programming methods assist in handling the issue of data processing in a spatially expressive manner. The effectiveness of ML is explained through the advanced algorithms of image classification and automated workflow. Hence, the use of ML tools is essential for implementing the workflow of processing geographic data to reveal and visualise environmental problems. Their applications demonstrate a prominent role of scripting for geospatial data handling and environmental analysis. As ML algorithms are developed and implemented in RS and mapping, their performance supports the computational complexity of the tasks involved in cartographic data processing. Many case studies on geographical analysis and mapping, which use conventional GISs, for example, require highly computational and costly mapping workload due to diversified cartographic tasks. In such cases, the use of ML, most notably gradient boosting, is an essential solution to optimising cartographic workflow.
Future research can use the thematic maps of land cover types and continue with the analysis of neighbouring regions of the Bab-el-Mandeb Straight for environmental monitoring. Moreover, it can enhance the thematic and topical research of East Africa, particularly in relation to the geological exploration of Djibouti, environmental analysis, and geophysical monitoring of tectonically active regions within the Afar Triple Junction. Furthermore, to continue this study, future works can adapt ML techniques of GRASS GIS for land cover monitoring of other regions over an extended period to analyse landscape dynamics using time series analysis in regions around the Red Sea, East Africa.