Groundwater Salinity Prediction in Deep Desert-Stressed Aquifers Using a Novel Multi-Stage Modeling Framework Integrating Enhanced Ensemble Learning and Hybrid AI Techniques
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe study is very interesting and deals with the issue of salinity in a desert area when rainfall is very meagre, evaporation rates are very high and data availability is scarce. It has been well demonstrated that the application of the Enhanced Ensemble Learning and hybrid AI techniques using a two stage modelling framework is well-suited to such areas. The research is innovative and well presented. Other than few suggestions, it would be prudent to add your recommendations for mitigation of the salinity issue in the study area. Also the scope for future work can also be incorporated as it will be helpful to the research community. Good work and all the best.
Author Response
Comments :
The study is very interesting and deals with the issue of salinity in a desert area when rainfall is very meagre, evaporation rates are very high and data availability is scarce. It has been well demonstrated that the application of the Enhanced Ensemble Learning and hybrid AI techniques using a two stage modelling framework is well-suited to such areas. The research is innovative and well presented. Other than few suggestions, it would be prudent to add your recommendations for mitigation of the salinity issue in the study area. Also the scope for future work can also be incorporated as it will be helpful to the research community. Good work and all the best.
- Response: We fully agree with the reviewer’s suggestion. Accordingly, we have revised the manuscript to include a dedicated section on salinity mitigation strategies in the discussion section, specifically tailored to arid regions with limited water resources. These recommendations are based on our model’s findings and include both agronomic and hydrological measures.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper uses a novel multi-stage modeling framework based on deep learning to predict the salinity of deep groundwater in deserts. This research can provide an effective tool for the early detection of water quality degradation. Overall, this article is meaningful, but there are still some problems. The main problem of this article is that the number of samples is too small, making it difficult for readers to believe that it has representativeness and application value. It is hoped that the author will carefully solve this problem.
(1) The introduction about the current research status of deep learning is overly detailed. It is suggested that some appropriate deletions be made to make the research background in this area more focused. Meanwhile, the introduction should be supplemented with information about the causes of groundwater salinization, which can further enrich the background.
(2) According to Figure 1, it is found that the sampling points are very few. The specific number of samples taken at this location needs to be provided in the text. There are almost no samples in the northern part of the study area. Are the current sample positions and quantities representative?
(3) The distribution characteristics of each parameter in Figure 3 should be analyzed, and the correlations among the parameters can also be appropriately analyzed.
(4) Figure 4 illustrates the overall procedural framework adopted in this survey, but it also requires an appropriate amount of text to describe the specific process.
(5) The reasons for selecting these hydrochemical parameters in Subsection 4.1 need to be supplemented.
(6) The discussion section needs to appropriately add some reasons for the distribution characteristics of groundwater salinization.
(7) The conclusion needs to be appropriately reduced and some quantitative information added.
Author Response
Comments :
(1) The introduction about the current research status of deep learning is overly detailed. It is suggested that some appropriate deletions be made to make the research background in this area more focused. Meanwhile, the introduction should be supplemented with information about the causes of groundwater salinization, which can further enrich the background.
- Response: We sincerely thank the reviewer for the valuable feedback. In response to the comment regarding the introduction section (section 1), we have carefully considered all the remarks. We have thoroughly revised and rewritten the entire introduction to improve its focus by reducing the overly detailed parts on deep learning research status. Additionally, we have enriched the background by including a clear and concise discussion on the causes of groundwater salinization, particularly in arid deep aquifers. We believe these changes have significantly improved the clarity and relevance of the introduction.
(2) According to Figure 1, it is found that the sampling points are very few. The specific number of samples taken at this location needs to be provided in the text. There are almost no samples in the northern part of the study area. Are the current sample positions and quantities representative?
- Response: We thank the reviewer for this insightful observation. In response, we have thoroughly revised and rewritten the entire data-collection section (section 2.2) to clarify the sampling strategy and explicitly provide the exact number of samples collected (41 groundwater samples in total). We also addressed the spatial distribution of the sampling points, explaining that sampling was necessarily focused near oasis zones where boreholes are concentrated, due to the geological and regulatory constraints in this deep desertic aquifer setting. The northern part of the study area is largely inaccessible for sampling, as drilling is not permitted outside the restricted oasis zones.
We believe these revisions better clarify the representativeness and limitations of the sampling network in this challenging environment, and how the dataset, though limited, is robust for characterizing groundwater salinity in this arid region. The updated section now explicitly communicates these points to improve transparency and reader understanding.
(3) The distribution characteristics of each parameter in Figure 3 should be analyzed, and the correlations among the parameters can also be appropriately analyzed.
- Response: Thank you for your valuable suggestion. We have revised the manuscript to include a detailed analysis of the distribution characteristics of each hydrochemical parameter originally shown in Figure 3. This analysis, along with the correlation assessment among the parameters, has been moved to the Results section (Section 5) for better clarity and flow. The updated spatial distributions are now presented in Figure 6, while the newly added correlation heatmap is shown in Figure 7. These additions provide deeper insights into the groundwater chemistry dynamics and improve the overall interpretation. Please refer to the updated Results section (Section 5) for further details.
(4) Figure 4 illustrates the overall procedural framework adopted in this survey, but it also requires an appropriate amount of text to describe the specific process.
- Response: We thank the reviewer for the valuable suggestion. In response, we have extensively revised and expanded the section describing the Data Processing Framework for Predicting Salinity (section 4). Additional detailed text has been added to clearly explain each step of the procedural framework illustrated in Figure 4, including data preprocessing, feature selection, model training, validation, and final prediction stages. These revisions provide a more comprehensive and transparent description of the methodology to better support the figure and enhance the reader’s understanding of the overall process.
(5) The reasons for selecting these hydrochemical parameters in Subsection 4.1 need to be supplemented.
- Response: We sincerely thank the reviewer for this valuable suggestion. In response, we have thoroughly revised Subsection 4.1 to supplement and clearly articulate the rationale behind the selection of the hydrochemical parameters used in this study. The added text explains the geochemical significance of each parameter in relation to groundwater salinity, their roles in arid environments, and their relevance based on previous hydrogeological studies. These additions aim to clarify the scientific basis for the parameter selection and strengthen the methodological foundation of our work. The revised subsection now provides a comprehensive justification, ensuring the reader fully understands the choice and importance of these features in the salinity prediction models.
(6) The discussion section needs to appropriately add some reasons for the distribution characteristics of groundwater salinization.
- Response: We sincerely thank the reviewer for this valuable suggestion. In response, we have carefully revised the discussion section (section 6) to include a detailed explanation of the hydrogeological and anthropogenic factors influencing the spatial distribution characteristics of groundwater salinization. The revised text now thoroughly addresses the natural processes, geological heterogeneities, and human activities that contribute to the observed salinity patterns in the study area. These additions provide a clearer understanding of the mechanisms driving salinization variability and enhance the overall interpretation of our results.
(7) The conclusion needs to be appropriately reduced and some quantitative information added.
- Response: We sincerely thank the reviewer for the valuable feedback. We have carefully considered all remarks and have entirely revised and fully rewritten the Conclusion section (Section 7). The revised conclusion is now more concise and includes relevant quantitative information to clearly summarize the key findings and model performance metrics.
Reviewer 3 Report
Comments and Suggestions for AuthorsDear authors!
Your work lacks description of the initial factual data - wells and hydrogeochemical analyses. I have the impression that your work has no relation to the problems of groundwater science. I recommend that you submit this work to another journal specializing in problems of new methods of processing experimental data.
Comments for author File: Comments.pdf
Author Response
Comments :
1- Your work lacks description of the initial factual data - wells and hydrogeochemical analyses. I have the impression that your work has no relation to the problems of groundwater science. I recommend that you submit this work to another journal specializing in problems of new methods of processing experimental data.
- Response: Thank you very much for your careful review and constructive feedback. We appreciate your concern regarding the description of the initial factual data, including wells and hydrogeochemical analyses. We would like to clarify that our study is firmly grounded in the hydrogeological context of the region and addresses key groundwater science issues, particularly the sustainability challenges faced by the multilayer aquifer systems under increasing exploitation and salinization.
The hydrogeological framework and flow dynamics of the aquifers, including recharge zones, flow paths, and salinization mechanisms, are described in detail with relevant references to prior foundational studies ([45,46]). Our work focuses on understanding the spatial variability and controlling factors of groundwater quality deterioration in East Kebili, which is critical for managing the water resource sustainably.
We acknowledge that additional clarity regarding the data sources, well locations, sampling campaigns, and hydrogeochemical analysis methods would strengthen the manuscript. Therefore, we have expanded the description of the initial dataset and analytical procedures in the revised version to better link the data with the scientific issues discussed.
We also emphasize that the integration of hydrogeochemical data interpretation with groundwater flow dynamics in this study directly relates to pressing groundwater science problems, such as salinization drivers, recharge limitations, and resource sustainability, rather than being solely a methodological exercise in data processing.
We hope these revisions and clarifications address your concerns and demonstrate the relevance of our work to the field of groundwater science.
2- Unfortunately, in addition to a general hydrogeological description of the area (Section 2), the work does not contain explicit data on wells and the results of their hydrogeochemical testing. This makes it impossible to verify the authors' conclusions using other independent methods.
- Response: Thank you for your valuable observation. We acknowledge the importance of providing explicit details on the wells and the hydrogeochemical data to ensure transparency and allow independent verification of our conclusions. In response, we have expanded the manuscript to include a comprehensive description of the well network—covering locations, depths, screened intervals, and aquifer systems targeted—as well as the hydrogeochemical testing results, including concentrations of major ions, physicochemical parameters, and quality control measures.
Specifically, a new subsection has been added in Subsection 2.2 (Data collection) presenting detailed well information and summarizing the analytical results. Additionally, summary tables and relevant hydrochemical statistics have been included in the Results section (Section 5). These additions provide the necessary factual basis for reproducibility and reinforce the validity of the interpretations and conclusions presented in the manuscript.
We believe these improvements will address your concerns and enhance the scientific rigor of the work.
3- Figure 1 shows the location of the wells where water was sampled for chemical analysis. Wells are located only in the southern part of the region. Figures 2 and 3 show the distribution of mineralization and 11 hydrogeochemical indicators across the entire region. How this could have been done without having data for the entire territory is absolutely unclear and raises doubts about the results of this work.
- Response: Thank you for your insightful comment. We acknowledge that the northern part of the study area remains largely unsampled due to the absence of accessible wells and strict regulatory prohibitions on new drilling within these sensitive desert zones. This limitation is a direct consequence of environmental protection policies and the naturally scarce groundwater resources in these remote areas.
The well network, therefore, is concentrated in the southern oasis zones where groundwater extraction is feasible and actively practiced. Despite this spatial constraint, the dataset provides critical hydrogeochemical information representative of the key groundwater sources sustaining local agriculture and settlements.
The primary objective of the proposed methodology is to address precisely this challenge: to enable reliable prediction and monitoring of aquifer salinization under conditions of limited and spatially clustered data. By integrating advanced artificial intelligence and machine learning techniques, our approach leverages the available dataset to extrapolate and model salinity trends beyond sampled points, thereby providing a valuable predictive tool for groundwater management across the entire region, including unsampled areas.
We have clarified these points in the revised manuscript (subsection 2.2: Data collection, and Section 6: Discussion) to better communicate the data limitations and the role of our methodology in overcoming these inherent constraints, ensuring that the interpretations and predictions are both scientifically grounded and practically applicable.
4- Only at the end of the article in section «Data Availability Statement» does it say that “data are available from the Regional Commissary for Agricultural Development Kebili – CRDA Kebili (agricultural map)”. This means that the authors did not obtain any new data on the hydrogeochemistry of groundwater in the area.
- Response: Thank you for your comment. We would like to clarify that the study area is located within a restricted zone subject to strict regulatory prohibitions on new drilling and groundwater exploitation, due to its fragile desert ecosystem and environmental protection policies. Consequently, the positions of existing wells and the extent of the aquifers are officially provided by the Regional Commissary for Agricultural Development Kebili (CRDA Kebili), which manages groundwater resources and monitoring activities in the region.
While the well locations and baseline maps are sourced from CRDA Kebili, the hydrogeochemical data presented in this study were obtained through an independent and comprehensive groundwater sampling campaign conducted by our team between 2022 and 2024. This campaign involved direct field sampling, in-situ measurements, and laboratory analyses performed under standardized protocols to generate new, high-quality data on groundwater chemistry. These original data form the core of the study’s analyses and interpretations.
We have now clarified this distinction in the revised manuscript to avoid any misunderstanding regarding the novelty and provenance of the hydrogeochemical dataset.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsAccept
Author Response
Dear Reviewer,
Thank you for your valuable suggestions. We have carefully improved the figures as recommended to enhance their clarity and quality.
We appreciate your constructive feedback and support.
Best regards,
The Authors
Reviewer 3 Report
Comments and Suggestions for AuthorsI recommend providing chemical analysis data for all groundwater sampling points, indicating coordinates, sampling depth, and aquifer.
Comments for author File: Comments.pdf
Author Response
Comments :
* The authors in the revised manuscript still do not pay due attention to the original data.
For example, Figure 2 – it is not clear, in which aquifer(s) is it about the TDS values?
- Response: We thank the reviewer for pointing out the need for clarification. In the study area, the Plio-Quaternary and Complex Terminal (CT) aquifers are hydraulically connected, forming a single multilayered aquifer. In the Chott Djerid region specifically, the Mio-Plio-Quaternary formations are not differentiated in practice, and in most cases they are considered as part of a multilayered aquifer system hydraulically attached to the CT aquifer. All water samples used in this study were collected from wells tapping into this multilayered system; therefore, the TDS values presented in Figure 2 represent the integrated water quality of the combined Plio-Quaternary–CT aquifer. This clarification has been added to the hydrogeological framework section (subsection 2.1. Study area) and is highlighted in the text for ease of reference.
* A comparison of Figure 2 (TDS distribution), Figure 6 (distribution of concentration values of individual components, TDS, pH) and Figure 1 (sampling scheme) raises confusion as to how the values of TDS and other parameters were determined in the northern region, where samples were not taken (?).
- Response: Thank you for this valuable comment. We acknowledge the potential confusion regarding the representation of TDS and other parameter values in the northern region where no direct samples were collected. To clarify, the values shown in Figures 2 and 6 for this area are predicted using the proposed AI and machine learning approach, which is specifically designed to estimate groundwater salinization in zones with limited data availability. This approach uses the relationships learned from the sampled data to provide spatial predictions across the entire study area, including unsampled regions.
The machine learning framework serves as an advanced and more accurate tool for interpolation and spatial prediction compared to traditional methods, enabling us to overcome the challenges posed by poor database zones. We have added clarifications in the manuscript to clearly distinguish measured data from predicted outputs, highlighting the model’s capability to fill data gaps and support comprehensive groundwater quality assessment.
* Section 2 does not contain any description of the locations where water samples were taken (depth, water level, aquifer), or any materials on the systematization of these samples by aquifers and by water sampling depth.
- Response: In response to the reviewer’s comment, a new table (Table 1) has been added in subsection 2.2 (Data Collection) providing the coordinates, depth, and aquifer information for each sampling location, along with the corresponding hydrochemical parameters. This table also allows a clear systematization of the samples by aquifer and sampling depth.
* The work is poorly systematized. Sections devoted to the description of the area and data are not linked to sections on data processing methods.
- Response: Thank you for your valuable feedback. We acknowledge the importance of clear and logical connections between the description of the study area, data, and the data processing methodology. In the revised manuscript, we have strengthened the integration between these sections by explicitly linking the characteristics of the study area and the nature of the collected data to the subsequent stages of the data processing and modeling framework.
Specifically, we have clarified how the hydrogeological and geochemical features of the area informed the selection of input variables, how the limited and spatially variable dataset influenced the choice of machine learning algorithms and preprocessing steps, and how these aspects are reflected in the model evaluation and interpretation of results.
These improvements enhance the coherence and flow of the manuscript, providing the reader with a clear understanding of how the field data and site-specific context underpin the modeling approach and outcomes.
* Section 3 provides a general description of the data processing methods, which consist of several stages. It is not clear what data are input and output at each stage. Section 4 also does not provide an idea of what tasks were solved at each stage, what set of initial data was used, and what results were obtained at each stage.
- Response: Thank you for your valuable feedback. We have revised Sections 3 and 4 to clearly specify the inputs and outputs at each stage of the data processing workflow. For each modeling stage, we now explicitly describe the dataset and variables used as input, the tasks and methods applied, and the corresponding outputs or results obtained. This includes detailed explanations of feature selection, model training, hyperparameter tuning, and performance evaluation. These additions aim to provide a clearer understanding of the methodological framework and the progression of the analysis. Please refer to the updated Sections 3 and 4 for these clarifications.
* Figure 4: The font size on the graph axes is small, making it impossible to read the text and numbers.
- Response: The font size of the axes labels and numbers in Figure 4 has been increased to ensure readability, as suggested.
* Section 5 (Results) A formal statistical description of the concentration values of individual components of the chemical composition of water, TDS and pH (Table 1) and their spatial distribution (Figure 6) is presented.
- Response: We thank the reviewer for the observation. The authors have checked Section 5 (Results) and confirm that it provides a formal statistical description of the concentration values of individual chemical components, TDS, and pH, as presented in Table 2 and illustrated by their spatial distribution in Figure 6. Additionally, we have added Table 1 in subsection 2.2 (Data Collection) to provide more detailed information on the hydrogeological characteristics and chemical analyses of the groundwater samples.
* Missing Figure 7 (?).
- Response: We thank the reviewer for pointing this out. The omission of Figure 7 in the revised version was an unintentional oversight during the preparation of the manuscript. In the revised submission, we have now included Figure 7 in the correct order and ensured that all figures and in-text references are consistent.
* Conclusion:
With a small amount of data (41 groundwater samples with unclear reference to depth and aquifers), the construction of predictive salinity distribution maps (presumably TDC?) is unconvincing.
This material needs to be substantially revised, paying closer attention to the original data.
I recommend providing chemical analysis data for all groundwater sampling points, indicating coordinates, sampling depth, and aquifer.
- Response: We sincerely thank you for your insightful comments and recommendations. In response, we have substantially revised the manuscript to clarify the nature and limitations of the dataset, explicitly referencing the 41 groundwater samples along with their depth and aquifer information.
Additionally, we have provided a detailed table including the chemical analysis data for all groundwater sampling points, with coordinates, sampling depths, and aquifer identification, as requested. This information has been incorporated to enhance transparency and strengthen the interpretation of our predictive salinity distribution maps.
We believe these revisions address your concerns and improve the overall clarity and robustness of the study.
Thank you again for your constructive feedback.
Author Response File: Author Response.pdf