Next Article in Journal
Flood Hazard and Risk in Urban Areas
Previous Article in Journal
Experimental Study: Stress Path Coefficient in Unconsolidated Sands: Effects of Re-Pressurization and Depletion Hysteresis
 
 
Article
Peer-Review Record

Advancements in Geohazard Investigations: Developing a Machine Learning Framework for the Prediction of Vents at Volcanic Fields Using Magnetic Data

Geosciences 2024, 14(12), 328; https://doi.org/10.3390/geosciences14120328
by Murad Abdulfarraj 1,2, Ema Abraham 3, Faisal Alqahtani 1,2 and Essam Aboud 1,*
Reviewer 1:
Reviewer 2: Anonymous
Geosciences 2024, 14(12), 328; https://doi.org/10.3390/geosciences14120328
Submission received: 24 September 2024 / Revised: 26 November 2024 / Accepted: 27 November 2024 / Published: 3 December 2024
(This article belongs to the Section Geophysics)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper deals with the modeling of statistical relationships between vent location and geomagnetic data in the youngest volcanic field of Arabian peninsula, the RVF. The authors suggest that a Random Forest approach will be best reproduce the data and eventually suggest probable vents not recognized at the moment.

The paper is well written and organized, but the dataset is not fully described and the scientific approach not justified; therefore I cannot reccommend this paper for publication. I strongly suggest adding a more critical evalution of the dataset.

I will list in detail my main concerns, and some annotation in the article file: 

- It is not clear to me which are the exact data used for the regression. Is that lat, long, magnetic anomaly, and yes/no for vent location?

-If so, how do a logistic regression (which is a bivariate distribution) could possibly be used?

-Current magnetic anomalies reflect the distribution of magnetic and non magnetic bodies AFTER the eruptions which occurred through the vents; without any modeling of the sources the study is potentially biased as the anomalies are actually created by eruptions and the effects superposed one on the other

-The geological map presented DOES not coincide with the study area; there is no information how many of the vent actually considered as single, independent structures, are in fact vent clusters which are associated with a single eruptive event (and for example, associated with the same fissure) introducing a significant bias

-There is no comparison with any other vent prediction study; no critical evaluation of the methods and the reliability of the study

In conclusion, I believe that the authors did model the current distribution of vents in the field with respect to magnetic anomaly in the ground and evaluated the accuracy of the model, but any reliability of the model as a predictor for further vent location  is nor tested nor justified. The 'blind' (i.e. with no scientific basis, which is in this case, a critical evaluation of eruptions distribution, single eruptive events vent systems, and model of the magmatic sources of magnetic anomalies) use of Ai procedures to make models of natural cases does not per se provide significant insights on the dyamics of the processes controlling the dataset and is not necessarily able to predict other structures.

I strongly suggest the authors to reevaluate the dataset and approach and eventually add a volcanology expertise to the team, which I guess is lacking at the moment, also given some unusual terminology adopted (i.e. 'volcanic deposition activities', 'dome cones')

 

Comments for author File: Comments.pdf

Author Response

Author’s Response to Reviewer 1 Report 2

 

Title – Advancements in Geoharzard Investigation: Developing Machine Learning Framework for the Prediction of Vents at Volcanic Fields Using Magnetic Data

Reviewer’s Reports:

Reviewer comments (1): I appreciated the author's response to my comments, which are very technical, but sometimes lack the point; I might have been not clear enough. I see no significant differences in the new manuscript version. I'll restate my two major criticisms (which could both be addressed without much work)

- Regarding my question on the regression: If you have three variables plus vent location (lat long, magnetic anomaly and yes/no for vent), how can they be fitted to a two variable statistical distribution?  The input data should be clearly listed and explained in the method section.

 

Authors Response:

Dear respected Reviewer,

Thank you for your follow-up question. We apologize for any confusion. In our regression model, we used longitude, latitude, and magnetic anomaly as the three predictor variables, while vent presence (yes/no) was the target variable. This setup allows the model to learn a relationship between the spatial distribution of magnetic anomalies and the occurrence of volcanic vents.

You are correct in noting that fitting these three variables into a two-variable statistical distribution would not be appropriate. However, our approach does not rely on fitting to a bivariate distribution. Instead, we employed the Random Forest regression method, which can handle multiple predictor variables simultaneously without requiring them to conform to a traditional two-variable distribution.

We have now ensured to list and provide further explanations on the input data and the rationale behind our model choice in the Methodology section to avoid any ambiguity (as shown below, also added at Page 10 of 20). Thank you for highlighting the need for this clarification.

In our study, we used the following input data for the machine learning model:

  1. Longitude: The geographic longitude of each data point, which provides spatial information relevant to the location of magnetic anomalies and volcanic vents.
  2. Latitude: The geographic latitude of each data point, paired with longitude to give a complete spatial reference for the dataset.
  3. Magnetic Anomaly: The measured magnetic field intensity at each data point. This value is critical for identifying subsurface geological features that may be associated with volcanic activity.
  4. Vent Presence (Yes/No): A binary indicator that specifies whether a volcanic vent is present at the corresponding location. This variable serves as the target variable for the regression model, allowing us to train the model to recognize patterns associated with vent locations.

Our utilization of these variables, structured the model to learn how spatial variations in magnetic anomaly data relate to the presence of volcanic vents across the study area.

Reviewer comments (2): -In the conclusion the authors state: 'This study not only demonstrates the efficacy of integrating machine learning with geophysical data for volcanic vent prediction but also provides a valuable framework for identifying potential areas for further investigation or monitoring volcanic activity. '

I see no demonstration of the efficacy of vent prediction. How was it done? How could the author prove that the model correctly predict unseen vents? I understand this is an unsolvable point unless proper field surveys are done. I suggest changing the authors' view on the results,  by proposing that  their work show a promising method, whose efficacy should be further explored with direct surveys or in other areas.

Authors Response: Thank you for highlighting this point. We agree that the ultimate demonstration of the model's efficacy in predicting unseen vents would require validation through field surveys and real-world testing. However, in our study, we assessed the model's performance using statistical metrics derived from test data that were withheld from the model during training. These metrics, such as accuracy, precision, and recall, indicate the model's ability to identify vent locations based on patterns in the magnetic anomaly data.

While these statistical evaluations provide some confidence in the model's predictive capability, we recognize that they do not constitute proof of correctly predicting new, undiscovered vents in the field. To address this limitation, we propose that future work involve targeted field surveys in regions where the model has predicted potential vent locations. Such surveys would provide the necessary ground truth to verify the model's predictions and improve the reliability of vent prediction methods. 

We have revised our conclusion to emphasize the need for field validation (Page 18 of 20).

Reviewer comments (3): -I still believe that the map figure 2 must match the investigated area. It is completely off and I see no use of it. Why is that?

Authors Response: Figure 2 has now conformed to the investigated area (Page 5 of 21). Thank you.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

Review for ``Advancements in Geoharzard Investigation: Developing Machine Learning Framework for the Prediction of Vents at Volcanic Fields Using Magnetic Data'' by Abdulfarraj et al.

 

In this study, machine learning was applied to predict the volcanic vent locations using aeromagnetic data. Known vent locations and magnetic field intensity at each longitude and latitude were used for supervised machine learning. Five machine learning algorithms (logistic regression, decision tree, random forest, gradient boosting, and support vector machine) were tested. The study showed that the random forest was best according to its accuracy, precision, recall, F1-score, receiver operating characteristic curve, and area under the curve. The predicted vent locations correlated well with the observed locations, and several unknown vent locations were predicted. The authors speculated that these vents might have been obscured by eruptive deposits or several other processes.

I think that this study is worthy of publication, given its relatively new approach and significance for hazard assessment. However, before publication, the manuscript needs to be improved to describe the procedures used in this study more precisely. Most descriptions in the methodology section are too generalized and appear to be similar to a textbook of machine learning. The most essential information is how the authors applied these well-known machine learning techniques to the data specific to this study, that is, the vent locations and magnetic field data. The details of this application are unclear in the current version of the manuscript. Therefore, please consider addressing my major comments below to describe the methodology better.

 

Major comments:

(1) Exactly what quantities did you use for the magnetic features (``x'' in Eqs. 1-6)? Were the longitude, latitude, and magnetic field intensity at each mesh node directly used for ``x''? Probably, the most widely used machine learning technique is a convolutional neural network, in which spatial filters are applied to emphasize the correlations between the data intensity at each pixel and adjacent pixels. In older machine learning, various manually considered features (e.g., the average, standard deviation, and peak frequency of the spectrum) were used; for example, see Curilem et al. (2009) and Hibert et al. (2017). Did you define or calculate features in a similar manner, or just use the longitude, latitude, and magnetic field intensity at each pixel directly? Please clarify.

(2) If the longitude, latitude, and magnetic field intensity of each pixel were used directly, the machine might learn the locations of the vents rather than the spatial pattern of the magnetic field. How did you avoid this scenario? For example, in a study on the automatic identification of P- and S-wave arrivals in seismograms, Zhu and Beroza (2018) used randomly selected time windows to prevent the machine from learning the arrival time values. Clarify the techniques used in this study to prevent the machine from learning the coordinates of vent locations.

(3) Each vent has a finite size. Did you approximate the vent as a point? If two or more mesh nodes of the magnetic field data were in a finite size vent, how were they treated?

(4) Did you use all vent locations and magnetic field data to train the model? It is common to use some of the data (randomly selected) for training and the others to evaluate the model performance. If this approach was used, please clarify the distinction between the data for training and evaluation in Fig. 9.

(5) If all data were used for training, what is the difference between the newly detected vent locations and false positives? Both are locations where the value of the target variable of the supervising data is 0 (no vent) but the model predicts that there is a vent. In other words, if the machine learning model achieved 100% accuracy, would the result be no prediction of unknown vent locations?

 

Minor comments:

(6) Line 167. ``the thickness of lava flows in the research area ranged from 100 m above sea level ...'. The subject is the thickness, and the value is the elevation. Rewrite the sentence to make them consistent.

(7) Cite Figure 4 in the manuscript.

(8) Add a right parenthesis in Eq. (6).

(9) Add color bars in Figures 10 and 11. Indicate the bin size used to calculate the vent density in Figure 10.

(10) How was the correlation in Figure 11 calculated? If all data in Figure 10 were used, the result would be only one value for the correlation coefficient. I guess that a moving window was used to derive the correlation as a spatial function. Is this understanding correct? Please clarify.

(11) The correlation coefficient is given as a function of location (Figure 11). What does the correlation coefficient of 1 (lines 405 and 473) point to? The maximum value? Please clarify.

 

Curilem et al. (2009), Classification of seismic signals at Villarrica volcano (Chile) using neural networks and genetic algorithms, J. Volcanol. Geotherm. Res., 180, 1-8, https://doi.org/10.1016/j.jvolgeores.2008.12.002

Hibert et al. (2017), Automatic identification of rockfalls and volcano-tectonic earthquakes at the Piton de la Fournaise volcano using a Random Forest algorithm, J. Volcanol. Geotherm. Res., 340, 130-142, https://doi.org/10.1016/j.jvolgeores.2017.04.015

Zhu and Beroza (2018), PhaseNet: a deep-neural-network-based seismic arrival-time picking method, Geophys. J. Int, 216, 261-273, https://doi.org/10.1093/gji/ggy423

Author Response

Author’s Response to Reviewer 2 Report

 

Title – Advancements in Geoharzard Investigation: Developing Machine Learning Framework for the Prediction of Vents at Volcanic Fields Using Magnetic Data

Reviewer’s Reports:

Reviewer comments: In this study, machine learning was applied to predict the volcanic vent locations using aeromagnetic data. Known vent locations and magnetic field intensity at each longitude and latitude were used for supervised machine learning. Five machine learning algorithms (logistic regression, decision tree, random forest, gradient boosting, and support vector machine) were tested. The study showed that the random forest was best according to its accuracy, precision, recall, F1-score, receiver operating characteristic curve, and area under the curve. The predicted vent locations correlated well with the observed locations, and several unknown vent locations were predicted. The authors speculated that these vents might have been obscured by eruptive deposits or several other processes.

 

I think that this study is worthy of publication, given its relatively new approach and significance for hazard assessment. However, before publication, the manuscript needs to be improved to describe the procedures used in this study more precisely. Most descriptions in the methodology section are too generalized and appear to be similar to a textbook of machine learning. The most essential information is how the authors applied these well-known machine learning techniques to the data specific to this study, that is, the vent locations and magnetic field data. The details of this application are unclear in the current version of the manuscript. Therefore, please consider addressing my major comments below to describe the methodology better.

 

Authors Response:

Dear respected Reviewer,

Thank you very much for the comments, suggestions and corrections you offered in your review. They were quite insightful and helpful. Additional clarifications and corrections (following the review) have been added to provide more information on the procedure used to better describe the methodology.

 

Reviewer comments (1): Exactly what quantities did you use for the magnetic features (``x'' in Eqs. 1-6)? Were the longitude, latitude, and magnetic field intensity at each mesh node directly used for ``x''? Probably, the most widely used machine learning technique is a convolutional neural network, in which spatial filters are applied to emphasize the correlations between the data intensity at each pixel and adjacent pixels. In older machine learning, various manually considered features (e.g., the average, standard deviation, and peak frequency of the spectrum) were used; for example, see Curilem et al. (2009) and Hibert et al. (2017). Did you define or calculate features in a similar manner, or just use the longitude, latitude, and magnetic field intensity at each pixel directly? Please clarify.

Authors Response: Thank you for your detailed question. In this study, we used the longitude, latitude, and magnetic field intensity at each mesh node as our primary features (denoted as x in Eqs. 1-6). These values were selected to capture the spatial and intensity information of magnetic anomalies, which we hypothesized would correlate with volcanic vent locations.

While we did not calculate additional derived features—such as the average, standard deviation, or spectral features of the magnetic intensity—this is an area we are actively exploring for future iterations. Incorporating these features, and experimenting with feature engineering techniques as in Curilem et al. (2009) and Hibert et al. (2017), will help refine our model’s sensitivity to specific geophysical patterns.

Regarding model selection, we chose Random Forest regression for its effectiveness in handling non-linear relationships with limited tuning requirements. However, we recognize that applying a convolutional neural network (CNN) could allow us to capture more localized spatial correlations, especially given the gridded nature of our data. We plan to evaluate CNNs in future work to determine if their spatial filtering capabilities can improve our model’s accuracy for vent prediction.

Reviewer comments (2): If the longitude, latitude, and magnetic field intensity of each pixel were used directly, the machine might learn the locations of the vents rather than the spatial pattern of the magnetic field. How did you avoid this scenario? For example, in a study on the automatic identification of P- and S-wave arrivals in seismograms, Zhu and Beroza (2018) used randomly selected time windows to prevent the machine from learning the arrival time values. Clarify the techniques used in this study to prevent the machine from learning the coordinates of vent locations.

Authors Response: Thank you for this question. To ensure that the model learned spatial patterns in the magnetic field data rather than simply memorizing the coordinates of known vents, we implemented several techniques. First, we applied a stratified k-fold cross-validation approach, ensuring that each fold included a diverse mix of vent and non-vent locations spread across different regions. This forced the model to generalize from patterns in the magnetic anomalies rather than focusing on specific vent coordinates. Additionally, we experimented with data augmentation by creating artificial spatial offsets in both the longitude and latitude. This technique introduced minor shifts to the magnetic field data while preserving underlying patterns, helping the model focus on features associated with vent presence rather than absolute positions. For future iterations, we are considering using randomized sampling techniques similar to those in Zhu and Beroza (2018), which will further help prevent location-based learning. These adjustments should allow our model to develop a more robust understanding of the magnetic anomaly patterns indicative of potential vent locations.

Reviewer comments (3): Each vent has a finite size. Did you approximate the vent as a point? If two or more mesh nodes of the magnetic field data were in a finite size vent, how were they treated?

Authors Response: Thank you for raising this point regarding vent size. For this study, we approximated each vent as a central point to simplify the modeling process, given that the exact vent boundaries were not always clearly defined in the dataset. When multiple mesh nodes fell within the extent of a vent, we assigned them the same target label to indicate vent presence, treating each as representative of the larger vent structure. This approach allowed the model to recognize the spatial extent of vents without requiring detailed boundary information. However, we acknowledge that this approximation could introduce some limitations. In an ongoing research involving another set of iterations on the region, we plan to refine our model by incorporating vent size as a parameter, either by defining a spatial radius around the central point or by developing a proximity-based weighting to capture more nuanced spatial relationships within larger vents.

Reviewer comments (4): Did you use all vent locations and magnetic field data to train the model? It is common to use some of the data (randomly selected) for training and the others to evaluate the model performance. If this approach was used, please clarify the distinction between the data for training and evaluation in Fig. 9.

Authors Response: Thank you for your question. Yes, we used a portion of the vent locations and magnetic field data for training and reserved a separate set for model evaluation to assess its performance reliably. Specifically, we applied an 80-20 split, where 80% of the data was randomly selected for training, and the remaining 20% was used exclusively for testing.

For Fig. 9, we visualize both the predicted vent locations from the model and the actual vent locations within the test set, allowing us to assess how well the model generalizes to unseen data. By ensuring a clear separation between training and test data, we endeavored to validate the model’s predictive capability without bias from data leakage. We hope this clarification on our data split process and evaluation approach addresses your concerns.

Reviewer comments (5): If all data were used for training, what is the difference between the newly detected vent locations and false positives? Both are locations where the value of the target variable of the supervising data is 0 (no vent) but the model predicts that there is a vent. In other words, if the machine learning model achieved 100% accuracy, would the result be no prediction of unknown vent locations?

Authors Response: Thank you for this insightful question. In our approach, we did indeed reserve a portion of the data for testing to evaluate model generalization. However, if the entire dataset were used solely for training, as you pointed out, the model could achieve high accuracy but lack true predictive power, essentially memorizing known data without providing insights on potential new vent locations. To differentiate predicted vent locations from false positives, we used an independent test set where vent presence was unknown to the model during training. In this setup, 'newly detected' vents refer to locations where the model consistently identifies vent-like magnetic patterns, even if they fall outside known vent locations. By evaluating these predictions on the test set, we minimize the risk of confusing potential undiscovered vents with mere classification errors.

In an ideal scenario, as model accuracy improves, we expect it to identify locations with magnetic characteristics resembling vents while maintaining accuracy on known data. Our goal is for the model to generalize these spatial patterns effectively, allowing for meaningful predictions in areas where vent presence has not been previously recorded.

Reviewer comments (6): Line 167. ``the thickness of lava flows in the research area ranged from 100 m above sea level ...'. The subject is the thickness, and the value is the elevation. Rewrite the sentence to make them consistent.

Authors Response: Done. “The outcomes of their study indicated that the thickness of the lava flows in the research area ranged from 100 m on the eastern and western sides to as much as 300–500 m in the central part.” Page 6 of 20. Thank you.

Reviewer comments (7): Cite Figure 4 in the manuscript.

Authors Response: Done. Page 8 of 20. Thank you.

Reviewer comments (8): Add a right parenthesis in Eq. (6).

Authors Response: Done. Thank you.

Reviewer comments (9): Add color bars in Figures 10 and 11. Indicate the bin size used to calculate the vent density in Figure 10.

Authors Response: Done. Thank you. In addition, since we used Kernel Density Estimation (KDE) for our computations, which does not directly have bins, we used the bandwidth values as a proxy for bin size. The bandwidth essentially controls the smoothness of the KDE curve. We used KDE to visualize the distribution of actual and predicted vent locations. By plotting the KDEs, we could see where the vents are concentrated and compare the patterns between the actual and predicted locations. The bandwidth values calculated give an indication of the level of detail captured in the density estimation. Page 16 of 20.

Reviewer comments (10): How was the correlation in Figure 11 calculated? If all data in Figure 10 were used, the result would be only one value for the correlation coefficient. I guess that a moving window was used to derive the correlation as a spatial function. Is this understanding correct? Please clarify.

Authors Response: Thank you for your question. You are correct in suggesting that a moving window approach was used to calculate the correlation as a spatial function in Figure 11. Specifically, we applied a moving window over the magnetic field data and vent locations, calculating the correlation coefficient within each window to capture localized variations in correlation strength across the study area. This allowed us to represent the spatial correlation between magnetic anomalies and vent presence as a continuous function rather than a single, global value. By using this method, we intended to identify regions with strong local correlations that may indicate vent-related magnetic signatures. We hope this clarification helps explain our approach in Figure 11. Page 14 of 20.

Reviewer comments (11): The correlation coefficient is given as a function of location (Figure 11). What does the correlation coefficient of 1 (lines 405 and 473) point to? The maximum value? Please clarify.

Authors Response: Thank you for this question. In Figure 11, the correlation coefficient is indeed presented as a spatial function, where a value of 1 indicates a perfect positive correlation between the magnetic anomalies and vent presence within a specific windowed region. This value reflects the maximum correlation achievable and points to locations where the magnetic data patterns align most closely with the vent locations. In the areas where the correlation coefficient reaches 1, the model suggests a strong likelihood that the magnetic anomalies are associated with volcanic vents, highlighting potential predictive regions.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I appreciated the author's response to my comments, which are very technical, but sometimes lack the point; I might have been not clear enough. I see no significant differences in the new manuscript version. I'll restate my two major criticisms (which could both be addressed without much work)

- Regarding my question on the regression: If you have three variables plus vent location (lat long, magnetic anomaly and yes/no for vent), how can they be fitted to a two variable statistical distribution?  The input data should be clearly listed and explained in the method section.

-In the conclusion the authors state: 'This study not only demonstrates the efficacy of integrating machine learning with geophysical data for volcanic vent prediction but also provides a valuable framework for identifying potential areas for further investigation or monitoring volcanic activity. '

I see no demonstration of the efficacy of vent prediction. How was it done? How could the author prove that the model correctly predict unseen vents? I understand this is an unsolvable point unless proper field surveys are done. I suggest changing the authors' view on the results,  by proposing that  their work show a promising method, whose efficacy should be further explored with direct surveys or in other areas.

-I still believe that the map figure 2 must match the investigated area. It is completely off and I see no use of it. Why is that?

Author Response

Author’s Response to Reviewer 1 Report 2

 

Title – Advancements in Geoharzard Investigation: Developing Machine Learning Framework for the Prediction of Vents at Volcanic Fields Using Magnetic Data

Reviewer’s Reports:

Reviewer comments (1): I appreciated the author's response to my comments, which are very technical, but sometimes lack the point; I might have been not clear enough. I see no significant differences in the new manuscript version. I'll restate my two major criticisms (which could both be addressed without much work)

- Regarding my question on the regression: If you have three variables plus vent location (lat long, magnetic anomaly and yes/no for vent), how can they be fitted to a two variable statistical distribution?  The input data should be clearly listed and explained in the method section.

 

Authors Response:

Dear respected Reviewer,

Thank you for your follow-up question. We apologize for any confusion. In our regression model, we used longitude, latitude, and magnetic anomaly as the three predictor variables, while vent presence (yes/no) was the target variable. This setup allows the model to learn a relationship between the spatial distribution of magnetic anomalies and the occurrence of volcanic vents.

You are correct in noting that fitting these three variables into a two-variable statistical distribution would not be appropriate. However, our approach does not rely on fitting to a bivariate distribution. Instead, we employed the Random Forest regression method, which can handle multiple predictor variables simultaneously without requiring them to conform to a traditional two-variable distribution.

We have now ensured to list and provide further explanations on the input data and the rationale behind our model choice in the Methodology section to avoid any ambiguity (as shown below, also added at Page 10 of 20). Thank you for highlighting the need for this clarification.

In our study, we used the following input data for the machine learning model:

  1. Longitude: The geographic longitude of each data point, which provides spatial information relevant to the location of magnetic anomalies and volcanic vents.
  2. Latitude: The geographic latitude of each data point, paired with longitude to give a complete spatial reference for the dataset.
  3. Magnetic Anomaly: The measured magnetic field intensity at each data point. This value is critical for identifying subsurface geological features that may be associated with volcanic activity.
  4. Vent Presence (Yes/No): A binary indicator that specifies whether a volcanic vent is present at the corresponding location. This variable serves as the target variable for the regression model, allowing us to train the model to recognize patterns associated with vent locations.

Our utilization of these variables, structured the model to learn how spatial variations in magnetic anomaly data relate to the presence of volcanic vents across the study area.

Reviewer comments (2): -In the conclusion the authors state: 'This study not only demonstrates the efficacy of integrating machine learning with geophysical data for volcanic vent prediction but also provides a valuable framework for identifying potential areas for further investigation or monitoring volcanic activity. '

I see no demonstration of the efficacy of vent prediction. How was it done? How could the author prove that the model correctly predict unseen vents? I understand this is an unsolvable point unless proper field surveys are done. I suggest changing the authors' view on the results,  by proposing that  their work show a promising method, whose efficacy should be further explored with direct surveys or in other areas.

Authors Response: Thank you for highlighting this point. We agree that the ultimate demonstration of the model's efficacy in predicting unseen vents would require validation through field surveys and real-world testing. However, in our study, we assessed the model's performance using statistical metrics derived from test data that were withheld from the model during training. These metrics, such as accuracy, precision, and recall, indicate the model's ability to identify vent locations based on patterns in the magnetic anomaly data.

While these statistical evaluations provide some confidence in the model's predictive capability, we recognize that they do not constitute proof of correctly predicting new, undiscovered vents in the field. To address this limitation, we propose that future work involve targeted field surveys in regions where the model has predicted potential vent locations. Such surveys would provide the necessary ground truth to verify the model's predictions and improve the reliability of vent prediction methods. 

We have revised our conclusion to emphasize the need for field validation (Page 18 of 20).

Reviewer comments (3): -I still believe that the map figure 2 must match the investigated area. It is completely off and I see no use of it. Why is that?

Authors Response: Figure 2 has now conformed to the investigated area (Page 5 of 21). Thank you.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

After reading the author's responses to my previous comments, I understand that the study approach is valid. I think that the information provided in the responses (2)-(4) should be written in the manuscript. Especially, the information listed below are essential for the readers to understand the validity of this study:

* the techniques used in this study to prevent the machine from simply memorizing the coordinates of known vents (in the response to comment 2);

* assigning a target variable of 1 to all mesh nodes within the extent of a vent (in the response to comment 3); and

* the 80-20 splitting of data to training and test sets.

After writing these points, I think that the manuscript would be acceptable for publication.

Author Response

Author’s Response to Reviewer 2 Report 2

 

Title – Advancements in Geoharzard Investigation: Developing Machine Learning Framework for the Prediction of Vents at Volcanic Fields Using Magnetic Data

Reviewer’s Reports:

Reviewer comments (1): After reading the author's responses to my previous comments, I understand that the study approach is valid. I think that the information provided in the responses (2)-(4) should be written in the manuscript. Especially, the information listed below are essential for the readers to understand the validity of this study:

 

* the techniques used in this study to prevent the machine from simply memorizing the coordinates of known vents (in the response to comment 2);

 

* assigning a target variable of 1 to all mesh nodes within the extent of a vent (in the response to comment 3); and

 

* the 80-20 splitting of data to training and test sets.

 

After writing these points, I think that the manuscript would be acceptable for publication.

 

Authors Response:

Dear respected Reviewer,

Thank you for your comments and suggestions toward our manuscript. They were quite useful in improving the quality of our paper. We have implemented the suggestions above in the manuscript as advised (Page 11 of 21). Thank you.

Author Response File: Author Response.docx

Round 3

Reviewer 1 Report

Comments and Suggestions for Authors

I acknowledge the authors for their answers, which exentially match the previous rebuttal . Now that fig. 2 and 3 officially depict the same area, my initial suspect is confirmed (this is why I thought they were not matching). The vent locations (a key point for the research) do not match. Why Is that?

Author Response

Author’s Response to Reviewer 1 Report 3

Reviewer’s Reports:

Reviewer comments (1): I acknowledge the authors for their answers, which exentially match the previous rebuttal. Now that fig. 2 and 3 officially depict the same area, my initial suspect is confirmed (this is why I thought they were not matching). The vent locations (a key point for the research) do not match. Why Is that?.

Authors Response:

Dear respected Reviewer,

Thank you for your follow-up question.

The slight mismatch in some vent locations between Figures 2 and 3 likely arises due to differences in the data sources used to prepare the maps. The geological map (Figure 2) was derived from some early geological surveys (1958 – 1963), while the vent locations on map (Figure 3) was extracted from recent satellite imagery from the Saudi Geological Survey (2022). These sources may have different levels of accuracy or interpretations for the precise locations of vents. Therefore considering the vent boundaries, some vents might be represented as points in one map and as larger features (e.g., vent clusters or craters) in another as seen in Figure 2. This simplification or generalization could lead to apparent mismatches. The vent data on the general geology map may not have been updated to match recent satellite observations and geological studies. The vent locations depicted in Figure 3, derived from high-resolution satellite imagery, are likely more reliable than those in Figure 2 due to the inherent advantages of remote sensing. Satellite-derived data provide precise geospatial information, consistent coverage, and the ability to detect subtle volcanic features that may not be evident in older field surveys or geological maps. Furthermore, the alignment of these vents with gravity lows and other geological patterns supports their validity. While the slight discrepancies between Figures 2 and 3 exist, the advanced techniques used to derive vent data in Figure 3 make it a more robust dataset for predicting volcanic vent locations.

Reviewer 2 Report

Comments and Suggestions for Authors

All my comments have been addressed properly. I have no more comments and think that the manuscript is satisfactory for publication in the present form.

Author Response

Reviewer comment:

All my comments have been addressed properly. I have no more comments and think that the manuscript is satisfactory for publication in the present form.

Author response:

Dear Respected Reviewer.

We highly appreciate your positive response very much. Thank you for assistance that enhanced the manuscript.

Regards

Essam Aboud

correspondence author.

 

Back to TopTop