Next Article in Journal
Asymmetric Etalon Effect in Fold-Type Optical Feedback Cavity-Enhanced Absorption Spectroscopy
Next Article in Special Issue
Geochemistry, Geochronology, and Prospecting Potential of the Dahongliutan Pluton, Western Kunlun Orogen
Previous Article in Journal
Deep Learning for Predicting Traffic in V2X Networks
 
 
Article
Peer-Review Record

Delineation and Analysis of Regional Geochemical Anomaly Using the Object-Oriented Paradigm and Deep Graph Learning—A Case Study in Southeastern Inner Mongolia, North China

Appl. Sci. 2022, 12(19), 10029; https://doi.org/10.3390/app121910029
by Bo Zhao 1, Dehui Zhang 2,*, Rongzhen Zhang 2,3,*, Zhu Li 2,4, Panpan Tang 1 and Haoming Wan 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2022, 12(19), 10029; https://doi.org/10.3390/app121910029
Submission received: 1 September 2022 / Revised: 1 October 2022 / Accepted: 3 October 2022 / Published: 6 October 2022
(This article belongs to the Special Issue New Advances and Illustrations in Applied Geochemistry)

Round 1

Reviewer 1 Report

This paper should be edited as a "Major Revision" based on following comments:

1. English writing should be improved.

2. Type of mineralization is unclear and should be described for criteria of elemental selection.

3. What is main algorithm? It is unclear. 

4. Why the authors did not introduce Machine Learning in the Introduction and literature review? They can used following references for better description in this section:

Afzal, P., Farhadi, S., Shamseddin Meigooni, M., Boveiri Konari,M, Daneshvar Saein, L., 2022. Geochemical Anomaly Detection in the Irankuh District Using Hybrid Machine Learning Technique and Fractal Modeling. Geopersia, 12(1): 191-199 doi: 10.22059/GEOPE.2022.336072.648644.

Farhadi, S., Afzal, P., Boveiri Konari, M., Daneshvar Saein, L., Sadeghi, B., 2022. Combination of Machine Learning Algorithms with Concentration-Area Fractal Method for Soil Geochemical Anomaly Detection in Sediment-Hosted Irankuh Pb-Zn Deposit, Central Iran. Minerals 12 (6), 689.

5. This is not any geological and mineralization validation in this paper. They should be added a section for this aim.

6. Quality of figures are low and should be improved.

Author Response

From Reviewer 1:

This paper should be edited as a "Major Revision" based on following comments:

  1. English writing should be improved.

Answer: I polished many English sentences as requested.

 

  1. Type of mineralization is unclear and should be described for criteria of elemental selection.

Answer: I added the relevant description as follows: Previous studies [26] have indicated that the mineralization of Nb, Ta, Li, Be, REE, U, etc. in the study area is most related to highly fractionated peraluminous granitoids (S-type); while the mineralization of W, Pb, Zn, Ag, fluorite, tourmaline (B), Au, Cu, etc. is mainly of quartz-vein type, often occurring within the syenogranitic wall rocks (highly fractionated I-type) surrounding the peraluminous granitoids. Based on these facts and according to the metallogenic specialization of granitoids [28], the relevant ore-forming elements can be divided into two groups: (1) Ag, Au, As, B, Cu, Hg, Mo, Pb, U, Zn, and Fe2O3, which are closely related to magnetite series granites (I-type); and (2) Be, Bi, F, Mo, Nb, Pb, Sb, Sn, U, W, and Fe2O3, which are closely related to ilmenite series granites (S-type) [26]. At the same time, in consideration of the extensive development of granitic complexes in this area and the complex mineral paragenesis [27] such as Cu-Mo versus W-Mo, Pb-Zn versus W, Sn-Pb, U-Pb versus Th-U, and pyrite versus Fe2O3, we make Mo, Pb, U, and Fe2O3 appear in both groups.

 

  1. What is main algorithm? It is unclear. 

Answer: I added a paragraph in Section 2.2 to explain my algorithm, which is: The proposed algorithm is a complex multi-step procedure, which involves several different methodologies and datasets, and the details of each step are given below. At the center of this algorithm is OGE – a graph network-based autoencoder, and other sub-algorithms can be regarded as the pre-processing and post-processing for OGE.

  1. Why the authors did not introduce Machine Learning in the Introduction and literature review? They can used following references for better description in this section:

- Afzal, P., Farhadi, S., Shamseddin Meigooni, M., Boveiri Konari,M, Daneshvar Saein, L., 2022. Geochemical Anomaly Detection in the Irankuh District Using Hybrid Machine Learning Technique and Fractal Modeling. Geopersia, 12(1): 191-199 doi: 10.22059/GEOPE.2022.336072.648644.

- Farhadi, S., Afzal, P., Boveiri Konari, M., Daneshvar Saein, L., Sadeghi, B., 2022. Combination of Machine Learning Algorithms with Concentration-Area Fractal Method for Soil Geochemical Anomaly Detection in Sediment-Hosted Irankuh Pb-Zn Deposit, Central Iran. Minerals 12 (6), 689.

Answer: I added the relevant description as requested: In OBIA-based image analysis, the standard practice is to conduct “multiresolution segmentation + machine learning-based classification” [5]. Actually, over the last few years, machine learning techniques have become an essential tool to advance different branches of science and engineering, including geochemical anomaly recognition. For example, a hybrid machine learning method was proposed in [13], which is based on combining K- Nearest Neighbor Regression and Random Forest Regression to predict Pb and Zn grades in the Irankuh Mining District (IMD), Iran. Reference [14] goes deeper: it trained four regression machine learning algorithms, i.e., K neighbor regressor, support vector regressor, gradient boosting regressor, and random forest regressor, to build a hybrid model to predict Pb and Zn grades of IMD. After that, the multifractal model [15] was used to classify Pb-Zn anomalies. Despite the success of the examples in the literature, few studies have explored the advantages of introducing traditional machine learning algorithms into OBIA-based geochemical prospecting.

  1. This is not any geological and mineralization validation in this paper. They should be added a section for this aim.

Answer: I added this section as requested.

3.4 Comparison and Validation by Factor Analysis

The best way to validate the effectiveness of our OGE algorithm is to observe how many ore spots can fall into the anomalous image objects, but this is not enough because most of the anomalous patches in Figure 13 are barren. Comparing Figure 13 to Figure 1, we also discover that the presence of ore spots and anomalies is not completely controlled by the spatial distribution of the outcropped bedrocks and known geological structure. For example, over a third of the known ore spots occur within the N2b and Quaternary-covered areas with less geological exposure. Inasmuch, it is difficult to directly validate the anomalies from the perspective of regional geology. In this Section, the factor analysis is used to do so. As we know, factor analysis is a technique that is widely used to reduce a large number of variables into fewer numbers of factors [32]. Although irrespective of the spatial structure of geochemical patterns, it facilitates the identification of multivariate geochemical anomalies. This is consistent with the major aim of OGE. So, supposing the results of factor analysis are tenable and close to truth, we can cross-validate the correctness and robustness of our model’s outputs by observing if the factor-score anomalies overlap considerably with the anomalies delineated in Figure 13.

We can make the following observations from Figure 14: (1) Nearly all the factor-score anomalies, no matter what the elemental association each factor represents, reside within the OGE-derived anomalous areas, and occupy most of the interior space. This result strongly supports the basic correctness of our model’s outputs. (2) Our OGE model is advantageous since quite a few ore spots that cannot be identified by the factor-score anomalies are well-identified by OGE. (3) The I-series elemental association can be further divided into five factors: Factor 1: As-Pb-Zn, Factor 2: Ag-B-Cu, Factor 3: Au-Hg, Factor 4: Mo-U, and Factor 5: Fe2O3. Likewise, the S-series elements can be divided into: Factor 1: Bi-W, Factor 2: Pb-Sb-Sn, Factor 3: Mo-U, Factor 4: Nb-Fe2O3, and Factor 5: Be-F. These divisions improve the interpretability of the extracted anomalies and facilitate the mineral species-specific geochemical exploration.

Given the above, by calculating the ore-spot recognition rate and conducting factor analysis, the OGE-derived anomalies in Figure 13 get validated.

 

  1. Quality of figures are low and should be improved.

Answer: I have re-generated some of the figures as requested, and will resubmit them to the Journal. Thanks.  

 

Author Response File: Author Response.docx

Reviewer 2 Report

Review of manuscript titled Delineation and Analysis of Regional Geochemical Anomaly Using the Object-Oriented Paradigm and Deep Graph Learning - A Case Study in Southeastern Inner Mongolia, North China

for Appl. Sci. 2022

In exploration geology, anomaly detection and identification is a deep-learning technique used indicate locations of the possible presence of mineral ore deposits, given a geochemical input dataset. In this manuscript, the authors have developed a new deep-learning architecture intended to outperform the current state-of-the-art GAUGE architecture developed by Guan et al. in article Recognizing Multivariate Geochemical Anomalies Related to Mineralization by Using Deep Unsupervised Graph Learning.

To detect geochemical ore species concentration anomalies, the authors have used an autoencoder model. An autoencoder is an unsupervised neural network-based feature extraction algorithm that learns the best parameters required to reconstruct its output as close to its input as possible. Autoencoders impose a bottleneck in the network which forces a compressed knowledge representation of the original input. In this case, the compressed knowledge represents unique characteristics or principal components of the geochemical distribution of selected species.

The author’s have implemented a convolutional autoencoder (CAE), which is good at capturing textural features of the input geochemical dataset. The idea is to segment the input dataset into spatially continuous and spectrally contiguous homogeneous regions where a high concentration of ore deposits is likely to reside.

The author’s have contributed to the field by proposing an object-based graph learning architecture (OGE – object graph encoder) based on using a graph convolutional neural network (GNN).

In the manuscript, the author’s have shown their architecture is scale and species invariant. In addition, the authors have proposed and implemented an object-oriented algorithm where statistical metrics are computed within each object to modify detected anomalous regions.

The authors have achieved relatively high detection accuracy in that ground-truth ore locations are located near the OGE detected anomalous regions.

The OGE detected anomalous regions of ore deposits are likely the result of diagenesis instead of regional mineralization. As a measure of performance, the author’s OGE model was shown to identify more than 80% of ground-truth ore deposit locations in less than 45% of the input land area, which is superior to GAUGE.

This is a comprehensive manuscript that warrants publication.

The authors provide a schematic of their symmetric autoencoder in figure 5. 

Questions for the authors:

1.       The dimension of the input layer of the autoencoder is 11, which corresponds to the number of species. However, the dimension of the output of layer 1 (32) does not equal the input dimension of hidden layer 2 (256). Are there additional hidden layers present in the encoder portion not shown in figure 5?

2.       The authors did not mention which type (max, average, etc.) of pooling operation is applied after convolution and which up- and down-sampling pooling operation yielded superior accuracy.

3.       In section 2.2.3, the authors explain which loss function was employed to measure reconstruction error (SPRE). Typically, training vs. validation loss as a function of epoch number is shown to assess over- and underfitting. Can the authors show such a loss plot?

Altogether this reviewer found this paper to be very interesting, insightful, and well written.

Author Response

From Reviewer 2:

Review of manuscript titled Delineation and Analysis of Regional Geochemical Anomaly Using the Object-Oriented Paradigm and Deep Graph Learning - A Case Study in Southeastern Inner Mongolia, North China

for Appl. Sci. 2022

In exploration geology, anomaly detection and identification is a deep-learning technique used indicate locations of the possible presence of mineral ore deposits, given a geochemical input dataset. In this manuscript, the authors have developed a new deep-learning architecture intended to outperform the current state-of-the-art GAUGE architecture developed by Guan et al. in article Recognizing Multivariate Geochemical Anomalies Related to Mineralization by Using Deep Unsupervised Graph Learning.

To detect geochemical ore species concentration anomalies, the authors have used an autoencoder model. An autoencoder is an unsupervised neural network-based feature extraction algorithm that learns the best parameters required to reconstruct its output as close to its input as possible. Autoencoders impose a bottleneck in the network which forces a compressed knowledge representation of the original input. In this case, the compressed knowledge represents unique characteristics or principal components of the geochemical distribution of selected species.

The author’s have implemented a convolutional autoencoder (CAE), which is good at capturing textural features of the input geochemical dataset. The idea is to segment the input dataset into spatially continuous and spectrally contiguous homogeneous regions where a high concentration of ore deposits is likely to reside.

The author’s have contributed to the field by proposing an object-based graph learning architecture (OGE – object graph encoder) based on using a graph convolutional neural network (GNN).

In the manuscript, the author’s have shown their architecture is scale and species invariant. In addition, the authors have proposed and implemented an object-oriented algorithm where statistical metrics are computed within each object to modify detected anomalous regions.

The authors have achieved relatively high detection accuracy in that ground-truth ore locations are located near the OGE detected anomalous regions.

The OGE detected anomalous regions of ore deposits are likely the result of diagenesis instead of regional mineralization. As a measure of performance, the author’s OGE model was shown to identify more than 80% of ground-truth ore deposit locations in less than 45% of the input land area, which is superior to GAUGE.

This is a comprehensive manuscript that warrants publication.

The authors provide a schematic of their symmetric autoencoder in figure 5. 

Questions for the authors:

  1. The dimension of the input layer of the autoencoder is 11, which corresponds to the number of species. However, the dimension of the output of layer 1 (32) does not equal the input dimension of hidden layer 2 (256). Are there additional hidden layers present in the encoder portion not shown in figure 5?

Answer: the dimension of the output of layer 1 is 32, while the input dimension of hidden layer 2 is 256. This is reasonable because there are eight heads in the GAT layer 1, and 32×8 = 256. However, to avoid misunderstanding, in Figure 5, I changed the D_in=256 as D_in=32×8=256. Thanks.

  1. The authors did not mention which type (max, average, etc.) of pooling operation is applied after convolution and which up- and down-sampling pooling operation yielded superior accuracy.

Answer: Thanks. Actually, no pooling operation was involved in our GAT- and GCN- dominated network architecture. The GraphSAGE module, which is used to generate low-dimensional vector representations for nodes, and is especially useful for graphs that have rich node attribute information, may be involved in pooling operations, but it is not incorporated in our OGE model. That is why I did not mention which type (max, average, etc.) of pooling operation is applied. As for the up- and down-sampling pooling operations: Pooling operation in CNN is focused on subsampling. Similarly, graph pooling operation is designed to generate graph level features from node features. Generally speaking, graph pooling can be seeing as an operation that given an initial graph as input, generates a coarsened graph with fewer nodes. Obviously, generating a coarsened graph with fewer nodes is not what we want, because we must ensure that every node in Figure 4 or every image object in Figure 2 has an anomaly score for subsequent analysis. That is why up- and down-sampling pooling operations are not involved in this study. Our GAT- and GCN- dominated network can be a good solution for this problem. Given the above, I am so sorry that I did not make relevant modifications in the manuscripts.

  1. In section 2.2.3, the authors explain which loss function was employed to measure reconstruction error (SPRE). Typically, training vs. validation loss as a function of epoch number is shown to assess over- and underfitting. Can the authors show such a loss plot?

Answer: As our algorithm is an unsupervised algorithm, so there is no validation dataset involved in the experiments. But as requested, I provide a plot of the recognition rate versus loss, and added relevant text description in the manuscript as follows:

For a better illustration, taking the I-series elemental association as example, Figure 8 gives the recognition rate versus loss curves during the training phase. As can be seen, before completing 160 epochs, both the recognition rate and the training loss decrease sharply. And then, the decreasing trend of loss become slow, eventually stabilizing at 0.124 or so; meanwhile, the growth of the recognition rate rises and became oscillating, centered at 0.837. So, in order to jump over the oscillation area, here we set the training epoch as 2000.  

Figure 8. Variation in loss and the recognition rate with number of training epochs. Note that: here we set the threshold of the anomaly score map as median, and the recognition rate is calculated as the ratio of the number of ore spots falling within the anomalous areas to the total number of known ore spots.

Altogether this reviewer found this paper to be very interesting, insightful, and well written.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

ACCEPT

Back to TopTop