3D Mineral Prospectivity Mapping of Zaozigou Gold Deposit, West Qinling, China: Deep Learning-Based Mineral Prediction

: This paper focuses on the scientiﬁc problem of quantitative mineralization prediction at large depth in the Zaozigou gold deposit, west Qinling, China. Five geological and geochemical indicators are used to establish geological and geochemical quantitative prediction model. Machine learning and Deep learning algorithms are employed for 3D Mineral Prospectivity Mapping (MPM). Especially, the Student Teacher Ore-induced Anomaly Detection (STOAD) model is proposed based on the knowledge distillation (KD) idea combined with Deep Auto-encoder (DAE) network model. Compared to DAE, STOAD uses three outputs for anomaly detection and can make full use of information from multiple levels of data for greater overall robustness. The results show that the quantitative mineral resources prediction by applying the STOAD model has a good performance, where the value of Area Under Curve (AUC) is 0.97. Finally, three main mineral exploration targets are delineated for further investigation.


Introduction
Geosciences is a data-intensive science, and geological survey and mineral exploration has accumulated a large amount of multi-source geosciences data, and the introduction of big data and its application of machine learning and deep learning can effectively support mineral resources exploration [1][2][3][4][5][6].
The Zaozigou Gold deposit is located within the West Qinling fold belt and is a typical epithermal-type gold deposit in the Xiahe-Hezuo area (Figure 1a). The main controlling factors for mineral resources formation within the region are tectonic movement and magmatism [62], with the regional tectonics spreading in a NW direction with developing folds and fracturs. Complex geological structure and magmatism are dominated by Yanshan period intermediate-acid intrusive rocks, which are widely distributed in the form of the batholith, stock, apophysis, and veins [62] (Figure 1b).
The geotectonic position of the Xiahe-Hezuo area is located at the northwestern part of the West Qinling orogenic belt and at the western extension of the Qinling-Qilian-Kunlun central orogenic belt, whose complex geological structure creates a superior mineralization environment [61].
The Zaozigou Gold deposit is located within the West Qinling fold belt and is a typical epithermal-type gold deposit in the Xiahe-Hezuo area (Figure 1a). The main controlling factors for mineral resources formation within the region are tectonic movement and magmatism [62], with the regional tectonics spreading in a NW direction with developing folds and fracturs. Complex geological structure and magmatism are dominated by Yanshan period intermediate-acid intrusive rocks, which are widely distributed in the form of the batholith, stock, apophysis, and veins [62] (Figure 1b).  [62] [62] The Triassic strata are the main stratigraphy for gold deposit. The genesis and spatialtemporal evolution of the intermediate-acidic dike is closely related to gold mineralization in the area, and during the mineralization process, the magmatic rocks not only provide the mineralized material, but also their internal environment is very suitable as an ore-bearing space, which can be regarded as a significant mineralization indicator for gold mineralization [63].
The spatial distribution of mineral deposits is directly controlled by the geotectonic position in the Xiahe-Hezuo area, which plays a major role in the formation of different types of gold deposits and is the boundary of the belt from a spatial perspective. The most important types of mineral-controlling structures are fractures and folds in this area [64,65]. The main orebodies of the Zaozigou deposit are produced in NE, NW, and near-SN oriented fracture zones, with two apparent mineralization periods, and post-formation fractures have modified and destroyed the orebodies [66][67][68][69][70].
The Zaozigou gold deposit is a typical representative of gold deposits associated with intermediate-acid dike rocks in the south of the Xiahe-Hezuo fracture. It is located at approximately 9 km southwest of the Hezuo city, Gansu Province, with convenient access to the mine site ( Figure 2a). The main ore-bearing position is between Gully 1 and Gully 4, with a total area of approximately 2.6 km 2 (Figure 2b).
The Triassic strata are the main stratigraphy for gold deposit. The genesis and spatial temporal evolution of the intermediate-acidic dike is closely related to gold mineralization in the area, and during the mineralization process, the magmatic rocks not only provide the mineralized material, but also their internal environment is very suitable as an ore bearing space, which can be regarded as a significant mineralization indicator for gold mineralization [63].
The spatial distribution of mineral deposits is directly controlled by the geotectonic position in the Xiahe-Hezuo area, which plays a major role in the formation of differen types of gold deposits and is the boundary of the belt from a spatial perspective. The mos important types of mineral-controlling structures are fractures and folds in this area [64,65]. The main orebodies of the Zaozigou deposit are produced in NE, NW, and near SN oriented fracture zones, with two apparent mineralization periods, and post-for mation fractures have modified and destroyed the orebodies [66][67][68][69][70].
The Zaozigou gold deposit is a typical representative of gold deposits associated with intermediate-acid dike rocks in the south of the Xiahe-Hezuo fracture. It is located at ap proximately 9 km southwest of the Hezuo city, Gansu Province, with convenient access to the mine site ( Figure 2a). The main ore-bearing position is between Gully 1 and Gully 4, with a total area of approximately 2.6 km 2 (Figure 2b).  There are only Triassic and Quaternary strata exposed in the Zaozigou gold deposi (Figure 2b). The strata of the area are mainly the lower part of the Gulangdi group (T2g 1 of the Middle Triassic formation, and the Quaternary (Qh alp ) is developed in the inter mountain valley. The lower part of the Gulangdi group (T2g 1 ) is the main ore-bearing rock which is a set of fine clastic rocks consisting of siliceous slate, clastic feldspathic fine sand stone with interbedded siltstone slate and argillaceous slate. The formation is composed There are only Triassic and Quaternary strata exposed in the Zaozigou gold deposit (Figure 2b). The strata of the area are mainly the lower part of the Gulangdi group (T 2 g 1 ) of the Middle Triassic formation, and the Quaternary (Q h alp ) is developed in the intermountain valley. The lower part of the Gulangdi group (T 2 g 1 ) is the main ore-bearing rock, which is a set of fine clastic rocks consisting of siliceous slate, clastic feldspathic fine sandstone with interbedded siltstone slate and argillaceous slate. The formation is composed of a large sedimentary cycle from bottom to top consisting of siltstone → argillaceous slate → calcareous slate [66,72].
The fractures are well developed and have a complex morphology, forming an intersecting trend, indicating that the area has experienced multiple phases of geological activity. Fractures strictly control the distribution of orebodies and vein rocks within the deposit. They can be classified into five groups of orientations, namely NW, N-S, NE, E-W, and NNE [73,74]. Fracture structures are the structural surfaces of mineralization, and these structural surfaces control the spreading characteristics of the orebodies [75].
Intermediate-acid dike, including fine crystalline diorite, diorite porphyrite, biotite dioritic porphyrite, and quartz diorite porphyrite densely produced with a porphyritic structure, spreading in NNE direction, turning to a nearly N-S direction in the western part of the deposit and a few NW directions. Influenced by the regional multi-period deep fracture activity, the multi-period magmatism has overlapped on multi-phases mineralization within the deposit [76].
There are 147 gold orebodies have been found in the Zaozigou gold deposit, of which 17 orebodies are main ore bodies with gold reserves greater than 1 tonne, and the total gold reserves of more than 100 tonnes [71]. According to the spatial distribution and combination of the mineralization, Zaozigou deposit can be divided into eastern and western ore groups.
The Eastern ore group is mainly located between Gully 1 and Gully 3 with the strike of NE orientation, containing Au1 controlled by F24, Au9 controlled by F21, and Au15 controlled by F25. These orebodies extend over 1000 m long and 300 m wide, with a NW direction tendency and steep dip near the ground surface, locally nearly upright. In the deep, these orebodies have been staggered by gently dipping fracture, causing the tendency to change to SE orientation ( Figure 2b). In addition, orebodies M4 and M6 are laying underground, with the strike of NWW orientation, the tendency of SW orientation, and the dip of 8 •~2 6 • . These orebodies cross the NE-striking orebodies obliquely, staggering them, and their own mineralization behavior occurs simultaneously [77].
The Western ore group is mainly distributed between Gully 3 and Gully 4, spreading in a nearly N-S direction, with August 29~August 31 as the main orebodies. The orebodies extend over 1000 m long and wider than 500 m, with a strike of 350 •~1 0 • , varying tendency and dips greater than 75 • , locally subvertical. These orebodies extend, and the mineralization is weaker in the steeper parts and stronger in the shallower parts.

Datesets Description
Historical geological and geochemical data were completed, including geological reports, geological exploration maps, and drills geochemical data, from the Development Research Center of China Geological Survey and No.3 Geological and Mineral Exploration team, Gansu Provincial Bureau of Geology and Mineral Exploration and Development, the coordinate system used in the mine-scale is Gaussian Kruger projection coordinates.
This study collected 72 drillings data in the Zaozigou gold deposit, established a drilling location database, an assay database, an inclinometry database, and a lithology database. The 3D model of drillings was constructed based on the drill hole data database ( Figure 3).
The primary geochemical halo data from the drillings were collected from the "Zaozigou gold successive resources exploration project in Hezuo city, Gansu Province", with a total of 72 drillings and 5028 samples with 12 elements of Ag, As, Au, Cu, Hg, Pb, Zn, Sb, W, Bi, Co, and Mo (the element detection methods can refer literature [71]). The sampling method was the continuous picking method with sample intervals generally within 10 m. Some orebodies or strongly altered areas being sampled at decreased intervals. The primary geochemical halo data from the drillings were collected from the "Zaozi gou gold successive resources exploration project in Hezuo city, Gansu Province", with total of 72 drillings and 5028 samples with 12 elements of Ag, As, Au, Cu, Hg, Pb, Zn, Sb W, Bi, Co, and Mo (the element detection methods can refer literature [71]). The samplin method was the continuous picking method with sample intervals generally within 10 m Some orebodies or strongly altered areas being sampled at decreased intervals.

Network Structure Designing
The basic end-to-end structure of DAE and STOAD consists of the input layer (Input convolutional layer (CONV), activation function layer, and output layer (Output).
DAE is a method that uses neural networks to compress and reconstruct data, and it core is to use an Encoder to compress the original data X to obtain the intermediate vari able Z, and then use a Decoder to reconstruct Z to obtain X' (Figure 4). It can be widel applied to tasks such as data dimensional reduction, feature extraction, and anomaly de tection.   The basic end-to-end structure of DAE and STOAD consists of the input layer (Input), convolutional layer (CONV), activation function layer, and output layer (Output).
DAE is a method that uses neural networks to compress and reconstruct data, and its core is to use an Encoder to compress the original data X to obtain the intermediate variable Z, and then use a Decoder to reconstruct Z to obtain X' (Figure 4). It can be widely applied to tasks such as data dimensional reduction, feature extraction, and anomaly detection. The primary geochemical halo data from the drillings were collected from the "Zaozigou gold successive resources exploration project in Hezuo city, Gansu Province", with a total of 72 drillings and 5028 samples with 12 elements of Ag, As, Au, Cu, Hg, Pb, Zn, Sb, W, Bi, Co, and Mo (the element detection methods can refer literature [71]). The sampling method was the continuous picking method with sample intervals generally within 10 m. Some orebodies or strongly altered areas being sampled at decreased intervals.

Network Structure Designing
The basic end-to-end structure of DAE and STOAD consists of the input layer (Input), convolutional layer (CONV), activation function layer, and output layer (Output).
DAE is a method that uses neural networks to compress and reconstruct data, and its core is to use an Encoder to compress the original data X to obtain the intermediate variable Z, and then use a Decoder to reconstruct Z to obtain X' (Figure 4). It can be widely applied to tasks such as data dimensional reduction, feature extraction, and anomaly detection.

Model Training
In the DAE model, the objective function is to measure the error between the reconstructed sample and the real sample and back propagate to adjust the network parameters. Mean Square Error (MSE) is usually used as the objective function in the reconstruction problem, and the optimization process is mainly divided into three steps: (i) forward propagation for calculating the output, (ii) calculating the loss of reconstruction, and (iii) back propagation to calculating the gradient according to the loss for optimizing the model parameters.
Equation (1) express the calculation process from the l − 1th layer to the lth layer of CNN, where x l represents the features of the lth layer (e.g., x 1 represents the input data, x 2 represents the output obtained after the first layer of convolution and activation calculation), b l represents the bias term of the lth layer, W l represents the weight parameter of the convolution of the lth layer, Z l represents the intermediate variables of the lth layer without the activation function calculation, and σ is the activation function. Rectified Linear Unit (ReLU) function is used as the activation function in this paper.
Assuming that the DAE model has a total of L layers, the errors of the input data and the reconstructed data are calculated by MSE to obtain the loss value.
δ L is the partial derivatives of Loss to z L , The partial derivatives of W and b are calculated from the losses, and is the Hadamard product.
The gradient of the back-propagation of each convolutional layer can be calculated by Equation (5), and finally the convolutional kernel parameters are updated by the following two equations. α is the learning rate.
where l = 1, 2, 3 . . . , L − 1, the above steps are repeated to achieve the training of the DAE network.

Model Testing
In the testing process, the test data X are reconstructed by DAE to obtain X', MSE between X and X' is normalized. Setting the threshold value µ, when the MSE value is greater than µ, corresponding samples are judged as anomalies, otherwise, the sample is normal. The calculation equations are as follows.

Network Structure Design
KD [78] is a mainstream method applied to model compression, which mainly consists of the Teacher network (T) and Student network (S). The larger T network is usually trained first for the tasks of image classification, target detection, semantic segmentation, etc., and then the smaller S network is trained to learn the generalization ability of the T network. Compared with T-networks, S-networks have fewer parameters and faster computation speed. Different information will be contained at different depth of the network. For fully extract metallogenic information, the STOAD model is proposed to achieve anomaly detection of layers based on calculating the differences between T encoder and S network. STOAD is divided into two structures: The T network and the S network, where the T network has the same Encoder and Decoder as the DAE, and the S network has the same structure as the Encoder in the T network. Especially, three scales output are designing to express different information at different depth of the network.
In the training process, STOAD model uses all layers to pre-train the T network for reconstruction. The encoder in the T network can be recognized as having the ability to compress the layers, and thus can be seen as having the ability to handle both normal and abnormal data [79]. If the T network is not trained, the anomaly detection can be achieved, but the feature extraction of the T network is not clear, and the interpretability is poor. Sequentially, taking normal data as input, training the S network, calculating the three scale output differences of T encoder and S network using MSE, while h i , i = 1, 2, 3, are the scales represents different depth of the network. Finally, back-propagates to update the parameters of the S network, the S network only has the capacity of handling normal data learned from the encoder of T network.
In the testing process, taking all data as input. Since the S network only learns the ability of dealing with normal data from the encoder of T network, the S network is difficult to maintain the similar multiscale output features with the T network because of a larger MSE. Therefore, the anomaly detection can be achieved by the MSE values ( Figure 5). In contrast to DAE, STOAD not only has three outputs, but also uses multiple levels of semantic and detailed information to make full use of the information in the data for anomaly detection.

Network Structure Design
KD [78] is a mainstream method applied to model compression, which mainly consists of the Teacher network (T) and Student network (S). The larger T network is usually trained first for the tasks of image classification, target detection, semantic segmentation, etc., and then the smaller S network is trained to learn the generalization ability of the T network. Compared with T-networks, S-networks have fewer parameters and faster computation speed.
Different information will be contained at different depth of the network. For fully extract metallogenic information, the STOAD model is proposed to achieve anomaly detection of layers based on calculating the differences between T encoder and S network. STOAD is divided into two structures: The T network and the S network, where the T network has the same Encoder and Decoder as the DAE, and the S network has the same structure as the Encoder in the T network. Especially, three scales output are designing to express different information at different depth of the network.
In the training process, STOAD model uses all layers to pre-train the T network for reconstruction. The encoder in the T network can be recognized as having the ability to compress the layers, and thus can be seen as having the ability to handle both normal and abnormal data [79]. If the T network is not trained, the anomaly detection can be achieved, but the feature extraction of the T network is not clear, and the interpretability is poor. Sequentially, taking normal data as input, training the S network, calculating the three scale output differences of T encoder and S network using MSE, while hi, i = 1,2,3, are the scales represents different depth of the network. Finally, back-propagates to update the parameters of the S network, the S network only has the capacity of handling normal data learned from the encoder of T network.
In the testing process, taking all data as input. Since the S network only learns the ability of dealing with normal data from the encoder of T network, the S network is difficult to maintain the similar multiscale output features with the T network because of a larger MSE. Therefore, the anomaly detection can be achieved by the MSE values ( Figure  5). In contrast to DAE, STOAD not only has three outputs, but also uses multiple levels of semantic and detailed information to make full use of the information in the data for anomaly detection.

Model Training and Testing
The training loss of STOAD consists of two parts, which are the training loss of the T network and the S network The overall T-network can be considered as a DAE, and MSE is used as the reconstruction loss of the T-network, so that the encoder part of the T-network holds the data compress capability. The training loss of the T network can be calculated as follow equation.
The loss of the S network consists of the MSEs of the three scales output of the T network. In order for the S network to learn the normal data compression ability from the T network, let h 1 = h 2 = h 3 = 1/3, and the training loss of the S network is calculated by Equation (8).
The STOAD model training and testing process is mainly as follows. 1.
The essence of T-network is a DAE, and its training process is same with DAE referring to Section 3.1.2.

2.
The S-network update is similar to the T-network, mainly in calculating the loss of the multi-scale features output from the encoder of the T-network, and using only normal data for training, thus letting the S-network to only learn the ability to compress normal data. L is the depth of S network.
δ L is the partial derivatives of z L of S network.
and the δ l is, i is 1 to L − 1: Convolutional module is updated by the following two equations. α is the learning rate.
In the testing process, we used T and S networks to calculate the testing dataset, separately. Defining the weights of three scales are W h1 = W h2 = W h3 = 1/3, the weighted sum of the output can be calculated by Equation (19), where X t i and X s i represent the output of the Encoder part of the T network and the S network, respectively. Predict (x) is the normalized Error (x), selecting a threshold µ, when the Predict (x) is greater than µ the sample point is judged to be an abnormal sample, otherwise it is a normal sample.

Geological and Geochemical Quantitative Prediction Model at Depth of Zaozigou Gold Deposit
Before carrying out the deep quantitative mineral prospectivity mapping in the Zaozigou gold deposit, it is necessary to review the geological and geochemical quantitative prediction model established in previous work [80].
Orebodies are strictly controlled by fractures in the Zaozigou gold deposit. The 30 m buffer zone of the fractures can effectively reflect the influence range of fracture, which can be used as a mineral prediction indicator. Factor F4 is an element association of Sb-Hg, which has a close relationship with fractures, and can be used as a favorable indicator for inferring deep fractures. The middle anomaly area of Au extracted by the multiple fractal C-V model can well reflect the spatial distribution of orebodies, which should be used as an important quantitative indicator. Near-ore halo element association B2 is extracted by the knowledge-driven CoDA also can express the location of orebodies well, it should be another mineral prediction indicator. The ratio of front halo to tail halo is an important geochemical parameter for predicting orebodies, B1/B3 is regarded as a prediction indicator accordingly (Table 1; Figure 6). The detailed data analysis and interpretation can refer to the literature [80]. sum of the output can be calculated by Equation (19), where t i and s i represent the output of the Encoder part of the T network and the S network, respectively. Predict (x) is the normalized Error (x), selecting a threshold μ, when the Predict (x) is greater than μ the sample point is judged to be an abnormal sample, otherwise it is a normal sample.

Geological and Geochemical Quantitative Prediction Model at Depth of Zaozigou Gold Deposit
Before carrying out the deep quantitative mineral prospectivity mapping in the Zaozigou gold deposit, it is necessary to review the geological and geochemical quantitative prediction model established in previous work [80].
Orebodies are strictly controlled by fractures in the Zaozigou gold deposit. The 30 m buffer zone of the fractures can effectively reflect the influence range of fracture, which can be used as a mineral prediction indicator. Factor F4 is an element association of Sb-Hg, which has a close relationship with fractures, and can be used as a favorable indicator for inferring deep fractures. The middle anomaly area of Au extracted by the multiple fractal C-V model can well reflect the spatial distribution of orebodies, which should be used as an important quantitative indicator. Near-ore halo element association B2 is extracted by the knowledge-driven CoDA also can express the location of orebodies well, it should be another mineral prediction indicator. The ratio of front halo to tail halo is an important geochemical parameter for predicting orebodies, B1/B3 is regarded as a prediction indicator accordingly (Table 1; Figure 6). The detailed data analysis and interpretation can refer to the literature [80].

Training Sample Selection
For mineralization prediction, the target variable (i.e., the label of trainin is either mineralized or nonmineralized (denoted by 1 and 0, respectively) 39,306 positive samples extracted from known orebodies and 49,686 negative s tracted from the nonmineralized position confirmed by drillings are used as t taset. In contrast to mineralization generates concentrated in limited space, th eralization is a widespread phenomenon, and negative samples are selected to uted as randomly and uniformly as possible throughout the study area [81] (F

Training Sample Selection
For mineralization prediction, the target variable (i.e., the label of training samples) is either mineralized or nonmineralized (denoted by 1 and 0, respectively). A total of 39,306 positive samples extracted from known orebodies and 49,686 negative samples extracted from the nonmineralized position confirmed by drillings are used as training dataset. In contrast to mineralization generates concentrated in limited space, the nonmineralization is a widespread phenomenon, and negative samples are selected to be distributed as randomly and uniformly as possible throughout the study area [81] (Figure 7).

DAE-Based 3D MPM
DAE is a widely used unsupervised model in deep learning. In this study, the DAE model is constructed by convolution, and the training process automatically learns to extract the semantic features of the normal evidence layer to compress and reconstruct the corresponding layers. In the testing process, detecting ore-induced anomalies by MSE value, the higher the MSE value is, the greater the loss of reconstruction is, which illustrates the high-probability area of ore-forming.
This DAE model is built using python 3.8, PyTorch 1.71 framework, the optimizer is Adam, the batch size is 16, the learning rate is 0.001, the learning rate decay is 10% per 5 Epochs, the CPU is 12th Gen Intel(R) Core(TM) i7-127OOKF, and the GPU is NVIDIA GeForce RTX 3090.
DAE model uses 70% of the normal data for training, and the testing dataset includes the remaining 30% of the normal samples and the relatively small number of anomaly

DAE-Based 3D MPM
DAE is a widely used unsupervised model in deep learning. In this study, the DAE model is constructed by convolution, and the training process automatically learns to extract the semantic features of the normal evidence layer to compress and reconstruct the corresponding layers. In the testing process, detecting ore-induced anomalies by MSE value, the higher the MSE value is, the greater the loss of reconstruction is, which illustrates the high-probability area of ore-forming.
This DAE model is built using python 3.8, PyTorch 1.71 framework, the optimizer is Adam, the batch size is 16, the learning rate is 0.001, the learning rate decay is 10% per 5 Epochs, the CPU is 12th Gen Intel(R) Core(TM) i7-127OOKF, and the GPU is NVIDIA GeForce RTX 3090.
DAE model uses 70% of the normal data for training, and the testing dataset includes the remaining 30% of the normal samples and the relatively small number of anomaly samples. The DAE model is trained for 500 Epochs and the loss function is plotted as follows. The model relatively converges at an MSE value of about 0.025, but the MSE fluctuates wider (Figure 8). In Figures 4 and 6, the horizontal axis indicates the sampling order and the vertical axis indicates the loss of DAE. The prediction parameters of precision value reaches 93%, Recall value is 88%, and F1-measure is 90%, which demonstrate the high accuracy and reliability of DAE-based MPM. It clearly shows that the high ore-forming probability at depth of Zaozigou gold deposit, which is worth for further investigation (Figure 9).

STOAD-Based 3D Mineral Prospectivity Mapping
The STOAD model is implemented using python 3.8, and PyTorch 1.71 framework. The T network is mainly composed of two parts: encoder and Decoder. The structure of T network is symmetric for letting the number of parameters of both parts identical, and thus the data compression ability of encoder is approximately equal to the data recon-

STOAD-Based 3D Mineral Prospectivity Mapping
The STOAD model is implemented using python 3.8, and PyTorch 1.71 framework. The T network is mainly composed of two parts: encoder and Decoder. The structure of T

STOAD-Based 3D Mineral Prospectivity Mapping
The STOAD model is implemented using python 3.8, and PyTorch 1.71 framework. The T network is mainly composed of two parts: encoder and Decoder. The structure of T network is symmetric for letting the number of parameters of both parts identical, and thus the data compression ability of encoder is approximately equal to the data reconstruction ability of Decoder.
The S network only has the encoder part, which is consistent with the T encoder. ReLU activation function is used for its capability of dealing with nonlinear data.
The same training parameters are used in the T and S networks, with an optimizer of Adam, a batch size of 16, a learning rate of 0.001, and a learning rate decay of 10% per 5 Epochs.
Similarly, STOAD uses 70% of the normal data for training, and the testing dataset includes the remaining 30% of the normal samples and the relatively small number of anomaly samples.
The STOAD model is trained with 500 Epochs, and the Teacher loss function and the Student loss function are plotted in Figure 10, the T network converges at about MSE value of 0.025 (similar to the above DAE), and fluctuates wider. The MSE curve of the S network converges at a lower value of about 0.0125, and distributes more stable than T network. The loss of the T network is the same as DAE indicating the ability to reconstruct all training data, the loss of the S network represents the difference in output between the S and T networks, and the more stable loss of the S network means that the S network can learn better the ability of the T network to reconstruct normal data.

Minerals 2022, 12, x FOR PEER REVIEW
The STOAD model is trained with 500 Epochs, and the Teacher loss function Student loss function are plotted in Figure 10, the T network converges at about MS of 0.025 (similar to the above DAE), and fluctuates wider. The MSE curve of the S n converges at a lower value of about 0.0125, and distributes more stable than T ne The loss of the T network is the same as DAE indicating the ability to reconstruct a ing data, the loss of the S network represents the difference in output between th T networks, and the more stable loss of the S network means that the S network ca better the ability of the T network to reconstruct normal data. The prediction parameter of Precision value reaches 96%, Recall value is 90%, value is 93%.
The 3D MPM based on STOAD model shows the high ore-forming probab depth of Zaozigou gold deposit which has the similar characteristics ore-forming bility distribution at depth. It gives evidence for deep mineral resources prediction 11). The prediction parameter of Precision value reaches 96%, Recall value is 90%, and F1 value is 93%.
The 3D MPM based on STOAD model shows the high ore-forming probability at depth of Zaozigou gold deposit which has the similar characteristics ore-forming probability distribution at depth. It gives evidence for deep mineral resources prediction ( Figure 11). ls 2022, 12, x FOR PEER REVIEW 15 of 22 Figure 11. STOAD-based 3D MPM.

Discussion
In this study, the deep learning methods of DAE and STOAD model are carried out for 3D MPM in Zaozigou gold deposit.
Compared with the DAE model, STOAD model has a higher precision in detection of ore-induced anomalies, which demonstrates a higher reliability of 3D MPM (Figures 9  and 11).
Meanwhile, the ROCs of the two methods were used to evaluate the models, and the AUC values of the two methods were 0.92 of DAE and 0.97 of STOAD. Observably, the STOAD model expresses a higher performance ( Figure 12).

Discussion
In this study, the deep learning methods of DAE and STOAD model are carried out for 3D MPM in Zaozigou gold deposit.
Compared with the DAE model, STOAD model has a higher precision in detection of ore-induced anomalies, which demonstrates a higher reliability of 3D MPM (Figures 9 and 11).
Meanwhile, the ROCs of the two methods were used to evaluate the models, and the AUC values of the two methods were 0.92 of DAE and 0.97 of STOAD. Observably, the STOAD model expresses a higher performance ( Figure 12).

Discussion
In this study, the deep learning methods of DAE and STOAD model are carried out for 3D MPM in Zaozigou gold deposit.
Compared with the DAE model, STOAD model has a higher precision in detection of ore-induced anomalies, which demonstrates a higher reliability of 3D MPM (Figures 9  and 11).
Meanwhile, the ROCs of the two methods were used to evaluate the models, and the AUC values of the two methods were 0.92 of DAE and 0.97 of STOAD. Observably, the STOAD model expresses a higher performance ( Figure 12).  From the points of algorithm theory, the STOAD model is proposed based on the KD idea combined with the DAE network, which is an improvement of DAE. So, it is better than the conventional DAE model in terms of algorithm performance.
From the points of application, the results of quantitative mineral resources prediction show that STOAD is more accurate and robust, which is demonstrated by prediction parameters, the loss curve (Figures 8 and 10) and the ROC curve ( Figure 12).
Furthermore, a total of 4 methods, two machine learning methods (MaxEnt model and GMM) and two deep learning methods (DAE and STOAD model), have been carried out for 3D MPM (Figure 13). The machine learning-based 3D MPM can refer to the literature [80], which is the previous work of our research team. From the points of algorithm theory, the STOAD model is proposed based on the KD idea combined with the DAE network, which is an improvement of DAE. So, it is better than the conventional DAE model in terms of algorithm performance.
From the points of application, the results of quantitative mineral resources prediction show that STOAD is more accurate and robust, which is demonstrated by prediction parameters, the loss curve (Figures 8 and 10) and the ROC curve ( Figure 12).
Furthermore, a total of 4 methods, two machine learning methods (MaxEnt model and GMM) and two deep learning methods (DAE and STOAD model), have been carried out for 3D MPM (Figure 13). The machine learning-based 3D MPM can refer to the literature [80], which is the previous work of our research team. The prediction results of the four methods exhibit a high correlation with the known orebodies in the shallow part and a slight difference in the deep part ( Figure 13). From the comparative analysis of the prediction results and model performance among the four methods, the STOAD takes advantage of mathematical theory and has the highest AUC value ( Figure 14). Especially, the high probability area can better reflect the metallogenic regularity at large depth. Therefore, it is decided to delineate a comprehensive mineral exploration targets at large depth of the Zaozigou based on the prediction results of STOAD model and supported by the prediction results of other methods. In order to improve the prediction accuracy, the area with a logical probability greater than 0.75 and less than 0.998 is defined as a medium potential area, and the area with a logical probability greater than 0.998 is defined as a high potential area, based on which three main mineral exploration targets are circled ( Figure 15). The prediction results of the four methods exhibit a high correlation with the known orebodies in the shallow part and a slight difference in the deep part ( Figure 13). From the comparative analysis of the prediction results and model performance among the four methods, the STOAD takes advantage of mathematical theory and has the highest AUC value ( Figure 14). Especially, the high probability area can better reflect the metallogenic regularity at large depth. Therefore, it is decided to delineate a comprehensive mineral exploration targets at large depth of the Zaozigou based on the prediction results of STOAD model and supported by the prediction results of other methods. In order to improve the prediction accuracy, the area with a logical probability greater than 0.75 and less than 0.998 is defined as a medium potential area, and the area with a logical probability greater than 0.998 is defined as a high potential area, based on which three main mineral exploration targets are circled ( Figure 15). Target I is located at elevation 1600~2000 m, belongs to the NE-orientation orebody group, which should be the extension of the Au1 orebody. In this position, we developed the main ore-forming fracture. The Au and Sb concentration in this position, and the ratio of front halo to tail halo has been increasing. Especially, it appears high logical probability calculated by all four methods at this position, indicating the Au1 orebody may extends deeper than the elevation of 1800 m or a concealed orebody exist therein.
Target II is located in the NW-orientation orebody group at the elevation of 2000~2400 m, where the fractures distribute complex, and the tail halo element Co has a significant vertical overlap with the front halo elements Sb and As. The elevated geochemical parameters starting from 2600 m to deep indicate that there may be an extension of ore bodies or new blind orebodies at depth. According to the theory of primary halo zonation, the front halo indicators generally appear at the leading edge of the ore body 100-300 m, and the blind ore body will appear at an altitude of 2400 m deep.  Target I is located at elevation 1600~2000 m, belongs to the NE-orientation orebody group, which should be the extension of the Au1 orebody. In this position, we developed the main ore-forming fracture. The Au and Sb concentration in this position, and the ratio of front halo to tail halo has been increasing. Especially, it appears high logical probability calculated by all four methods at this position, indicating the Au1 orebody may extends deeper than the elevation of 1800 m or a concealed orebody exist therein.
Target II is located in the NW-orientation orebody group at the elevation of 2000~2400 m, where the fractures distribute complex, and the tail halo element Co has a  Target I is located at elevation 1600~2000 m, belongs to the NE-orientation orebody group, which should be the extension of the Au1 orebody. In this position, we developed the main ore-forming fracture. The Au and Sb concentration in this position, and the ratio of front halo to tail halo has been increasing. Especially, it appears high logical probability calculated by all four methods at this position, indicating the Au1 orebody may extends deeper than the elevation of 1800 m or a concealed orebody exist therein.
Target II is located in the NW-orientation orebody group at the elevation of 2000~2400 m, where the fractures distribute complex, and the tail halo element Co has a Target III is located below 1600 m, it can be divided into targets III-1, III-2, and III-3. Target III-1 is located at elevation of about 1300 m, target III-2 located at elevation of about 1000 m., target III-3 located at elevation of 700 m. The deep fracture is the main indicator. It is worth mentioning that the deep drill for scientific research has achieved in 2021, which is the deepest drill of Zaozigou gold deposit, which reaches to the elevation of 700 m ( Figure 15). In previous study, there may be an "upper veins and lower layers" distribution feature of orebodies [82]. The shallow stratigraphy is strongly folded, and steeply dipping fracture structures are developed; the orebody is controlled by fractures and vein rocks, while the deep orebody may be mainly controlled by intrusive mass to form laminated ore bodies. This understanding is also reflected in the present prediction targets.

Conclusions
To address the scientific problem of quantitative mineralization prediction at large depth, the geological and geochemical quantitative mineral resources prediction model at depth of the Zaozigou gold deposit is constructed based on the quantitative extraction of mineralization signatures, which consists of five quantitative prediction indicators, including 30 m fracture buffer zone, element association (Sb-Hg) of ore-bearing structures, metallogenic element Au, near-ore halo element association Au-Ag-Cu-Pb-Zn, and the ratio of front halo to tail halo (As-Sb-Hg)/(W-Mo-Bi). Then, two machine learning methods (MaxEnt model and GMM) and two deep learning methods (DAE and STOAD model) are carried out for 3D MPM, and mineral exploration targets are delineated at depth (Figure 15).
Because the primary halo of drills is limited by the engineering-controlled boundaries, this study adopts the idea of quantitative expression of ore-controlling factors by element associations and limit the deep inference range within the modeling space to perform deep quantitative mineral prediction. However, the accuracy of the prediction of machine learning and deep learning methods inevitably decreases rapidly with increasing depth. This shortcoming can be supplemented by two aspects of quantitative inversion of geophysical data and numerical simulation of deep mineralization dynamics, which is also a direction worth further research.
In summary, a series of methods are developed in this study of 3D Mineral Prospectivity Mapping of Zaozigou gold deposit, West Qinling, China. The achieved results by machine learning and deep learning methods shows well performance in quantitative mineral resources prediction at depth, which is worth being promoted to similar ore deposits. Moreover, it was concluded that the deep learning-based approach works better with larger amounts of data, and the STOAD proposed in this paper has a higher accuracy in anomaly detection. May the relevant ideas of this study provide a reference for related researchers.