Basalt Tectonic Discrimination Using Combined Machine Learning Approach

Geochemical discrimination of basaltic magmatism from different tectonic settings remains an essential part of recognizing the magma generation process within the Earth’s mantle. Discriminating among mid-ocean ridge basalt (MORB), ocean island basalt (OIB) and island arc basalt (IAB) is that matters to geologists because they are the three most concerned basalts. Being a supplement to conventional discrimination diagrams, we attempt to utilize the machine learning algorithm (MLA) for basalt tectonic discrimination. A combined MLA termed swarm optimized neural fuzzy inference system (SONFIS) was presented based on neural fuzzy inference system and particle swarm optimization. Two geochemical datasets of basalts from GEOROC and PetDB served as to test the classification performance of SONFIS. Several typical discrimination diagrams and well-established MLAs were also used for performance comparisons with SONFIS. Results indicated that the classification accuracy of SONFIS for MORB, OIB and IAB in both datasets could reach over 90%, superior to other methods. It also turns out that MLAs had certain advantages in making full use of geochemical characteristics and dealing with datasets containing missing data. Therefore, MLAs provide new research tools other than discrimination diagrams for geologists, and the MLA-based technique is worth extending to tectonic discrimination of other volcanic rocks.


Introduction
Magmatic rocks (e.g., basalt and granite) form in a wide variety of tectonic settings, which primarily include mid-ocean ridges, ocean islands and island arcs. Many scholars have expressed concern regarding the restoration of the original tectonic setting of ancient magmatic rocks [1,2]. The establishment of the plate tectonics hypothesis has brought the study of a tectonic setting discrimination of magmatic rocks to the forefront of geological research [3]. Previous studies have shown that it is difficult to achieve this goal through only the macro geological setting, and understanding the chemical composition of these rocks is crucial for accurate tectonic setting discrimination [1,[4][5][6]. Magmatic rocks formed in different tectonic settings have unique geochemical characteristics, which are mainly reflected in the differences in composition [7][8][9][10]. The contents of major elements, trace elements and isotopic composition can be obtained by whole-rock geochemical analysis. Thus, it is feasible to discriminate the tectonic settings where magma formed and the chemical properties of magmatic source areas through elemental composition [11]. Basalt, as a typical magmatic rock, is widely distributed in the earth's crust, and its chemical element content can provide sufficient information about mantle sources, mantle partial melting, magma crystallization processes and mantle metasomatism [12,13]. Table 1. A summary of ML models and application scenarios in a few cases.

Neural Fuzzy Inference System (NFIS)
To provide options for geochemists, MLAs applied in tectonic discrimination still need to be explored [10,56]. This study introduces a robust and quantitative MLA called NFIS. Developed since 1993, NFIS integrates the concise form of fuzzy logic with the learning algorithm of neural networks. It is a hierarchical topology based on neural networks to provide the fuzzy function of the fuzzy system [57]. The unified model not only has the self-learning ability of neural networks, but also makes up for the uninterpretability of neural networks using fuzzy logic [58,59]. Additionally, the unified model utilizes a hybrid algorithm of back-propagation and least squares estimation to adjust the antecedent and consequent parameters, and can automatically generate if-then rules [60]. Structurally, NFIS is an adaptive multi-layer feedforward neural network, composed by individual layers for fuzzification, product, normalization, defuzzification and output [61]. The algorithm has been widely used in different fields (e.g., financial industry [58], hydraulic engineering [59,60], and mining industry [61]) due to its excellent performance. Even so, literature surveys show that there is no research regarding the application of NFIS to geochemical tectonic discrimination, thus leading us to provide such applications in our current study.

Particle Swarm Optimization (PSO)
On the other hand, model parameter selection has a significant influence on classification performance in ML modeling [60]. However, in previous studies, model parameters were not specially optimized when different MLAs were used for tectonic setting discrimination, but instead manual adjustment and grid searches have been common methods of parameter tuning, which makes it difficult to explore the full potential of model performance. Swarm intelligence optimization has always been an area of high priority research as a burgeoning evolutionary computation technique. The advantages of metaheuristic optimization techniques provide solutions to the parameter optimization problem using soft computing [62]. A number of well-known metaheuristic algorithms, such as genetic algorithm [63], ant colony optimization [64], differential evolution [65] and PSO [66], have been put forward as possible Minerals 2019, 9, 376 4 of 19 optimization techniques to address the parameter optimization weaknesses. Among them, PSO is considered as a promising and powerful optimization technique, thus it was selected to search for optimal parameters with NFIS [67]. PSO, which was originally attributed to Kennedy and Eberhart [66], is an advanced evolutionary algorithm for solving the continuous global optimization problem using a non-linear technique. The idea was initially inspired by the bird flock patterns, and then a simplified model was established using swarm intelligence [68,69]. The PSO-based algorithm is well suited to deal with non-linear and non-convex design spaces with discontinuities on account of easy implementation, high precision, good robustness and fast convergence [70]. The algorithm has attracted extensive attention in academic circles due to its strong ability to solve practical problems [58][59][60][61]. Therefore, PSO can automatically optimize the antecedent and consequent parameters, which subsequently improves the adaptability of NFIS.

Problem Description and Research Contribution
Some problems with existing discrimination diagrams are described. With the outstanding classification nature of MLAs, SONFIS and other MLAs can overcome the above problems to some extent, that is the main contribution of this paper.

Limitations of Conventional Discrimination Diagrams
Though basalt discrimination diagrams are widely used, they have some inherent limitations.

•
Restricted by the earlier data processing technique, the sampling method was adopted and the typical region was taken as a research example, thus the discrimination diagrams were obtained. Although the discrimination diagrams obtained have achieved great results, with the accumulation of massive geochemical data, the earlier discrimination diagrams may not be applicable. Plotting compiled global data on some classical discrimination diagrams illustrates the problem, with a significant amount of these data being misclassified (see Figure 1) [53]. There is also significant overlap among MORB, OIB and IAB samples.

•
Binary or ternary discrimination diagrams are the most commonly used for basalt tectonic discrimination. In other words, only a few elements or element ratios are utilized, which could affect the discrimination effect. In addition, when information about related elements in a discrimination diagram is missing, the diagram is not available.
As such, the expanded known compositional ranges of basalts that form in different tectonic settings mean that the earlier developed discrimination diagrams have significant limitations that cast doubt on the validity of results obtained using these diagrams [42]. optimization has always been an area of high priority research as a burgeoning evolutionary computation technique. The advantages of metaheuristic optimization techniques provide solutions to the parameter optimization problem using soft computing [62]. A number of well-known metaheuristic algorithms, such as genetic algorithm [63], ant colony optimization [64], differential evolution [65] and PSO [66], have been put forward as possible optimization techniques to address the parameter optimization weaknesses. Among them, PSO is considered as a promising and powerful optimization technique, thus it was selected to search for optimal parameters with NFIS [67]. PSO, which was originally attributed to Kennedy and Eberhart [66], is an advanced evolutionary algorithm for solving the continuous global optimization problem using a non-linear technique. The idea was initially inspired by the bird flock patterns, and then a simplified model was established using swarm intelligence [68,69]. The PSO-based algorithm is well suited to deal with non-linear and non-convex design spaces with discontinuities on account of easy implementation, high precision, good robustness and fast convergence [70]. The algorithm has attracted extensive attention in academic circles due to its strong ability to solve practical problems [58][59][60][61]. Therefore, PSO can automatically optimize the antecedent and consequent parameters, which subsequently improves the adaptability of NFIS.

Problem Description and Research Contribution
Some problems with existing discrimination diagrams are described. With the outstanding classification nature of MLAs, SONFIS and other MLAs can overcome the above problems to some extent, that is the main contribution of this paper.

Limitations of Conventional Discrimination Diagrams
Though basalt discrimination diagrams are widely used, they have some inherent limitations.

•
Restricted by the earlier data processing technique, the sampling method was adopted and the typical region was taken as a research example, thus the discrimination diagrams were obtained. Although the discrimination diagrams obtained have achieved great results, with the accumulation of massive geochemical data, the earlier discrimination diagrams may not be applicable. Plotting compiled global data on some classical discrimination diagrams illustrates the problem, with a significant amount of these data being misclassified (see Figure 1) [53]. There is also significant overlap among MORB, OIB and IAB samples.

•
Binary or ternary discrimination diagrams are the most commonly used for basalt tectonic discrimination. In other words, only a few elements or element ratios are utilized, which could affect the discrimination effect. In addition, when information about related elements in a discrimination diagram is missing, the diagram is not available. As such, the expanded known compositional ranges of basalts that form in different tectonic settings mean that the earlier developed discrimination diagrams have significant limitations that cast doubt on the validity of results obtained using these diagrams [42].

Feasibility of MLAs for Tectonic Discrimination
MLAs can overcome the two limitations of conventional discrimination diagrams. Taking SONFIS as an example, the advantages of MLAs over discrimination diagrams are described.

•
MLAs originated from the era of big data, with strong adaptability to all kinds of data. As a classifier, SONFIS can be trained based on the geochemical data of a large number of basalt samples with known tectonic settings. For the samples with unknown tectonic settings, the geochemical data measured can be directly input into the trained SONFIS, then the corresponding tectonic setting can be easily acquired. When different data serve as to train the SONFIS, the model parameters update adaptively to get a new different classifier. In summary, the performance of the classifier is related to the quantity and quality of geochemical data trained.

•
MLAs have no limit on the amount of input geochemical data. Theoretically, the more the effective information, the better the performance of the classifier. For samples with unknown tectonic settings, the classification effect of SONFIS is also still satisfactory even if some input data are missing. Therefore, the MLA-based classifiers have excellent compatibility and robustness.
The advantages of MLAs can make up for the limitations of conventional discrimination diagrams, that is, the MLA-based discrimination method can supplement the discrimination diagrams. This technique will be fully demonstrated through two datasets (in Section 5.2) and comparative experiments (in Section 6).

Feasibility of MLAs for Tectonic Discrimination
MLAs can overcome the two limitations of conventional discrimination diagrams. Taking SONFIS as an example, the advantages of MLAs over discrimination diagrams are described.

•
MLAs originated from the era of big data, with strong adaptability to all kinds of data. As a classifier, SONFIS can be trained based on the geochemical data of a large number of basalt samples with known tectonic settings. For the samples with unknown tectonic settings, the geochemical data measured can be directly input into the trained SONFIS, then the corresponding tectonic setting can be easily acquired. When different data serve as to train the SONFIS, the model parameters update adaptively to get a new different classifier. In summary, the performance of the classifier is related to the quantity and quality of geochemical data trained.

•
MLAs have no limit on the amount of input geochemical data. Theoretically, the more the effective information, the better the performance of the classifier. For samples with unknown tectonic settings, the classification effect of SONFIS is also still satisfactory even if some input data are missing. Therefore, the MLA-based classifiers have excellent compatibility and robustness.
The advantages of MLAs can make up for the limitations of conventional discrimination diagrams, that is, the MLA-based discrimination method can supplement the discrimination diagrams. This technique will be fully demonstrated through two datasets (in Section 5.2) and comparative experiments (in Section 6).

Neural Fuzzy Inference System (NFIS)
In order to provide a simplified visual process, the NFIS architecture with two inputs and one output is illustrated in Figure 2, where circles and squares represent fixed and adaptive nodes, respectively [71]. The mathematical principle of NFIS is explained within the diagram of this general structure.

Neural Fuzzy Inference System (NFIS)
In order to provide a simplified visual process, the NFIS architecture with two inputs and one output is illustrated in Figure 2, where circles and squares represent fixed and adaptive nodes, respectively [71]. The mathematical principle of NFIS is explained within the diagram of this general structure.  For a first-order Takagi-Sugeno fuzzy model, a common rule is set with two fuzzy if-then rules, which is presented as follows: where A and B are specified as the membership functions for inputs x, y, and , , are function parameters of output f. It should also be noted that the node in the i-th position of the k-th layer is denoted as , and the node functions in the same layer are of the same function family as described below.

Layer 1: Fuzzification Layer
The fuzzy membership values for input factors are generated using a membership function in this layer, as seen in Equation (1). The Gaussian membership function, namely Equation (2), is considered as the input membership function.
where A and B are the linguistic variables, ( ) and ( ) are the Gaussian membership functions, and and c are the antecedent parameters that respectively control the shapes of ( ) and ( ) .

Layer 2: Product Layer
Every node in Layer 2 computes the firing strength of the rule-based on Equation (3), which is the product of all incoming signals. For a first-order Takagi-Sugeno fuzzy model, a common rule is set with two fuzzy if-then rules, which is presented as follows: where A and B are specified as the membership functions for inputs x, y, and p 0 , p 1 , p 2 are function parameters of output f. It should also be noted that the node in the i-th position of the k-th layer is denoted as O k i , and the node functions in the same layer are of the same function family as described below.

Layer 1: Fuzzification Layer
The fuzzy membership values for input factors are generated using a membership function in this layer, as seen in Equation (1). The Gaussian membership function, namely Equation (2), is considered as the input membership function.
where A and B are the linguistic variables, µ A i (x) and µ B i (y) are the Gaussian membership functions, and σ and c are the antecedent parameters that respectively control the shapes of µ A i (x) and µ B i (y) .

Layer 2: Product Layer
Every node in Layer 2 computes the firing strength ω i of the rule-based on Equation (3), which is the product of all incoming signals.

Layer 3: Normalization Layer
Normalization in this layer is done in accordance with Equation (4). The output, called normalized firing strength ω i , is calculated based on the ratio of the i-th rule's firing strength to the sum of all rules' firing strengths.

Layer 4: Defuzzification Layer
The layer is used for defuzzification where the value of the rule consequence for each node is calculated as follows.
where p i 0 , p i 1 and p i 2 are the consequent parameters of the output function f i .

Layer 5: Output Layer
In this layer, the overall output is generated through the summation of all incoming signals.

Particle Swarm Optimization (PSO)
PSO is initialized with a set of random particles (solutions) that have random positions and velocities. Seeking an optimal solution through an iterative process is the next step of the algorithm, in which particle positions are adjusted based on experimentation with the particle of interest and that of other particles. To find a potential solution, the best position of each particle tracked is called the personal best p best . The global best g best is also the global best value achieved by other particles in the swarm.
Each particle can be trained to accelerate from its own p best and g best positions during the learning process. This can be achieved by calculating the new velocity term for every particle based on its distance from the p best and g best positions. The p best and g best velocities are then randomly weighted to produce a new velocity value for the particle, which then affects the next position of the particle in the next iteration.
Compared with other optimization algorithms, the main advantage of PSO is its simplicity. The velocity update equation (see Equation (7)) and movement equation (see Equation (8)) are the only equations required in PSO. Equation (7) is used to adjust the velocity vectors given to p best and g best , while Equation (8) determines the actual motion of particles through their specific velocity vectors. Furthermore, Shi and Eberhart [72] introduced the inertia weight w into PSO in order to improve the convergence speed. The inertia weight determines the contribution rate of the particle's previous velocity to its velocity at the current step.

Overall Methodology: The Proposed Hybrid SONFIS Method
A new hybrid artificial intelligence method for tectonic discrimination of basalts, called SONFIS, was put forward based on NFIS and PSO methods. PSO was introduced to improve the flexibility and generalization ability of NFIS by seeking the optimal parameters corresponding to different datasets. The architecture of the proposed hybrid SONFIS method is shown in Figure 3. The procedure of SONFIS is outlined below.
Step 1: Dataset preparation. The training and test sets were determined by percentage segmentation. The independent and dependent variables were also specified.
Step 2: Model initialization. An initial SONFIS model was generated after all antecedent and consequent parameters were confirmed by trial and error.
Step 3: Parameter optimization. PSO was employed to optimize the model by searching for the optimal values of the antecedent and consequent parameters. Cross-validation played an important role therein.
Step 4: Fitness evaluation. The overall classification accuracy was taken as the fitness function, which was the bridge between NFIS and PSO. Model tuning was based on fitness evaluation results.
Step 5: Model validation. The maximum number of iterations or certain classification precision was considered as the stopping criterion. Once the optimal parameters were found, the SONFIS obtained was validated.

Overall Methodology: The Proposed Hybrid SONFIS Method
A new hybrid artificial intelligence method for tectonic discrimination of basalts, called SONFIS, was put forward based on NFIS and PSO methods. PSO was introduced to improve the flexibility and generalization ability of NFIS by seeking the optimal parameters corresponding to different datasets. The architecture of the proposed hybrid SONFIS method is shown in Figure 3. The procedure of SONFIS is outlined below.
Step 1: Dataset preparation. The training and test sets were determined by percentage segmentation. The independent and dependent variables were also specified.
Step 2: Model initialization. An initial SONFIS model was generated after all antecedent and consequent parameters were confirmed by trial and error.
Step 3: Parameter optimization. PSO was employed to optimize the model by searching for the optimal values of the antecedent and consequent parameters. Cross-validation played an important role therein.
Step 4: Fitness evaluation. The overall classification accuracy was taken as the fitness function, which was the bridge between NFIS and PSO. Model tuning was based on fitness evaluation results.
Step 5: Model validation. The maximum number of iterations or certain classification precision was considered as the stopping criterion. Once the optimal parameters were found, the SONFIS obtained was validated.

Data Acquisition and Preprocessing
The two geochemical datasets of basalts from GEOROC and PetDB were used to test the classification performance of the proposed SONFIS model. One was a high-dimensional dataset, and the other was low-dimensional. The usability of SONFIS could be validated by using both types of datasets.

Data Acquisition and Preprocessing
The two geochemical datasets of basalts from GEOROC and PetDB were used to test the classification performance of the proposed SONFIS model. One was a high-dimensional dataset, and the other was low-dimensional. The usability of SONFIS could be validated by using both types of datasets.

Dataset 1 with High-Dimensional Features
An extensive dataset of 938 basalt samples was obtained after data cleaning [1].      The unit of the major element concentration is wt%, and so is the minimum, maximum and mean. Moreover, the unit of the minor and trace element concentration is ppm, and so is the minimum, maximum and mean.

Dataset 2 with Low-Dimensional Features
The data volume of the low-dimensional dataset was larger than that of dataset 1, containing 1582 basalt samples [10]. Five hundred and thirty-nine MORB samples came from the Atlantic and Pacific mid-ocean ridges, 463 OIB samples were mainly distributed in the Atlantic and Pacific regions and 580 IAB samples were from the west and east coast of the Pacific Rim. Unlike dataset 1, each basalt sample in dataset 2 contained only 12 elements, which were K 2 O, CaO, SiO 2 , MgO, NiO, Na 2 O, FeO T , TiO 2 , Al 2 O 3 , MnO, Cr 2 O 3 and P 2 O 5 . The basic statistics for each element corresponding to three different tectonic settings are presented in Table 3. The unit of the element concentration is wt%, and so is the minimum, maximum and mean.
It should be noted that both datasets contained missing data [1,10]. Nothing was done with the missing data in this study, once again illustrating the superiority of the proposed SONFIS method. What was more, each geochemical feature in both datasets was treated as an input attribute, while the output classes were unified into three basalt tectonic settings, that is, MORB, OIB and IAB. Since the fuzzy membership values of the SONFIS model were between 0 and 1, the values of the input features were adjusted to fit this range. Three output classes, MORB, OIB and IAB, were symbolized by 0, 1 and 2, respectively. Percentage segmentation is a conventional method used for model validation in ML modeling. Both datasets were then randomly split into two subsets, one as a training set (80%) and the other as a test set (20%) [10]. The segmentation results of the two datasets are shown in Tables 4 and 5, separately. Marked in italics and bold denotes the classification accuracy of the best classifier for different tectonic settings. Marked in italics and bold denotes the classification accuracy of the best classifier for different tectonic settings.

Model Parameter Configuration
For parameter learning and tuning, the K-fold cross-validation technique was adopted for model optimization in order to verify the classification performance of the training model and adjust the parameters, as shown in Figure 5 [1,8,10]. As a general rule for providing empirical evidence, K = 10 is generally preferred. The raw dataset was first divided into 10 subsets. The hold-out method was then repeated 10 times, such that each time, one of the 10 subsets was used as the test set and the remaining nine subsets were grouped together to form a training set. The error estimation was finally averaged over all 10 trials to get the total effectiveness of the model.  Marked in italics and bold denotes the classification accuracy of the best classifier for different tectonic settings. Marked in italics and bold denotes the classification accuracy of the best classifier for different tectonic settings.

Model Parameter Configuration
For parameter learning and tuning, the K-fold cross-validation technique was adopted for model optimization in order to verify the classification performance of the training model and adjust the parameters, as shown in Figure 5 [1,8,10]. As a general rule for providing empirical evidence, = 10 is generally preferred. The raw dataset was first divided into 10 subsets. The hold-out method was then repeated 10 times, such that each time, one of the 10 subsets was used as the test set and the remaining nine subsets were grouped together to form a training set. The error estimation was finally averaged over all 10 trials to get the total effectiveness of the model. The fuzzy c-means clustering [60] was employed to transfer all training samples into = 10 clusters, which was determined by trial and error. The fuzzy if-then rules were generated by clusters, and the optimal parameters of the rule were acquired through the PSO algorithm, as shown in Figure 3. During the optimization process, different combinations of antecedent and consequent parameters were generated, and each combination was tested with 10-fold cross-validation, resulting in a total of 100 iterations. The variant of classification accuracy was used for fitness evaluation. When the optimization was completed, the optimal position of the swarm was obtained, providing the highest classification accuracy once the stopping criterion was satisfied. The The fuzzy c-means clustering [60] was employed to transfer all training samples into c = 10 clusters, which was determined by trial and error. The fuzzy if-then rules were generated by clusters, and the optimal parameters of the rule were acquired through the PSO algorithm, as shown in Figure 3. During the optimization process, different combinations of antecedent and consequent parameters were generated, and each combination was tested with 10-fold cross-validation, resulting in a total of 100 iterations. The variant of classification accuracy was used for fitness evaluation. When the optimization was completed, the optimal position g best of the swarm was obtained, providing the highest classification accuracy once the stopping criterion was satisfied. The optimized values of the antecedent and consequent parameters of SONFIS were then determined, and the corresponding model was finally validated using the test set prior to being used for discriminating the tectonic settings of unknown samples.

Model Performance Evaluation
Evaluation metrics are fundamental to evaluating the proposed SONFIS method with existing datasets. Using specific metrics, model performance evaluation results can be reasonably given and accurately expressed. A comprehensive evaluation system, including the classification accuracy and confusion matrix, was used to assess the classification effect of each MLA. The classification accuracy C i of an individual program i depended on the number of samples correctly classified and was quantitatively evaluated by Equation (9) [53].
where T is the number of sample cases correctly classified, and N is the total number of sample cases. As a visual tool, the confusion matrix is often used to describe the performance of a classifier on the test set for which the actual values are known, and intuitively exhibits classification precision [10]. It can also be described as a specific type of contingency table with two dimensions and identical sets of classes in each dimension. The horizontal dimension denotes the actual class, and the vertical dimension denotes the predicted class. The number of correct and incorrect predictions were summarized with count values, and consequently broken down by each class. Some derived indexes (e.g., precision and recall) could also be derived from the confusion matrix. In summary, a confusion matrix provided insight not only into identifying the errors being made by a classifier, but more importantly, the types of errors being made.

Model Validation Scheme Design
To assess the classification performance of the proposed SONFIS method when processing geochemical data, several experiments were carried out using the two datasets described in Section 5.2.1. It is imperative to verify the effectiveness of the PSO algorithm using optimizing model parameters, and then compare it with two other common optimization techniques, manual adjustment and grid search. Furthermore, three additional common MLAs (SVM, RF and NB) were also compared with the SONFIS model. The new LRC and MLP models were also additionally used for performance comparison. Finally, the proposed SONFIS method and the other five MLAs were all implemented in MATLAB ® R2016b.

Optimization Effect Verification
The parameters of NFIS were tuned based on the same datasets (in Section 5.2.1) and computing platform (Intel ® Core TM i7-8700 CPU @ 3.20 GHz, 16 GB of RAM, and a 64-bit windows 10 OS), using manual adjustment, grid search and a PSO algorithm (i.e., SONFIS), sequentially. Each method was conducted ten times, and the average classification accuracy obtained was used as the comparison value. The results show that the classification accuracy of SONFIS was approximately five to ten percent higher than that of the other two optimized models when discriminating among the tectonic settings of basalts. The improvement of classification performance was consistent with literature [58]. The SONFIS model converged after 100 iterations, consequently taking less time, which suggested the PSO algorithm was superior in both performance and efficiency for parameter optimization.

MLA Performance Comparison
For dataset 1, the basalt tectonic discrimination results obtained by six MLAs are shown in Table 4 and Figure 6. It can be seen from Table 4 that all MLAs had good classification performance for the three tectonic settings. The data demonstrated that SONFIS was the most capable, followed by RF, SVM and MLP, all of which had an overall classification accuracy of over 90%. However, LRC and NB did not perform well in comparison. SONFIS also demonstrated top performance in terms of the classification accuracy of single tectonic setting. The discrimination effects of MLP on OIB and RF on IAB were also outstanding. In Figure 6, darker colored squares on the main diagonal indicated higher accuracy of the classification results. In the confusion matrix for each MLA, the squares on the main diagonal were the darkest, indicating that the number of correct classifications was large. Again, the SONFIS model performed best in the three classes. Overall, all MLAs used could take advantage of the high-dimensional information in dataset 1 and had a remarkable discrimination ability, which was not achieved in the discrimination diagrams [53]. These results were additionally consistent with previous results obtained by Ueki et al. [8]. for the three tectonic settings. The data demonstrated that SONFIS was the most capable, followed by RF, SVM and MLP, all of which had an overall classification accuracy of over 90%. However, LRC and NB did not perform well in comparison. SONFIS also demonstrated top performance in terms of the classification accuracy of single tectonic setting. The discrimination effects of MLP on OIB and RF on IAB were also outstanding. In Figure 6, darker colored squares on the main diagonal indicated higher accuracy of the classification results. In the confusion matrix for each MLA, the squares on the main diagonal were the darkest, indicating that the number of correct classifications was large.
Again, the SONFIS model performed best in the three classes. Overall, all MLAs used could take advantage of the high-dimensional information in dataset 1 and had a remarkable discrimination ability, which was not achieved in the discrimination diagrams [53]. These results were additionally consistent with previous results obtained by Ueki et al. [8]. For dataset 2, the basalt tectonic discrimination results acquired by six MLAs are shown in Table 5 and Figure 7. In this dataset, there were only a few elements and no isotopes. As noted by Petrelli and Perugini [40], a small number of elements can affect the classification effect of MLAs, which was also confirmed by the classification results below. For example, the three models, MLP, SVM and RF, did well in the dataset 1, but did not perform well with dataset 2. However, the SONFIS model still classified best, with the single and overall classification accuracy reaching about 95%, demonstrating that the integration of the PSO algorithm improved the adaptability of SONFIS. The effects of information loss were also visually illustrated in Figure 7. Unlike Figure 6, not all of the darkest squares were on the main diagonal within every subgraph. However, the confusion matrix corresponding to the SONFIS model was consistent with Figure 6, which demonstrated the advantages of using the combined model. For dataset 2, the basalt tectonic discrimination results acquired by six MLAs are shown in Table 5 and Figure 7. In this dataset, there were only a few elements and no isotopes. As noted by Petrelli and Perugini [40], a small number of elements can affect the classification effect of MLAs, which was also confirmed by the classification results below. For example, the three models, MLP, SVM and RF, did well in the dataset 1, but did not perform well with dataset 2. However, the SONFIS model still classified best, with the single and overall classification accuracy reaching about 95%, demonstrating that the integration of the PSO algorithm improved the adaptability of SONFIS. The effects of information loss were also visually illustrated in Figure 7. Unlike Figure 6, not all of the darkest squares were on the main diagonal within every subgraph. However, the confusion matrix corresponding to the SONFIS model was consistent with Figure 6, which demonstrated the advantages of using the combined model. SONFIS model still classified best, with the single and overall classification accuracy reaching about 95%, demonstrating that the integration of the PSO algorithm improved the adaptability of SONFIS. The effects of information loss were also visually illustrated in Figure 7. Unlike Figure 6, not all of the darkest squares were on the main diagonal within every subgraph. However, the confusion matrix corresponding to the SONFIS model was consistent with Figure 6, which demonstrated the advantages of using the combined model.

Contrast with Conventional Discrimination Diagrams
Some scholars [26,53] have utilized conventional discrimination diagrams to classify the corresponding dataset in Section 5.2.1 into MORB, OIB and IAB respectively, which facilitates the performance contrast between MLAs and discrimination diagrams. For dataset 1, Han et al. [53] adopted the TiO2-MnO-P2O5, FeO T -MgO-Al2O3, Ti-Zr-Y, Zr/Y-Zr and Ti-Zr five common diagrams for tectonic discrimination (see Figure 1). In the initial state, the discrimination effect of the Ti-Zr diagram was outstanding, whose classification accuracy was about 75%. When the missing data was processed, the classification accuracy of the Zr/Y-Zr diagram could reach 90%, but it was still inferior to some MLAs (e.g., MLP, SVM, RF and SONFIS). For dataset 2, Li et al. [26] established the novel FeO T /Na2O-FeO T /CaO diagram, which had the excellent classification effect for MORB, OIB and IAB through trial and error, while the discrimination effect between MORB (or OIB) and IAB was still not satisfactory. On the contrary, the proposed SONFIS method performed well for each tectonic setting.
The contrast may be incomplete because it did not cover all the basalt discrimination diagrams, but it still showed several advantages of MLAs: (1) MLAs could take advantage of all the element information, even if the dataset contained missing data, that is, MLAs were less selective about samples. (2) The classification accuracy of MLAs was often higher, and the classification effect of single setting was uniform. (3) MLAs could be easily transplanted to tectonic discrimination of other volcanic rocks.

Discussion: Applicability and Deficiency of MLA-Based Discrimination Method
MLAs can extract valuable information from diverse data, which was taken into account in the selection of basalt samples. The geochemical data used in this paper were global data of the Cenozoic era, with a certain tectonic environment background, excluding data before the Cenozoic era. The amount of data was large and the source was numerous, including the fresh, altered and metamorphic basalts. The richness and diversity of geochemical datasets made for a reliable SONFIS model with high classification accuracy. Nevertheless, it was insufficient that basalts that were influenced by crustal contamination had not been considered because only three types of tectonic settings (MORB, OIB and IAB) were studied. More tectonic settings also need to be classified accurately by MLAs in the follow-up study.
Compared with binary or ternary discrimination diagrams, another deficiency of MLAs was that the high-dimensional data (more than three dimensions) were difficult to visualize. Although geologists could directly call on the trained model for tectonic discrimination and obtain satisfactory classification results, the visualization problem would limit its practical application to

Contrast with Conventional Discrimination Diagrams
Some scholars [26,53] have utilized conventional discrimination diagrams to classify the corresponding dataset in Section 5.2.1 into MORB, OIB and IAB respectively, which facilitates the performance contrast between MLAs and discrimination diagrams. For dataset 1, Han et al. [53] adopted the TiO 2 -MnO-P 2 O 5 , FeO T -MgO-Al 2 O 3 , Ti-Zr-Y, Zr/Y-Zr and Ti-Zr five common diagrams for tectonic discrimination (see Figure 1). In the initial state, the discrimination effect of the Ti-Zr diagram was outstanding, whose classification accuracy was about 75%. When the missing data was processed, the classification accuracy of the Zr/Y-Zr diagram could reach 90%, but it was still inferior to some MLAs (e.g., MLP, SVM, RF and SONFIS). For dataset 2, Li et al. [26] established the novel FeO T /Na 2 O-FeO T /CaO diagram, which had the excellent classification effect for MORB, OIB and IAB through trial and error, while the discrimination effect between MORB (or OIB) and IAB was still not satisfactory. On the contrary, the proposed SONFIS method performed well for each tectonic setting.
The contrast may be incomplete because it did not cover all the basalt discrimination diagrams, but it still showed several advantages of MLAs: (1) MLAs could take advantage of all the element information, even if the dataset contained missing data, that is, MLAs were less selective about samples.
(2) The classification accuracy of MLAs was often higher, and the classification effect of single setting was uniform. (3) MLAs could be easily transplanted to tectonic discrimination of other volcanic rocks.

Discussion: Applicability and Deficiency of MLA-Based Discrimination Method
MLAs can extract valuable information from diverse data, which was taken into account in the selection of basalt samples. The geochemical data used in this paper were global data of the Cenozoic era, with a certain tectonic environment background, excluding data before the Cenozoic era. The amount of data was large and the source was numerous, including the fresh, altered and metamorphic basalts. The richness and diversity of geochemical datasets made for a reliable SONFIS model with high classification accuracy. Nevertheless, it was insufficient that basalts that were influenced by crustal contamination had not been considered because only three types of tectonic settings (MORB, OIB and IAB) were studied. More tectonic settings also need to be classified accurately by MLAs in the follow-up study.
Compared with binary or ternary discrimination diagrams, another deficiency of MLAs was that the high-dimensional data (more than three dimensions) were difficult to visualize. Although geologists could directly call on the trained model for tectonic discrimination and obtain satisfactory classification results, the visualization problem would limit its practical application to some extent. It is imperative to solve the visualization problem, and many high dimensional visualization (dimensionality reduction) methods are being tried, such as principal component analysis and t-SNE. In addition, some MLAs belong to the black box model, which makes it difficult for geochemists to utilize the internal classification rules directly. The reasonable interpretation of classification results obtained by MLAs also needs to be resolved.

Conclusions and Future Work
A combined technique based on fuzzy logic and neural networks, termed NFIS, was newly introduced into the research area of tectonic discrimination of basalts in this paper. A metaheuristic optimization algorithm PSO was also utilized to avoid model parameters restricting the classification performance of NFIS. As such, NFIS and PSO were combined to construct a novel intelligent discrimination method called SONFIS. The accuracy and generalizability of NFIS were compared with that of five well-established MLAs using high-dimensional and low-dimensional datasets. The model evaluation results were subsequently represented by classification accuracy calculations and confusion matrix visualization. The conclusions are summarized as follows: • It could be found from Section 6.1 that with the help of PSO, the overall classification accuracy of SONFIS was about 5% higher than that of NFIS optimized by manual adjustment and grid search. Compared with grid search-optimized NFIS, SONFIS was more complicated, but it demonstrated better classification performance, indicating that the combined model was worth exploring. • SONFIS had excellent generalization capacity, since PSO could automatically search for optimal parameters for different datasets. SONFIS could also be accurately applied to both high-dimensional and low-dimensional datasets, which was valuable for the study of petrology and geochemistry.

•
The comparative experiments show that SONFIS was competitive for the two datasets used, demonstrating classification accuracy over 90% for both datasets. Furthermore, more elements could be utilized by SONFIS, giving it a superior ability to avoid the unreliability of the discrimination results.

•
The other five well-established MLAs were also excellent methods for the tectonic discrimination of basalts, showing that ML was a particularly useful and promising tool in geochemical research. The combination of large databases and ML techniques might yield unexpected results.
Although ML had outstanding potential in petrological and geochemical research, it inevitably had some disadvantages, such as difficulty in interpretation and expression. It is necessary to explore effective methods to solve these problems in future studies, so as to improve the usability of ML in Earth Science.