Basalt Tectonic Discrimination Using Combined Machine Learning Approach

Ren, Qiubing; Li, Mingchao; Han, Shuai; Zhang, Ye; Zhang, Qi; Shi, Jonathan

doi:10.3390/min9060376

Open AccessArticle

Basalt Tectonic Discrimination Using Combined Machine Learning Approach

by

Qiubing Ren

¹

,

Mingchao Li

^1,*

,

Shuai Han

¹,

Ye Zhang

¹,

Qi Zhang

² and

Jonathan Shi

³

¹

State Key Laboratory of Hydraulic Engineering Simulation and Safety, Tianjin University, Tianjin 300354, China

²

Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing 100029, China

³

College of Engineering, Louisiana State University, Baton Rouge, LA 70803, USA

^*

Author to whom correspondence should be addressed.

Minerals 2019, 9(6), 376; https://doi.org/10.3390/min9060376

Submission received: 20 March 2019 / Revised: 6 June 2019 / Accepted: 18 June 2019 / Published: 22 June 2019

(This article belongs to the Section Mineral Geochemistry and Geochronology)

Download

Browse Figures

Versions Notes

Abstract

:

Geochemical discrimination of basaltic magmatism from different tectonic settings remains an essential part of recognizing the magma generation process within the Earth’s mantle. Discriminating among mid-ocean ridge basalt (MORB), ocean island basalt (OIB) and island arc basalt (IAB) is that matters to geologists because they are the three most concerned basalts. Being a supplement to conventional discrimination diagrams, we attempt to utilize the machine learning algorithm (MLA) for basalt tectonic discrimination. A combined MLA termed swarm optimized neural fuzzy inference system (SONFIS) was presented based on neural fuzzy inference system and particle swarm optimization. Two geochemical datasets of basalts from GEOROC and PetDB served as to test the classification performance of SONFIS. Several typical discrimination diagrams and well-established MLAs were also used for performance comparisons with SONFIS. Results indicated that the classification accuracy of SONFIS for MORB, OIB and IAB in both datasets could reach over 90%, superior to other methods. It also turns out that MLAs had certain advantages in making full use of geochemical characteristics and dealing with datasets containing missing data. Therefore, MLAs provide new research tools other than discrimination diagrams for geologists, and the MLA-based technique is worth extending to tectonic discrimination of other volcanic rocks.

Keywords:

basalt; tectonic setting; geochemical discrimination; machine learning; neural fuzzy inference system; particle swarm optimization

1. Introduction

Magmatic rocks (e.g., basalt and granite) form in a wide variety of tectonic settings, which primarily include mid-ocean ridges, ocean islands and island arcs. Many scholars have expressed concern regarding the restoration of the original tectonic setting of ancient magmatic rocks [1,2]. The establishment of the plate tectonics hypothesis has brought the study of a tectonic setting discrimination of magmatic rocks to the forefront of geological research [3]. Previous studies have shown that it is difficult to achieve this goal through only the macro geological setting, and understanding the chemical composition of these rocks is crucial for accurate tectonic setting discrimination [1,4,5,6]. Magmatic rocks formed in different tectonic settings have unique geochemical characteristics, which are mainly reflected in the differences in composition [7,8,9,10]. The contents of major elements, trace elements and isotopic composition can be obtained by whole-rock geochemical analysis. Thus, it is feasible to discriminate the tectonic settings where magma formed and the chemical properties of magmatic source areas through elemental composition [11]. Basalt, as a typical magmatic rock, is widely distributed in the earth’s crust, and its chemical element content can provide sufficient information about mantle sources, mantle partial melting, magma crystallization processes and mantle metasomatism [12,13]. What is more, basalts are mafic rocks, which are closer to the composition of the parental magma that was extracted from its mantle source [14,15]. For these reasons, basalt is usually the research object of tectonic setting discrimination [16,17,18].

Mid-ocean ridge basalt (MORB), ocean island basalt (OIB) and island arc basalt (IAB) are the three main types of basalts most concerned by academia. MORBs erupt at mid-ocean ridge without subduction. MORBs are derived from the depleted upper mantle, and have a high degree of partial melting. MORBs generally have the characteristics of low K₂O, TiO₂ and depleted incompatible elements. OIBs are distributed as islands far away from the subduction zone, and within plate hotspot activity. OIBs come from the LILE-enriched regions of the lower mantle, whose origin is related to the mantle plume. IABs are typical products of subduction zone magmatism, which are derived from the depleted upper mantle and components from the subducted slab [19,20,21,22,23]. Thus, how to discriminate the tectonic settings of basalts has become an essential issue in geochemistry [24,25,26].

The basalt discrimination diagram method is an important scientific discovery after the rise of plate tectonics. In the 1970s, a group of scholars, represented by Pearce, devoted themselves to the geochemical tectonic discrimination of basalts, and established the discrimination diagram method, which opened up a new approach to the basalt geodynamics research [10,27,28,29]. According to different source compositions of MORB, OIB and IAB, they explored the influence of the depth of magma formed, the degree of partial melting, magma evolution on the basalt geochemical properties [30,31,32]. Starting from the statistical theory and combining with typical regional case studies, they put forward many basalt tectonic discrimination diagrams, and achieved great results [33,34,35,36,37,38,39]. Subsequently, the discrimination diagram method was rapidly extended, which promoted the development of basalt research and plate tectonics. However, with the accumulation of geochemical data, it is found that the common discrimination diagrams are not always applicable to new datasets [1,8,12,13,40,41,42]. Some scholars are still trying to explore other methods to establish new discrimination diagrams. Linear discriminant analysis [4], primordial mantle-normalized diagrams [43] and other geochemical discrimination tools [44] have been proposed sequentially as alternatives. Despite some progress, there has been no widespread academic response. Since then, the discrimination diagram method has fallen into a bottleneck. Therefore, we attempt to utilize the artificial intelligence method to discriminate among MORB, OIB and IAB.

The main aim of this paper is to propose a new hybrid artificial intelligence method for tectonic setting discrimination of basalts, which we have termed swarm optimized neural fuzzy inference system (SONFIS). A set of scientific and reasonable model evaluation criteria were established, which included the quantitative classification accuracy and confusion matrix visualization. Two geochemical datasets of basalts were compiled to validate SONFIS. The classification performance of SONFIS was evaluated using the same datasets through comparisons with other well-established machine learning (ML) models, such as support vector machine (SVM), random forest (RF) and naive Bayes (NB). Moreover, the logistic regression classifier (LRC) and multilayer perceptron (MLP), which have not been applied in tectonic discrimination, have also been included for comparison.

This paper is organized as follows. A literature review is introduced in Section 2. Section 3 depicts the limitations of conventional discrimination diagrams and how to overcome these problems. In Section 4, the mathematical principles of the main methods used, including the neural fuzzy inference system (NFIS) and particle swarm optimization (PSO), are briefly described. Section 5 presents the proposed hybrid SONFIS method and its implementation procedure. Section 6 illustrates and discusses the results of comparative experiments and the applicability of ML modeling in tectonic discrimination. Conclusions and future work are finally provided in Section 7.

2. Literature Review

Data sharing has dramatically expanded with the establishment of some open-access and comprehensive global geochemical databases, such as GEOROC [45] and PetDB [46], which provide reliable data support for big data analysis [47,48]. Big data makes it possible to reevaluate the conventional discrimination diagrams that are based on small samples and traditional statistics [49,50,51]. Li et al. [43], Wang et al. [13] and Di et al. [7,18] adopted discrimination diagrams to conduct big data mining research on global data samples respectively, which again showed the inapplicability of the statistical method. In an attempt to make a simplification, tectonic setting discrimination was transformed into a multi-classification task, and ML techniques were introduced into geochemical tectonic discrimination due to their dominant classification performance and innate adaptability to big data [1,4,8,40,52]. At present, this technique is still scarcely applied for tectonic discrimination, despite its extensive application in other fields [3,53,54,55]. Only several commonly used ML algorithms (MLAs) have been utilized in a few cases beyond basalts, such as classification trees (CT), SVM, decision tree (DT), RF, sparse multinomial regression (SMR), NB and k-nearest neighbors (KNN), as summarized in Table 1. These cases indicate that the ML technique, which can utilize more elements and have excellent generalization capability, has a good potential for tectonic discrimination, and it is expected to become another common method in addition to conventional discrimination diagrams.

2.1. Neural Fuzzy Inference System (NFIS)

To provide options for geochemists, MLAs applied in tectonic discrimination still need to be explored [10,56]. This study introduces a robust and quantitative MLA called NFIS. Developed since 1993, NFIS integrates the concise form of fuzzy logic with the learning algorithm of neural networks. It is a hierarchical topology based on neural networks to provide the fuzzy function of the fuzzy system [57]. The unified model not only has the self-learning ability of neural networks, but also makes up for the uninterpretability of neural networks using fuzzy logic [58,59]. Additionally, the unified model utilizes a hybrid algorithm of back-propagation and least squares estimation to adjust the antecedent and consequent parameters, and can automatically generate if-then rules [60]. Structurally, NFIS is an adaptive multi-layer feedforward neural network, composed by individual layers for fuzzification, product, normalization, defuzzification and output [61]. The algorithm has been widely used in different fields (e.g., financial industry [58], hydraulic engineering [59,60], and mining industry [61]) due to its excellent performance. Even so, literature surveys show that there is no research regarding the application of NFIS to geochemical tectonic discrimination, thus leading us to provide such applications in our current study.

2.2. Particle Swarm Optimization (PSO)

On the other hand, model parameter selection has a significant influence on classification performance in ML modeling [60]. However, in previous studies, model parameters were not specially optimized when different MLAs were used for tectonic setting discrimination, but instead manual adjustment and grid searches have been common methods of parameter tuning, which makes it difficult to explore the full potential of model performance. Swarm intelligence optimization has always been an area of high priority research as a burgeoning evolutionary computation technique. The advantages of metaheuristic optimization techniques provide solutions to the parameter optimization problem using soft computing [62]. A number of well-known metaheuristic algorithms, such as genetic algorithm [63], ant colony optimization [64], differential evolution [65] and PSO [66], have been put forward as possible optimization techniques to address the parameter optimization weaknesses. Among them, PSO is considered as a promising and powerful optimization technique, thus it was selected to search for optimal parameters with NFIS [67]. PSO, which was originally attributed to Kennedy and Eberhart [66], is an advanced evolutionary algorithm for solving the continuous global optimization problem using a non-linear technique. The idea was initially inspired by the bird flock patterns, and then a simplified model was established using swarm intelligence [68,69]. The PSO-based algorithm is well suited to deal with non-linear and non-convex design spaces with discontinuities on account of easy implementation, high precision, good robustness and fast convergence [70]. The algorithm has attracted extensive attention in academic circles due to its strong ability to solve practical problems [58,59,60,61]. Therefore, PSO can automatically optimize the antecedent and consequent parameters, which subsequently improves the adaptability of NFIS.

3. Problem Description and Research Contribution

Some problems with existing discrimination diagrams are described. With the outstanding classification nature of MLAs, SONFIS and other MLAs can overcome the above problems to some extent, that is the main contribution of this paper.

3.1. Limitations of Conventional Discrimination Diagrams

Though basalt discrimination diagrams are widely used, they have some inherent limitations.

Restricted by the earlier data processing technique, the sampling method was adopted and the typical region was taken as a research example, thus the discrimination diagrams were obtained. Although the discrimination diagrams obtained have achieved great results, with the accumulation of massive geochemical data, the earlier discrimination diagrams may not be applicable. Plotting compiled global data on some classical discrimination diagrams illustrates the problem, with a significant amount of these data being misclassified (see Figure 1) [53]. There is also significant overlap among MORB, OIB and IAB samples.
Binary or ternary discrimination diagrams are the most commonly used for basalt tectonic discrimination. In other words, only a few elements or element ratios are utilized, which could affect the discrimination effect. In addition, when information about related elements in a discrimination diagram is missing, the diagram is not available.

As such, the expanded known compositional ranges of basalts that form in different tectonic settings mean that the earlier developed discrimination diagrams have significant limitations that cast doubt on the validity of results obtained using these diagrams [42].

3.2. Feasibility of MLAs for Tectonic Discrimination

MLAs can overcome the two limitations of conventional discrimination diagrams. Taking SONFIS as an example, the advantages of MLAs over discrimination diagrams are described.

MLAs originated from the era of big data, with strong adaptability to all kinds of data. As a classifier, SONFIS can be trained based on the geochemical data of a large number of basalt samples with known tectonic settings. For the samples with unknown tectonic settings, the geochemical data measured can be directly input into the trained SONFIS, then the corresponding tectonic setting can be easily acquired. When different data serve as to train the SONFIS, the model parameters update adaptively to get a new different classifier. In summary, the performance of the classifier is related to the quantity and quality of geochemical data trained.
MLAs have no limit on the amount of input geochemical data. Theoretically, the more the effective information, the better the performance of the classifier. For samples with unknown tectonic settings, the classification effect of SONFIS is also still satisfactory even if some input data are missing. Therefore, the MLA-based classifiers have excellent compatibility and robustness.

The advantages of MLAs can make up for the limitations of conventional discrimination diagrams, that is, the MLA-based discrimination method can supplement the discrimination diagrams. This technique will be fully demonstrated through two datasets (in Section 5.2) and comparative experiments (in Section 6).

4. Mathematical Principles of Main Algorithms

4.1. Neural Fuzzy Inference System (NFIS)

In order to provide a simplified visual process, the NFIS architecture with two inputs and one output is illustrated in Figure 2, where circles and squares represent fixed and adaptive nodes, respectively [71]. The mathematical principle of NFIS is explained within the diagram of this general structure.

For a first-order Takagi-Sugeno fuzzy model, a common rule is set with two fuzzy if-then rules, which is presented as follows:

Rule (i) \to If x is A_{i} and y is B_{i}, then f_{i} = p_{0}^{i} + p_{1}^{i} x + p_{2}^{i} y, i = 1, 2,

where A and B are specified as the membership functions for inputs x, y, and

p_{0}, p_{1}, p_{2}

are function parameters of output f. It should also be noted that the node in the i-th position of the k-th layer is denoted as

O_{i}^{k}

, and the node functions in the same layer are of the same function family as described below.

4.1.1. Layer 1: Fuzzification Layer

The fuzzy membership values for input factors are generated using a membership function in this layer, as seen in Equation (1). The Gaussian membership function, namely Equation (2), is considered as the input membership function.

O_{i}^{1} = μ_{A_{i} (x)}, O_{i}^{1} = μ_{B_{i} (y)}, i = 1, 2,

(1)

μ_{A_{i} (x)} = e^{- \frac{{(x - c_{i})}^{2}}{2 σ_{i}^{2}}}, μ_{B_{i} (y)} = e^{- \frac{{(y - c_{i})}^{2}}{2 σ_{i}^{2}}}, i = 1, 2,

(2)

where A and B are the linguistic variables,

μ_{A_{i} (x)}

and

μ_{B_{i} (y)}

are the Gaussian membership functions, and

σ

and c are the antecedent parameters that respectively control the shapes of

μ_{A_{i} (x)}

and

μ_{B_{i} (y)}

.

4.1.2. Layer 2: Product Layer

Every node in Layer 2 computes the firing strength

ω_{i}

of the rule-based on Equation (3), which is the product of all incoming signals.

O_{i}^{2} = ω_{i} = μ_{A_{i} (x)} \cdot μ_{B_{i} (y)}, i = 1, 2 .

(3)

4.1.3. Layer 3: Normalization Layer

Normalization in this layer is done in accordance with Equation (4). The output, called normalized firing strength

{\bar{ω}}_{i}

, is calculated based on the ratio of the i-th rule’s firing strength to the sum of all rules’ firing strengths.

O_{i}^{3} = {\bar{ω}}_{i} = \frac{ω_{i}}{ω_{1} + ω_{2}}, i = 1, 2 .

(4)

4.1.4. Layer 4: Defuzzification Layer

The layer is used for defuzzification where the value of the rule consequence for each node is calculated as follows.

O_{i}^{4} = {\bar{ω}}_{i} f_{i} = {\bar{ω}}_{i} (p_{0}^{i} + p_{1}^{i} x + p_{2}^{i} y), i = 1, 2,

(5)

where

p_{0}^{i}

,

p_{1}^{i}

and

p_{2}^{i}

are the consequent parameters of the output function

f_{i}

.

4.1.5. Layer 5: Output Layer

In this layer, the overall output is generated through the summation of all incoming signals.

O_{i}^{5} = \sum_{i} {\bar{ω}}_{i} f_{i} = \frac{\sum_{i} ω_{i} f_{i}}{\sum_{i} ω_{i}}, i = 1, 2 .

(6)

4.2. Particle Swarm Optimization (PSO)

PSO is initialized with a set of random particles (solutions) that have random positions and velocities. Seeking an optimal solution through an iterative process is the next step of the algorithm, in which particle positions are adjusted based on experimentation with the particle of interest and that of other particles. To find a potential solution, the best position of each particle tracked is called the personal best

p_{b e s t}

. The global best

g_{b e s t}

is also the global best value achieved by other particles in the swarm.

Each particle can be trained to accelerate from its own

p_{b e s t}

and

g_{b e s t}

positions during the learning process. This can be achieved by calculating the new velocity term for every particle based on its distance from the

p_{b e s t}

and

g_{b e s t}

positions. The

p_{b e s t}

and

g_{b e s t}

velocities are then randomly weighted to produce a new velocity value for the particle, which then affects the next position of the particle in the next iteration.

Compared with other optimization algorithms, the main advantage of PSO is its simplicity. The velocity update equation (see Equation (7)) and movement equation (see Equation (8)) are the only equations required in PSO. Equation (7) is used to adjust the velocity vectors given to

p_{b e s t}

and

g_{b e s t}

, while Equation (8) determines the actual motion of particles through their specific velocity vectors. Furthermore, Shi and Eberhart [72] introduced the inertia weight

w

into PSO in order to improve the convergence speed. The inertia weight determines the contribution rate of the particle’s previous velocity to its velocity at the current step.

\vec{v_{n e w}} = w \cdot \vec{v} + r_{1} C_{1} \cdot (\vec{p_{b e s t}} - \vec{p}) + r_{2} C_{2} \cdot (\vec{g_{b e s t}} - \vec{p}),

(7)

\vec{p_{n e w}} = \vec{p} + \vec{v_{n e w}},

(8)

where

\vec{v_{n e w}}

,

\vec{v}

,

\vec{p_{n e w}}

and

\vec{p}

are the new velocity, current velocity, new position and current position, respectively.

C_{1}

and

C_{2}

are the acceleration constants,

\vec{p_{b e s t}}

is the personal best position and

\vec{g_{b e s t}}

is the global best position among all particles.

r_{1}

and

r_{2}

are random values in the range (0,1) sampled from a uniform distribution, and

w

is the inertia weight.

5. Methodology

5.1. Overall Methodology: The Proposed Hybrid SONFIS Method

A new hybrid artificial intelligence method for tectonic discrimination of basalts, called SONFIS, was put forward based on NFIS and PSO methods. PSO was introduced to improve the flexibility and generalization ability of NFIS by seeking the optimal parameters corresponding to different datasets. The architecture of the proposed hybrid SONFIS method is shown in Figure 3. The procedure of SONFIS is outlined below.

Step 1: Dataset preparation. The training and test sets were determined by percentage segmentation. The independent and dependent variables were also specified.

Step 2: Model initialization. An initial SONFIS model was generated after all antecedent and consequent parameters were confirmed by trial and error.

Step 3: Parameter optimization. PSO was employed to optimize the model by searching for the optimal values of the antecedent and consequent parameters. Cross-validation played an important role therein.

Step 4: Fitness evaluation. The overall classification accuracy was taken as the fitness function, which was the bridge between NFIS and PSO. Model tuning was based on fitness evaluation results.

Step 5: Model validation. The maximum number of iterations or certain classification precision was considered as the stopping criterion. Once the optimal parameters were found, the SONFIS obtained was validated.

5.2. Methodology Implementation Procedure

5.2.1. Data Acquisition and Preprocessing

The two geochemical datasets of basalts from GEOROC and PetDB were used to test the classification performance of the proposed SONFIS model. One was a high-dimensional dataset, and the other was low-dimensional. The usability of SONFIS could be validated by using both types of datasets.

Dataset 1 with High-Dimensional Features

An extensive dataset of 938 basalt samples was obtained after data cleaning [1]. The dataset included 296 MORB samples from the East Pacific Rise, Mid Atlantic Ridge, Indian Ocean and Juan de Fuca Ridge. Three hundred and nineteen OIB samples were from St. Helena, the Canary, Cape Verde, Caroline, Crozet, Hawaii-Emperor, Juan Fernandez, Marquesas, Mascarene, Samoan and Society islands. The remaining 323 IAB samples came from the Aeolian, Izu-Bonin, Kermadec, Kurile, Lesser Antilles, Mariana, Scotia and Tonga arcs. The global distribution map of basalt samples is shown in Figure 4. Each basalt sample was assigned with 11 major elements (SiO₂, TiO₂, Al₂O₃, Fe₂O₃, FeO^T, CaO, MgO, MnO, K₂O, Na₂O and P₂O₅), 35 minor and trace elements (La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, Sc, V, Cr, Co, Ni, Cu, Zn, Ga, Rb, Sr, Y, Zr, Nb, Sn, Cs, Ba, Hf, Ta, Pb, Th and U) and five isotopic ratios (¹⁴³Nd/¹⁴⁴Nd, ⁸⁷Sr/⁸⁶Sr, ²⁰⁶Pb/²⁰⁴Pb, ²⁰⁷Pb/²⁰⁴Pb and ²⁰⁸Pb/²⁰⁴Pb). Accordingly, every sample was represented by a 51-dimensional vector. The basic statistics for each dimension are shown in Table 2.

Dataset 2 with Low-Dimensional Features

The data volume of the low-dimensional dataset was larger than that of dataset 1, containing 1582 basalt samples [10]. Five hundred and thirty-nine MORB samples came from the Atlantic and Pacific mid-ocean ridges, 463 OIB samples were mainly distributed in the Atlantic and Pacific regions and 580 IAB samples were from the west and east coast of the Pacific Rim. Unlike dataset 1, each basalt sample in dataset 2 contained only 12 elements, which were K₂O, CaO, SiO₂, MgO, NiO, Na₂O, FeO^T, TiO₂, Al₂O₃, MnO, Cr₂O₃ and P₂O₅. The basic statistics for each element corresponding to three different tectonic settings are presented in Table 3.

It should be noted that both datasets contained missing data [1,10]. Nothing was done with the missing data in this study, once again illustrating the superiority of the proposed SONFIS method. What was more, each geochemical feature in both datasets was treated as an input attribute, while the output classes were unified into three basalt tectonic settings, that is, MORB, OIB and IAB. Since the fuzzy membership values of the SONFIS model were between 0 and 1, the values of the input features were adjusted to fit this range. Three output classes, MORB, OIB and IAB, were symbolized by 0, 1 and 2, respectively. Percentage segmentation is a conventional method used for model validation in ML modeling. Both datasets were then randomly split into two subsets, one as a training set (80%) and the other as a test set (20%) [10]. The segmentation results of the two datasets are shown in Table 4 and Table 5, separately.

5.2.2. Model Parameter Configuration

For parameter learning and tuning, the K-fold cross-validation technique was adopted for model optimization in order to verify the classification performance of the training model and adjust the parameters, as shown in Figure 5 [1,8,10]. As a general rule for providing empirical evidence,

K = 10

is generally preferred. The raw dataset was first divided into 10 subsets. The hold-out method was then repeated 10 times, such that each time, one of the 10 subsets was used as the test set and the remaining nine subsets were grouped together to form a training set. The error estimation was finally averaged over all 10 trials to get the total effectiveness of the model.

The fuzzy c-means clustering [60] was employed to transfer all training samples into

c = 10

clusters, which was determined by trial and error. The fuzzy if-then rules were generated by clusters, and the optimal parameters of the rule were acquired through the PSO algorithm, as shown in Figure 3. During the optimization process, different combinations of antecedent and consequent parameters were generated, and each combination was tested with 10-fold cross-validation, resulting in a total of 100 iterations. The variant of classification accuracy was used for fitness evaluation. When the optimization was completed, the optimal position

g_{b e s t}

of the swarm was obtained, providing the highest classification accuracy once the stopping criterion was satisfied. The optimized values of the antecedent and consequent parameters of SONFIS were then determined, and the corresponding model was finally validated using the test set prior to being used for discriminating the tectonic settings of unknown samples.

5.2.3. Model Performance Evaluation

Evaluation metrics are fundamental to evaluating the proposed SONFIS method with existing datasets. Using specific metrics, model performance evaluation results can be reasonably given and accurately expressed. A comprehensive evaluation system, including the classification accuracy and confusion matrix, was used to assess the classification effect of each MLA. The classification accuracy C_i of an individual program i depended on the number of samples correctly classified and was quantitatively evaluated by Equation (9) [53].

C_{i} = \frac{T}{N} \times 100 %

(9)

where T is the number of sample cases correctly classified, and N is the total number of sample cases.

As a visual tool, the confusion matrix is often used to describe the performance of a classifier on the test set for which the actual values are known, and intuitively exhibits classification precision [10]. It can also be described as a specific type of contingency table with two dimensions and identical sets of classes in each dimension. The horizontal dimension denotes the actual class, and the vertical dimension denotes the predicted class. The number of correct and incorrect predictions were summarized with count values, and consequently broken down by each class. Some derived indexes (e.g., precision and recall) could also be derived from the confusion matrix. In summary, a confusion matrix provided insight not only into identifying the errors being made by a classifier, but more importantly, the types of errors being made.

5.2.4. Model Validation Scheme Design

To assess the classification performance of the proposed SONFIS method when processing geochemical data, several experiments were carried out using the two datasets described in Section 5.2.1. It is imperative to verify the effectiveness of the PSO algorithm using optimizing model parameters, and then compare it with two other common optimization techniques, manual adjustment and grid search. Furthermore, three additional common MLAs (SVM, RF and NB) were also compared with the SONFIS model. The new LRC and MLP models were also additionally used for performance comparison. Finally, the proposed SONFIS method and the other five MLAs were all implemented in MATLAB^® R2016b.

6. Results and Discussion

6.1. Optimization Effect Verification

The parameters of NFIS were tuned based on the same datasets (in Section 5.2.1) and computing platform (Intel^® Core^TM i7-8700 CPU @ 3.20 GHz, 16 GB of RAM, and a 64-bit windows 10 OS), using manual adjustment, grid search and a PSO algorithm (i.e., SONFIS), sequentially. Each method was conducted ten times, and the average classification accuracy obtained was used as the comparison value. The results show that the classification accuracy of SONFIS was approximately five to ten percent higher than that of the other two optimized models when discriminating among the tectonic settings of basalts. The improvement of classification performance was consistent with literature [58]. The SONFIS model converged after 100 iterations, consequently taking less time, which suggested the PSO algorithm was superior in both performance and efficiency for parameter optimization.

6.2. MLA Performance Comparison

For dataset 1, the basalt tectonic discrimination results obtained by six MLAs are shown in Table 4 and Figure 6. It can be seen from Table 4 that all MLAs had good classification performance for the three tectonic settings. The data demonstrated that SONFIS was the most capable, followed by RF, SVM and MLP, all of which had an overall classification accuracy of over 90%. However, LRC and NB did not perform well in comparison. SONFIS also demonstrated top performance in terms of the classification accuracy of single tectonic setting. The discrimination effects of MLP on OIB and RF on IAB were also outstanding. In Figure 6, darker colored squares on the main diagonal indicated higher accuracy of the classification results. In the confusion matrix for each MLA, the squares on the main diagonal were the darkest, indicating that the number of correct classifications was large. Again, the SONFIS model performed best in the three classes. Overall, all MLAs used could take advantage of the high-dimensional information in dataset 1 and had a remarkable discrimination ability, which was not achieved in the discrimination diagrams [53]. These results were additionally consistent with previous results obtained by Ueki et al. [8].

For dataset 2, the basalt tectonic discrimination results acquired by six MLAs are shown in Table 5 and Figure 7. In this dataset, there were only a few elements and no isotopes. As noted by Petrelli and Perugini [40], a small number of elements can affect the classification effect of MLAs, which was also confirmed by the classification results below. For example, the three models, MLP, SVM and RF, did well in the dataset 1, but did not perform well with dataset 2. However, the SONFIS model still classified best, with the single and overall classification accuracy reaching about 95%, demonstrating that the integration of the PSO algorithm improved the adaptability of SONFIS. The effects of information loss were also visually illustrated in Figure 7. Unlike Figure 6, not all of the darkest squares were on the main diagonal within every subgraph. However, the confusion matrix corresponding to the SONFIS model was consistent with Figure 6, which demonstrated the advantages of using the combined model.

6.3. Contrast with Conventional Discrimination Diagrams

Some scholars [26,53] have utilized conventional discrimination diagrams to classify the corresponding dataset in Section 5.2.1 into MORB, OIB and IAB respectively, which facilitates the performance contrast between MLAs and discrimination diagrams. For dataset 1, Han et al. [53] adopted the TiO₂-MnO-P₂O₅, FeO^T-MgO-Al₂O₃, Ti-Zr-Y, Zr/Y-Zr and Ti-Zr five common diagrams for tectonic discrimination (see Figure 1). In the initial state, the discrimination effect of the Ti-Zr diagram was outstanding, whose classification accuracy was about 75%. When the missing data was processed, the classification accuracy of the Zr/Y-Zr diagram could reach 90%, but it was still inferior to some MLAs (e.g., MLP, SVM, RF and SONFIS). For dataset 2, Li et al. [26] established the novel FeO^T/Na₂O-FeO^T/CaO diagram, which had the excellent classification effect for MORB, OIB and IAB through trial and error, while the discrimination effect between MORB (or OIB) and IAB was still not satisfactory. On the contrary, the proposed SONFIS method performed well for each tectonic setting.

The contrast may be incomplete because it did not cover all the basalt discrimination diagrams, but it still showed several advantages of MLAs: (1) MLAs could take advantage of all the element information, even if the dataset contained missing data, that is, MLAs were less selective about samples. (2) The classification accuracy of MLAs was often higher, and the classification effect of single setting was uniform. (3) MLAs could be easily transplanted to tectonic discrimination of other volcanic rocks.

6.4. Discussion: Applicability and Deficiency of MLA-Based Discrimination Method

MLAs can extract valuable information from diverse data, which was taken into account in the selection of basalt samples. The geochemical data used in this paper were global data of the Cenozoic era, with a certain tectonic environment background, excluding data before the Cenozoic era. The amount of data was large and the source was numerous, including the fresh, altered and metamorphic basalts. The richness and diversity of geochemical datasets made for a reliable SONFIS model with high classification accuracy. Nevertheless, it was insufficient that basalts that were influenced by crustal contamination had not been considered because only three types of tectonic settings (MORB, OIB and IAB) were studied. More tectonic settings also need to be classified accurately by MLAs in the follow-up study.

Compared with binary or ternary discrimination diagrams, another deficiency of MLAs was that the high-dimensional data (more than three dimensions) were difficult to visualize. Although geologists could directly call on the trained model for tectonic discrimination and obtain satisfactory classification results, the visualization problem would limit its practical application to some extent. It is imperative to solve the visualization problem, and many high dimensional visualization (dimensionality reduction) methods are being tried, such as principal component analysis and t-SNE. In addition, some MLAs belong to the black box model, which makes it difficult for geochemists to utilize the internal classification rules directly. The reasonable interpretation of classification results obtained by MLAs also needs to be resolved.

7. Conclusions and Future Work

A combined technique based on fuzzy logic and neural networks, termed NFIS, was newly introduced into the research area of tectonic discrimination of basalts in this paper. A metaheuristic optimization algorithm PSO was also utilized to avoid model parameters restricting the classification performance of NFIS. As such, NFIS and PSO were combined to construct a novel intelligent discrimination method called SONFIS. The accuracy and generalizability of NFIS were compared with that of five well-established MLAs using high-dimensional and low-dimensional datasets. The model evaluation results were subsequently represented by classification accuracy calculations and confusion matrix visualization. The conclusions are summarized as follows:

It could be found from Section 6.1 that with the help of PSO, the overall classification accuracy of SONFIS was about 5% higher than that of NFIS optimized by manual adjustment and grid search. Compared with grid search-optimized NFIS, SONFIS was more complicated, but it demonstrated better classification performance, indicating that the combined model was worth exploring.
SONFIS had excellent generalization capacity, since PSO could automatically search for optimal parameters for different datasets. SONFIS could also be accurately applied to both high-dimensional and low-dimensional datasets, which was valuable for the study of petrology and geochemistry.
The comparative experiments show that SONFIS was competitive for the two datasets used, demonstrating classification accuracy over 90% for both datasets. Furthermore, more elements could be utilized by SONFIS, giving it a superior ability to avoid the unreliability of the discrimination results.
The other five well-established MLAs were also excellent methods for the tectonic discrimination of basalts, showing that ML was a particularly useful and promising tool in geochemical research. The combination of large databases and ML techniques might yield unexpected results.

Although ML had outstanding potential in petrological and geochemical research, it inevitably had some disadvantages, such as difficulty in interpretation and expression. It is necessary to explore effective methods to solve these problems in future studies, so as to improve the usability of ML in Earth Science.

Author Contributions

Overall framework design, M.L.; Methodology and validation, Q.R.; Data collection and preliminary analysis, S.H.; Draft review and editing, Y.Z.; Investigation and conceptualization, Q.Z.; Data curation and formal analysis, J.S. All the authors discussed the results and commented on the manuscript.

Funding

This research was jointly funded by the National Natural Science Foundation for Excellent Young Scientists of China (Grant No. 51622904), the Tianjin Science Foundation for Distinguished Young Scientists of China (Grant No. 17JCJQJC44000) and the National Natural Science Foundation for Innovative Research Groups of China (Grant No. 51621092).

Conflicts of Interest

The authors declare no conflict of interest.

References

Vermeesch, P. Tectonic discrimination of basalts with classification trees. Geochim. Cosmochim. Acta 2006, 70, 1839–1848. [Google Scholar]
Ryan, K.M.; Williams, D.M. Testing the reliability of discrimination diagrams for determining the tectonic depositional environment of ancient sedimentary basins. Chem. Geol. 2007, 242, 103–125. [Google Scholar] [CrossRef]
Liu, K.; Liu, W.B. Machine learning and identification of the tectonic environment of basalt in the continental plate. Eng. Technol. Manag. 2017. [Google Scholar] [CrossRef]
Vermeesch, P. Tectonic discrimination diagrams revisited. Geochem. Geophys. Geosyst. 2006, 7, Q06017. [Google Scholar] [CrossRef]
Guo, Q.Q.; Xiao, W.J.; Windley, B.F.; Mao, Q.G.; Han, C.M.; Qu, J.F.; Ao, S.J.; Li, J.L.; Yong, Y. Provenance and tectonic settings of Permian turbidites from the Beishan Mountains, NW China: Implications for the Late Paleozoic accretionary tectonics of the southern Altaids. J. Asian Earth Sci. 2012, 49, 54–68. [Google Scholar] [CrossRef]
Chen, C.; Ren, Y.S.; Zhao, H.L.; Yang, Q.; Shang, Q.Q. Age, tectonic setting, and metallogenic implication of Phanerozoic granitic magmatism at the eastern margin of the Xing’an-Mongolian Orogenic Belt, NE China. J. Asian Earth Sci. 2017, 144, 368–383. [Google Scholar] [CrossRef]
Di, P.F.; Wang, J.R.; Zhang, Q.; Yang, J.; Chen, W.F.; Pan, Z.J.; Du, X.L.; Jiao, S.T. The evaluation of basalt tectonic discrimination diagrams: Constraints on the research of global basalt data. Bull. Miner. Petrol. Geochem. 2017, 36, 891–896. [Google Scholar]
Ueki, K.; Hino, H.; Kuwatani, T. Geochemical discrimination and characteristics of magmatic tectonic settings: A machine-learning-based approach. Geochem. Geophys. Geosyst. 2018, 19, 1327–1347. [Google Scholar]
Shi, Y.; Huang, Q.W.; Liu, X.J.; Krapež, B.; Yu, J.H.; Bai, Z.A. Provenance and tectonic setting of the supra-crustal succession of the Qinling Complex: Implications for the tectonic affinity of the North Qinling Belt, Central China. J. Asian Earth Sci. 2018, 158, 112–139. [Google Scholar]
Ren, Q.B.; Li, M.C.; Han, S. Tectonic discrimination of olivine in basalt using data mining techniques based on major elements: A comparative study from multiple perspectives. Big Earth Data 2019, 1–18. [Google Scholar] [CrossRef]
Mao, X.; Li, L.; Liu, Z.; Zeng, R.; Dick, J.M.; Yue, B.; Ai, Q. Multiple magma conduits model of the Jinchuan Ni-Cu-(PGE) deposit, northwestern China: Constraints from the geochemistry of platinum-group elements. Minerals 2019, 9, 187. [Google Scholar] [CrossRef]
Wang, J.R.; Chen, W.F.; Zhang, Q.; Jin, W.J.; Jiao, S.T.; Wang, Y.X.; Yang, J.; Pan, Z.J. MORB data mining: Reflection of basalt discrimination diagram. Geotecton. Met. 2017, 41, 420–431. [Google Scholar]
Wang, J.R.; Chen, W.F.; Zhang, Q.; Jiao, S.T.; Yang, J.; Pan, Z.J.; Wang, S. Preliminary research on data mining of N-MORB and E-MORB: Discussion on method of the basalt discrimination diagrams and the character of MORB’s mantle source. Acta Petrol. Sin. 2017, 33, 993–1005. [Google Scholar]
Green, D.H. The origin of basaltic and nephelinitic magmas in the earth’s mantle. Tectonophysics 1969, 7, 409–422. [Google Scholar] [CrossRef]
Wood, D.A. The application of a Th-Hf-Ta diagram to problems of tectonomagmatic classification and to establishing the nature of crustal contamination of basaltic lavas of the British Tertiary Volcanic Province. Earth Planet. Sci. Lett. 1980, 50, 11–30. [Google Scholar] [CrossRef]
Zhang, Y.; Yu, K.; Qian, H. LA-ICP-MS analysis of clinopyroxenes in basaltic pyroclastic rocks from the Xisha Islands, northwestern South China Sea. Minerals 2018, 8, 575. [Google Scholar] [CrossRef]
Shu, S.; Yang, X.; Liu, L.; Liu, W.; Cao, J.; Gao, E. Dual geochemical characteristics for the basic intrusions in the Yangtze Block, South China: New evidence for the breakup of Rodinia. Minerals 2018, 8, 228. [Google Scholar] [CrossRef]
Di, P.F.; Chen, W.F.; Zhang, Q.; Wang, J.R.; Tang, Q.Y.; Jiao, S.T. Comparison of global N-MORB and E-MORB classification schemes. Acta Petrol. Sin. 2018, 34, 264–274. [Google Scholar]
Yoder, H.S., Jr.; Tilley, C.E. Origin of basalt magmas: An experimental study of natural and synthetic rock systems. J. Petrol. 1962, 3, 342–532. [Google Scholar] [CrossRef]
Hofmann, A.W.; White, W.M. Mantle plumes from ancient oceanic crust. Earth Planet. Sci. Lett. 1982, 57, 421–436. [Google Scholar] [CrossRef]
Pearce, J.A.; Lippard, S.J.; Roberts, S. Characteristics and tectonic significance of supra-subduction zone ophiolites. Geol. Soc. Lond. Spec. Publ. 1984, 16, 77–94. [Google Scholar] [CrossRef]
Zindler, A.; Hart, S. Chemical geodynamics. Ann. Rev. Earth Planet. Sci. 1986, 14, 493–571. [Google Scholar] [CrossRef]
Sun, S.S.; McDonough, W.F. Chemical and isotopic systematics of oceanic basalts: Implications for mantle composition and processes. Geol. Soc. Lond. Spec. Publ. 1989, 42, 313–345. [Google Scholar] [CrossRef]
Safonova, I.; Maruyama, S.; Kojima, S.; Komiya, T.; Krivonogov, S.; Koshida, K. Recognizing OIB and MORB in accretionary complexes: A new approach based on ocean plate stratigraphy, petrology and geochemistry. Gondwana Res. 2016, 33, 92–114. [Google Scholar] [CrossRef]
Bi, J.H.; Ge, W.C.; Yang, H.; Wang, Z.H.; Tian, D.X.; Liu, X.W.; Xu, W.L.; Xing, D.H. Geochemistry of MORB and OIB in the Yuejinshan Complex, NE China: Implications for petrogenesis and tectonic setting. J. Asian Earth Sci. 2017, 145, 475–493. [Google Scholar] [CrossRef]
Li, Y.Q.; Du, X.L.; Jin, W.J.; Du, J.; Zhang, Q.; Wang, J.R.; Ma, Z. A comparative study of olivine in mid-ocean ridge basalt (MORB), ocean island basalt (OIB) and island arc basalt (IAB). Chin. J. Geol. 2018, 53, 1228–1239. [Google Scholar]
Pearce, J.A.; Cann, J.R. Ophiolite origin investigated by discriminant analysis using Ti, Zr and Y. Earth Planet. Sci. Lett. 1971, 12, 339–349. [Google Scholar] [CrossRef]
Pearce, J.A.; Cann, J.R. Tectonic setting of basic volcanic rocks determined using trace element analyses. Earth Planet. Sci. Lett. 1973, 19, 290–300. [Google Scholar] [CrossRef]
Pearce, J.A. Statistical analysis of major element patterns in basalts. J. Petrol. 1976, 17, 15–43. [Google Scholar] [CrossRef]
Hirose, K.; Kawamoto, T. Hydrous partial melting of lherzolite at 1 GPa: The effect of H₂O on the genesis of basaltic magmas. Earth Planet. Sci. Lett. 1995, 133, 463–473. [Google Scholar] [CrossRef]
Farnetani, C.G.; Richards, M.A.; Ghiorso, M.S. Petrological models of magma evolution and deep crustal structure beneath hotspots and flood basalt provinces. Earth Planet. Sci. Lett. 1996, 143, 81–94. [Google Scholar] [CrossRef]
Arndt, N.T.; Kerr, A.C.; Tarney, J. Dynamic melting in plume heads: The formation of Gorgona komatiites and basalts. Earth Planet. Sci. Lett. 1997, 146, 289–301. [Google Scholar] [CrossRef]
Pearce, J.A.; Norry, M.J. Petrogenetic implications of Ti, Zr, Y, and Nb variations in volcanic rocks. Contrib. Mineral. Petrol. 1979, 69, 33–47. [Google Scholar] [CrossRef]
Wood, D.A.; Joron, J.L.; Treuil, M. A re-appraisal of the use of trace elements to classify and discriminate between magma series erupted in different tectonic settings. Earth Planet. Sci. Lett. 1979, 45, 326–336. [Google Scholar] [CrossRef]
Shervais, J.W. Ti-V plots and the petrogenesis of modern and ophiolitic lavas. Earth Planet. Sci. Lett. 1982, 59, 101–118. [Google Scholar] [CrossRef]
Mullen, E.D. MnO/TiO₂/P₂O₅: A minor element discriminant for basaltic rocks of oceanic environments and its implications for petrogenesis. Earth Planet. Sci. Lett. 1983, 62, 53–62. [Google Scholar] [CrossRef]
Pearce, J.A.; Peate, D.W. Tectonic implications of the composition of volcanic arc magmas. Ann. Rev. Earth Planet. Sci. 1995, 23, 251–285. [Google Scholar] [CrossRef]
Workman, R.K.; Hart, S.R. Major and trace element composition of the depleted MORB mantle (DMM). Earth Planet. Sci. Lett. 2005, 231, 53–72. [Google Scholar] [CrossRef]
Galoyan, G.; Rolland, Y.; Sosson, M.; Corsini, M.; Melkonyan, R. Evidence for superposed MORB, oceanic plateau and volcanic arc series in the Lesser Caucasus (Stepanavan, Armenia). Comptes Rendus Geosci. 2007, 339, 482–492. [Google Scholar] [CrossRef]
Petrelli, M.; Perugini, D. Solving petrological problems through machine learning: The study case of tectonic discrimination using geochemical and isotopic data. Contrib. Min. Petrol. 2016, 171, 81. [Google Scholar] [CrossRef]
Wang, J.R.; Pan, Z.J.; Zhang, Q.; Chen, W.F.; Yang, J.; Jiao, S.T.; Wang, S.H. Intra-continental basalt data mining: The diversity of their constituents and the performance in basalt discrimination diagrams. Acta Petrol. Sin. 2016, 32, 1919–1933. [Google Scholar]
Liu, X.L.; Zhang, Q.; Li, W.C.; Yang, F.C.; Zhao, Y.; Li, Z.; Jiao, S.T.; Wang, J.R.; Zhang, N.; Wang, S.S.; et al. Applicability of large-ion lithophile and high field strength element basalt discrimination diagrams. Int. J. Dig. Earth 2018, 11, 752–760. [Google Scholar] [CrossRef]
Li, C.S.; Arndt, N.T.; Tang, Q.Y.; Ripley, E.M. Trace element indiscrimination diagrams. Lithos 2015, 232, 76–83. [Google Scholar] [CrossRef]
Zhang, L. MATPLOT: A MATLAB standalone application for geochemical data analysis and plotting. Acta Petrol. Sin. 2018, 34, 495–502. [Google Scholar]
GEOROC. Available online: http://georoc.mpch-mainz.gwdg.de/georoc/ (accessed on 20 June 2019).
PetDB. Available online: http://www.earthchem.org/petdb (accessed on 20 June 2019).
Zhang, Q.; Zhou, Y.Z. Reflections on the scientific research method in the era of big data. Bull. Mineral. Petrol. Geochem. 2017, 36, 881–885. [Google Scholar]
Zhang, Q.; Zhou, Y.Z. Big data helps geology develop rapidly. Acta Petrol. Sin. 2018, 34, 3167–3172. [Google Scholar]
Luo, J.M.; Zhang, Q. Big data opens up new way for geology study: Mining of all data enhances the researchful precision. Chin. J. Geol. 2018, 53, 1207–1214. [Google Scholar]
Zhang, Q.; Jiao, S.T.; Lu, X.X. Discussion on causality and correlation in geological research. Acta Petrol. Sin. 2018, 34, 275–280. [Google Scholar]
Zhou, Y.Z.; Chen, S.; Zhang, Q.; Xiao, F.; Wang, S.G.; Liu, Y.P.; Jiao, S.T. Advances and prospects of big data and mathematical geoscience. Acta Petrol. Sin. 2018, 34, 255–263. [Google Scholar]
Han, S.; Li, M.C.; Ren, Q.B. Discriminating among tectonic settings of spinel based on multiple machine learning algorithms. Big Earth Data 2019. [Google Scholar] [CrossRef]
Han, S.; Li, M.C.; Ren, Q.B.; Liu, C.Z. Intelligent determination and data mining for tectonic settings of basalts based on big data methods. Acta Petrol. Sin. 2018, 34, 3207–3216. [Google Scholar]
Jiao, S.T.; Zhou, Y.Z.; Zhang, Q.; Jin, W.J.; Liu, Y.P.; Wang, J. Study on intelligent discrimination of tectonic settings based on global gabbro data from GEOROC. Acta Petrol. Sin. 2018, 34, 3189–3194. [Google Scholar]
Trépanier, S.; Mathieu, L.; Daigneault, R.; Faure, S. Precursors predicted by artificial neural networks for mass balance calculations: Quantifying hydrothermal alteration in volcanic rocks. Comput. Geosci. 2016, 89, 32–43. [Google Scholar] [CrossRef] [Green Version]
Ren, Q.B.; Wang, G.; Li, M.C.; Han, S. Prediction of rock compressive strength using machine learning algorithms based on spectrum analysis of geological hammer. Geotech. Geol. Eng. 2019, 37, 475–489. [Google Scholar] [CrossRef]
Jang, J.S. ANFIS: Adaptive-network-based fuzzy inference system. IEEE Trans. Syst. Man Cybern. 1993, 23, 665–685. [Google Scholar] [CrossRef]
Chen, M.Y. A hybrid ANFIS model for business failure prediction utilizing particle swarm optimization and subtractive clustering. Inf. Sci. 2013, 220, 180–195. [Google Scholar] [CrossRef]
Basser, H.; Karami, H.; Shamshirband, S.; Akib, S.; Amirmojahedi, M.; Ahmad, R.; Jahangirzadeh, A.; Javidnia, H. Hybrid ANFIS-PSO approach for predicting optimum parameters of a protective spur dike. Appl. Soft Comput. 2015, 30, 642–649. [Google Scholar] [CrossRef]
Bui, K.T.T.; Bui, D.T.; Zou, J.G.; Van Doan, C.; Revhaug, I. A novel hybrid artificial intelligent approach based on neural fuzzy inference model and particle swarm optimization for horizontal displacement modeling of hydropower dam. Neural Comput. Appl. 2018, 29, 1495–1506. [Google Scholar] [CrossRef]
Shahnazar, A.; Rad, H.N.; Hasanipanah, M.; Tahir, M.M.; Armaghani, D.J.; Ghoroqi, M. A new developed approach for the prediction of ground vibration using a hybrid PSO-optimized ANFIS-based model. Environ. Earth Sci. 2017, 76, 527. [Google Scholar] [CrossRef]
Boussaïd, I.; Lepagnot, J.; Siarry, P. A survey on optimization metaheuristics. Inf. Sci. 2013, 237, 82–117. [Google Scholar] [CrossRef]
Tang, K.S.; Man, K.F.; Kwong, S.; He, Q. Genetic algorithms and their applications. IEEE Sig. Proc. Mag. 1996, 13, 22–37. [Google Scholar] [CrossRef]
Marco, D.; Mauro, B.; Thomas, S. Ant colony optimization. IEEE Comput. Intell. Mag. 2007, 1, 28–39. [Google Scholar]
Storn, R.; Price, K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R.C. Particle swarm optimization. In Proceedings of the International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; pp. 1942–1948. [Google Scholar] [CrossRef]
Song, S.; Kong, L.; Gan, Y.; Su, R. Hybrid particle swarm cooperative optimization algorithm and its application to MBC in alumina production. Prog. Nat. Sci. 2008, 18, 1423–1428. [Google Scholar] [CrossRef]
Momeni, E.; Armaghani, D.J.; Hajihassani, M.; Amin, M.F.M. Prediction of uniaxial compressive strength of rock samples using hybrid particle swarm optimization-based artificial neural networks. Measurement 2015, 60, 50–63. [Google Scholar] [CrossRef]
Chatterjee, S.; Sarkar, S.; Hore, S.; Dey, N.; Ashour, A.S.; Balas, V.E. Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput. Appl. 2017, 28, 2005–2016. [Google Scholar] [CrossRef]
Wang, D.; Tan, D.; Liu, L. Particle swarm optimization algorithm: An overview. Soft Comput. 2018, 22, 387–408. [Google Scholar] [CrossRef]
Rini, D.P.; Shamsuddin, S.M.; Yuhaniz, S.S. Particle swarm optimization for ANFIS interpretability and accuracy. Soft Comput. 2016, 20, 251–262. [Google Scholar] [CrossRef]
Shi, Y.; Eberhart, R. A modified particle swarm optimizer. In Proceedings of the IEEE World Congress on Computational Intelligence, Anchorage, Alaska, 4–9 May 1998; pp. 69–73. [Google Scholar] [CrossRef]

Figure 1. Classical basalt tectonic discrimination diagrams: (a) diagram of Ti-Zr-Y, (b) diagram of Ti-Zr, (c) diagram of Zr/Y-Zr, (d) diagram of FeO^T-MgO-Al₂O₃ and (e) diagram of TiO₂-MnO-P₂O₅. Abbreviations: WPB—within plate basalt; CAB—calc-alkali basalt; CB—continental basalt; SCIB—spread center island basalt; OB—orogenic basalt; OIAB—oceanic island alkali basalt.

Figure 2. General architecture of the neural fuzzy inference system (NFIS) model.

Figure 3. Architecture description of the proposed hybrid swarm optimized neural fuzzy inference system (SONFIS) method for tectonic discrimination of basalts.

Figure 4. Global distribution map of basalt sample collection sites.

Figure 5. K-fold cross-validation technique.

Figure 6. Confusion matrix visualization of tectonic discrimination of basalts in dataset 1 based on different MLAs: (a) logistic regression classifier (LRC); (b) naïve Bayes (NB); (c) multilayer perceptron (MLP); (d) support vector machine (SVM); (e) random forest (RF) and (f) SONFIS.

Figure 7. Confusion matrix visualization of tectonic discrimination of basalts in dataset 2 based on different MLAs: (a) logistic regression classifier (LRC); (b) naïve Bayes (NB); (c) multilayer perceptron (MLP); (d) support vector machine (SVM); (e) random forest (RF) and (f) SONFIS.

Table 1. A summary of ML models and application scenarios in a few cases.

Authors	MLAs	Rock Types	Tectonic Settings
Vermeesch [1]	CT	Basalt	MOR, OI and IA
Petrelli and Perugini [40]	SVM	Volcanic rocks	CA, IA, IOA, BAB, CF, MOR, OP and OI
Liu and Liu [3]	SVM and DT	Basalt	CP and PF
Ueki et al. [8]	SVM, RF and SMR	Volcanic rocks	CA, IA, IOA, BAB, CF, MOR, OP and OI
Han et al. [53]	NB, KNN, SVM and RF	Basalt	MOR, OI and IA
Jiao et al. [54]	SVM, KNN and RF	Gabbro	CF, CM, IV and OI

Abbreviations: MOR—mid-ocean ridge; OI—ocean island; IA—island arc; CA—continental arc; IOA—intra-oceanic arc; BAB—back-arc basin; CF—continental flood; OP—oceanic plateau; CP—continental plateau; PF—plateau flood; CM—convergent margin; IV—intraplate volcanic.

Table 2. Basic statistics for dataset 1 with high-dimensional features.

No.	Elements	Min.	Max.	Mean	No.	Elements	Min.	Max.	Mean
1	SiO₂	36.55	54.30	49.15	27	V	27.00	622.00	284.65
2	TiO₂	0.18	4.91	1.81	28	Cr	0.00	3700.00	246.80
3	Al₂O₃	8.52	26.16	15.71	29	Co	10.00	460.00	45.83
4	Fe₂O₃	0.00	19.33	4.25	30	Ni	0.00	900.00	120.27
5	FeO^T	1.24	15.11	7.61	31	Cu	5.00	6001.00	112.14
6	CaO	0.35	14.81	10.54	32	Zn	28.70	441.00	95.77
7	MgO	1.57	22.60	7.39	33	Ga	9.00	48.00	18.96
8	MnO	0.01	19.00	0.20	34	Rb	0.00	116.73	15.14
9	K₂O	0.02	9.68	0.70	35	Sr	6.68	1590.00	385.88
10	Na₂O	0.74	5.95	2.70	36	Y	7.00	296.00	30.86
11	P₂O₅	0.01	2.35	0.29	37	Zr	0.00	988.30	145.80
12	La	0.00	317.00	17.09	38	Nb	0.00	130.00	19.17
13	Ce	0.00	420.00	36.85	39	Sn	0.45	12.00	2.00
14	Pr	0.36	26.20	5.86	40	Cs	0.00	5.00	0.32
15	Nd	0.00	780.00	21.36	41	Ba	1.30	1088.00	219.33
16	Sm	0.52	923.00	6.82	42	Hf	0.11	21.90	3.69
17	Eu	0.20	288.00	2.23	43	Ta	0.01	6.80	1.23
18	Gd	1.03	43.60	5.52	44	Pb	0.00	47.00	3.89
19	Tb	0.10	28.30	0.97	45	Th	0.00	27.00	2.47
20	Dy	0.00	594.00	7.93	46	U	0.00	6.20	0.74
21	Ho	0.29	6.10	1.01	47	¹⁴³Nd/¹⁴⁴Nd	0.50	0.52	0.51
22	Er	0.80	259.00	3.58	48	⁸⁷Sr/⁸⁶Sr	0.70	0.71	0.70
23	Tm	0.11	2.80	0.48	49	²⁰⁶Pb/²⁰⁴Pb	17.08	38.46	18.93
24	Yb	0.26	193.00	3.38	50	²⁰⁷Pb/²⁰⁴Pb	15.39	15.83	15.56
25	Lu	0.04	243.00	0.94	51	²⁰⁸Pb/²⁰⁴Pb	18.83	40.17	38.29
26	Sc	0.00	88.00	34.95

The unit of the major element concentration is wt%, and so is the minimum, maximum and mean. Moreover, the unit of the minor and trace element concentration is ppm, and so is the minimum, maximum and mean.

Table 3. Basic statistics for dataset 2 with low-dimensional features.

Elements	MORB			OIB			IAB
Elements	Min.	Max.	Mean	Min.	Max.	Mean	Min.	Max.	Mean
K₂O	0.01	0.05	0.02	0.00	0.38	0.03	0.00	0.09	0.01
CaO	0.10	0.54	0.31	0.02	0.58	0.31	0.01	0.66	0.21
SiO₂	29.93	41.85	40.12	34.71	42.43	39.71	27.32	42.93	39.01
MgO	36.43	51.30	46.64	23.64	51.71	45.47	25.24	52.66	42.65
NiO	0.01	0.44	0.20	0.01	0.43	0.22	0.00	0.55	0.16
Na₂O	0.01	0.10	0.02	0.00	0.53	0.14	0.00	0.21	0.02
FeO^T	7.72	23.90	12.59	7.58	42.33	14.28	7.11	37.35	17.18
TiO₂	0.01	0.11	0.03	0.00	0.19	0.03	0.00	0.21	0.03
Al₂O₃	0.01	0.55	0.06	0.00	0.95	0.09	0.00	0.94	0.06
MnO	0.01	0.53	0.20	0.03	0.73	0.21	0.08	0.76	0.27
Cr₂O₃	0.01	0.21	0.06	0.00	0.23	0.05	0.00	0.29	0.04
P₂O₅	0.01	0.14	0.07	0.01	0.10	0.03	0.00	0.12	0.02

The unit of the element concentration is wt%, and so is the minimum, maximum and mean.

Table 4. Classification accuracy of tectonic discrimination of basalts in dataset 1 based on six MLAs.

Tectonic Settings	Training Set	Test Set	Classification Accuracy (%)
Tectonic Settings	Training Set	Test Set	LRC	NB	MLP	SVM	RF	SONFIS
MORB	239	57	91.23	82.46	85.96	91.23	91.23	92.98
OIB	262	57	87.72	84.21	98.25	91.23	96.49	98.25
IAB	249	74	89.19	93.24	89.19	93.24	95.95	95.95
Total	750	188	89.36	87.23	90.96	92.02	94.68	95.74

Marked in italics and bold denotes the classification accuracy of the best classifier for different tectonic settings.

Table 5. Classification accuracy of tectonic discrimination of basalts in dataset 2 based on six MLAs.

Tectonic Settings	Training Set	Test Set	Classification Accuracy (%)
Tectonic Settings	Training Set	Test Set	LRC	NB	MLP	SVM	RF	SONFIS
MORB	424	115	74.78	90.43	85.22	84.35	82.61	96.52
OIB	374	89	55.06	41.57	88.76	29.21	85.39	94.38
IAB	468	112	85.71	75.89	90.18	87.50	92.86	97.32
Total	1266	316	73.10	71.52	87.97	69.94	87.28	96.20

Marked in italics and bold denotes the classification accuracy of the best classifier for different tectonic settings.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ren, Q.; Li, M.; Han, S.; Zhang, Y.; Zhang, Q.; Shi, J. Basalt Tectonic Discrimination Using Combined Machine Learning Approach. Minerals 2019, 9, 376. https://doi.org/10.3390/min9060376

AMA Style

Ren Q, Li M, Han S, Zhang Y, Zhang Q, Shi J. Basalt Tectonic Discrimination Using Combined Machine Learning Approach. Minerals. 2019; 9(6):376. https://doi.org/10.3390/min9060376

Chicago/Turabian Style

Ren, Qiubing, Mingchao Li, Shuai Han, Ye Zhang, Qi Zhang, and Jonathan Shi. 2019. "Basalt Tectonic Discrimination Using Combined Machine Learning Approach" Minerals 9, no. 6: 376. https://doi.org/10.3390/min9060376

APA Style

Ren, Q., Li, M., Han, S., Zhang, Y., Zhang, Q., & Shi, J. (2019). Basalt Tectonic Discrimination Using Combined Machine Learning Approach. Minerals, 9(6), 376. https://doi.org/10.3390/min9060376

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Basalt Tectonic Discrimination Using Combined Machine Learning Approach

Abstract

1. Introduction

2. Literature Review

2.1. Neural Fuzzy Inference System (NFIS)

2.2. Particle Swarm Optimization (PSO)

3. Problem Description and Research Contribution

3.1. Limitations of Conventional Discrimination Diagrams

3.2. Feasibility of MLAs for Tectonic Discrimination

4. Mathematical Principles of Main Algorithms

4.1. Neural Fuzzy Inference System (NFIS)

4.1.1. Layer 1: Fuzzification Layer

4.1.2. Layer 2: Product Layer

4.1.3. Layer 3: Normalization Layer

4.1.4. Layer 4: Defuzzification Layer

4.1.5. Layer 5: Output Layer

4.2. Particle Swarm Optimization (PSO)

5. Methodology

5.1. Overall Methodology: The Proposed Hybrid SONFIS Method

5.2. Methodology Implementation Procedure

5.2.1. Data Acquisition and Preprocessing

Dataset 1 with High-Dimensional Features

Dataset 2 with Low-Dimensional Features

5.2.2. Model Parameter Configuration

5.2.3. Model Performance Evaluation

5.2.4. Model Validation Scheme Design

6. Results and Discussion

6.1. Optimization Effect Verification

6.2. MLA Performance Comparison

6.3. Contrast with Conventional Discrimination Diagrams

6.4. Discussion: Applicability and Deficiency of MLA-Based Discrimination Method

7. Conclusions and Future Work

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI