Hyperspectral Lithological Classification of 81 Rock Types Using Deep Ensemble Learning Algorithms

Xie, Shanjuan; Qiu, Yichun; Cao, Shixian; Wu, Wenyuan

doi:10.3390/min15080844

Open AccessArticle

Hyperspectral Lithological Classification of 81 Rock Types Using Deep Ensemble Learning Algorithms

by

Shanjuan Xie

^1,*,

Yichun Qiu

²

,

Shixian Cao

¹ and

Wenyuan Wu

^1,3

¹

School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China

²

Kharkiv College, Hangzhou Normal University, Hangzhou 311121, China

³

Zhejiang Provincial Key Laboratory of Urban Wetlands and Regional Change, Hangzhou Normal University, Hangzhou 311121, China

^*

Author to whom correspondence should be addressed.

Minerals 2025, 15(8), 844; https://doi.org/10.3390/min15080844

Submission received: 5 July 2025 / Revised: 1 August 2025 / Accepted: 6 August 2025 / Published: 8 August 2025

(This article belongs to the Section Mineral Exploration Methods and Applications)

Download

Browse Figures

Versions Notes

Abstract

To address overfitting due to limited sample size, and the challenges posed by “Spectral Homogeneity with Material Heterogeneity (SHMH)” and “Material Consistency with Spectral Divergence (MCSD)”—which arise from subtle spectral differences and limited classification accuracy—this study proposes a deep integration model that combines the Adaptive Boosting (AdaBoost) algorithm with a convolutional recurrent neural network (CRNN). The model adopts a dual-branch architecture integrating a 2D-CNN and gated recurrent unit to effectively fuse spatial and spectral features of rock samples, while the integration of the AdaBoost algorithm optimizes performance by enhancing system stability and generalization capability. The experiment used a hyperspectral dataset containing 81 rock samples (46 igneous rocks and 35 metamorphic rocks) and evaluated model performance through five-fold cross-validation. The results showed that the proposed 2D-CRNN-AdaBoost model achieved 92.55% overall accuracy, which was significantly better than that of other comparative models, demonstrating the effectiveness of multimodal feature fusion and ensemble learning strategy.

Keywords:

hyperspectral image classification; rock classification; deep learning; AdaBoost algorithm

1. Introduction

Rocks are fundamental components of the Earth’s surface and are essential for accurate classification in fields such as geotechnical engineering, mineralogy, petrology, rock mechanics, and mineral resource exploration [1,2,3].

In geological research, rock classification and nomenclature systems facilitate the differentiation of rock types and their spatial distribution across distinct regions, thereby enhancing exploration efficiency. For example, Li et al. [4] demonstrated the critical role of lithological classification in guiding geological surveys and mineral prospecting. By identifying the types of Yanshanian granites, their research revealed potential genetic linkages to Mo-polymetallic mineralization, underscoring the practical value of systematic rock categorization in mineral resource targeting. Similarly, analyzing the microstructural characteristics of melt inclusions in anatectic metamorphic rocks necessitates precise rock classification and petrographic identification using traditional methodologies [5].

Classification methods predominantly rely on field-based macroscopic observations and laboratory microscopic identification. For example, the approach proposed by Xu et al. (2014) [6] involves field collection of rock spectra coupled with EO-1 Hyperion data matching to identify lithological types. However, such conventional methodologies suffer from limitations including low efficiency, strong subjectivity, and limited adaptability to complex geological environments. In recent years, hyperspectral imaging technology, leveraging its nanometer-scale spectral resolution (λ/100) and broad spectral coverage, has enabled the simultaneous resolution of microscale compositional variations and macroscale structural features in rocks, providing a strong technical foundation for intelligent lithological classification [7]. This advancement has been increasingly applied to geological mapping [8] and mineral resource exploration [9].

The development of hyperspectral rock classification methodologies has progressed through two pivotal phases. The first, centered on traditional machine learning, primarily employed algorithms such as spectral angle mapper [10], random forest [11,12], K-means clustering [13], and support vector machine [14]. For example, Tessier et al. [15] successfully classified desiccated rock mixtures by integrating wavelet texture analysis with SVM, while Tripathi and Garg [16] employed principal component analysis (PCA)-based dimensionality reduction and K-means clustering to identify basalt subtypes. However, these methods faced significant limitations due to the curse of dimensionality—also known as the Hughes phenomenon [17]—inherent in hyperspectral data. While subsequent hybrid methods like Kilickaya et al. [18] integrated subspace learning with one-class classification to eliminate cascaded errors from traditional separate dimensionality reduction and classification, their insufficient spatial-spectral feature decoupling capability still limits classification performance in complex geological scenarios.

The breakthrough of convolutional neural networks (CNNs) [19] in image processing has spurred researchers to explore end-to-end feature learning frameworks for the hyperspectral classification of rocks. For example, Yang et al. [20] implemented 1D-CNNs to classify coal and rock spectral curves, while Zhang et al. [21] used the fusion strategy of 2D-CNN and 1D-CNN to capture spatial correlation [22], enabling coarse classification of airborne hyperspectral rock images. However, single-modal feature extraction remains insufficient for addressing the complexity of geological environments. Sinaice et al. [23] successfully distinguished eight igneous rock units with 96% prediction accuracy by combining hyperspectral imaging with deep learning CNN technology; however, the experiment only covered a few igneous rock samples and could not fully represent more rock types.

The comprehensive development of deep learning has promoted the formation of a space-spectrum joint learning paradigm. Hyperspectral rock imagery exhibits dual attributes [24]: spatial features reflect tectonic features (e.g., stratification and joint development), while spectral features encode mineral chemical composition information [25]. These dual properties necessitate that algorithms possess multiscale feature fusion capabilities to holistically integrate spatial context and spectral signatures for robust lithological discrimination.

The dual-stream CNN architecture developed by Dang et al. [26] achieved an accuracy of 89.2% in granite classification, demonstrating the efficacy of spatial-spectral collaborative learning. Subsequent studies further incorporated temporal modeling techniques to enhance spectral sequence analysis [27]. For example, Zhao et al. [28] developed a Long Short-Term Memory and convolutional neural network (LSTM-CNN) hybrid network, validating its effectiveness in spectral-temporal feature extraction on the airborne visible/infrared imaging spectrometer copper ore dataset (1990–2480 nm). Agrawal and Govil [29] addressed training instability in multi-layer LSTM stacks by proposing residual connections integrating 1D-CNN and LSTM modules, achieving recognition accuracies of 92% and 89% for alunite and calcite, respectively. Notably, Liu et al. [30] adopted the implementation of rock thin section mineral grain classification, which improved the overall recognition rate by approximately 4% compared with the ResNet + LSTM method (93.2%), demonstrating better engineering applicability.

However, existing approaches face three core challenges:

①: Inadequate dataset foundations: Existing datasets exhibit critical constraints in lithological diversity. For instance, the Okada Mineral Set [31] covers only five mineral types, while the Galdames dataset [32] expands coverage to thirteen rock types—including andesite, tourmaline breccia, and travertine breccia. Nevertheless, the persistent lack of spectral variability analysis fundamentally restricts their applicability to complex rock identification and classification scenarios.
②: Small-sample generalization deficit: Data scarcity induces severe overfitting and accuracy degradation, especially for metamorphic rocks. As evidenced by Li et al. [33], the ShufflNetV2 model misclassified 46 metamorphic rock samples as volcanic rocks when using deep learning to classify thin rock section images, while the overall accuracy (OA) of metamorphic rocks was the lowest (67%).
③: Spectral confusion: Spectral Homogeneity with Material Heterogeneity (SHMH) and Spectral Heterogeneity for Compositionally Similar Deposits (MCSD) disrupt the spectral-lithological correspondence, significantly compromising the recognition accuracy of mainstream classification models. SHMH refers to the spectral uniformity arising from the overlapping of key mineral absorption features in different lithologies. For instance, Zhang et al. [34] discovered spectral overlap in certain mineral assemblages. Barbey et al. [35] observed that the reflection curves of cordierite hornfels and granite converge in the 2200–2400 nm band due to overlapping Fe²⁺ absorption features. Regmi [36] confirmed that metamorphic hornfels and granite exhibit similarities in Al-OH absorption due to their shared ferromagnesian minerals.

MCSD refers to the spectral divergence observed within the same lithology caused by internal variations in mineral structure or alteration. For example, Karimzadeh and Tangestani [37] found that spectral ratio characteristics vary in chlorite-serpentinite schist due to differentiation in alteration.

In response to the above challenges, this study conducted breakthrough explorations at two levels: data foundation construction and algorithm architecture innovation. First, a laboratory-grade imaging spectrometer was used to construct a standard hyperspectral rock dataset (HSRD-1.0) encompassing 46 igneous and 35 metamorphic rock types, with a focus on spectrally ambiguous lithologies. Next, an Adaptive Boosting (AdaBoost)-enhanced convolutional recurrent neural network (2D-CRNN) ensemble framework was proposed, featuring a dual-branch architecture for dynamically fusing spatial and spectral features—effectively mitigating the issue of homospectrality. Furthermore, ablation experiments on deformable convolutional layers demonstrated that the adaptive sample weighting mechanism significantly enhanced recognition accuracy and robustness for small-sample categories, offering a systematic solution to class imbalance in hyperspectral rock image classification.

2. Hardware and Dataset

2.1. Hyperspectral Imaging System

This study adopted the HySpex SWIR-384 push-broom hyperspectral imaging system (as shown in Figure 1), which is designed with a push-broom moving platform, featuring a 16-degree field of view, 384 × 288 pixels, a spectral range of 950–2500 nm (5.45 nm resolution), and 16-bit data accuracy. It is equipped with an automatic focusing function to achieve high-precision multispectral imaging in moving scenes.

2.2. Hyperspectral Rock Standard Dataset HSRD-1.0

A total of 81 geologically representative newly exposed rock samples were selected, among which No. 1 to No. 46 were magmatic rocks, and No. 47 to No. 81 were metamorphic rocks (see Figure 2a for classification display). Magmatic rock samples are of great importance in the fields of geothermal development, metallurgical industry, and geological scientific research because of their high resistance to high temperatures and high compressive strength. Metamorphic rocks (such as gneiss and marble) are susceptible to the superposition of multi-stage metamorphism due to mineral recrystallization and structural complexity (such as foliation and banded structure), resulting in the overlapping of absorption peaks between bands (such as the similar response of epidote and amphibole at 1200–1400 nm), which poses a greater challenge to the spectral resolution of classification models. However, crustal evolution records and the development potential of new building materials highlight the necessity of high-precision identification.

During the collection of rock sample data, to eliminate sensor noise and environmental interference, the research team adopted the whiteboard calibration technique to obtain true reflectance data. Specifically, using a standard whiteboard with a reflectance close to 100% as the benchmark, the spectral reflectance calculation was performed according to Equation (1)

R = \frac{H \cdot R_{w}}{E_{w}}

(1)

where

H

represents the hyperspectral emissivity image (raw sensor data);

R_{w}

is the pre-calibrated reflectivity of the white reference board; and

E_{w}

denotes the emissivity of the whiteboard, which is used to normalize the sensor’s emissivity measurements and derive accurate surface reflectivity.

The corrected hyperspectral pseudo-color image is shown in Figure 2b, which effectively improves the signal-to-noise ratio of the original data.

This visualization method can directly reflect the spectral absorption characteristics of different rock types, laying a foundation for subsequent mineral composition analysis. For the issue of blurred edges and shadow interference in rock sample images, the study innovatively applied a 5 × 5 erosion algorithm to morphologically process the sample edges (the processing effect comparison is shown in Figure 2c). This method effectively eliminates the effect of boundary abnormal pixels on spectral feature extraction, improving the accuracy of the classification model.

The first step during the data processing stage of the rock samples was to extract the spectral characteristics of 81 types of rocks from the preprocessed data and draw spectral curves (Figure 3).

2.3. Spectral Characteristics of Dataset HSRD-1.0

Among these (Figure 3), the spectral characteristics of rocks show obvious rock type dependence: igneous rocks are characterized by features such as the high reflectance of olivine, indicated by a strong absorption near 1000 nm, and pyroxene, showing double absorption peaks around 900 nm and 2000 nm in ferromagnesian rocks such as sample No. 1. Additionally, volcanic bombs (No. 12) and volcaniclastic rocks like pumice (No. 36), which are porous and glassy, exhibit low reflectance across the entire wavelength range and broadened water absorption peaks near 1400 nm. In contrast, the spectral features of metamorphic rocks are notably more complex. Fine-crystalline marble (No. 49) and medium-crystalline marble (No. 50) are primarily defined by the CO₃²⁻ absorption peaks near 2300 nm in the near-infrared (NIR) band, with the peak depth correlating with the grain size of the rocks. For sillimanite-bearing contact metamorphic rocks, the epidote skarn (No. 56) and garnet-epidote sillimanite (No. 57) both exhibit spectral features influenced by the presence of chlorite minerals, which is the cause of their low reflectance and broadened water peaks in all bands. The core similarity between the spectral features of the striated mixed rock (No. 78) and the ophthalmic mixed rock (No. 80) lies in the fact that the spectral curves of both sillimanites show significant “sawtooth” features due to the alternate distribution of light bodies (feldspathic minerals) and dark bodies (ferromagnesian minerals). The spectral curves also show significant “jagged fluctuations”. These rocks’ spectral characteristics demonstrate a paradoxical duality—systematic intra-class regularity coupled with inter-class convergence—that both challenges diagnostic classification and validates the representativeness of experimental samples in encapsulating real-world lithological variability.

3. Method

3.1. Related Work

3.1.1. Two-Dimensional CNN

CNNs have shown significant advantages in the field of image spatial feature extraction by virtue of their local sensory fields and weight-sharing mechanism [38]. The two-dimensional convolutional neural network (2D-CNN) is a deep learning model designed for two-dimensional mesh data (e.g., images and video frames) that automatically extracts spatial features through convolutional operations, local connectivity, and weight-sharing mechanisms. Two-dimensional CNN has been widely applied in the field of computer vision, covering image classification, target detection, and image segmentation; it is also a core technology for efficiently processing two-dimensional spatial data. Given the unique characteristics of hyperspectral data, early studies modeled spatial neighborhood information employing 2D-CNN and used convolution kernels to capture local structural features such as rock texture and edges. Follow-up work further introduced a lightweight design to alleviate the overfitting problem in hyperspectral small-sample scenarios.

Given an input signal (usually a two-dimensional matrix such as an image) X, which is of size m × n, and a convolution kernel (also known as a filter) K, which is of size

k \times k

, the convolution operation results in the output matrix Y, which is of size

(m - k + 1) \times (n - k + 1)

(here, it is assumed that there is no padding, i.e., the padding is 0). The formula for each element

Y [i] [j]

in the output matrix Y is

Y [i] [j] = \sum_{p = 0}^{k - 1} \sum_{q = 0}^{k - 1} X [i + p] [j + q] \times K [p] [q]

(2)

3.1.2. Recurrent Neural Network

A recurrent neural network (RNN) is a type of neural network designed for processing sequence data (e.g., text, speech, and time series), at the core of which are recurrent connections between neurons in the hidden layer that allow information to be passed in the temporal dimension, thus capturing temporal dependencies in the sequence [39].

It is widely used for the spectral sequence analysis of hyperspectral data due to its ability to model temporal dependencies. Classical LSTM [40] mitigates the long sequence gradient vanishing problem by introducing forgetting and input gates. In the field of hyperspectral rock classification, Agrawal & Govil [29] proposed a residual connection based on 1D CNN and LSTM to improve the difficulty in training caused by multi-layer LSTM stacking, and effectively identified most minerals such as alunite and calcite on the AVIRIS hyperspectral copper ore dataset, which is a typical application of RNN and its variants in this field. A gated recurrent unit (GRU) [41] further simplifies the gating mechanism by dynamically balancing the historical state and current input with an update gate and reset gate, which can balance efficiency and accuracy in spectral feature modeling. Equation (3) is the core formula of an RNN:

h_{t} = σ (W_{h h} h_{t - 1} + W_{x h} h_{x t} + b_{h})

(3)

Here,

h_{t}

denotes the hidden state at the current time step t, which summarizes the information from all previous time steps;

h_{t - 1}

is the hidden state at the previous time step

t - 1; x_{t}

is the input at the current time step t;

W_{h h}

and

W_{x h}

are weight matrices that are used to linearly transform the hidden state and the inputs to capture the relationship between the inputs and the hidden state;

b_{h}

is the bias term used to adjust the result after the linear transformation; and

σ

is an activation function, such as the hyperbolic tangent function tanh or the logistic function Sigmoid, which is used to introduce nonlinearities and enable the network to learn complex patterns.

3.2. Proposed Algorithm

This study proposes a dual-branch hyperspectral rock classification model that integrates 2D-CNN and GRU. Two-dimensional CNN extracts the spatial features of rocks through spectral dimensionality reduction and structural refinement to suppress redundant information, while the GRU models the temporal dependence of spectral sequences by using the gating mechanism to capture the subtle differences between bands. Both reduce the risk of overfitting through parameter optimization and introduce the AdaBoost algorithm to dynamically fuse two-branch features and preferentially strengthen the discriminant weight of spectrally confusing samples. The method simultaneously improves classification accuracy and generalization under small-sample conditions, effectively solves the classification ambiguity problem caused by the subtle differences in rock spectra, and provides an efficient and reliable technical solution for geological lithology identification (see Figure 4 for the schematic diagram).

3.2.1. A Framework for Hyperspectral Rock Classification Based on Bimodal Feature Synergy

To address the core challenges of hyperspectral rock image classification—namely subtle spectral differences and overfitting due to limited sample sizes—this study proposes a spatial-spectral bimodal cooperative learning framework. The framework features a dual-branch feature extraction network that integrates a 2D-CNN and a GRU. The 2D-CNN branch is based on PCA downscaling and streamlining structural design (e.g., compression of spectral channels, reduction in layers, and removal of pooling layers), which gives priority to the extraction of rock multi-scale spatial features (e.g., texture and edges) and improves the stability of training in small samples through batch normalization. In spectral sequence analysis, GRU branch is responsible for modeling the dynamic temporal dependencies within spectral sequences. The GRU branch uses the gating mechanism to model the dynamic temporal dependence of the sequence, and the reset and update gates collaboratively capture the cross-band local correlations and long-range spectral trends to enhance the sensitivity to subtle spectral differences. The dual-branch model effectively suppresses the risk of overfitting through parameter streamlining and complementary feature design, and it combines with the AdaBoost algorithm to adaptively fuse heterogeneous features to preferentially strengthen the discriminative weight of spectrally confusing lithological samples. Experimental results demonstrate that the framework enables effective collaborative optimization of spatial and spectral features, even with limited sample sizes. It significantly improves classification accuracy and model generalization, offering reliable technical support for intelligent lithological identification in geological exploration.

3.2.2. Two-Dimensional Convolutional Recurrent Neural Network

Through spatial-temporal bimodal feature extraction, the CNN and GRU are deeply integrated to provide the dual advantages of local detail perception and global temporal modeling. The 2D-CRNN model adopts a dual-path feature extraction architecture, and the 2D-CNN branch efficiently captures the spatial structural features of the rock image by streamlining the design of the number of convolutional channels and the number of layers while suppressing the overfitting risk. The RNN branch takes advantage of the temporal dependence modeling of the spectral sequence by the GRU to extract multi-level spectral response patterns [42].

The 2D-CNN branch, based on PCA dimensionality reduction, reduces the computational complexity by using PCA to reduce the spectral dimensions of the input data from 384 bands to 3 bands (as shown in Figure 5), eliminating redundant information to significantly improve model performance. The branch consists of two customized 2D convolutional layers that mine multi-scale rock spatial features from limited data, capturing key characteristics of rock images, such as edges, textures, and spatial positioning. To prevent overfitting, the spectral channels received by the 2D convolutional layer are compressed to three, and the number of convolutional layers and parallel loop cell layers is halved. At the same time, the pooling layer is omitted, thus maintaining a high spatial resolution. The convolutional layers leverage varying parameters and kernel sizes to extract features, integrating batch normalization and ReLU activation functions. The former performs normalization on small-sample data to stabilize data distributions, accelerate model convergence, and reduce gradient fluctuations, thereby enhancing stability; its regularization-like effect further mitigates overfitting. The latter introduces nonlinear factors to capture complex spectral relationships and address spectral confusion issues.

The RNN branch integrates 1D-CNN and GRU to capture spectral sequence features of rock images. In this framework, the spectral dimension of hyperspectral images is treated as temporal sequence features. When spectral data are fed into this branch, the 1D-CNN first extracts the central pixel block and performs dimensionality reduction while capturing local spectral features and patterns. Batch normalization layers and ReLU activation functions are appended after the convolutional layers, outputting preliminarily processed data. The GRU layer is designed to process the temporal sequence features of hyperspectral images along the spectral dimension. These features exhibit a dynamic and continuous nature, enabling the reflection of trends in rock spectral characteristics over time or wavelength-dependent variations. By modeling sequential dependencies, the GRU captures long-range spectral correlations and adaptively tracks gradual spectral shifts, enhancing the representation of subtle spectral patterns in geological samples.

Hyperspectral datasets are typically large and contain long spectral sequences, making it challenging to model spectral dependencies effectively. Traditional RNNs often struggle with capturing long-range relationships due to vanishing or exploding gradients. In contrast, GRU networks, with their streamlined gated mechanisms that efficiently regulate information flow, offer a more robust solution for modeling spectral-temporal relationships compared to LSTMs. The GRU layer improves the efficiency of training long-sequence data by simplifying the LSTM structure. The output from the GRU layer is processed through batch normalization and a tanh activation function, enhancing the model’s adaptability to data variations and nonlinear characteristics. Subsequently, the data undergo additional batch normalization and ReLU activation to optimize the final output.

The GRU employs gated mechanisms to model temporal dependencies in spectral sequences and extract multi-level spectral response patterns. A GRU features a more streamlined gating architecture than an LSTM with only an update gate and reset gate. Given a time step t, a small batch of inputs

X_{t} \in R^{n * d}

(n is the number of samples, and d is the number of inputs), and a hidden state

H_{t - 1} \in R^{n * h}

(h is the hidden unit dimension) of the previous time steps, the reset gate

R_{t}

and the update gate

Z_{t}

are calculated as

R_{t} = s i g m o i d (X_{t} W_{x r} + H_{t - 1} W_{h r} + b_{r})

(4)

Z_{t} = s i g m o i d (X_{t} W_{x z} + H_{t - 1} W_{h z} + b_{z})

(5)

In Equation (4),

W_{x r}

(x represents the input; r represents the reset gate) denotes the weight matrix from the input layer to the reset gate;

W_{h r}

denotes the recursive weight from the hidden state to the reset gate, modeling the band correlation across time steps; and

b_{r}

is the bias vector of the reset gate, which controls the default threshold for gate activation.

In Equation (5),

W_{x z}

denotes the projection weights of inputs to the update gate, extracting diagnostic absorbing features;

W_{h z}

denotes the recursive weights of hidden states to the update gate, maintaining the memory of key spectral features; and

b_{z}

is the update gate bias. Positive initialization enhances the historical state retention tendency.

The Sigmoid function serves as the activation function to map values into the interval [0, 1], thereby regulating the gating mechanisms in the GRU. This ensures that the outputs of the reset gate

R_{t}

and update gate

Z_{t}

are constrained within the range [0, 1]. The GRU updates the hidden state by computing a candidate hidden state

{\tilde{H}}_{t}

, which is achieved through element-wise multiplication (shown as ⊙) between the reset gate output at the current time step t and the previous hidden state

H_{t - 1}

. Specifically, the candidate hidden state

{\tilde{H}}_{t} \in R^{n \times h}

(where n is the batch size, and h is the hidden unit dimension) at time step t is formulated as

{\tilde{H}}_{t} = t a n h (X_{t} W_{x h} + (R_{t} ⊙ H_{t - 1}) W_{h h} + b_{h})

(6)

where

W_{x h}

denotes the input-to-candidate-state projection weights, which capture localized spectral details by mapping raw spectral inputs to the hidden space.

W_{h h}

represents the recurrent weights of the reset hidden state, modeling nonlinear interactions between spectral bands to refine temporal dependencies.

b_{h}

is the candidate-state bias term, compensating for spectral baseline shifts and enhancing the adaptability of spectral feature representation.

This mechanism gives the GRU two key advantages. First, the reset gate

R_{t}

explicitly captures local correlations between spectral bands—such as mineral absorption peaks—thereby increasing sensitivity to subtle spectral variations. Second, the update gate

Z_{t}

adaptively balances long-range dependencies and short-term variations through weight optimization, effectively mitigating the vanishing gradient problem. This gating synergy makes the GRU particularly suitable for spatial-spectral joint modeling across hundreds of bands in hyperspectral data, providing theoretical guarantees for temporal feature extraction in dual-branch architectures.

3.2.3. AdaBoost Algorithm Integrated with Deep Learning Frameworks

After completing spatial-spectral dual-branch feature extraction, a key challenge in hyperspectral rock classification is how to effectively fuse these heterogeneous features and enhance classification robustness, especially under small-sample conditions. In the model constructed in this study, the two types of heterogeneous features are deeply fused by the fully connected layer to produce the initial classification probabilities. Here, weak classifiers are simple sub-modules with limited individual discriminative power—each focusing on specific local patterns within spatial or spectral features of hyperspectral rock data, with standalone performance slightly better than random guessing. The weight coefficients of each weak classifier are dynamically adjusted via error backpropagation by calculating the cross-entropy loss with respect to the true labels. This process enhances the weights of high-performing feature extraction paths while attenuating those of sub-optimal ones. Finally, a weighted voting mechanism integrates multi-path prediction results (i.e., summing each classifier’s output multiplied by its confidence weight), thereby achieving collaborative optimization of spatial-spectral features.

The iterative sample weighting mechanism of the AdaBoost algorithm directly addresses the key challenges in geological data by dynamically adjusting the training focus. In response to spectral confusion, the algorithm increases the weights of such difficult samples during consecutive iterations, compelling the model to deepen its ability to identify mineral absorption features. This weight allocation mechanism effectively mitigates the issue of class imbalance, while the confidence-weighted voting strategy significantly reduces cascading errors arising from spatial-spectral feature conflicts.

To further improve classification accuracy and generalization in hyperspectral rock classification, this study innovatively integrates the AdaBoost algorithm with a deep learning framework. This establishes a dual enhancement mechanism by dynamically adjusting both sample weights and model decision weights. Originally proposed by Freund and Schapire [43] in the 1990s, this algorithm’s core philosophy lies in iteratively adjusting sample weight distributions to force the model to persistently focus on hard samples, thereby improving discriminative capacity for fine-grained rock categories. Critically, the weak classifiers in this implementation are homogeneous architectures. Multiple identical convolutional recurrent neural networks (CRNNs). These CRNNs, constructed using the same CNN-GRU two-branch design described above, function as weak learners trained on iteratively re-weighted samples. This approach constitutes a homogeneous ensemble strategy—distinct from heterogeneous integrations (e.g., combining decision stumps with CRNNs)—where the base learner remains consistent across all iterations. Specifically, the model employs structurally invariant CRNNs whose parameters are optimized on dynamically adjusted sample distributions, eliminating cascaded errors from heterogeneous model fusion while maximizing feature decoupling capabilities for hyperspectral rock discrimination (see Figure 6 for the schematic diagram).

First, the sample weights are initialized. Assume that the dataset D contains M samples, and the initial weight of each sample is equal to

w_{i}^{1} = \frac{1}{M},

i.e., the weight of the ith sample in the first iteration round. Then, train a weak classifier

c_{t} (x)

(t denotes round t) using the current sample weights and compute the weighted error rate

r_{t}

of this weak classifier on the training set as

r_{t} = \sum_{i = 1}^{M} w_{i}^{t} \cdot I (c_{t} (x_{i}) \neq y_{i})

(7)

where

I (\cdot)

is the indicator function, which takes the value of 1 when the prediction is wrong, and 0 otherwise.

x_{i}

denotes the feature vector of the ith sample, so

c_{t} (x_{i})

is the prediction value (±1) of the weak classifier for sample

x_{i}

.

y_{i}

, i.e., the input data, denotes the ith sample’s label, i.e., the output target that the model needs to predict.

The weight

a_{t}

of the weak classifier is calculated according to the error rate obtained by the above calculation as

α_{t} = l o g \frac{1 - r_{t}}{r_{t}}

(8)

This weight reflects the importance of the weak classifier in the final decision; the lower the error rate, the higher the weight. Then, the sample weights are updated so that the weights of the misclassified samples are increased, as shown in Equation (9)

w_{i}^{t + 1} = w_{i}^{t} e^{(α_{t} I (y_{i} \neq c_{t} (x_{i})))}

(9)

At each iteration, AdaBoost adjusts the sample weights according to the current classification results based on the fused feature map and improves the weights of misclassified samples. This not only enhances the model’s ability to handle noise and outliers but also helps it focus more effectively on relevant features when dealing with complex cases such as “spectral homogeneity with material heterogeneity” (SHMH) and “material consistency with spectral divergence” (MCSD), thereby improving overall model robustness.

When all the preset iteration rounds are completed, the algorithm calculates the corresponding combination coefficient

α_{t}

for each weak classifier according to its error rate during training, and finally, the set of all weak classifiers is combined into a strong classifier through weighted linear superposition. Its mathematical expression is as Equation (10)

C (x) = s i g n (\sum_{t = 1}^{T} α_{t} \cdot c_{t} (x))

(10)

The sign (⋅) function serves as the signum operator, determining the final classification outcome. Here,

T

signifies the total iteration count, corresponding to the number of weak classifiers used during training.

The AdaBoost framework improves classification accuracy for geologically rare and spectrally similar rock types through strategic weight amplification of hard samples, concurrently suppressing overfitting while boosting model discriminative capability and operational robustness.

A deep ensemble network model, which includes a bi-branch network and AdaBoost algorithm, was designed in this study. The former makes full use of the advantages of bi-branch feature extraction to extract high discriminant space-spectrum features and realizes the balance optimization of accuracy and robustness in fine-grained rock classification tasks. The latter makes the model focus on difficult samples skillfully by iteratively adjusting sample weights, effectively improving the clarity of decision boundaries. Additionally, through a lightweight structural design, the model mitigates overfitting in small-sample scenarios and significantly improves overall performance in hyperspectral rock classification. This “sample-model” collaborative optimization strategy finally achieves a balanced optimization of accuracy and robustness in fine-grained rock classification.

4. Results and Discussion

4.1. Experimental Design

4.1.1. Datasets

This study established basic real labels for 81 types of rocks to allocate these categories to the training set, validation set, and test set. The number of pixel samples corresponding to each type of rock in the real labels was statistically analyzed and is presented in Table 1, thereby enhancing the accuracy and generalization ability of the supervised classification model.

4.1.2. Experimental Environment and Evaluation Criteria

The experiment was performed on a platform featuring an Intel(R) Xeon(R) Platinum 8255C CPU, an NVIDIA GeForce RTX 2080Ti GPU, and the Ubuntu 20.04 operating system. All experiments used PyTorch 1.11.0, Python 3.8, and CUDA 11.3. The specific Python modules and libraries included numpy 2.4.0, scikit-learn 1.3.2, and torch 1.11.0 + cu113.

The focus of this study is on the classification of hyperspectral rock images, and certain evaluation indicators are needed to reflect the classification effect when evaluating the classification results. Therefore, three commonly used indicators in this field were used to comprehensively evaluate the results: overall accuracy (OA), average accuracy (AA), and Kappa coefficient. OA represents the proportion of correctly classified samples to the total number of samples, and its calculation is as Equation (11):

O A = \frac{\sum_{i = 1}^{n} h_{i i}}{\sum_{i = 1}^{n} N_{i}}

(11)

In Equation (12),

n

is the number of rock categories in the image,

N_{i}

is the number of pixels in the ith rock category, and

h_{i i}

is the number of pixels correctly classified in the ith category. AA is the sum of the sample proportion of each category relative to the total number of samples, divided by the number of categories.

A A = \frac{1}{n} \sum_{i = 1}^{n} \frac{h_{i i}}{N_{i}}

(12)

The Kappa coefficient is an indicator of consistency that can be used to measure the validity of a classification. It is calculated using Equation (13)

K a p p a = \frac{N \sum_{i = 1}^{r} x_{i i} - \sum_{i = 1}^{r} (x_{i +} x_{+ i})}{N^{2} - \sum_{i = 1}^{r} (x_{i +} x_{+ i})}

(13)

where

N

is the total number of samples;

r

is the number in the confusion matrix;

x_{i +}

and

x_{+ i}

are the sum of the elements in row i and column i of the confusion matrix, respectively; and

x_{i i}

is the value of column i in row i of the confusion matrix.

4.1.3. Experimental Setup

In the proposed C-RNN-AdaBoost model, the AdaBoost algorithm uses 10 decision stumps as weak classifiers, and the weights of the C-RNN model are initialized by the He normal distribution, while the bias term is set to zero, and the layer without the bias term is not initialized. The Adam algorithm was used in the optimization process, with the initial learning rate set to 0.001; dynamic adjustment was implemented to adapt to the change in the learning rate during the training process.

In terms of data processing, a five-fold cross-validation method was used in this study. In each iteration, 20% of the pixels were randomly selected as the training set, 70% for testing, and the remaining 10% for validation. And this data partitioning method helps evaluate the generalization ability of the model, but it may also lead to differences in sample distribution between different folds, which can affect the consistency of model performance.

The experimental results (as shown in Table 2) indicated that when using C-RNN to classify 81 types of rocks individually, there were significant differences in training accuracy between different folds. This was mainly attributed to the reduction in sample size and the uneven distribution of features. In addition, the decision stumps in AdaBoost performed poorly in capturing complex feature relationships, especially in multi-class tasks that require fine differentiation, exposing the limitations of traditional machine learning methods in such tasks.

4.1.4. Effect of the Number of Training Rounds on Classification Accuracy

Through experiments with different training rounds (10, 20, 30, 40, and 50 rounds) —involving dynamic adjustment of the initial learning rate (0.001) by a factor of 10 when validation accuracy stagnated, moderate increases in weight decay (from 1 × 10⁻⁵ to 5× 10⁻⁵) to mitigate overfitting, and a stable batch size of 32. The specific effect of the AdaBoost algorithm on model performance was analyzed. The experimental results (as shown in Figure 7) showed that the model incorporating AdaBoost demonstrated higher classification accuracy in all tested training rounds. The accuracy of both types of models improved significantly after 20 rounds, but after more than 40 rounds, this improvement became limited. To make rational use of computing resources, this experiment was terminated after 50 rounds of training.

4.1.5. Effect of the Number of Weak Classifiers on Classification Accuracy

To investigate the effect of the number of weak classifiers on the classification accuracy of the AdaBoost algorithm, experiments were conducted using one, three, five, and seven classifiers. The experimental results (as shown in Table 3. Comparison of classification accuracy for different numbers of weak classifiers in the model.) indicated that the classification accuracy was optimal when using three classifiers, although the training time increased compared with using one classifier, and overfitting did not occur. As the number of classifiers increased further, the classification accuracy did not significantly improve but rather decreased, and the model began to overfit and consume more resources. Therefore, subsequent experiments all used three classifiers for training.

4.1.6. Effect of RNN Branch on Classification Accuracy

The proposed C-RNN-AdaBoost model enhances spectral feature extraction through the GRU gating mechanism, highlighting the key role of spectral information in rock classification. To verify its effectiveness, three typical bi-branched space-spectral fusion models (2D-CNN-BiRNN-AdaBoost, 2D-1D CNN-AdaBoost, and 2D-LSTM-AdaBoost) [29,44,45] were used for comparison, and the OA, AA and Kappa coefficient of the proposed model are all higher were higher (as shown in Table 4). All models were trained on 20% data and cross-validated with a 5% fold. In addition, by simplifying the network structure and reducing the number of parameters, the C-RNN-AdaBoost model significantly reduces the computing resources and time consumption while maintaining efficient classification.

Specifically, although the bidirectional recurrent neural network (BiRNN) enhances the ability to perceive context, it performs poorly in classification tasks, exhibiting gradient anomalies and defects in long-range information processing. LSTM alleviates the problem of gradient vanishing through the gating mechanism, which significantly improves the classification accuracy. The 1D-CNN branch has significant advantages in local feature extraction, and its classification effect is similar to that of the model proposed in this study. However, the 1D-CNN needs more training parameters for training, resulting in a large consumption of computing resources and time.

The text model adopts a simplified GRU structure, which only contains two gates, reducing the number of parameters compared with the LSTM’s three gates, thereby lowering the computational complexity. The update gate mechanism of the GRU achieves an effective balance between long-term and short-term dependence, and it further improves the classification accuracy. The classification effect diagram of the four algorithms shown in Figure 8 indicates that the proposed model had higher smoothness in rock classification than other comparative models. Furthermore, there were fewer misclassified pixels, which verifies the high efficiency and robustness of the proposed model in complex rock classification tasks.

4.2. Comparison and Discussion of Experimental Results

To evaluate the effectiveness of the rock classification algorithm introduced in this study, several mainstream hyperspectral image classification algorithms (including 3D-CNN, Bi-CLSTM, SSRN, and DenseNet) were selected for comparative analysis [46,47,48,49]. All models use a five-fold cross-validation method to divide the training, validation, and test data. The proposed model showed the highest classification accuracy for the dataset containing 81 types of rock as shown in Table 5.

The 3D-CNN and SSRN belong to the traditional 3D convolutional single-branch cascade model, and the SSRN is better than the 3D-CNN because it introduces the residual network structure of skipping connections, which improves the stability and classification accuracy of training. Bi-CLSTM combines the advantages of bidirectional LSTM and convolutional LSTM to extract local features better, but its performance is degraded due to the significant difference in the distribution of training and test data. DenseNet achieved a classification accuracy second only to the proposed model by using a dense connection mechanism, allowing each layer to directly access the feature information of all preceding layers.

As shown in Figure 9, the proposed model (e) had the least misfraction of pixels compared with (a) 3D-CNN, (b) Bi-CLSTM, (c) SSRN, and (d) DenseNet, especially in the transition region at the rock edges.

4.3. Analysis and Discussion of Typical Samples of Metamorphic Rock

4.3.1. Identification of Typical and Easily Confused Metamorphic Rock

For spectrally similar lithological assemblages, including fine-grained marble (sample 49) and medium-crystalline marble (sample 50), epidote skarn (sample 56) and its garnet-bearing variant (sample 57), amphibole-dominant lithologies (samples 68 and 73), and migmatitic complexes like garnet schist and sillimanite schist (70/72), the model exhibited robust multi-scale feature extraction capabilities, achieving classification accuracies above 98.6% for all critical sample sets. Detailed classification metrics for these four representative groups are provided in Table 6, and the corresponding spectral profiles are visualized in Figure 10.

4.3.2. Class Identification of Typical Metamorphic Rock

The confusion matrix intuitively shows the predicted vs. actual results across categories. Confusion matrix heatmaps visualizing prediction results for five distinct models are displayed in Figure 11.

Cordierite hornfels and granites have similar spectral characteristics and highly overlapping mineral assemblages, which increases the classification difficulty:

①: Spectral curve similarity: Cordierite formation is intrinsically linked to biotite dehydration melting reactions. This petrogenetic process induces significant spectral overlap between cordierite-dominant hornfels and granite in the 2200–2400 nm range, where overlapping hydroxyl (OH⁻) and ferrous iron (Fe²⁺) absorption features create nearly identical reflectance curve morphologies. Additionally, these lithologies show comparable full width at half maximum (FWHM) values for their broad absorption features in the 1000–1300 nm region. Such spectral convergence hinders deep learning models from resolving diagnostic mineralogical signatures, leading to blurred classification boundaries and a significant increase in error rates.
②: Mineralogical composition overlap: Both rock types share common components such as quartz and mafic minerals (e.g., cordierite and biotite), leading to similar spectral responses due to Fe²⁺ and Al-OH absorption features. Quantitative analyses reveal over 20% error in mineral abundance estimation between them. Weathered secondary minerals (e.g., limonite and kaolinite) and mixed image element interference (>40% mixing probability at 30 m resolution) further confound the primary spectral-compositional correlation.

In the model developed in this study, the reduced brightness of the highlighted regions (indicated by the red box) indicates that most of the previously confused samples were correctly classified. This demonstrates the model’s higher OA and improved classification performance.

Nevertheless, the proposed model exhibits persisting limitations in specific lithological identifications. Notably, all benchmark models consistently misclassified sample 36 (pumice), with the 3D-CNN erroneously identifying its primary lithology as sample 13 (volcanic lava). This confusion arises from an 82% spectral overlap in low-frequency band trends between these lithotypes (Figure 10). The issue is exacerbated when models focus more on extracting spatial features—such as textural patterns, edge contours, or pixel distribution structures within the rock image—rather than prioritizing discriminative spectral signatures that could distinguish them.

Furthermore, both our model and comparative architectures partially mislabeled regions of sample 36 (pumice) as sample 59 (serpentinite). Spectral curve analysis revealed reflectance divergence of only 3.6% in high-frequency bands between these lithologies, highlighting the persistent challenge of differentiating subtle spectral variations below a 5% contrast threshold—a critical performance boundary in hyperspectral petrological classification.

Mineralogically, this challenge arises because pumice (sample 36), a porous glassy volcaniclastic rock, and serpentinite (sample 59), dominated by hydrous serpentine, share convergent spectral drivers: both contain hydroxyl (OH⁻) groups (from adsorbed water in pumice’s glassy matrix and serpentine’s crystal structure) causing overlapping near-infrared absorption. Their fine-grained textures further minimize reflectance contrasts, making subtle <5% differences indistinct to models.

5. Conclusions

In this study, 81 rock types were classified using a series of network models designed to fully exploit both the spatial and spectral information of rocks and minerals. Ultimately, a deep learning model combining a CNN and an RNN was proposed for the classification of hyperspectral rock images. The AdaBoost integration demonstrably enhanced model stability, reducing the standard deviation of overall accuracy (OA) from 32.85% in the base C-RNN model to 7.44% in the CRNN-AdaBoost ensemble. By fully leveraging the multi-dimensional information in hyperspectral rock images, the model reduces the occurrence of “spectral homogeneity with material heterogeneity (SHMH)” and “material consistency with spectral divergence (MCSD)” phenomena, thereby significantly enhancing classification accuracy. By introducing the AdaBoost algorithm, the stability and generalization ability of the model were enhanced, especially in cases with few samples and diverse categories. This approach effectively reduces overfitting and improves discriminative power, thereby enhancing the classification accuracy across the 81 types of fine-grained rocks. However, the misjudgment of spectrally highly similar samples, such as No. 36 (pumice) and No. 59 (serpentine), still exposes the insufficient sensitivity of the model to subtle feature differences. Future work will focus on the following three aspects. First, we will introduce prior knowledge of mineral composition to construct a spectral decoupling network, enhancing the ability to distinguish spectrally similar categories. Second, we will adapt to real-time field detection requirements through dynamic pruning and quantization compression of model parameters. Third, we will integrate multi-modal data such as LiDAR and micro-imaging data to build a cross-scale rock analysis framework, further breaking through the spectral limitations of single data sources and promoting the development of geological interpretation toward multi-dimensional intelligence.

Author Contributions

Conceptualization was done by S.X., Y.Q. and S.C. S.C. was responsible for Methodology and Software. Validation was carried out by Y.Q. Formal analysis was done by W.W. Y.Q. and S.X. wrote the original draft, while W.W. and S.X. were responsible for writing—review and editing. Visualization was done by S.X. All authors edited the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 61701153 and No. 41402304) and Zhejiang Provincial Natural Science Foundation of China (LQ13D020002).

Data Availability Statement

Dataset details and downloads are available at https://uwrl.hznu.edu.cn/c/2023-02-24/2805144.shtml (accessed on 27 August 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Karimpouli, S.; Tahmasebi, P. Segmentation of digital rock images using deep convolutional autoencoder networks. Comput. Geosci. 2019, 126, 142–150. [Google Scholar] [CrossRef]
Guo, W.; Dong, C.; Lin, C.; Wu, Y.; Zhang, X.; Liu, J. Rock physical modeling of tight sandstones based on digital rocks and reservoir porosity prediction from seismic data. Front. Earth Sci. 2022, 10, 932929. [Google Scholar] [CrossRef]
Houshmand, N.; Goodfellow, S.; Esmaeili, K.; Calderón, J.C.O. Rock type classification based on petrophysical, geochemical, and core imaging data using machine and deep learning techniques. Appl. Comput. Geosci. 2022, 16, 100104. [Google Scholar] [CrossRef]
Li, W.; Yin, G.; Yu, H.; Liu, X. The Yanshanian granites and associated Mo-polymetallic mineralization in the Xiangcheng-Luoji area of the Sanjiang-Yangtze conjunction zone in Southwest China. Acta Geol. Sin.—Engl. Ed. 2014, 88, 1742–1756. [Google Scholar] [CrossRef]
Ferrero, S.; Bartoli, O.; Cesare, B.; Salvioli-Mariani, E.; Acosta-Vigil, A.; Cavallo, A.; Groppo, C.; Battiston, S. Microstructures of melt inclusions in anatectic metasedimentary rocks. J. Metamorph. Geol. 2012, 30, 303–322. [Google Scholar] [CrossRef]
Xu, Y.; Ma, H.; Peng, S. Study on identification of altered rock in hyperspectral imagery using spectrum of field object. Ore Geol. Rev. 2014, 56, 584–595. [Google Scholar] [CrossRef]
Zhao, J.; Wang, G.; Zhou, B.; Ying, J.; Liu, J. Exploring an application-oriented land-based hyperspectral target detection framework based on 3D–2D CNN and transfer learning. EURASIP J. Adv. Signal Process. 2024, 2024, 37. [Google Scholar] [CrossRef]
Transon, J.; D’Andrimont, R.; Maugnard, A.; Defourny, P. Survey of Hyperspectral Earth Observation Applications from Space in the Sentinel-2 Context. Remote Sens. 2018, 10, 157. [Google Scholar] [CrossRef]
Jakob, S.; Zimmermann, R.; Gloaguen, R. The need for accurate geometric and radiometric corrections of drone-borne hyperspectral data for mineral exploration: MEPHySTo—A toolbox for pre-processing drone-borne hyperspectral data. Remote Sens. 2017, 9, 88. [Google Scholar] [CrossRef]
Paoletti, M.E.; Haut, J.M.; Plaza, J.; Plaza, A. Deep&Dense Convolutional Neural Network for Hyperspectral Image Classification. Remote Sens. 2018, 10, 1454. [Google Scholar] [CrossRef]
Zhang, X.; Li, P. Lithological mapping from hyperspectral data by improved use of spectral angle mapper. Int. J. Appl. Earth Obs. Geoinf. 2014, 31, 95–109. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
Ghezelbash, R.; Maghsoudi, A.; Shamekhi, M.; Pradhan, B.; Daviran, M. Genetic algorithm to optimize the SVM and K-means algorithms for mapping of mineral prospectivity. Neural Comput. Appl. 2023, 35, 719–733. [Google Scholar] [CrossRef]
Tessier, J.; Duchesne, C.; Bartolacci, G. A machine vision approach to on-line estimation of run-of-mine ore composition on conveyor belts. Miner. Eng. 2007, 20, 1129–1144. [Google Scholar] [CrossRef]
Tripathi, P.; Garg, R.D. Potential of DESIS and PRISMA hyperspectral remote sensing data in rock classification and mineral identification: A case study for Banswara in Rajasthan, India. Environ. Monit. Assess. 2023, 195, 575. [Google Scholar] [CrossRef] [PubMed]
Hughes, G. On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theory 1968, 14, 55–63. [Google Scholar] [CrossRef]
Kilickaya, S.; Ahishali, M.; Sohrab, F.; Ince, T.; Gabbouj, M. Hyperspectral image analysis with subspace learning-based one-class classification. In Proceedings of the 2023 Photonics & Electromagnetics Research Symposium (PIERS), Prague, Czech Republic, 3–6 July 2023; pp. 953–959. [Google Scholar] [CrossRef]
Güler, E.; Kakız, M.T.; Günay, F.B.; Şanal, B.; Çavdar, T. Kapalı mekan ortamında 1D-CNN kullanarak yapılan doluluk tespiti sınıflandırması. Karadeniz Fen Bilim. Derg. 2023, 13, 60–71. [Google Scholar] [CrossRef]
Yang, J.; Chang, B.; Zhang, Y.; Luo, W.; Ge, S.; Wu, M. CNN coal and rock recognition method based on hyperspectral data. Int. J. Coal Sci. Technol. 2022, 9, 63. [Google Scholar] [CrossRef]
Zhang, C.; Yi, M.; Ye, F.; Xu, Q.; Li, X.; Gan, Q. Application and Evaluation of Deep Neural Networks for Airborne Hyperspectral Remote Sensing Mineral Mapping: A Case Study of the Baiyanghe Uranium Deposit in Northwestern Xinjiang, China. Remote Sens. 2022, 14, 5122. [Google Scholar] [CrossRef]
Jiang, T.; Wang, X.J. Hyperspectral images classification based on fusion features derived from 1D and 2D convolutional neural network. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 42, 335–341. [Google Scholar] [CrossRef]
Sinaice, B.B.; Kawamura, Y.; Kim, J.; Okada, N.; Kitahara, I.; Jang, H. Application of deep learning approaches in igneous rock hyperspectral imaging. In Proceedings of the 28th International Symposium on Mine Planning and Equipment Selection—MPES 2019, Perth, Australia, 2–4 December 2019; Topal, E., Ed.; Springer: Cham, Switzerland, 2020; pp. 347–356. [Google Scholar] [CrossRef]
Ge, W.; Cheng, Q.; Tang, Y.; Jing, L.; Gao, C. Lithological classification using Sentinel-2A data in the Shibanjing ophiolite complex in Inner Mongolia, China. Remote Sens. 2018, 10, 638. [Google Scholar] [CrossRef]
Bozga, M.; Maler, O.; Tripakis, S. Efficient verification of timed automata using dense and discrete time semantics. In Correct Hardware Design and Verification Methods; Pierre, L., Kropf, T., Eds.; Springer: Berlin/Heidelberg, Germany, 1999; pp. 125–141. [Google Scholar] [CrossRef]
Dang, L.; Pang, P.; Zuo, X.; Liu, Y.; Lee, J. A Dual-Path Small Convolution Network for Hyperspectral Image Classification. Remote Sens. 2021, 13, 3411. [Google Scholar] [CrossRef]
Hussain, M.; Bird, J.J.; Faria, D.R. A study on CNN transfer learning for image classification. In Advances in Computational Intelligence Systems: UKCI 2018; Lotfi, A., Bouchachia, H., Gegov, A., Langensiepen, C., McGinnity, M., Eds.; Springer: Cham, Switzerland, 2019; Volume 84, pp. 199–212. [Google Scholar] [CrossRef]
Zhao, H.; Deng, K.; Li, N.; Wang, Z.; Wei, W. Hierarchical Spatial-Spectral Feature Extraction with Long Short Term Memory (LSTM) for Mineral Identification Using Hyperspectral Imagery. Sensors 2020, 20, 6854. [Google Scholar] [CrossRef] [PubMed]
Agrawal, N.; Govil, H. A deep residual convolutional neural network for mineral classification. Adv. Space Res. 2023, 71, 3186–3202. [Google Scholar] [CrossRef]
Liu, Y.; Wu, X.; Teng, Q.; He, H. Mineral identification of rock thin section images based on improved SKnet and Bi-GRU. Intell. Comput. Appl. 2023, 13, 104–111. [Google Scholar] [CrossRef]
Okada, N.; Maekawa, Y.; Owada, N.; Haga, K.; Shibayama, A.; Kawamura, Y. Automated Identification of Mineral Types and Grain Size Using Hyperspectral Imaging and Deep Learning for Mineral Processing. Minerals 2020, 10, 809. [Google Scholar] [CrossRef]
Galdames, F.J.; Perez, C.A.; Estévez, P.A.; Adams, M. Rock lithological instance classification by hyperspectral images using dimensionality reduction and deep learning. Chemom. Intell. Lab. Syst. 2022, 224, 104538. [Google Scholar] [CrossRef]
Li, D.; Zhao, J.; Ma, J. Experimental studies on rock thin-section image classification by deep learning-based approaches. Mathematics 2022, 10, 2317. [Google Scholar] [CrossRef]
Zhang, Z.; Zheng, C.; Liang, C.; Santosh, M.; Hao, J.; Dong, L.; Hou, J.; Hou, F.; Li, M. Metamorphism and P-T Evolution of High-Pressure Granulites from the Fuping Complex, North China Craton. Minerals 2024, 14, 138. [Google Scholar] [CrossRef]
Barbey, P.; Marignac, C.; Montel, J.M.; Macaudière, J.; Gasquet, D.; Jabbori, J. Cordierite growth textures and the conditions of genesis and emplacement of crustal granitic magmas: The Velay granite complex (Massif Central, France). J. Petrol. 1999, 40, 1425–1441. [Google Scholar] [CrossRef]
Regmi, K.R. Petrogenesis of the augen gneisses from Mahesh Khola section, Central Nepal. Bull. Dep. Geol. 2008, 11, 13–22. [Google Scholar] [CrossRef]
Karimzadeh, Z.; Tangestani, M.H. Application of WorldView-3 data in alteration mineral mapping in Chadormalu area, central Iran. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 589–596. [Google Scholar] [CrossRef]
LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent neural network regularization. arXiv 2014. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar] [CrossRef]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014. [Google Scholar] [CrossRef]
Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
Alrebdi, N.; Al-Shargabi, A.A. Emotional state prediction based on EEG signals using ensemble methods. Int. J. Adv. Comput. Sci. Appl. 2023, 14, 123. [Google Scholar] [CrossRef]
Li, Y.; Zhang, H.; Shen, Q. Spectral–Spatial Classification of Hyperspectral Imagery with 3D Convolutional Neural Network. Remote Sens. 2017, 9, 67. [Google Scholar] [CrossRef]
Liu, Q.; Zhou, F.; Hang, R.; Yuan, X. Bidirectional-convolutional LSTM based spectral-spatial feature learning for hyperspectral image classification. Remote Sensing 2017, 9, 1330. [Google Scholar] [CrossRef]
Zhong, Z.; Li, J.; Luo, Z.; Chapman, M. Spectral–spatial residual network for hyperspectral image classification: A 3-D deep learning framework. IEEE Trans. Geosci. Remote Sens. 2018, 56, 847–858. [Google Scholar] [CrossRef]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]

Figure 1. HySpex hyperspectral sensor.

Figure 2. Dataset HSRD-1.0. (a) Classification of 81 rock samples, (b) false color composite image of samples, and (c) true labels after corrosion.

Figure 3. Spectral curves of 81 rock types.

Figure 4. Deep integration network diagram.

Figure 5. Two-dimensional convolutional recurrent neural network module.

Figure 6. Schematic diagram of the AdaBoost algorithm.

Figure 7. Comparison of overall classification accuracy between two models with different training rounds.

Figure 8. Classification renderings of models under different RNN branches on the dataset containing 81 types of rock: (a) 2D-BRNN-AdaBoost, (b) 2D-LSTM-AdaBoost, (c) 2D-1D CNN-AdaBoost, and (d) proposed model.

Figure 9. Classification renderings of different models on the dataset containing 81 types of rock: (a) 3D-CNN, (b) Bi-CLSTM, (c) SSRN, (d) DenseNet, (e) proposed model, and (f) real label.

Figure 10. Spectral characteristics of metamorphic subtypes: (a) marble (fine-grained; No. 49, green) vs. marble (medium-grained; No. 50, blue); (b) epidote skarn (No. 56, orange) vs. garnet-epidote sillimanite rock (No. 57, yellow); (c) amphibole schist (No. 68, blue) vs. plagioclase gneiss (No. 73, red); (d) banded migmatite (No. 78, purple) vs. sillimanite schist (No. 72, black); (e) pumice (No. 36, red) vs. serpentine (No. 59, green), and (f) volcanic lava (No. 13,blue) vs. pumice (No. 36, red).

Figure 11. Visualization of confusion matrix thermodynamics for different models: (a) 3D CNN, (b) Bi-CLSTM, (c) SSRN, (d) DenseNet, and (e) proposed model.

Table 1. Eighty-one types of rock labels corresponding to color and pixel sample number.

Rock Name	Class	Samples	Rock Name	Class	Samples
Peridotite	1	4986	Pseudoleucite Phonolite	41	5014
Pyroxenite	2	5519	Carbonatite	42	4576
Komatiite	3	5740	Pegmatite	43	6018
Amphibolite	4	4837	Gabbroic Pegmatite	44	4631
Kimberlite	5	6548	Lamprophyre	45	5078
Anorthosite	6	5404	Aplite	46	5045
Gabbro	7	5472	Quartzite	47	4175
Diabase	8	4166	Banded Magnetite Quartzite	48	4722
Basalt	9	3747	Marble (Fine-grained)	49	2771
Vesicular Basalt	10	4296	Marble (Medium-grained)	50	5699
Amygdaloidal Basalt	11	2906	Red Marble	51	5551
Volcanic Bomb	12	4739	Andalusite Hornfels	52	3969
Volcanic Lava	13	2945	Biotite Hornfels	53	4983
Diorite	14	6163	Cordierite Hornfels	54	3479
Diorite Porphyry	15	4923	Garnet Skarn	55	4842
Andesite	16	4238	Epidote Skarn	56	4256
Quartz Diorite	17	4251	Garnet-Epidote Skarn	57	3819
Granodiorite	18	5359	Greisen	58	4183
Trachyte	19	5239	Serpentinite	59	3221
Latite	20	4717	Mylonite	60	3670
Pyroxene-Quartz Syenite Porphyry	21	4492	Phyllite	61	3120
Orthoclase	22	5110	Eclogite	62	4857
Syenite Porphyry	23	3914	Gray Slate	63	4485
Granite	24	5384	Black Slate	64	4931
Aplite	25	5126	Chlorite Schist	65	5440
Monzogranite	26	3477	Talc schist	66	4998
Porphyritic Granite	27	4656	Muscovite Quartz Schist	67	5207
Potassic Granite	28	4782	Amphibole Schist	68	4012
Graphic Granite	29	5186	Kyanite Schist	69	4664
Rhyolite	30	4263	Pyroxene Amphibolite	70	4660
Spherulitic Rhyolite	31	4529	Staurolite Schist	71	4192
Felsite	32	3906	Sillimanite Schist	72	4604
Obsidian	33	4107	Plagioclase Amphibole Schist	73	5170
Pitchstone	34	4035	Granitic Gneiss	74	5596
Perlite	35	3009	Biotite Gneiss	75	5370
Pumice	36	5155	Garnet Granulite	76	5789
Alaskite	37	4365	Leptynite	77	6215
Ijolite	38	4118	Banded Migmatite	78	6082
Nepheline Syenite	39	4892	Ptygmatic Migmatite	79	4715
Melilite Phonolite	40	3648	Augen Migmatite	80	4781
			Mixed Granite	81	4638
Total			377,577

Table 2. Results of classification of 81 types of rocks by the AdaBoost and C-RNN algorithms.

Model	OA (%)	Kappa × 100
AdaBoost	53.837 ± 1.461	53.2 ± 1.5
C-RNN	81.493 ± 32.850	81.2 ± 33.3

Table 3. Comparison of classification accuracy for different numbers of weak classifiers in the model.

Number of Classifiers	OA (%)	AA (%)	Kappa × 100	Average Training Duration (Seconds)
1	88.513 ± 5.323	88.348 ± 4.993	88.40 ± 5.1	1838.816
3	92.554 ± 7.442	92.250 ± 7.861	92.50 ± 7.5	4374.326
5	91.927 ± 4.298	92.016 ± 5.134	92.10 ± 4.3	6402.439
7	89.365 ± 5.611	89.131 ± 5.309	89.20 ± 5.2	7129.034

Table 4. Comparison of classification accuracy for different RNN branches in the model.

Model	OA (%)	AA (%)	Kappa × 100	Parameter Quantity
2D-BRNN-AdaBoost	83.914 ± 1.579	83.350 ± 1.700	83.70 ± 1.6	16,342,941
2D-LSTM-AdaBoost	89.796 ± 2.842	89.298 ± 3.194	89.70 ± 2.9	8,229,085
2D-1D CNN-AdaBoost	91.118 ± 5.227	90.670 ± 5.461	91.10 ± 5.3	36,670,263
Proposed CRNN-AdaBoost	92.554 ± 7.442	92.250 ± 7.861	92.50 ± 7.5	1,012,632

Table 5. Classification results of different models on the dataset containing 81 types of rock.

Model	OA (%)	AA (%)	Kappa × 100
3D CNN	87.129 ± 6.439	88.235 ± 6.873	87.10 ± 6.5
Bi-CLSTM	89.476 ± 3.316	89.038 ± 3.128	88.20 ± 3.0
SSRN	91.012 ± 4.734	91.989 ± 4.325	90.90 ± 4.9
DenseNet	90.301 ± 3.213	90.991 ± 3.052	90.80 ± 3.1
Proposed CRNN-AdaBoost	92.554 ± 7.442	92.250 ± 7.861	92.50 ± 7.5

Table 6. Classification accuracy of metamorphic subtypes using the CRNN-AdaBoost model.

Class Name	Classification Results of the Four Classes
49–50	99.5%
56–57	99.4%
68/73	98.6%
70/72	99.8%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xie, S.; Qiu, Y.; Cao, S.; Wu, W. Hyperspectral Lithological Classification of 81 Rock Types Using Deep Ensemble Learning Algorithms. Minerals 2025, 15, 844. https://doi.org/10.3390/min15080844

AMA Style

Xie S, Qiu Y, Cao S, Wu W. Hyperspectral Lithological Classification of 81 Rock Types Using Deep Ensemble Learning Algorithms. Minerals. 2025; 15(8):844. https://doi.org/10.3390/min15080844

Chicago/Turabian Style

Xie, Shanjuan, Yichun Qiu, Shixian Cao, and Wenyuan Wu. 2025. "Hyperspectral Lithological Classification of 81 Rock Types Using Deep Ensemble Learning Algorithms" Minerals 15, no. 8: 844. https://doi.org/10.3390/min15080844

APA Style

Xie, S., Qiu, Y., Cao, S., & Wu, W. (2025). Hyperspectral Lithological Classification of 81 Rock Types Using Deep Ensemble Learning Algorithms. Minerals, 15(8), 844. https://doi.org/10.3390/min15080844

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperspectral Lithological Classification of 81 Rock Types Using Deep Ensemble Learning Algorithms

Abstract

1. Introduction

2. Hardware and Dataset

2.1. Hyperspectral Imaging System

2.2. Hyperspectral Rock Standard Dataset HSRD-1.0

2.3. Spectral Characteristics of Dataset HSRD-1.0

3. Method

3.1. Related Work

3.1.1. Two-Dimensional CNN

3.1.2. Recurrent Neural Network

3.2. Proposed Algorithm

3.2.1. A Framework for Hyperspectral Rock Classification Based on Bimodal Feature Synergy

3.2.2. Two-Dimensional Convolutional Recurrent Neural Network

3.2.3. AdaBoost Algorithm Integrated with Deep Learning Frameworks

4. Results and Discussion

4.1. Experimental Design

4.1.1. Datasets

4.1.2. Experimental Environment and Evaluation Criteria

4.1.3. Experimental Setup

4.1.4. Effect of the Number of Training Rounds on Classification Accuracy

4.1.5. Effect of the Number of Weak Classifiers on Classification Accuracy

4.1.6. Effect of RNN Branch on Classification Accuracy

4.2. Comparison and Discussion of Experimental Results

4.3. Analysis and Discussion of Typical Samples of Metamorphic Rock

4.3.1. Identification of Typical and Easily Confused Metamorphic Rock

4.3.2. Class Identification of Typical Metamorphic Rock

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI