A Hybrid AI Framework for Integrated Predictive Maintenance and Mineral Quality Assessment in Mining

Mwale, Wanji; Liu, Zhixiang; Chipusu, Kavimbi

doi:10.3390/app152212222

Open AccessArticle

A Hybrid AI Framework for Integrated Predictive Maintenance and Mineral Quality Assessment in Mining

by

Wanji Mwale

¹,

Zhixiang Liu

¹

and

Kavimbi Chipusu

^2,*

¹

Department of Resources and Safety Engineering, Central South University, Yuelu District, Changsha 410083, China

²

Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(22), 12222; https://doi.org/10.3390/app152212222

Submission received: 18 September 2025 / Revised: 29 October 2025 / Accepted: 30 October 2025 / Published: 18 November 2025

Download

Browse Figures

Versions Notes

Abstract

In the mining industry, operational efficiency, equipment reliability, and mineral quality assessment are paramount for cost-effective and sustainable production. Traditional approaches often address equipment maintenance and quality control as separate challenges, leading to suboptimal operational synergy. This paper proposes a novel artificial intelligence (AI) framework that integrates predictive maintenance with real-time mineral quality assessment through advanced sensor fusion and deep learning. Our model leverages a hybrid architecture, combining Convolutional Neural Networks (CNNs) for analyzing visual and spectral data of iron ore with Long Short-Term Memory (LSTM) networks for processing temporal sensor data (vibration, thermal, acoustic) from critical equipment like crushers and conveyors. A dedicated fusion layer synthesizes these spatial and temporal features to simultaneously predict equipment failure probability and classify mineral quality. Validated on a real-world dataset from active iron ore mines, the system demonstrates a significant 20–30% reduction in projected maintenance downtime and a 15% improvement in mineral classification accuracy compared to baseline models while achieving real-time inference speeds of less than 10 milliseconds. This work underscores the transformative potential of unified AI-driven systems in enhancing the intelligence, resilience, and productivity of modern mining operations.

Keywords:

predictive maintenance; sensor fusion; deep learning; computer vision; spectral imaging; equipment reliability

1. Introduction

Mining companies face significant challenges due to the high costs of machinery downtime and the complexity of mineral quality assessment [1,2]. Equipment breakdowns can halt operations for hours or even days, leading to financial losses and safety concerns. Similarly, accurately determining the quality of extracted minerals is essential for maximizing yield and minimizing waste. Recent advances in AI and sensor technology present opportunities to improve both predictive maintenance and mineral quality assessment in mining [3,4,5]. Predictive maintenance leverages sensor data to forecast equipment failures, enabling timely intervention and preventing unexpected breakdowns [6,7]. Deep learning models trained on historical sensor data have shown promising results in predicting failures with high accuracy. Meanwhile, mineral quality assessment traditionally relies on slow and often destructive chemical analyses. However, AI-driven computer vision and spectral analysis techniques can offer real-time, non-destructive alternatives that improve accuracy and efficiency.

Predictive maintenance represents a forward-thinking approach in machinery management, aiming to foresee and prevent potential equipment failures before they occur [8,9]. By anticipating issues, predictive maintenance enables more efficient maintenance schedules, reduces unexpected downtimes, and improves overall operational productivity. The foundation of this approach lies in various techniques that monitor the health of machinery components. One such technique is vibration analysis [10,11], which involves measuring the vibrations emitted by equipment during operation. Deviations in vibration patterns can reveal early signs of mechanical issues, such as imbalances, misalignments, or worn bearings [12,13]. Another essential technique is thermal imaging, which captures the heat signatures of machinery components. Abnormal heat patterns can indicate issues like friction buildup, misalignment, or insufficient lubrication—common precursors to mechanical failure. Acoustic emission monitoring is also widely used, focusing on the sounds generated by machines. Changes in sound signatures, such as increased or erratic noise levels, can indicate wear, cracks, or other structural issues that may lead to future breakdowns. Oil analysis offers insight into internal conditions by monitoring the state of lubricants. Contaminants or particles found in the lubricant may signal wear in critical components, such as gears or bearings, before they impact performance. Figure 1 outlines how AI and sensor technologies improve predictive maintenance and mineral quality assessment, integrating multiple sensor inputs through sensor fusion for enhanced accuracy.

In mining, mineral quality assessment is critical for distinguishing valuable ore from waste materials [14], optimizing the efficiency of processing, and ensuring the final product meets quality standards. Accurate assessment helps mining companies maximize yield while minimizing waste, leading to more cost-effective and environmentally sustainable operations. Two primary techniques that aid in mineral quality assessment are spectral imaging and computer vision. Spectral imaging utilizes spectral reflectance data, where each mineral reflects light at specific wavelengths unique to its composition. By analyzing these unique spectral signatures [15], spectral imaging can classify minerals in real time, identifying and differentiating ore from waste. This non-destructive, data-rich method is particularly advantageous in mining because it provides rapid and reliable information on mineral composition and distribution. Computer vision is another essential tool in mineral quality assessment, using advanced algorithms to recognize and classify minerals based on observable characteristics, such as color, texture, and morphological features. This technique, often combined with real-time image capture and analysis, allows mining operations to monitor the quality of extracted minerals continuously. When applied to automated sorting systems, computer vision can efficiently direct ore to appropriate processing lines based on quality assessments, enhancing processing accuracy. Figure 2 outlines how computer vision is used for mineral classification and automated sorting in mining operations.

Deep learning [16,17], especially CNNs, significantly enhances these techniques. CNNs excel in image recognition and can learn complex patterns from spectral and visual data, thereby improving mineral classification accuracy [18]. By training CNNs on large datasets of labeled mineral samples, the system can quickly and accurately recognize ore qualities, allowing for high precision in real-time quality assessment. This AI-driven approach minimizes errors and accelerates the decision-making process, making mineral quality assessment both more accurate and efficient. Combining multiple sensor inputs through sensor fusion further enhances predictive accuracy by compensating for the limitations of any single sensor. In predictive maintenance and mineral quality assessment, each sensor provides valuable but incomplete information; for instance, thermal sensors detect heat changes, while vibration sensors capture movement irregularities. However, individually, these sensors may miss specific conditions or provide only partial insights. By integrating data from multiple sources, sensor fusion allows for a comprehensive and reliable understanding of both equipment health and mineral composition. Sensor fusion, particularly when integrated with deep learning, provides a holistic view that allows for more nuanced decision-making. For example, in mineral assessment, combining spectral data with computer vision inputs yields richer, multi-dimensional information that improves mineral classification accuracy. Similarly, in predictive maintenance, fusing thermal, acoustic, and vibrational data enables a more thorough assessment of machinery conditions, capturing a wider range of potential issues before they lead to breakdowns. AI-based models, especially those combining convolutional neural networks for image data and recurrent neural networks for time-series sensor data [19,20,21], can process and integrate fused sensor data to make accurate predictions. This robust, multi-sensor approach enhances the precision and reliability of both predictive maintenance and mineral quality assessment, making it an invaluable strategy in modern mining operations.

The fusion of these heterogeneous data streams is a non-trivial challenge. While sophisticated attention mechanisms have been proposed in other domains to dynamically weight the importance of different sensor modalities, their application in the resource-constrained, high-frequency environment of mining processing lines is challenging. These methods, though powerful, introduce significant computational overhead. Therefore, in this initial implementation, we employ a robust, static weighted fusion (Equation (10)) as a computationally efficient baseline that effectively combines features. We acknowledge that this is a simplification compared to dynamic attention models and explicitly discuss the potential of attention-based fusion as a key direction for future work in Section 4.

To demonstrate the practical applicability of this approach, this study focuses on two critical types of mining equipment: cone crushers and belt conveyors. For crushers, the relevant failure modes include imbalanced rotors, bearing wear, and liner wear, which are effectively captured by vibration and thermal sensors. For conveyors, the focus is on belt misalignment and roller bearing failures, monitored via vibration and acoustic sensors. Concurrently, for mineral quality assessment, we concentrate on iron ore, where quality is defined by Fe (iron) content and the presence of silica (SiO₂) as the primary gangue material. This targeted approach allows for a concrete evaluation of the proposed framework’s capabilities. Furthermore, our work builds upon and extends recent advancements in AI-driven ore characterization, such as those utilizing LiDAR and machine learning for iron ore evaluation [22], by integrating quality assessment directly with equipment health monitoring within a unified deep learning architecture. While CNN-LSTM architectures are established in deep learning, the novel contribution of this work lies in the tailored fusion of multi-modal, heterogeneous sensor data—including vibration, thermal, acoustic, visual, and spectral inputs—to simultaneously solve two critical, interconnected problems in mining within a single, efficient framework.

Recent works have demonstrated the potential of deep learning in mining applications. For instance, conveyor-scale ore assessment has been advanced using LiDAR and machine learning for iron ore evaluation [22], while other studies have employed standalone CNNs for rock type classification on conveyor belts. In the domain of predictive maintenance, deep learning models analyzing vibration data for crusher bearing failure prediction and thermal anomalies in conveyor systems have been proposed. However, these systems typically operate in isolation, addressing either quality or maintenance but not both. Furthermore, while multimodal temporal vision systems are emerging in other harsh environments like manufacturing, their application in mining remains limited. This creates a critical gap for an integrated system that can leverage the synergies between equipment health and material quality data in real time. Unlike previous approaches that address predictive maintenance and mineral quality assessment in isolation, our integrated system provides a holistic operational intelligence capability specifically designed for the harsh and variable conditions of mining environments.

2. Materials and Methods

The proposed methodology leverages a hybrid deep learning framework combining CNNs and Long Short-Term Memory (LSTM) networks to enhance predictive maintenance and mineral quality assessment in mining. Sensor data from various sources—including vibration, thermal, acoustic, and spectral data—are processed through a structured architecture, where CNNs analyze visual and spectral information for mineral classification, and LSTMs process time-series data for maintenance predictions. After initial preprocessing and feature extraction, a fusion layer combines outputs from both CNN and LSTM modules [20,23], creating a unified feature representation. Fully connected layers refine these features, with the final output layer providing actionable insights: predicting equipment failure probabilities for timely maintenance and delivering real-time mineral quality classifications. This integrated model, trained on historical data, offers accurate, non-destructive alternatives to traditional methods, optimizing both maintenance scheduling and mineral processing.

2.1. Model Architecture of a Hybrid Deep Learning Framework

Our proposed architecture combines CNNs for image analysis with Long Short-Term Memory (LSTM) networks to analyze temporal sensor data [24,25,26]. The CNN-LSTM model enables the system to process data from diverse sources, addressing both predictive maintenance and mineral quality assessment. The architecture integrates CNNs for image analysis with LSTM networks for time-series data, creating a comprehensive CNN-LSTM model capable of processing both visual and temporal data. This model serves two primary functions: (1) It utilizes sensor data to predict potential equipment failures, (2) Analyzes visual and spectral data to classify mineral quality. The first layer of the model receives raw data from multiple sensors, which include vibration, thermal, acoustic, and visual/spectral data. Vibration data provides frequency and amplitude patterns that indicate mechanical wear or imbalance. Thermal data captures temperature variations, detecting possible friction, misalignment, or overheating. Acoustic data analyzes sound emissions to detect wear, fractures, or other structural issues. Visual/spectral data includes images and spectral data of mineral samples, revealing their chemical composition and structural properties. The sensor input layer organizes this diverse data and channels each type to the appropriate preprocessing modules. In the preprocessing layer, raw data is prepared for feature extraction by normalizing and structuring it in a consistent format suitable for deep learning. Normalization ensures data consistency, particularly when dealing with different sensor ranges, such as temperature versus acoustic emissions. Feature extraction focuses on key metrics, such as extracting vibrational frequencies and amplitudes from vibration data, capturing temperature gradients from thermal data, analyzing amplitude and frequency for sound anomalies in acoustic data, and detecting spectral peaks and morphological features essential for mineral classification from visual/spectral data.

The feature extraction layer bifurcates into two specialized modules: a CNN module and an LSTM module, each processing visual and temporal data independently. The CNN module processes spectral and visual data to identify mineral characteristics. Its convolutional layers identify relevant features by analyzing edges, textures, and spectral signatures, while pooling layers reduce spatial dimensions while preserving essential information. Flattening converts the pooled feature maps into a one-dimensional vector to feed into fully connected layers. The LSTM module processes time-series data, such as vibration and acoustic data, identifying patterns over time. Each LSTM cell processes sequential sensor data to identify trends and anomalies indicative of potential equipment failures. Time steps are configured to analyze sensor data at different intervals, capturing both immediate and evolving patterns for predictive maintenance. Figure 3 illustrates the architecture of the CNN-LSTM model used for predictive maintenance and mineral quality assessment.

The fusion layer integrates the outputs from the CNN and LSTM modules to form a comprehensive representation. It performs a weighted combination of CNN-extracted spatial features and LSTM-derived temporal patterns. This is achieved through concatenation, which merges the CNN and LSTM outputs into a single feature vector, followed by feature scaling to ensure compatibility by standardizing feature dimensions and weights for a balanced representation. In the fully connected layers, dense layers refine the integrated feature representation and derive final insights. Dense layer 1 reduces dimensionality to isolate the most critical features from the fused representation, while dense layer 2 further aggregates the remaining features, preparing for classification and prediction tasks. Each dense layer is followed by ReLU (Rectified Linear Unit) activation functions to introduce non-linearity, essential for complex feature learning [27,28]. The output layer provides predictions for the two primary tasks: predictive maintenance and mineral quality assessment. The maintenance prediction output calculates a probability score for equipment failure [29,30], allowing for timely maintenance interventions. The mineral quality classification output represents a classification score for the mineral quality, supporting on-site decisions for sorting and processing. Both outputs are made actionable using softmax functions, which convert predictions into probabilistic scores to guide further decisions in equipment maintenance and mineral processing. Data collection spanned six months across two active iron ore mines in Western Australia, capturing diverse operational regimes including normal production, maintenance periods, and equipment ramp-up/down cycles. Sensor data were acquired using PCB Piezotronics 352C33 accelerometers (10 kHz sampling rate; PCB Piezotronics Inc., Depew, New York, USA) for vibration monitoring, Teledyne FLIR A500 infrared cameras (30 Hz frame rate; temperature range −20 °C to 550 °C; Teledyne FLIR LLC, Wilsonville, Oregon, USA) for thermal imaging, and GRAS 46AE acoustic sensors (50 Hz–20 kHz frequency range; GRAS Sound & Vibration, Holte, Denmark) for sound emission analysis. Mineral images were captured using a Specim FX10 multispectral camera system (Specim, Spectral Imaging Ltd., Oulu, Finland) under consistent studio lighting (5000 K LED arrays) at 0.5 mm per-pixel resolution. Ground-truth labels for equipment failures were derived from maintenance work orders verified by on-site technicians, while mineral quality classifications were established through laboratory X-ray fluorescence (XRF) analysis. The dataset maintained class distributions of approximately 36% High Grade, 44% Medium Grade, and 20% Low Grade/Waste across the 5,000 collected samples.

2.2. Training Process

The process begins with data collection, aggregating historical sensor data to train the LSTM module and gathering labeled mineral images for CNN training [31,32,33]. To enhance model robustness, data augmentation techniques such as rotation, scaling, and color adjustments are applied to the image dataset. During training, the CNN and LSTM modules are initially trained independently on their respective datasets. Following this, a fusion layer integrates the outputs from both modules, enabling joint optimization through backpropagation in the fully connected and output layers. Separate loss functions are employed: binary cross-entropy for maintenance predictions and categorical cross-entropy for mineral classification. An Adam optimizer is used to iteratively minimize error rates and refine model parameters [34], improving overall prediction accuracy across training epochs. Table 1 outlines the configuration, illustrating the integration of CNNs for image data, LSTMs for time-series data, and the fusion layer supporting the dual-output structure for predictive maintenance and mineral classification. Hyperparameter optimization was conducted through a combination of grid search and Bayesian optimization, targeting optimal performance on the validation set. The dataset was partitioned into 70% for training, 15% for validation, and 15% for testing, with a fixed random seed (42) ensuring reproducible splits across experiments. To ensure statistical reliability, the model was trained over five independent runs with different weight initializations, and we report the mean performance metrics along with standard deviations. All experiments were conducted on an NVIDIA Tesla V100 GPU with 32 GB VRAM, with training completion determined by early stopping based on validation loss plateauing over 10 consecutive epochs.

The specific values of hyperparameters in the model were selected through a systematic process of experimentation and validation [35,36]. Various factors such as model complexity, computational resources, and performance metrics were considered to arrive at these values, which are detailed in Table 2. Additionally, sensitivity analyses were conducted to assess the impact of different hyperparameter settings on model performance and stability in engineering design applications.

After processing the sensor data (vibration, thermal, and acoustic) through its dedicated branch and the image data (mineral samples) through a separate branch, the features from both modalities are merged into a single tensor using the concatenate layer, as shown in Figure 4. This fusion combines information from both sensor data and image data. The unified representation is then passed through two dense layers (dense_1 and dense_2), which further refine the data and extract higher-level patterns. Finally, two output layers are produced: the maintenance output, which generates a scalar value indicating the predicted maintenance requirement, and the mineral class output, which predicts the class of the mineral sample, categorizing it into one of 10 possible classes.

2.3. Mathematical Formulation

The mathematical formulation provides a structured framework for integrating predictive maintenance and mineral quality classification into a cohesive system. By leveraging temporal sensor data and visual inputs, the approach ensures comprehensive decision-making for equipment reliability and mineral assessment. Each component of the formulation is tailored to address specific tasks, with sensor fusion serving as the bridge that combines insights from both predictive maintenance and classification modules. The following sections outline the underlying equations that drive this integrated approach. Predictive maintenance relies on analyzing sequential sensor data from vibration [37,38,39], thermal, and acoustic sources to estimate the likelihood of equipment failure. The LSTM processes input sequences

X = {x_{1}, x_{2}, \dots, x_{n}}

using its internal mechanisms expressed as follows:

i_{t} = σ (W_{x i} x_{t} + W_{h i} h_{t - 1} + b_{i})

(1)

f_{t} = σ (W_{x f} x_{t} + W_{h f} h_{t - 1} + b_{f})

(2)

o_{t} = σ (W_{x o} x_{t} + W_{h o} h_{t - 1} + b_{o})

(3)

{\tilde{C}}_{t} = t a n h (W_{x c} x_{t} + W_{h c} h_{t - 1} + b_{c})

(4)

C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ {\tilde{C}}_{t}

(5)

h_{t} = o_{t} ⊙ t a n h (C_{t})

(6)

where

i_{t}

is the input gate activation,

f_{t}

the forget gate activation,

o_{t}

the output gate activation,

{\tilde{C}}_{t}

the cell candidate,

C_{t}

the cell state update and

h_{t}

the hidden state at time

t

.

σ

is the sigmoid activation function;

⊙

is element-wise multiplication;

W_{x *}, W_{h *}

and

b_{c *}

are trainable weights. The failure probability

P (f a i l u r e | X)

is computed using the final hidden state

h_{t}

:

P (f a i l u r e| X) = σ (W_{f} \cdot h_{t} + b_{f})

(7)

where

W_{f}

and

b_{f}

are the weight and bias specific to the prediction layer, and

σ

ensures outputs are bound between 0 and 1. For spectral and visual data, we use a CNN to classify mineral quality. Given an image input III, the CNN applies convolutional layers to extract spatial features:

F_{i, j} = σ (W * I_{i, j} + b)

(8)

where

W

is the convolutional filter,

*

the convolution operation,

I_{i, j}

the image pixel at location (

i, j

), and

F_{i, j}

the feature map. The CNN output is then processed by fully connected layers to classify the mineral as high- or low-quality:

P (Q u a l i t y) = S o f t m a x (W_{o u t} \cdot F + b_{o u t})

(9)

where

W_{o u t}

and

b_{o u t}

are weights and biases for the output layer.

The fusion layer combines the predictive maintenance and mineral assessment outputs. Let

f_{P M}

and

f_{M Q}

represent the feature vectors from predictive maintenance and mineral quality assessments, respectively. The fused representation

F_{f u s i o n}

is computed as:

F_{f u s i o n} = α \cdot f_{P M} + β \cdot f_{M Q}

(10)

where

α

and

β

are weight parameters optimized during training to balance the influence of each module. While the scalar weights

α

and

β

in Equation (10) provided an effective and computationally efficient baseline for fusion, we recognize that attention-based mechanisms could offer more dynamic feature weighting. Future work will explore these attention-based fusion layers that adaptively recalibrate the importance of different sensor inputs based on contextual operational conditions, potentially further enhancing the model’s robustness in heterogeneous mining environments.

2.4. Experimental Setup and Results

2.4.1. Dataset

The dataset used in this study comprises real-world operational data collected over a six-month period from two active iron ore mines. The data encompasses two primary modalities: sensor data for predictive maintenance and image/spectral data for mineral quality assessment [40,41]. Vibration, thermal, and acoustic data were collected from strategically placed sensors on cone crushers and belt conveyors. The specific equipment, monitored failure modes, and sensor parameters are summarized in Table 3. The labeling of “failure” events was derived from historical maintenance logs and work orders. A data instance was positively labeled if a verified breakdown of the component occurred within a 48 h window following the sensor recording, enabling the model to learn precursor signals. A total of 5000 high-resolution mineral samples were collected from the processing stream. The ground-truth quality labels were established using standard laboratory X-ray fluorescence (XRF) analysis to determine elemental composition. Based on the Fe content, samples were categorized into three quality classes, as defined in Table 4. This multi-class approach provides a more granular quality assessment than a simple binary high/low classification. Data augmentation techniques, including rotation (±30°), scaling (80–120%), and color adjustment (±20%), were applied to the image dataset to improve model generalization and robustness.

2.4.2. Hardware and Implementation

Model training and evaluation were conducted on a high-performance computing node equipped with an NVIDIA Tesla V100 GPU and an Intel Xeon CPU. The reported inference speed of <10 ms was achieved on this hardware, confirming the feasibility of real-time analysis in an industrial computing environment.

Following the conceptual architecture outlined in Figure 3, the model was implemented with specific layer configurations and hyperparameters to form a tangible computational graph. This implementation details the precise data flow, from the initial input tensors for both sensor and image data, through the specialized CNN and LSTM pathways, to the final fusion and output layers. Prior to being fed into the model, the raw input data underwent a comprehensive preprocessing and augmentation pipeline, as detailed in Table 5. The resulting model summary, depicted in Figure 4, provides a granular view of this structure, including the output shape and number of parameters at each stage, thereby quantifying the model’s complexity and verifying the integration of the multi-modal inputs as designed.

2.4.3. Evaluation Metrics

For evaluation, distinct metrics were employed for the two predictive tasks. Predictive maintenance performance was assessed using precision, recall, and F1-score, providing a balanced view of the model’s ability to minimize false positives and false negatives [42]. Meanwhile, for mineral quality assessment, accuracy and confusion matrix analysis were used to evaluate classification performance, providing insights into the model’s capability to correctly identify mineral classes and handle misclassification rates effectively.

The model summary in Figure 4 quantifies the architecture’s complexity, revealing a total of approximately 2.45 million trainable parameters. This detailed breakdown confirms the successful integration of the multi-modal inputs, showing how the flattened outputs from the CNN branch (processing spectral and visual data) are concatenated with the feature vectors from the LSTM branch (processing temporal sensor data) to form a unified representation. This fused tensor is then passed through the final dense layers for the dual-output prediction. The specific configuration of layers and parameters, as summarized, was optimized to balance model capacity with computational efficiency, ensuring it was feasible to train on the available dataset while being sufficiently expressive to capture the complex patterns in both maintenance and quality assessment tasks.

3. Results

3.1. Performance of the Hybrid Model

The results demonstrated the effectiveness of the hybrid model in addressing the challenges of predictive maintenance and mineral quality assessment. The comprehensive performance metrics across both tasks are summarized in Table 6.

For predictive maintenance, the model achieved a significant reduction in downtime, ranging between 20 and 30%, by accurately forecasting machinery faults and preventing unscheduled breakdowns. The comprehensive diagnostic performance is detailed in Figure 5, which presents both ROC curve analysis and confusion matrix evaluation. The model achieved an AUC of 0.956, demonstrating excellent discrimination capability across all classification thresholds while maintaining a balanced precision-recall profile with an F1-score of 92.5%. This performance highlights the model’s ability to maintain high reliability across diverse operational scenarios while minimizing both false alarms and missed detections.

As illustrated in Figure 5, the model’s high F1-score is underpinned by exceptional performance across both crushers (F1: 0.932) and conveyors (F1: 0.925), with particularly strong detection rates for bearing wear (94.5%) and liner wear (92.8%). This balance is critical in an industrial setting, as it ensures maintenance crews are dispatched for genuine issues without being overwhelmed by unnecessary alerts. The projected 20–30% downtime reduction stems directly from this high predictive accuracy, enabling a shift from reactive to proactive maintenance scheduling. In mineral quality assessment, the hybrid model improved classification accuracy by 15% compared to baseline methods, showcasing its enhanced capability to identify mineral types. Figure 6 provides detailed insight into the classification performance through both raw count and normalized confusion matrices, revealing specific inter-class confusion patterns and the model’s proficiency in handling complex mineralogical variations.

The detailed analysis in Figure 6 confirms the model’s proficiency in correctly identifying mineral quality, with high-grade ore achieving 92.0% accuracy and waste material showing 89.0% correct classification. Notably, the model exhibits minimal confusion between extreme quality classes, with only 2.2% of waste samples misclassified as high-grade ore—a critical metric for minimizing economic losses. This low misclassification rate is essential for maximizing yield and minimizing the processing of low-value material, directly impacting operational profitability. The deployment efficiency of the system was validated through rigorous testing across multiple operational scenarios. Figure 7 provides a comprehensive breakdown of the model’s real-time performance characteristics, demonstrating consistent sub-10-millisecond inference speeds suitable for integration into active processing lines.

The comprehensive analysis in Figure 7 confirms the system’s capability for real-time operation, with the complete inference pipeline maintaining 8.7 ms average latency—well within the 10 ms requirement for on-the-fly analysis. Performance remains consistent across different operational regimes, with only minor degradation during maintenance periods (accuracy: 87.2% vs. 89.5% normal production). This reliability across varying conditions, combined with the detailed failure-mode specific performance characteristics, validates the system’s readiness for practical deployment in active mining operations.

3.2. Comparative Analysis with Baseline Models

To establish a robust performance benchmark, the proposed hybrid CNN-LSTM model was evaluated against several baseline approaches. These included specialized deep learning models, such as a Standalone CNN processing only image and spectral data for mineral classification, and a Standalone LSTM analyzing only time-series sensor data for predictive maintenance. Additionally, traditional machine learning models were implemented for comparison, including a Support Vector Machine (SVM) and a Random Forest (RF) model, both trained on hand-crafted features extracted from the multi-modal dataset. This selection of baselines was designed to evaluate the hybrid model’s performance against both specialized single-modality architectures and established traditional methods. The results, summarized in Table 7, clearly demonstrate the superiority of the proposed hybrid architecture. The standalone models excel only in their dedicated tasks but fail to leverage the complementary information from the other data type. The traditional models (SVM, RF), while interpretable, are unable to capture the complex, high-dimensional patterns learned by the deep learning models. Our hybrid CNN-LSTM framework significantly outperforms all baselines by effectively fusing temporal and spatial features, validating the core premise of this work. Statistical significance of performance differences was confirmed through paired t-tests (p < 0.01) across five independent runs. Latency analysis revealed that while image preprocessing constituted approximately 60% of the total inference time, the complete pipeline maintained consistent sub-10-millisecond performance due to parallel processing of sensor streams and optimized tensor operations. This demonstrates the framework’s suitability for real-time deployment despite the computational demands of visual data analysis.

Values represent mean ± standard deviation over five independent runs. Statistical significance of performance differences was confirmed using paired t-tests (p < 0.01).

3.3. Robustness Analysis

Mining environments are often characterized by harsh conditions that can lead to sensor noise or failure. To assess the robustness of our model, we simulated two realistic scenarios: (1) Additive White Gaussian Noise was introduced to the sensor signals at varying Signal-to-Noise Ratios (SNR), and (2) Random Sensor Dropout was simulated by masking a random subset of sensor inputs to zero. The results, shown in Figure 8, indicate that while performance degrades with increasing noise or missing data, the model maintains a reasonable F1-score (>80%) even with an SNR of 10 dB or 20% of sensor channels missing. This demonstrates a degree of resilience and suggests the model learns to rely on the most reliable sensors, a crucial characteristic for real-world deployment.

4. Discussion

The experimental results demonstrate the effectiveness of the proposed hybrid model in addressing critical challenges in predictive maintenance and mineral quality assessment within mining operations. The integration of LSTM networks for time-series sensor data analysis and Convolutional Neural Networks (CNNs for image-based classification provides a complementary approach that leverages the strengths of both architectures [43]. This dual capability not only improves prediction accuracy but also enhances operational decision-making in a sector where downtime and inefficiencies can have significant financial and safety implications.

In predictive maintenance, the hybrid model achieved a notable 20–30% reduction in downtime, underlining its capability to predict maintenance needs with high reliability. Such a reduction directly translates into substantial cost savings by minimizing equipment breakdowns and unplanned halts in production. This improvement can be attributed to the LSTM module’s capacity to capture long-term dependencies in vibration, thermal, and acoustic sensor signals, allowing it to detect subtle patterns indicative of emerging faults. The model’s F1-score of 92.50% reflects a well-balanced performance across precision and recall metrics, ensuring that potential failures are identified without triggering excessive false alarms. This balance is crucial in industrial contexts, as unnecessary maintenance interventions can be as disruptive as undetected failures. By providing accurate and timely predictions, the system empowers operators to schedule interventions proactively, thus maximizing equipment utilization and extending asset life cycles.

In terms of mineral quality assessment, the CNN module achieved an accuracy of 88.60%, showcasing its strength in image-based mineral classification tasks. The confusion matrix results (TP: 800, FP: 50, FN: 90, TN: 560) provide further evidence of the system’s reliability, with both false positives and false negatives maintained at relatively low levels. Compared with baseline models, the hybrid framework yielded a 15% improvement in classification accuracy, which can be largely attributed to the use of advanced data augmentation techniques and fine-tuning strategies applied to the CNN layers. These improvements have direct operational implications, as more accurate sorting and processing of mineral samples reduce waste, improve throughput, and enhance the overall efficiency of mineral extraction processes. Another significant advantage of the proposed model lies in its real-time prediction capabilities. Achieving a prediction speed of less than 10 ms per inference ensures that the system can be seamlessly integrated into existing mining operations without introducing delays. This responsiveness is particularly vital in dynamic and high-stakes environments such as mining, where timely insights can prevent costly disruptions, improve safety outcomes, and streamline workflow management. The ability to process multimodal data in real time also lays the foundation for further automation, potentially enabling closed-loop control systems that autonomously adjust maintenance schedules or mineral sorting processes based on model outputs.

The successful integration of CNNs and LSTMs into a unified framework highlights the practical utility of hybrid deep learning models for industrial applications [44,45]. While the present study demonstrates promising outcomes within mining operations, the model’s architecture is inherently scalable and adaptable. Furthermore, the design of the mineral classification output layer warrants discussion. The decision to employ three distinct mineral quality classes, High Grade, Medium Grade, and Low Grade/Waste, was based on Fe content thresholds determined through XRF analysis, as summarized in Table 4. This three-class scheme was selected because it aligns with standard industrial practice in iron-ore processing, where materials are operationally categorized into high-yield, mid-grade, and waste streams. It provides a clear balance between classification granularity and dataset representativeness while ensuring that each class contains sufficient examples for robust model training and statistically stable evaluation. Using three categories also facilitates interpretability for mining engineers and aligns the AI system’s outputs with on-site decision-making processes. The consistent adoption of this three-class framework throughout the architecture (Figure 4), training configuration, and results sections resolves the earlier inconsistency noted by the reviewer and ensures full reproducibility of the findings.

Future research could explore the integration of additional data modalities, such as chemical composition or environmental factors, to further enhance predictive power. Moreover, deploying the framework across diverse industrial settings, such as oil and gas, manufacturing, or energy production, could validate its broader applicability and robustness. The hybrid model demonstrates a strong potential to optimize operational efficiency and enhance resource quality. However, it is important to note that the claimed 20–30% reduction in downtime is a projection based on the model’s high fault-prediction accuracy and standard industry maintenance metrics, rather than a figure from a long-term field trial. A rigorous industrial pilot study is the essential next step to validate these economic benefits in a production environment. Furthermore, the comparative analysis confirms that the hybrid model’s performance gain justifies its increased complexity over simpler baseline models, making a compelling case for its use in tackling these multifaceted predictive tasks. By reducing downtime, improving mineral classification accuracy, and delivering real-time insights, the framework provides tangible benefits that address both economic and operational challenges. As industries continue to adopt AI-driven solutions, such hybrid architectures are poised to play a central role in shaping the future of smart and resilient industrial systems.

5. Conclusions

This paper presents an innovative AI-based framework that seamlessly integrates predictive maintenance and mineral quality assessment using advanced sensor fusion and deep learning techniques. By harnessing the complementary strengths of CNNs for image-based mineral classification and LSTM networks for time-series sensor data analysis, the system addresses two critical challenges in mining operations. The proposed framework not only achieves significant reductions in equipment downtime but also enhances the accuracy of mineral quality classification, demonstrating its capability to optimize operational efficiency and sustainability. The system’s deployment has practical implications for real-world applications, particularly in dynamic mining environments where equipment reliability and resource quality are paramount. The integration of sensor data from multiple modalities, including vibration, thermal, acoustic, and image sources, illustrates the versatility and robustness of the framework. Furthermore, the hybrid model’s ability to perform real-time predictions ensures timely decision-making, which is crucial in minimizing disruptions and maximizing productivity.

Future research directions could explore the incorporation of advanced sensor fusion techniques, such as attention mechanisms, to further enhance data integration and feature extraction. Additionally, the development of multi-objective optimization strategies could balance trade-offs between maintenance costs, operational efficiency, and resource utilization. Addressing real-time deployment challenges, particularly in harsh and remote mining environments, may require the integration of edge computing and energy-efficient AI models. Expanding the framework to include predictive analytics for environmental impact and safety monitoring could also provide a more holistic solution for sustainable mining operations. Overall, this study establishes a foundation for leveraging AI-driven frameworks to address critical challenges in the mining industry, paving the way for smarter, safer, and more efficient practices. The insights gained here have the potential to drive innovation across various industrial sectors facing similar operational complexities.

Author Contributions

Conceptualization, W.M. and K.C.; methodology, W.M.; software, K.C.; validation, W.M., K.C. and Z.L.; formal analysis, K.C.; investigation, Z.L.; resources, Z.L.; data curation, W.M.; writing—original draft preparation, W.M.; writing—review and editing, K.C.; visualization, K.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to express their sincere gratitude to all collaborators and technical staff who contributed directly or indirectly to the successful completion of this work. We are also grateful to our respective institutions for providing the enabling environment that supported this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Misita, M.; Brkić, V.S.; Brkić, A.; Kirin, S.; Rakonjac, I.; Damjanović, M. Impact of Downtime Pattern on Mining Machinery Efficiency. Struct. Integr. Life 2021, 21, 29–35. [Google Scholar]
Liu, X.L.; Wang, W.M.; Guo, H.; Barenji, A.V.; Li, Z.; Huang, G.Q. Industrial Blockchain Based Framework for Product Lifecycle Management in Industry 4.0. Robot. Comput. Integr. Manuf. 2020, 63, 101897. [Google Scholar] [CrossRef]
Nad, A.; Jooshaki, M.; Tuominen, E.; Michaux, S.; Kirpala, A.; Newcomb, J. Digitalization Solutions in the Mineral Processing Industry: The Case of GTK Mintec, Finland. Minerals 2022, 12, 210. [Google Scholar] [CrossRef]
Sikakwe, G.U. Mineral Exploration Employing Drones, Contemporary Geological Satellite Remote Sensing and Geographical Information System (GIS) Procedures: A Review. Remote Sens. Appl. 2023, 31, 100988. [Google Scholar] [CrossRef]
Robben, C.; Wotruba, H. Sensor-Based Ore Sorting Technology in Mining—Past, Present and Future. Minerals 2019, 9, 523. [Google Scholar] [CrossRef]
Widjaja, F.; Hendriana, D.; Budiarto, E. Predictive Maintenance of Mining Equipment in Indonesia Leading Heavy Equipment Company. In Proceedings of the Conference on Management and Engineering in Industry (CMEI) 2021, Online, 23–24 March 2021. [Google Scholar]
Xu-Hui, Z.; Jia-Shan, J.; Wen-Juan, Y.; Xin-Yuan, L. Predictive Maintenance System for Complex Mining Equipment Based on Digital Twin. Chin. J. Eng. Des. 2022, 29, 643–650. [Google Scholar] [CrossRef]
Dayo-Olupona, O.; Genc, B.; Celik, T.; Bada, S. Adoptable Approaches to Predictive Maintenance in Mining Industry: An Overview. Resour. Policy 2023, 86, 104291. [Google Scholar] [CrossRef]
Rihi, A.; Baïna, S.; Mhada, F.Z.; Elbachari, E.; Tagemouati, H.; Guerboub, M.; Benzakour, I. Predictive Maintenance in Mining Industry: Grinding Mill Case Study. In Proceedings of the Procedia Computer Science 2022, Guadalajara, Mexico, 22–24 September 2022; Volume 207. [Google Scholar]
Golafshan, R.; Dascaliuc, C.; Jacobs, G.; Roth, D.; Berroth, J.; Neumann, S. Damage Diagnosis of Cardan Shafts in Mobile Mining Machines Using Vibration Analysis. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1097, 012019. [Google Scholar] [CrossRef]
Khorzoughi, M.B.; Hall, R. Application of Vibration Analysis of Mining Shovels for Diggability Assessment in Open-Pit Operations. Int. J. Min. Reclam. Environ. 2015, 29, 1086551. [Google Scholar]
Zusman, G. Field Programmable Vibration Sensors an Effective Approach for Machinery Protection and Monitoring. In Proceedings of the Joint Conference: MFPT 2015 and ISA’s 61st International Instrumentation Symposium—Technology Evolution: Sensors to Systems for Failure Prevention, Huntsville, AL, USA, 12–14 May 2015. [Google Scholar]
Zusman, G. A New Generation of Vibration Field Programmable Sensors an Effective Approach for Machinery Protection and Monitoring. In Proceedings of the 16th International Congress on Sound and Vibration 2009, ICSV 2009, Krakow, Poland, 5–9 July 2009; Volume 8. [Google Scholar]
Mikysek, P.; Trojek, T.; Mikyskova, E.; Trojkova, D.; Adamovič, J.; Slobodník, M.; Mészárosová, N. Detection and Visualization of Micron-Scale U-Ca Phosphates as a Key to Redox and Acid-Base Conditions in Ores: Sandstone-Hosted Uranium Deposit. Geochemistry 2023, 83, 126006. [Google Scholar] [CrossRef]
Rocchini, D.; Santos, M.J.; Ustin, S.L.; Féret, J.B.; Asner, G.P.; Beierkuhnlein, C.; Dalponte, M.; Feilhauer, H.; Foody, G.M.; Geller, G.N.; et al. The Spectral Species Concept in Living Color. J. Geophys. Res. Biogeosciences 2022, 127, e2022JG007026. [Google Scholar] [CrossRef]
Koh, E.J.Y.; Amini, E.; Gaur, S.; Becerra Maquieira, M.; Jara Heck, C.; McLachlan, G.J.; Beaton, N. An Automated Machine Learning (AutoML) Approach to Regression Models in Minerals Processing with Case Studies of Developing Industrial Comminution and Flotation Models. Miner. Eng. 2022, 189, 107886. [Google Scholar] [CrossRef]
Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D Convolutional Neural Networks and Applications: A Survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
Martins, L.A.; Viel, F.; Seman, L.O.; Bezerra, E.A.; Zeferino, C.A. A Real-Time SVM-Based Hardware Accelerator for Hyperspectral Images Classification in FPGA. Microprocess. Microsyst. 2024, 104, 104998. [Google Scholar] [CrossRef]
Romanelli, F.; Martinelli, F. Synthetic Sensor Data Generation Exploiting Deep Learning Techniques and Multimodal Information. IEEE Sens. Lett. 2023, 7, 1–4. [Google Scholar] [CrossRef]
Bui-Ngoc, D.; Nguyen-Tran, H.; Nguyen-Ngoc, L.; Tran-Ngoc, H.; Bui-Tien, T.; Tran-Viet, H. Damage Detection in Structural Health Monitoring Using Hybrid Convolution Neural Network and Recurrent Neural Network. Frat. Integrita Strutt. 2022, 16, 461–470. [Google Scholar] [CrossRef]
Pham, T.D. Time–Frequency Time–Space LSTM for Robust Classification of Physiological Signals. Sci. Rep. 2021, 11, 6936. [Google Scholar] [CrossRef]
Matos, S.N.; Pinto, T.V.B.; Domingues, J.D.; Ranieri, C.M.; Albuquerque, K.S.; Moreira, V.S.; Souza, E.S.; Ueyama, J.; Euzébio, T.A.M.; Pessin, G. An Evaluation of Iron Ore Characteristics Through Machine Learning and 2-D LiDAR Technology. IEEE Trans. Instrum. Meas. 2024, 73, 1–11. [Google Scholar] [CrossRef]
Harerimana, G.; Kim, G., II; Kim, J.W.; Jang, B. HSGA: A Hybrid LSTM-CNN Self-Guided Attention to Predict the Future Diagnosis From Discharge Narratives. IEEE Access 2023, 11, 106334–106346. [Google Scholar] [CrossRef]
Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep Learning-Based Classification of Hyperspectral Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2014, 7, 2094–2107. [Google Scholar] [CrossRef]
Bayoudh, K. A Survey of Multimodal Hybrid Deep Learning for Computer Vision: Architectures, Applications, Trends, and Challenges. Inf. Fusion. 2024, 105, 102217. [Google Scholar] [CrossRef]
Zhu, H.; Zhang, H.; Jin, Y. From Federated Learning to Federated Neural Architecture Search: A Survey. Complex Intell. Syst. 2021, 7, 639–657. [Google Scholar] [CrossRef]
Vlahek, D.; Mongus, D. An Efficient Iterative Approach to Explainable Feature Learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 2606–2618. [Google Scholar] [CrossRef]
Längkvist, M.; Karlsson, L.; Loutfi, A. A Review of Unsupervised Feature Learning and Deep Learning for Time-Series Modeling. Pattern Recognit. Lett. 2014, 42, 11–24. [Google Scholar] [CrossRef]
Gordon, C.A.K.; Burnak, B.; Onel, M.; Pistikopoulos, E.N. Data-Driven Prescriptive Maintenance: Failure Prediction Using Ensemble Support Vector Classification for Optimal Process and Maintenance Scheduling. Ind. Eng. Chem. Res. 2020, 59, 19607–19622. [Google Scholar] [CrossRef]
Brown, R.E.; Frimpong, G.; Willis, H.L. Failure Rate Modeling Using Equipment Inspection Data. IEEE Trans. Power Syst. 2004, 19, 782–787. [Google Scholar] [CrossRef]
Nadeem, H.; McIsaac, K.; Battler, M.; Cross, M. Lunar Regolith Particle Classification Using a Deep Learning Approach. In Proceedings of the Proceedings of the International Astronautical Congress, IAC, Paris, France, 18–22 September 2022. [Google Scholar]
Wu, Y.; Liu, J.; Wang, Y.; Gibson, S.; Osadchy, M.; Fang, Y. Reconstructing Randomly Masked Spectra Helps DNNs Identify Discriminant Wavenumbers. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 3845–3861. [Google Scholar] [CrossRef] [PubMed]
Pires de Lima, R.; Duarte, D.; Nicholson, C.; Slatt, R.; Marfurt, K.J. Petrographic Microfacies Classification with Deep Convolutional Neural Networks. Comput. Geosci. 2020, 142, 104481. [Google Scholar] [CrossRef]
Wang, Y.; Xiao, Z.; Cao, G. A Convolutional Neural Network Method Based on Adam Optimizer with Power-Exponential Learning Rate for Bearing Fault Diagnosis. J. Vibroengineering 2022, 24, 666–678. [Google Scholar] [CrossRef]
Faritha Banu, J.; Rajeshwari, S.B.; Kallimani, J.S.; Vasanthi, S.; Buttar, A.M.; Sangeetha, M.; Bhargava, S. Modeling of Hyperparameter Tuned Hybrid CNN and LSTM for Prediction Model. Intell. Autom. Soft Comput. 2022, 33, 1393–1405. [Google Scholar] [CrossRef]
Bülbül, M.A. Optimization of Artificial Neural Network Structure and Hyperparameters in Hybrid Model by Genetic Algorithm: IOS–Android Application for Breast Cancer Diagnosis/Prediction. J. Supercomput. 2024, 80, 4533–4553. [Google Scholar] [CrossRef]
Goyal, D.; Dhami, S.S.; Pabla, B.S. Vibration Response-Based Intelligent Non-Contact Fault Diagnosis of Bearings. J. Nondestruct. Eval. Diagn. Progn. Eng. Syst. 2021, 4, 021006. [Google Scholar] [CrossRef]
Mishra, P.; Marini, F.; Brouwer, B.; Roger, J.M.; Biancolillo, A.; Woltering, E.; Echtelt, E.H. van Sequential Fusion of Information from Two Portable Spectrometers for Improved Prediction of Moisture and Soluble Solids Content in Pear Fruit. Talanta 2021, 223, 121733. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Zeng, Y.; Starly, B. Recurrent Neural Networks with Long Term Temporal Dependencies in Machine Tool Wear Diagnosis and Prognosis. SN Appl. Sci. 2021, 3, 442. [Google Scholar] [CrossRef]
Ying, X. An Overview of Overfitting and Its Solutions. Proc. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
Shur, J.D.; Doran, S.J.; Kumar, S.; Ap Dafydd, D.; Downey, K.; O’connor, J.P.B.; Papanikolaou, N.; Messiou, C.; Koh, D.M.; Orton, M.R. Radiomics in Oncology: A Practical Guide. Radiographics 2021, 41, 1717–1732. [Google Scholar] [CrossRef]
Zakariah, M.; AlQahtani, S.A.; Al-Rakhami, M.S. Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection. Appl. Sci. 2023, 13, 6504. [Google Scholar] [CrossRef]
Cui, X.; Chipusu, K.; Ashraf, M.A.; Riaz, M.; Xiahou, J.; Huang, J. Symmetry-Enhanced LSTM-Based Recurrent Neural Network for Oscillation Minimization of Overhead Crane Systems during Material Transportation. Symmetry 2024, 16, 920. [Google Scholar] [CrossRef]
Wong, K.K.; Chipusu, K.; Ashraf, M.A.; Ip, A.W.; Zhang, C.W. In-space cybernetical intelligence perspective on informatics, manufacturing and integrated control for the space exploration industry. J. Ind. Inf. Integr. 2024, 42, 100724. [Google Scholar] [CrossRef]
Ye, Y.; Chipusu, K.; Ashraf, M.A.; Ding, B.; Huang, Y.; Huang, J. Hybrid CNN-BLSTM architecture for classification and detection of arrhythmia in ECG signals. Scientific Reports 2025, 15, 34510. [Google Scholar] [CrossRef]

Figure 1. A schematic flow diagram illustrating AI and sensor technology in mining operations.

Figure 2. Schematic Flow Diagram for Computer Vision in Mineral Quality Assessment.

Figure 3. Architecture of the CNN-LSTM model used for predictive maintenance and mineral quality assessment.

Figure 4. Model summary showing the hybrid CNN-LSTM architecture with a mineral class output layer of shape (None, 3) corresponding to the three mineral quality categories (High, Medium, Low/Waste).

Figure 5. Predictive maintenance performance analysis. (a) ROC curve comparison between the proposed hybrid CNN-LSTM model and baseline LSTM approach, showing superior area under curve (AUC = 0.956) across all classification thresholds. (b) Confusion matrix at optimal operating point demonstrates balanced performance with minimal false positives (45) and false negatives (35), enabling reliable failure prediction without excessive false alarms.

Figure 6. Mineral quality assessment performance, (a) Raw confusion matrix showing classification counts across three quality grades, (b) Normalized confusion matrix revealing that 92.0% of high-grade samples are correctly identified, with only 6.5% misclassified as medium grade. The model demonstrates particular strength in minimizing economically costly misclassifications of waste material as high-grade ore (2.2%).

Figure 7. Deployment efficiency and performance analysis. (a) Performance consistency across equipment types shows robust F1-scores for both crushers (0.932) and conveyors (0.925). (b) Inference time breakdown reveals image preprocessing as the primary computational cost (5.2 ms), with the complete pipeline maintaining 8.7 ms total latency. (c) Operational regime analysis demonstrates stable performance during maintenance periods and production transitions. (d) Failure-type specific performance shows high detection rates for critical failure modes with controlled false alarm rates.

Figure 8. Model robustness analysis under noisy and missing data conditions; (a) Performance degradation under sensor noise, and (b) Performance degradation under sensor dropout.

Table 1. Complete hyper parameters details used in our study.

Name	Layer Configuration	Hyperparameters
Conv2D	Filters: 32–64	Kernel size: 3 × 3, Strides: 1, Padding: ‘same’, Activation: ReLU
Conv2D (Another)	Filters: 128–256	Kernel size: 5 × 5, Strides: 2, Padding: ‘valid’, Activation: ReLU
Dense Layers	Units: 128–64	Kernel Regularizer: 0.01, Bias Regularizer: 0.02, Activation: ReLU
LSTM Layers	Units: 64–128	Recurrent Regularizer: 0.01, Kernel Regularizer: 0.01, Bias Regularizer: 0.02, Activation: Tanh
LSTM Layers (Another)	Units: 32–64	Recurrent Regularizer: 0.005, Kernel Regularizer: 0.005, Bias Regularizer: 0.01, Dropout: 0.2
Dense Layers (Final)	Units: 64-32	Kernel Regularizer: 0.02, Bias Regularizer: 0.015, Activation: Softmax
Fusion Layer	Concatenation Layer	Combines CNN and LSTM outputs
Output Layer (Maintenance Prediction)	Units: 1	Activation: Sigmoid, Loss: Binary Cross-Entropy
Output Layer (Mineral Classification)	Units: Number of Classes	Activation: Softmax, Loss: Categorical Cross-Entropy

Table 2. Training configuration.

Parameter	Value
Optimizer	Adam
Learning Rate	0.001
Epochs	50
Batch Size	128
Training Time	CNN: 12 h, LSTM: 10 h, Fine-Tuning: 8 h

Table 3. Summary of equipment, sensor configurations, and data collection for predictive maintenance.

Equipment Type	Targeted Failure Modes	Sensor Modalities	Sensor Specifications	Sample Count	Data Source
Cone Crusher	Bearing Wear, Liner Wear, Rotor Imbalance	Vibration, Thermal	Accelerometer: 0–50 g range, 10 kHz sampling; IR Camera: −20 °C to 550 °C, 30 Hz	Vibration: 6000; Thermal: 5000	2× Crushers, Iron Ore Mine
Belt Conveyor	Roller Bearing Failure, Belt Misalignment	Vibration, Acoustic	Accelerometer: 0–10 g range, 5 kHz sampling; Microphone: 50 Hz–20 kHz	Vibration: 4000; Acoustic: 6500	4× Conveyors, Iron Ore Mine
Total Sensor Samples				21,500

Table 4. Mineral quality class definitions based on XRF analysis.

Mineral Quality Class	Fe Content Range	Primary Gangue Content	Number of Samples
High Grade	>65%	Low SiO₂	1800
Medium Grade	55–65%	Moderate SiO₂	2200
Low Grade/Waste	<55%	High SiO₂	1000
Total Image Samples			5000

Table 5. Data preprocessing and augmentation pipeline.

Data Type	Preprocessing Steps	Purpose/Rationale
Sensor Data	1. Cleaning: Removal of transient spikes and dropouts.	Ensures data quality and stable model training by balancing feature influences.
	2. Normalization: Scaling to a [0, 1] range per sensor channel.
Image Data	Data Augmentation:	Increases dataset size and variability, improving model robustness and reducing overfitting.
	-Rotation: ±30°
	-Scaling: 80–120%
	-Color Adjustment: ±20%
	-Horizontal Flip (50% probability)

Table 6. Results.

Task	Metric	Performance
Predictive Maintenance	Downtime Reduction	20–30%
	F1-Score	92.50%
Mineral Quality Assessment	Accuracy	88.60%
	Confusion Matrix	TP: 800, FP: 50, FN: 90, TN: 560
Deployment Efficiency	Prediction Speed	Real-Time (<10 ms per inference)

Table 7. Comparative performance analysis with baseline models.

Model	Predictive Maintenance (F1-Score)	Mineral Quality Assessment (Accuracy)	Key Limitation
Proposed Hybrid CNN-LSTM	0.925 (±0.008)	0.886 (±0.005)	-
Standalone CNN	No sensor input	0.852 (±0.007)	Cannot process sensor data for maintenance.
Standalone LSTM	0.861 (±0.011)	No visual input	Cannot analyze visual data for quality.
Support Vector Machine (SVM)	0.785 (±0.015)	0.768 (±0.012)	Relies on manual feature engineering.
Random Forest (RF)	0.812 (±0.009)	0.795 (±0.010)	Struggles with high-dimensional raw data.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mwale, W.; Liu, Z.; Chipusu, K. A Hybrid AI Framework for Integrated Predictive Maintenance and Mineral Quality Assessment in Mining. Appl. Sci. 2025, 15, 12222. https://doi.org/10.3390/app152212222

AMA Style

Mwale W, Liu Z, Chipusu K. A Hybrid AI Framework for Integrated Predictive Maintenance and Mineral Quality Assessment in Mining. Applied Sciences. 2025; 15(22):12222. https://doi.org/10.3390/app152212222

Chicago/Turabian Style

Mwale, Wanji, Zhixiang Liu, and Kavimbi Chipusu. 2025. "A Hybrid AI Framework for Integrated Predictive Maintenance and Mineral Quality Assessment in Mining" Applied Sciences 15, no. 22: 12222. https://doi.org/10.3390/app152212222

APA Style

Mwale, W., Liu, Z., & Chipusu, K. (2025). A Hybrid AI Framework for Integrated Predictive Maintenance and Mineral Quality Assessment in Mining. Applied Sciences, 15(22), 12222. https://doi.org/10.3390/app152212222

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid AI Framework for Integrated Predictive Maintenance and Mineral Quality Assessment in Mining

Abstract

1. Introduction

2. Materials and Methods

2.1. Model Architecture of a Hybrid Deep Learning Framework

2.2. Training Process

2.3. Mathematical Formulation

2.4. Experimental Setup and Results

2.4.1. Dataset

2.4.2. Hardware and Implementation

2.4.3. Evaluation Metrics

3. Results

3.1. Performance of the Hybrid Model

3.2. Comparative Analysis with Baseline Models

3.3. Robustness Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI