Next Article in Journal
See the Unseen: Grid-Wise Drivable Area Detection Dataset and Network Using LiDAR
Previous Article in Journal
Ice Sheet Mass Changes over Antarctica Based on GRACE Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting Rock Hardness and Abrasivity Using Hyperspectral Imaging Data and Random Forest Regressor Model

Department of Civil and Mineral Engineering, The University of Toronto, Toronto, ON M5S 1A4, Canada
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(20), 3778; https://doi.org/10.3390/rs16203778
Submission received: 28 August 2024 / Revised: 25 September 2024 / Accepted: 9 October 2024 / Published: 11 October 2024
(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)

Abstract

:
This study aimed to develop predictive models for rock hardness and abrasivity based on hyperspectral imaging data, providing valuable information without interrupting the mining processes. The data collection stage first involved scanning 159 rock samples collected from 6 different blasted rock piles using visible and near-infrared (VNIR) and short-wave infrared (SWIR) sensors. The hardness and abrasivity of the samples were then determined through Leeb rebound hardness (LRH) and Cerchar abrasivity index (CAI) tests, respectively. The data preprocessing involved radiometric correction, background removal, and staking VNIR and SWIR images. An integrated approach based on K-means clustering and the band ratio concept was employed for feature extraction, resulting in 28 band-ratio-based features. Afterward, the random forest regressor (RFR) algorithm was employed to develop predictive models for rock hardness and abrasivity separately. The performance assessment showed that the developed models can estimate rock hardness and abrasivity of unseen data with R2 scores of 0.74 and 0.79, respectively, with the most influential features located mainly within the SWIR region. The results indicate that integrated hyperspectral data and RFR technique have strong potential for practical and efficient rock hardness and abrasivity characterization during mining processes.

1. Introduction

The interaction between rocks and mechanical equipment is one of the most predominant phenomena in geoengineering projects, such as mining, milling, and rock excavation, and it can greatly affect the lifetime of mechanical tools [1,2]. This dynamic interaction between rocks and mechanical equipment significantly impacts mechanical equipment’s performance, efficiency, lifetime, and maintenance. A thorough understanding of this interaction is critically important for optimizing geoengineering operations [3]. Three distinct factors determine this interaction: the properties of rocks, the properties of mechanical tools, and how mechanical tools interact with rocks [4]. Regarding these three factors, only rock properties may vary significantly in different parts of a specific project, and the latter two terms are generally constants.
From a geomechanical point of view, rock hardness and abrasivity are among the most critical properties of rocks that determine how rocks interact with mechanical tools. Rock hardness generally refers to the rock’s ability to resist scratching, penetration, or permanent deformation [5]. Various testing methods have been developed to measure rock hardness, which can be categorized into four categories based on the mechanism they use: indentation, rebound, scratch, and grinding. A comprehensive review of the existing rock hardness testing methods and their advantages and limitations can be found in [6]. Among the mentioned categories, rebound tests are flexible, practical, nondestructive, and economically viable; therefore, they have been extensively applied in many geoengineering applications [7,8,9,10,11,12,13]. The rebound hardness category consists of a variety of testing methods, including Schmidt hammer (SH), Shore scleroscope, and Leeb rebound hardness (LRH), which are among the most commonly used ones [14]. In contrast, abrasion refers to the wearing or tearing away of particles from a solid surface [15]. Abrasion is a process that could cause the removal or displacement of material at a solid surface, leading to wear, especially on tools used in mining, drilling, milling, and tunneling applications. The Cerchar abrasivity index (CAI) is among the most widely used abrasion tests in geoengineering projects [16].
Rock hardness and abrasivity together could considerably affect all mining processes, from drilling to grinding. Figure 1 schematically illustrates how higher rock hardness and abrasivity could impact the entire mine-to-mill process. A brief description of the situation of facing high hardness and abrasive rock materials through each stage is provided here. In the drilling stage, the higher the hardness and abrasivity of the rocks, the higher the required energy for drilling, the lower the penetration rate, and the higher the wear and tear on drilling equipment (e.g., drill bit). Achieving acceptable fragmentation and blasting results (e.g., particle size distribution) requires more explosivity (higher blasting energy) in harder rocks than in softer ones. Increased wear and tear on the shovel bucket nails, loading cycling, and energy consumption are among the most adverse effects of hard rocks on shovels in the loading stage. Similar impacts can be seen in crushing and grinding. For example, feeding hard and more abrasive rock material into grinders for an extended period would decrease the efficiency of the comminution process, increase the comminution energy requirement, the wear of mill liners and ball consumption in ball mills, and increase the downtime for maintenance. More detailed information on the impacts of rock hardness and abrasivity on different mining and milling processes can be found in [6,17,18,19,20,21].
To date, numerous testing methods have been developed to assess rock hardness and abrasivity in the lab and on field scales. Like all other geomechanical testing methods, the proposed approaches for rock hardness and abrasivity have advantages and disadvantages [6]. On the one hand, these methods directly examine rock materials and provide engineers and decision-makers with insightful values. On the other hand, they are generally time-consuming, costly, and labor-demanding. To be more specific, these testing methods require rock samples to be collected, prepared, and tested, which in a mining operation environment simply means an interruption in the mining process, which is undesirable.
In this study, the primary focus is on the development of predictive models for rock hardness and abrasivity using remotely sensed hyperspectral imaging data. The aim is to provide decision-makers at mine sites with a nondestructive, contactless, and real-time method for estimating rock hardness and abrasivity. While previous research has explored the relationship between hyperspectral data and the mechanical properties of rocks, such as in references [22,23], these studies primarily utilized spectroradiometers for point measurements to investigate the correlation between rock properties and spectral information, which do not provide a sensing method. This study is one of the initial attempts to fill the gap in estimating rock hardness and abrasivity based on remotely sensed hyperspectral imaging data.
Hyperspectral imaging technology has emerged as a promising solution for automating the discrimination process in the past couple of decades. Hyperspectral imaging involves using an imaging spectrometer, also called a hyperspectral camera, to collect spectral information. A hyperspectral camera captures a scene’s light, separated into individual wavelengths or spectral bands. It provides a two-dimensional image of a scene while simultaneously recording the spectral information of each pixel in the image [24]. The rich spectral signatures obtained enable the differentiation of minerals based on their unique spectral patterns, i.e., absorption and reflectance. This capability paves the way for refined geological interpretations and resource assessments.
As a rapid, nondestructive, simultaneous multirange spectral technique, hyperspectral imaging can be applied in remote regions to rapidly collect a considerable amount of data, allowing engineers to better understand material characteristics throughout the entire mine-to-mill process. In the mining industry, hyperspectral remote sensing has been mainly used for mineral mapping, retrieving surface compositional information for mineral exploration, and for lithologic mapping and mine tailings with a focus on acid-generating minerals [25,26,27,28,29,30,31]. However, hyperspectral remote sensing has recently found its way into other geoengineering applications, such as characterizing rocks’ physical, geochemical, and mechanical properties [22,23,32,33,34,35,36,37,38,39].
Maras et al. [23] attempted to predict the physicomechanical properties of landscape rock samples using the mineralogical composition obtained as the arithmetic average of 16 spectroradiometer measurements. They claimed that although significant correlations were observed between the reflectance value of specific wavelengths and the physicomechanical properties of rocks, they could not confirm a cause-and-effect relationship, allowing them to predict the physicomechanical properties of rock through hyperspectral data. Schaefer et al. [32] explored the possibility of using reflectance spectroscopy to estimate the physical and mechanical properties of volcanic rocks, such as density, porosity, hardness, elasticity, and magnetic susceptibility. The authors assumed that these properties are related to the mineral composition and degradation of the rocks, which can be detected by the spectral signatures of iron-bearing minerals such as pyroxene, magnetite, and pyrite. The results of statistical analyses showed that reflectance spectroscopy has the potential to be a rapid and noninvasive method for characterizing volcanic rocks and suggested that aerial or satellite imaging spectroscopy could be used to map the geotechnical properties of volcanoes at a large scale. In similar studies, Kereszturi et al. [34] and Schaefer et al. [33] showed and confirmed the potential of hyperspectral imaging as a rapid and nondestructive method for characterizing the geomechanical properties of volcanic rocks. In another study by van Duijvenbode et al. [22] the use of material fingerprinting was discussed as a tool to investigate a link between rock hardness and rapidly acquired geochemical and hyperspectral data. The concept of material fingerprinting entails a fingerprint classification based on the similarity of the measured and constitutive material attributes. They concluded that the defined material fingerprints can explain material hardness, such as grindability. However, they did not propose a predictive model, and all conclusions were made based on visualization comparisons.
Considering that a mining project is an interconnected operation in which the performance of each stage impacts the performance of the downstream stages, the results of these predictive models could eventually help engineers and decision-makers better plan for the downstream processes. In fact, the hardness and abrasivity estimation from the hyperspectral scanning of a newly blasted rock pile could provide valuable information regarding the diggability, crushability, grindability of the materials, and their behavior during mining and processing in terms of ease and difficulty. For instance, knowing the hardness of ore rocks enables mineral processing engineers to make better decisions regarding the prioritizing of processing and mixing of different ore categories to reduce the milling time as the highest energy consumption sector in the entire mine-to-mill process. In addition, information obtained from the predictive models may also be used to optimize previous processes, such as drilling and blasting, by constantly updating and populating the mine database and block models, improving short-term mine planning. A good example of this situation could be using the obtained hardness and abrasivity information from each blasted rock pile to update and populate the geomechanical block model of the mine to better plan for future blast drillings.

2. Material and Methods

2.1. Framework of Study

Figure 2 schematically illustrates the current research framework. The data collection comprises scanning rock samples using hyperspectral sensors and performing hardness and abrasivity tests on the collected rock samples. The preprocessing step mainly focuses on applying necessary filters on the hyperspectral data. Afterward, an appropriate train and test split is applied to the collected data, considering the distribution of rock hardness and abrasivity values. Next, a feature extraction approach based on K-means clustering and band ratio concept is developed to reduce the dimensionality of hyperspectral data. The last step deals with the development of predictive models and performance assessments. The specific approaches and methodologies applied in each step will be discussed in detail in the following.

2.2. Data Collection

A total of 159 handpicked rock samples were taken from 6 distinct blasted muck piles within a gold mine to cover all the rock types within the deposit. The collected rock samples mainly include sedimentary rocks with varying alteration degrees, mafic, ultramafic, and intrusive ones. To better understand the mineralogical composition of the studied area, 34 rock samples selected randomly from all six piles were sent for X-ray diffraction (XRD) analysis. The main reasons for sending only 34 rock samples are that not only is the XRD analysis a destructive, time- and cost-consuming process, but the rest of the rock samples are also required for other purposes. The obtained XRD results are provided in Table 1. XRD analysis carried out on the selected rock samples revealed that quartz, biotite, muscovite, chlorite, dolomite, calcite, pyrite, talc, magnetite, hornblende, albite, and potassium feldspar are the most common minerals. However, it should be noted that not all minerals are present in every sample. For example, most samples can be classified into two categories: have a great amount of quartz or quartz-free samples. Similar situations could be observed for other minerals such as dolomite, calcite, and potassium feldspar. This result could confirm the diversity among the collected rock samples.
The data collection then took place in three stages. First, the collected rock samples were scanned using the HySpex Mjolnir VS-620 (Oslo, Norway), a state-of-the-art imaging system that stands out due to its high-performance specifications and scientific-grade data quality. The VS-620 operates in the visible near-infrared (VNIR) to short-wave infrared (SWIR) range, covering a spectral range from 400 to 2500 nm. In the VNIR range, the camera features 1240 spatial pixels and 200 spectral channels at 3 nm for each pixel and 620 spatial pixels and 300 spectral channels at 5.1 nm for each pixel in the SWIR range [40]. The VS-620 was mounted on a lab rack specifically designed for hyperspectral scanning. Figure 3a details different parts of the designed hyperspectral imaging setup.
The differences in the rock types and mineralogical compositions of the collected rock samples could result in a wide range of albedo when scanning the sample in the lab. Three different diffuse reflectance panels with known spectral curves were used to ensure the proper scanning of the samples. The reference panels have 20, 50, and 90 percent nominal reflectance values. It also must be mentioned that the rock samples were scanned separately. The 20 percent diffuse reflectance panel was employed for darker samples, while the 90 percent was used for brighter samples.
In the second step, rock samples were subjected to the LRH test, a portable, fast, nondestructive dynamic impact-and-rebound test. LRH was originally developed to assess metallic materials’ hardness, and has recently been extensively applied to geoengineering projects [7,41,42,43,44,45,46]. The theoretical basis of the LRH test is based on the principle of energy consumption. In this test, the hardness value is reported as the ratio of the rebound velocity of a hard tungsten carbide spherical tip impact body to its impact velocity. While the impact velocity is always constant, the physicomechanical properties of the tested rock sample, such as surface elasticity and strength, act as a resistance factor and reduce the rebounding velocity [47].
LRH tests were conducted using an Equotip 550 Leeb D device (Proceq, Schwerzenbach, Switzerland) (Figure 3b) according to ASTM procedures and manufacturer testing guidelines [47,48]. For consistency, the term HLD will be used to report the hardness value measured by a D-type Leeb rebound device in this paper. An integrated approach based on the small sample theory and confidence interval was used to determine the representative mean HLD value for each sample [45]. Based on this approach, considering an error level and a specific confidence interval, one could find the minimum number of LRH measurements instead of performing a predefined number of measurements, resulting in the representative mean HLD value for each rock sample. Regarding calibration, it must be noted that the device was continually calibrated either after ten consecutive sample tests or at the beginning of the testing day.
CAI is one of the most used commonly abrasion tests for determining rock abrasivity and wear estimation on mechanical tools. As originally designed by the Center d’Etudes et Recherches des Charbonages (Cerchar) de France in the 1970s for coal mining applications, the test has gradually gained acceptance in other geoengineering fields, such as hard rock mining and tunneling [16,49,50]. The CAI test can be performed according to different standards, such as AFNOR. NF P 94-430-1 [51], ASTM D7625-10 [52], and ISRM [15]. Several modifications have been made to enhance the accuracy and practicality of the CAI test to date; among them, the original design developed by Cerchar Center and modified by West are the most widely used ones [16].
In this study, the CAI tests were performed using the West device employing pins of HRC 54 (Wille Geotehnik, Rosdorf, Germany) (Figure 3c). The device consists of a mechanical vice to firmly hold the specimen while a hardened steel stylus with a 90-degree cone tip interacts over the rock surface under a constant load of 70 N. Then, the rock sample moves under the stationary loaded stylus, ensuring a 10 mm scratch on the surface of the rock specimen. Testing was conducted on freshly broken rock surfaces resulting from the point load index test. However, CAI tests were conducted on saw-cut surfaces of rocks, which could not be broken using a point load device. A correction factor of 1.14 was used to account for the smooth surface produced by the saw cut, as recommended by ISRM [15]. The average of seven CAI tests was calculated for each specimen to measure the representative mean CAI value using the following equation:
CAI = 10 × i = 1 7 d i
where d i is the average of measured wear on the stylus tip surface in two perpendicular profiles with an accuracy of 0.01 mm. A high-resolution calibrated camera mounted on a binocular microscope was used to measure the wear of the pins.

2.3. Data Preprocessing

The preprocessing of raw data obtained from hyperspectral scanning involved four main steps. In the first step, the radiometric correction was applied to VNIR and SWIR scans, which included the reflectance calibration. It is important to note that the raw scan stores information about the amount of radiance reflected from the surface of the scanned rock sample. The reference panels considered material with known spectral curves and calibrated the reflectance values measured on the rock surface. The second step addressed the oversaturation problem, which occurs when the sensor cannot capture the amount of reflected energy in a single shot. There are two ways to deal with this problem. First, it can be prevented by adjusting the tray speed, reducing light intensity, and using an appropriate reflectance panel. However, in some cases, such as having bright minerals in very dark content, obtaining a high-quality scan of the darker part may lead to oversaturation in the brighter parts. Conversely, attempting to scan the brighter parts may result in noise scans of the darker portions. The second approach to dealing with oversaturation is to mask the oversaturated pixels and remove their values. During data collection, caution was taken to avoid oversaturation; however, oversaturation was inevitable in some of the samples for the sake of the spectral curve’s quality.
The third step was to stack the VNIR and SWIR scans. As was mentioned, the HySpex Mjolnir VS-620 can cover the full spectral range from 400 to 2500 nm using two hyperspectral sensors with different specifications. The VNIR sensor features a spectral resolution of 3 nm, covering the range from 400 to 1000 nm in 200 bands. In contrast, the SWIR sensor configures the range from 970 to 2500 nm in 300 bands, ensuring a spectral resolution of 5.1 nm. One way to address the difference in the obtained spectral resolutions is to resample at a desired spectral resolution when stacking images. On the other hand, while the spatial resolution of the VINR sensor is exactly double that of the SWIR one, these two sensors are not fully synchronized, resulting in the capturing of different lengths of images. This would result in having a few more lines in the VNIR image than in the SWIR image. Lastly, due to a baseline of 75 mm between the two sensors, their fields of view do not completely overlap, resulting in a common area smaller than the nominal coverage area for each sensor (Figure 4a). It must be noted that the common area’s size depends on the scanning distance (H). The shorter the scanning distance, the smaller the size of the common area.
These discrepancies prompted the need for the following preprocessing step, aimed at stacking these two hyperspectral images for further analysis. A Python algorithm was developed based on search theory and the maximum similarity between the two images to achieve this. Figure 4b demonstrates the concept behind the staking algorithm. As can be seen, the proposed algorithm tries to find the best match between two images by finding the maximum similarity between them. The algorithm was applied to all samples, resulting in a stacked image for each rock sample. It is worth noting that due to the overlapping of some part of the VNIR range with SWIR, the resulting stack image would contain 487 bands, slightly lower than 500 bands. The final preprocessing step focuses on cropping and removing background spectra to reduce the size of the stacked image. This is accomplished by limiting and cropping images to only the rock sample part. Figure 5 illustrates most of the preprocessing steps for one of the samples.

2.4. Train and Test Split

Data splitting is an essential step in developing machine learning (ML) models. It allows us to simulate the process of performance assessment by utilizing new data as we develop new models. To consider the testing dataset as unseen data, one simple requirement must be met: no information must leak into the development process from the testing dataset. Data leakage generally occurs during the data wrangling and exploratory data analysis (EDA) steps. For example, when the model is informed about the parameter range, data leakage occurs by knowing the information about the entire dataset. Therefore, it is obvious that the dataset must be split first to prevent any data leaks.
Datasets can be split in different ways depending on their characteristics. One of the most utilized approaches is the random split method, which is implemented to split the dataset by random shuffle. There could be two issues with using the random split. First, it is necessary to assess the split bias since 159 rock samples were collected at six different locations during the experimental stage. The random shuffle bias occurs when one data point is randomly selected as the training set, and its adjacent point is selected as the testing set, and this usually happens when data are geospatially correlated, such as core log data. It should be noted that the rock samples were randomly collected in different parts of blasted muck piles; therefore, it can be said that the collected samples are not geospatially related to each other.
Figure 6 shows the box plots of the mean HLD and CAI values of rock samples from each sampling location. In Figure 6, blue asterisks represent the mean value of the sampling location. In addition, the number of samples taken from each location is also shown above each box plot. Variation within the collected samples at each location can be seen for both mean HLD and CAI, indicating no geospatial correlation. Otherwise, each sampling location should have tighter box plots with lower variation.
Secondly, imbalanced data may occur, which refers to the situation in which the data are not evenly distributed across the entire range of the target parameter. In this case, most data would be found in a certain range, while the remaining parts would have less data. Data imbalances can negatively affect the performance of predictive models since, typically, the more data that are introduced to a model, the better it performs. Therefore, if there is insufficient data for a specific range of the target parameter, the model cannot perform perfectly for that range. Various techniques can be used to address this issue, such as obtaining more data, which can be time-consuming and expensive. This study uses class weighting methods to ensure there is a similar ratio between the different classes in the training and testing dataset. As shown in Figure 7, two auxiliary thresholds are applied for both mean HLD and CAI values, dividing the distributions into three groups (low, medium, and high). For the HLD distribution, 450 and 650 are considered thresholds, and 1.5 and 3.5 are considered thresholds for the CAI distribution. It should be noted that these classes will only be used during the train and test split. As can be seen in Figure 8, the results of the random split addressing class imbalance for both HLD and CAI values are presented. In this study, 80 percent of the collected rock samples (127 rock samples) were considered for the training, and the remaining 20 percent (32 rock samples) were used only for performance assessment as unseen data.

2.5. Feature Extraction and Exploratory Data Analysis

One of the greatest challenges in developing predictive models using hyperspectral data is dealing with their high dimensionality. For instance, in this research, the final scan of each rock sample contains around 50,000 pixels with 487 channels each. Generally, developing machine learning or deep learning models using raw hyperspectral data requires more than thousands of records, which are impossible to collect in the field of geoengineering.
This study uses a feature extraction approach based on K-means clustering and the band ratio concept to extract the most important features. This section details the proposed feature extraction approach. The first step was to mosaic all stacked images of the training dataset. Figure 9 demonstrates the true RGB of the mosaic image used for the feature extraction purpose. Afterward, a K-means clustering was applied to this image to obtain the dominant spectral curves. The elbow method was employed to determine the optimum number of clusters. Hence, several K-means analyses with different numbers of clusters ranging from 2 to 10 were performed. As shown in Figure 10, a K-mean clustering with 7 clusters is the optimum way to obtain the dominant spectral curves. The mean spectral curves resulting from the K-means clustering with 7 clusters are shown in Figure 11. It should be noted that one of the spectral curves belongs to the image background and was excluded.
After retrieving the dominant spectral curves resulting from the K-means analysis, all six spectral curves were normalized and carefully studied for absorption peaks, reflectance peaks, and changes in slope. A feature was defined as a band ratio whenever a change was observed in the curves. Figure 12 shows the process of feature extraction. Table 2 provides information regarding the obtained band ratios for further analysis. In the next step, the approach was applied to all samples, reducing the dimensionality from an order of 487 to 28. For each sample, the main representative feature values were considered as the mean of the features over all pixels, and the resultant dataset was used for developing the predictive models.
To explore and understand the relationship between the selected band ratios, as the input features, and HLD and CAI values, as the target parameters, the correlation between them was explored. Figure 13 illustrates the pairwise correlation matrix of the selected band ratios and HLD and CAI values. The absolute correlation between each feature was considered to make the interpretation easier. In this case, while 0, represented by blue, indicates no linear relationship, 1, represented by green, indicates a perfect linear relationship. Although there could be strong correlations between some of the band ratios, the pairwise matrix reveals moderate correlations between some of the band ratios with the target parameters, i.e., HLD and CAI. To better understand these relations, the distribution of each feature concerning HLD and CAI classes was studied (Figure 14). It must be noted that the HLD and CAI classes were defined based on the auxiliary thresholds established in the previous section. As can be seen, certain selected features, such as feature 25 (F_25), exhibit a trend towards HLD and CAI classes. However, a meaningful relationship could not be observed for most of the selected features. It is worth noting that these distribution graphs were developed using the training dataset, and the testing dataset is yet to be revealed.
In Figure 13, a good correlation between HLD and CAI values can also be observed, which has a great deal of significance. Figure 15 provides a scatter plot of HLD and CAI values to better analyze this correlation. As can be seen, there is a noticeable linear relationship between HLD and CAI values. This aspect bears significance as establishing dependence between HLD and CAI could facilitate the development of a unified predictive model capable of simultaneously estimating both HLD and CAI values. To better understand their relationship, HLD and CAI values were categorized into three distinct groups using a similar approach used in the train and test split stage. A three-color system was used to distinguish between different classes of CAIs. In contrast, HLD classes were identified by varying marker sizes: the bigger the marker, the higher the HLD value. Despite the notable correlation, the relationship between HLD and CAI values is not quite clear. For instance, in the range 450 to 650 of HLD value, associated with medium hardness, CAI values range from 0.5 to 5, comprising all CAI classes. Therefore, separate predictive models were developed for each hardness variable instead of developing a unified predictive model capable of estimating both HLD and CAI values simultaneously.

2.6. Developing Predictive Models

The random forest regressor (RFR) algorithm was employed to develop predictive models for HLD and CAI values based on hyperspectral data. This section explains the theoretical concept behind the RFR algorithm and the steps taken to tune it.
A random forest (RF) algorithm is an ensemble learning algorithm that can handle both classification and regression tasks [54]. The term ensemble learning refers to a method for combining predictions from multiple machine learning algorithms, called base learners. In the case of the RF algorithm, numerous decision trees are applied to different subsets of a dataset, leading to higher precision estimations. Using multiple trees overcomes the instability and substantial variability that can arise when using individual decision tree models. This leads to inconsistent generalization behavior even with insignificant changes in the dataset [55]. In fact, by forming an ensemble model from independent individual trees, one could expect more robust predictions.
For each decision tree within the RF algorithm, the data are recursively divided into more homogeneous groups, called nodes, aiming to enhance the predictability of the response variable [56]. A split is determined by the values of predictor variables with significant explanatory factors. Then, an RF model is constructed by training multiple decision trees on different bootstrapped subsets of the data. In the case of the regression analysis, the final predicted value is the mean of the fitted responses from all the individual decision trees generated by each bootstrapped sample. The foundational principles and formulations underlying the RFR are elaborated upon by [54]. Figure 16 schematically demonstrates the structure of the RF regressor algorithm.
The performance of machine learning algorithms is directly controlled by the hyperparameters, and one must first find the optimal hyperparameter values to obtain the best performance on a given dataset. This process is called hyperparameter tuning and can be categorized into uninformed and informed methods [57]. Uninformed hyperparameter tuning involves systematically exploring a predefined set of hyperparameter values without leveraging any information about the model’s performance. There are different uniformed hyperparameter tuning methods, such as grid search or random search. In grid search, the hyperparameter space is discretized into a grid, and the model is trained and evaluated for each combination of hyperparameters. While grid search is exhaustive, it can be computationally expensive and may not be efficient when dealing with high-dimensional hyperparameter spaces. To address this problem, a random search approach was proposed, which found that for most datasets, only a few hyperparameters matter. However, Bergstra and Bengio [58] showed that random search becomes unreliable as the complexity of the model increases.
Informed hyperparameter tuning, on the other hand, utilizes information gained during the tuning process to guide the search more efficiently. Bayesian optimization (BO) is a popular method in this category. In this situation, the tuning process could be considered as an optimization problem, and since the objective function is unknown, traditional techniques, such as gradient descent, cannot be applied. BO is a highly effective technique for solving optimization problems that do not have a closed-form objective function [59,60]. Optimization involves using prior information about the unknown objective function and sample information to determine the posterior distribution of the function. The objective function can eventually be optimized based on this posterior information [61]. This study uses BO based on the Gaussian process to tune hyperparameters of the RFR algorithm. A complete guide on how BO has been used in this study for hyperparameter tuning can be found in [57]. The following section explores the results of the predictive models and assesses their performance

3. Results and Discussion

In this study, the RFR algorithm was used to develop models for predicting HLD and CAI values using VNIR and SWIR data. This section presents the obtained results for each model and then provides a comprehensive performance assessment of the models. To achieve this, several statistical evaluation indices, including coefficient of determination (R2), root mean square error (RMSE), and variance account for (VAF) between the measured and predicted values, are employed for evaluating the accuracy of the predictive models. A model is considered ideal when R2 is 1, RMSE is 0, and VAF is 100%. The following formulas for calculating the mentioned indices are presented in Equations (2)–(4).
R 2 = 1 S S R S S T
R M S E = 1 N i = 1 N A i P i 2
V A F = 1 v a r A i P i v a r A i × 100
where SSR, SST, A i , P i , and N are sum squared regression, the sum of squares total, the actual value of ith sample, the predicted value of ith sample, and the number of samples used for testing models, respectively.
As mentioned before, 32 records representing 20 percent of datasets, which were not incorporated in the development/training of the predictive models, were considered for testing the developed models. Figure 17 illustrates the performance evaluation for the developed models. A 1:1 plot between the actual and predicted values provides a concise and straightforward way to evaluate the performance of predictive models. The closer the data to the 1:1 line, the better the performance of the model. In addition, in the case of projecting actual and predicted values on horizontal and vertical axes, respectively, data points located under and above the 1:1 line represent under- and overestimation, respectively. As can be seen on the testing dataset, Figure 17b,e, the data are well scattered around the 1:1 line.
Regarding the performance comparison, the developed model of CAI with an R2 of 0.79 performs slightly better than the developed model for HLD with an R2 of 0.74. Considering the hyperspectral data as nondestructive remotely sensed data, the obtained results are promising. In addition, Figure 17c,f show the residual plot between actual and predicted values for HLD and CAI models as a graphical means for analyzing the pattern of differences between actual and predicted values. In the ideal case, the residual plot should be randomly distributed around the r e s i d u a l   v a l u e = 0 , and no clear pattern should be observed. Evidently, no pattern can be observed in both cases, confirming that the predictive model is performing well, and the residuals are randomly distributed around zero.
The impact of input features on the performance of the developed models for predicting HLD and CAI values was studied using the concentration approach by SHAP analysis, which is a method that explains the output of machine learning models by attributing the contribution of each feature to the final prediction. It uses game theory to fairly distribute the “credit” of the prediction among the input features, making it easier to understand which features are most influential [62]. The obtained results could be used as the predictive power indicator of input features. With this, we can gain more insight into the problem and select features when there are too many variables for further studies. Figure 18 shows the results of SHAP analysis on the predictive models for the top 15 features in each case. Red represents the lower value, and light green represents the higher value of each input feature. Regarding the degree of importance, the wider the distribution of the input feature, the greater the impact on the prediction. It should also be noted that the top 15 features are sorted from most to least predictive on the vertical axis, from top to bottom. Based on Figure 18, “F_25”, “F_6”, “F_23”, and “F_13” are among the most influential input features for both HLD and CAI. As can be seen, the higher the “F_25”, the higher both the predicted HLD and CAI values.
In addition, when referring to Table 2, the obtained band ratios from the feature extraction approach, one can easily see that the most important features are located within the SWIR region in both HLD and CAI cases. This aspect bears significance, as establishing accurate predictive models solely based on either VNIR or SWIR data could significantly enhance the practicality of the proposed method by reducing the dimensionality of the scanning data and speeding up the data processing. Therefore, an attempt was made to develop separate predictive models based solely on VNIR or SWIR and compare their performance with the initial case, i.e., predictive models based on VNIR and SWIR data. Figure 19 illustrates the R2 of different models on the testing dataset. Although the performance of the SWIR-based models is slightly lower than the initial case, the results of VNIR-based models are not within the acceptable range. In addition, for all three different situations, it was observed that the predictive models perform slightly better for CAI compared to HLD values, also highlighted in Table 3.
The hyperspectral data used in this study for developing predictive models were acquired in a controlled laboratory environment, with ideal illumination, no moisture content, no atmospheric effects, and constant scanning distance, which could differ from the field conditions. Because such parameters could significantly affect the quality of hyperspectral data [63], investigating their effects on the performance of the developed models is of vital importance for developing more robust predictive models based on field conditions. It is recognized that the conditions mentioned above might affect the results of the developed predictive models based on field data.
Illumination generally affects the spectral intensity and does not significantly alter the spectral shape. Odermatt and Gege [64] investigated the effects of sun angle on the spectral curve and showed that a change in sun angle could only result in a change in spectral intensity. Similar results also can be observed in [65]. On the other hand, Philpot and Tian [66] studied the effects of soil water content on the spectral reflectance curves and explained that while the higher water content results in a lower intensity of the spectral curve, the shape of the spectral curve is almost the same.
The other factor affecting the quality of the hyperspectral images is the scanning distance (H), the distance between the sensors and the scanning surface, as shown in Figure 4a. This distance is longer in the field and can affect the spatial resolution of the scanned data. This factor could affect the hyperspectral images in two ways. First, the longer the scanning distance, the bigger the pixel size, which means more mixing of the spectral curves. Secondly, in field practices, increasing H would increase the potential for greater atmospheric effects. It must be noted that this study does not address the uncertainties associated with the atmospheric condition and only focuses on the effects of spatial resolution on the proposed approach. Regarding the effects of different pixel sizes, i.e., different scanning distances, a resampling approach was used to simulate the situation of scanning the collected rock samples from different H. The main reason for following such an approach is that it was not possible to increase the actual distance between VS620 and rock samples in the lab due to the lab rack’s limitations.
The utilized resampling approach consists of three steps: (1) considering a specific window size, (2) averaging the spectral curves within the window size, and then (3) assigning the average spectral curve to the new pixel. Four different window sizes, ranging from 1 to 4, were considered for the resampling purposes, representing a scanning distance ranging from 3 m to 9 m. A window size of 1 means considering a tile of 3-by-3 pixels of the original image and assigning the mean spectral value of that tile to a pixel in the new image. In this case, the pixel size will increase from 0.57 mm by 0.57 mm in the original image to 1.71 mm by 1.71 mm in the resampled image. It must be noted that the window size of 1 simulates the hyperspectral scanning from a scanning distance of 3 m. More detailed information regarding the considered window sizes is mentioned in Table 4.
The resampling approach was only applied to the testing dataset, and the main reason for doing so was to see whether the lab-based predictive models could be applied in the field conditions or not. In fact, the scanning distance was tried to see how it could impact the proposed feature extraction approach and the performance of predictive models. Figure 20 shows the results of the resampling approach on sample GS1-17. To better illustrate the effects of the resampling approach on the spectral curve, Figure 21 compares the spectral curves of the specified spot in Figure 20 for the original and different resampled images. It is worth noting that spectral curves are shifted upward for better comparison only. As can be easily seen, the general trends of all spectral curves are the same, and the only obvious difference is that as the order of window size increases, the obtained spectral curves tend to be smoother. In addition, Figure 21 shows the four most influential extracted features of the proposed models (F_25, F_23, F_13, and F_6).
Afterward, the resulting hyperspectral images from the resampling approach were fed into the feature extraction approach, and then the obtained results were fed into the predictive models. Bar plots were used to compare the performance of predicted HLD and CAI values for different window sizes used in the resampling approach (Figure 22). No significant difference between the predicted values using original and resampled images can be observed, which confirms that the proposed model is not considerably affected by the scanning distance.

4. Conclusions

Rock hardness and abrasivity are among the most critical properties of rocks, and they can significantly impact the mine-to-mill process. Accurate and continuous estimation and understanding of the spatial variabilities of rock hardness and abrasivity within the ore deposit can lead to better planning and optimization of the entire mine-to-mill process by allowing informed decisions on equipment selection, process parameters, blending strategies, and maintenance schedules. Currently, there are no remote sensing methods to estimate the rock hardness and abrasivity. Hence, this study utilized the RFR algorithm to predict rock hardness and abrasivity based on hyperspectral imaging as a rapid, nondisruptive, large-scale, and multipurpose data accusation approach. To achieve this, 159 handpicked rock samples were collected from six distinct blasted muck piles within a gold mine, which are mainly sedimentary and ultramafic rocks. The testing procedure consisted of two stages: (1) scanning samples using the hyperspectral VNIR and SWIR sensors, and (2) performing LRH and CAI tests to quantify the hardness and abrasivity of rock samples. Due to the high dimensionality of hyperspectral data, 487 bands, a feature extraction approach based on K-means clustering and band ratio concept was applied to the training dataset, 80% of the samples, resulting in 28 band-ratio-based features.
The preliminary data assessment revealed a significant correlation between HLD and CAI. However, since their relationship was unclear, separate predictive models were developed for HLD and CAI. A Bayesian optimization technique based on the Gaussian process was used to tune the hyperparameters of the RFR algorithm. The performance of the proposed predictive models for HLD and CAI was checked using three statistical indices, including R2, RMSE, and VAF, in each case. The obtained results showed that the developed models can effectively predict rock hardness and abrasivity. However, the proposed model for CAI with an R2 of 0.79 performed slightly better than the model for HLD with an R2 of 0.74. Considering the hyperspectral data as nondestructive remotely sensed data, the obtained results are promising. SHAP analysis was employed to assess the effects of the input features on the performance of predictive models. The obtained results showed that “F_25”, “F_6”, “F_23”, and “F_13” are among the most influential input features for both HLD and CAI models, mainly located within the SWIR region. Therefore, an attempt was made to develop predictive models based solely on SWIR data, which resulted in a 10 percent reduction in R2 in both HLD and CAI cases. In the last step, the effects of scanning distance on the performance of proposed models were investigated. The results showed that the proposed feature extraction and predictive models are not affected by increasing the scanning distance up to 9 m. In future works, the effects of other factors, including moisture, dust, illumination, and different scanning distances, will be evaluated in the field.

Author Contributions

Conceptualization, S.G. and K.E.; methodology, S.G. and K.E.; software, S.G.; validation, S.G. and K.E.; formal analysis, S.G. and K.E.; investigation, S.G. and K.E.; resources, K.E.; data curation, S.G.; writing—original draft preparation, S.G.; writing—review and editing, K.E.; visualization, S.G.; supervision, K.E.; project administration, K.E.; funding acquisition, K.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science and Engineering Research Council of Canada (NSERC), grant number ALLRP 561062–20.

Data Availability Statement

The datasets presented in this article are not readily available because they are part of an ongoing study and require permission from the mining company.

Acknowledgments

The authors would like to acknowledge Weir Group and the Natural Science and Engineering Research Council of Canada (NSERC) for their financial support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Thuro, K.; Singer, J.; Kasling, H.; Bauer, M. Soil Abrasivity Assessment Using the LCPC Testing Device. Felsbau 2006, 24, 37–45. [Google Scholar]
  2. Teymen, A. The Usability of Cerchar Abrasivity Index for the Estimation of Mechanical Rock Properties. Int. J. Rock Mech. Min. Sci. 2020, 128, 104258. [Google Scholar] [CrossRef]
  3. Deketh, H. Wear of Rock Cutting Tools: Laboratory Experiments on the Abrasivity of Rock, 1st ed.; CRC Press: Rotterdam, The Netherlands, 1995. [Google Scholar]
  4. Hui, S. Collaboration to Reduce Wear and Corrosion Cost for the Mining Industry; National Research Council Canada: Vancouver, BC, Canada, 2018. [Google Scholar]
  5. Demirdag, S.; Yavuz, H.; Altindag, R. The Effect of Sample Size on Schmidt Rebound Hardness Value of Rocks. Int. J. Rock Mech. Min. Sci. 2009, 46, 725–730. [Google Scholar] [CrossRef]
  6. Ghorbani, S.; Hoseinie, S.H.; Ghasemi, E.; Sherizadeh, T. A Review on Rock Hardness Testing Methods and Their Applications in Rock Engineering. Arab. J. Geosci. 2022, 15, 1067. [Google Scholar] [CrossRef]
  7. Gomez-Heras, M.; Benavente, D.; Pla, C.; Martinez-Martinez, J.; Fort, R.; Brotons, V. Ultrasonic Pulse Velocity as a Way of Improving Uniaxial Compressive Strength Estimations from Leeb Hardness Measurements. Constr. Build. Mater. 2020, 261, 119996. [Google Scholar] [CrossRef]
  8. Ince, I.; Bozdag, A. An Investigation on Sample Size in Leeb Hardness Test and Prediction of Some Index Properties of Magmatic Rocks. Arab. J. Geosci. 2021, 14, 182. [Google Scholar] [CrossRef]
  9. Benavente, D.; Fort, R.; Gomez-Heras, M. Improving Uniaxial Compressive Strength Estimation of Carbonate Sedimentary Rocks by Combining Minimally Invasive and Non-Destructive Techniques. Int. J. Rock Mech. Min. Sci. 2021, 147, 104915. [Google Scholar] [CrossRef]
  10. Zhang, W.; Wang, Z.; Shi, Z.; Xu, P.; Chang, Z. Influence Mechanism of High Temperature on Drilling Rate and Hardness of Sandstone. Nat. Resour. Res. 2022, 31, 2589–2601. [Google Scholar] [CrossRef]
  11. Garrido, M.E.; Petnga, F.B.; Martínez-Ibáñez, V.; Serón, J.B.; Hidalgo-Signes, C.; Tomás, R. Predicting the Uniaxial Compressive Strength of a Limestone Exposed to High Temperatures by Point Load and Leeb Rebound Hardness Testing. Rock Mech. Rock Eng. 2022, 55, 1–17. [Google Scholar] [CrossRef]
  12. Akbay, D.; Ekincioğlu, G. Estimating the Brittleness Values of Carbonated Rocks with Shore, Schmidt, and Leeb Hardness Values. Environ. Earth Sci. 2022, 81, 206. [Google Scholar] [CrossRef]
  13. Ghorbani, S.; Hoseinie, S.H.; Ghasemi, E.; Sherizadeh, T. Application of Leeb Hardness Test in Prediction of Dynamic Elastic Constants of Sedimentary and Igneous Rocks. Geotech. Geol. Eng. 2022, 40, 3125–3145. [Google Scholar] [CrossRef]
  14. Çelik, S.B.; Çobanoğlu, İ. Comparative Investigation of Shore, Schmidt, and Leeb Hardness Tests in the Characterization of Rock Materials. Environ. Earth Sci. 2019, 78, 554. [Google Scholar] [CrossRef]
  15. Alber, M.; Yaralı, O.; Dahl, F.; Bruland, A.; Käsling, H.; Michalakopoulos, T.N.; Cardu, M.; Hagan, P.; Aydın, H.; Özarslan, A. ISRM Suggested Method for Determining the Abrasivity of Rock by the CERCHAR Abrasivity Test. Rock Mech. Rock Eng. 2013, 47, 261–266. [Google Scholar] [CrossRef]
  16. Rostami, J.; Ghasemi, A.; Alavi Gharahbagh, E.; Dogruoz, C.; Dahl, F. Study of Dominant Factors Affecting Cerchar Abrasivity Index. Rock Mech. Rock Eng. 2014, 47, 1905–1919. [Google Scholar] [CrossRef]
  17. Yarali, O.; Yaşar, E.; Bacak, G.; Ranjith, P.G. A Study of Rock Abrasivity and Tool Wear in Coal Measures Rocks. Int. J. Coal Geol. 2008, 74, 53–66. [Google Scholar] [CrossRef]
  18. Hoseinie, S.H.; Ataei, M.; Mikaiel, R. Comparison of Some Rock Hardness Scales Applied in Drillability Studies. Arab. J. Sci. Eng. 2012, 37, 1451–1458. [Google Scholar] [CrossRef]
  19. Majeed, Y.; Abu Bakar, M.Z.; Butt, I.A. Abrasivity Evaluation for Wear Prediction of Button Drill Bits Using Geotechnical Rock Properties. Bull. Eng. Geol. Environ. 2020, 79, 767–787. [Google Scholar] [CrossRef]
  20. Jamal, A.; Kumar, R.; Singh, R.; Singh, N.; Digarse, A.K. Determination Of Longevity Of Teeth In Buckets Of Loading Equipment In Coal Mines—A Case Study. Int. J. Sci. Technol. Res. 2016, 5, 25–35. [Google Scholar]
  21. Haffez, G.S.A. Correlation between Work Index and Mechanical Properties of Some Saudi Ores. Mater. Test. 2012, 54, 108–112. [Google Scholar] [CrossRef]
  22. van Duijvenbode, J.R.; Cloete, L.M.; Shishvan, M.S.; Buxton, M.W.N. Material Fingerprinting as a Tool to Investigate between and within Material Type Variability with a Focus on Material Hardness. Min. Eng. 2022, 189, 107885. [Google Scholar] [CrossRef]
  23. Maras, E.E.; Caniberk, M.; Odabas, M.S.; Degerli, B.; Maras, S.S.; Maras, H.H. An Evaluation of the Relationship between Physical/Mechanical Properties and Mineralogy of Landscape Rocks as Determined by Hyperspectral Reflectance. Arab. J. Geosci. 2016, 9, 164. [Google Scholar] [CrossRef]
  24. Yang, B.; Wang, S.; Li, S.; Zhou, B.; Zhao, F.; Ali, F.; He, H. Research and Application of UAV-Based Hyperspectral Remote Sensing for Smart City Construction. Cogn. Robot. 2022, 2, 255–266. [Google Scholar] [CrossRef]
  25. Lypaczewski, P.; Rivard, B.; Gaillard, N.; Perrouty, S.; Piette-Lauzière, N.; Bérubé, C.L.; Linnen, R.L. Using Hyperspectral Imaging to Vector towards Mineralization at the Canadian Malartic Gold Deposit, Québec, Canada. Ore Geol. Rev. 2019, 111, 102945. [Google Scholar] [CrossRef]
  26. Gaillard, N.; Williams-Jones, A.E.; Clark, J.R.; Lypaczewski, P.; Salvi, S.; Perrouty, S.; Piette-Lauzière, N.; Guilmette, C.; Linnen, R.L. Mica Composition as a Vector to Gold Mineralization: Deciphering Hydrothermal and Metamorphic Effects in the Malartic District, Quebec. Ore Geol. Rev. 2018, 95, 789–820. [Google Scholar] [CrossRef]
  27. Dalm, M.; Buxton, M.W.N.; van Ruitenbeek, F.J.A. Ore–Waste Discrimination in Epithermal Deposits Using Near-Infrared to Short-Wavelength Infrared (NIR-SWIR) Hyperspectral Imagery. Math. Geosci. 2019, 51, 849–875. [Google Scholar] [CrossRef]
  28. Tuşa, L.; Kern, M.; Khodadadzadeh, M.; Blannin, R.; Gloaguen, R.; Gutzmer, J. Evaluating the Performance of Hyperspectral Short-Wave Infrared Sensors for the Pre-Sorting of Complex Ores Using Machine Learning Methods. Miner. Eng. 2020, 146, 106150. [Google Scholar] [CrossRef]
  29. Mathieu, M.; Roy, R.; Launeau, P.; Cathelineau, M.; Quirt, D. Alteration Mapping on Drill Cores Using a HySpex SWIR-320m Hyperspectral Camera: Application to the Exploration of an Unconformity-Related Uranium Deposit (Saskatchewan, Canada). J. Geochem. Explor. 2017, 172, 71–88. [Google Scholar] [CrossRef]
  30. Abdolmaleki, M.; Consens, M.; Esmaeili, K. Ore-Waste Discrimination Using Supervised and Unsupervised Classification of Hyperspectral Images. Remote Sens. 2022, 14, 6386. [Google Scholar] [CrossRef]
  31. Akbar, S.; Abdolmaleki, M.; Ghadernejad, S.; Esmaeili, K. Applying Knowledge-Based and Data-Driven Methods to Improve Ore Grade Control of Blast Hole Drill Cuttings Using Hyperspectral Imaging. Remote Sens. 2024, 16, 2823. [Google Scholar] [CrossRef]
  32. Schaefer, L.N.; Kereszturi, G.; Villeneuve, M.; Kennedy, B. Determining Physical and Mechanical Volcanic Rock Properties via Reflectance Spectroscopy. J. Volcanol. Geotherm. Res. 2021, 420, 107393. [Google Scholar] [CrossRef]
  33. Schaefer, L.N.; Kereszturi, G.; Ben, K.M.; Villeneuve, M. Characterizing Lithological, Weathering, and Hydrothermal Alteration Influences on Volcanic Rock Properties via Spectroscopy and Laboratory Testing: A Case Study of Mount Ruapehu Volcano, New Zealand. Bull. Volcanol. 2023, 85, 43. [Google Scholar] [CrossRef]
  34. Kereszturi, G.; Heap, M.; Schaefer, L.N.; Darmawan, H.; Deegan, F.M.; Kennedy, B.; Komorowski, J.C.; Mead, S.; Rosas-Carbajal, M.; Ryan, A.; et al. Porosity, Strength, and Alteration—Towards a New Volcano Stability Assessment Tool Using VNIR-SWIR Reflectance Spectroscopy. Earth Planet Sci. Lett. 2023, 602, 117929. [Google Scholar] [CrossRef]
  35. Lee, S.J.; Jeong, G.C.; Kim, J.T. Analysis and Comparison of Rock Spectroscopic Information Using Drone-Based Hyperspectral Sensor. J. Eng. Geol. 2021, 31, 479–492. [Google Scholar] [CrossRef]
  36. Bakun-Mazor, D.; Ben-Ari, Y.; Notesko, G.; Marco, S.; Ben-Dor, E. Measuring Carbonate Rock Strength Using Spectroscopy across the Optical and Thermal Region. In Proceedings of the IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2021; Volume 833. [Google Scholar]
  37. Okada, N.; Maekawa, Y.; Owada, N.; Haga, K.; Shibayama, A.; Kawamura, Y. Automated Identification of Mineral Types and Grain Size Using Hyperspectral Imaging and Deep Learning for Mineral Processing. Minerals 2020, 10, 809. [Google Scholar] [CrossRef]
  38. Bakun-Mazor, D.; Ben-Ari, Y.; Marco, S.; Ben-Dor, E. Predicting Mechanical Properties of Carbonate Rocks Using Spectroscopy across 0.4–12 Μm. Rock Mech. Rock Eng. 2024, 1–18. [Google Scholar] [CrossRef]
  39. Stead, D.; Donati, D.; Wolter, A.; Sturzenegger, M. Application of Remote Sensing to the Investigation of Rock Slopes: Experience Gained and Lessons Learned. ISPRS Int. J. Geoinf. 2019, 8, 296. [Google Scholar] [CrossRef]
  40. HySpex Mjolnir VS-620 Configuration. Available online: https://www.hyspex.com/hyspex-products/hyspex-mjolnir/hyspex-mjolnir-vs-620/ (accessed on 20 May 2024).
  41. Leeb, D. Dynamic Hardness Testing of Metallic Materials. NDT Int. 1979, 12, 274–278. [Google Scholar] [CrossRef]
  42. Aoki, H.; Matsukura, Y. A New Technique for Non-Destructive Field Measurement of Rock-Surface Strength: An Application of the Equotip Hardness Tester to Weathering Studies. Earth Surf. Process Landf. 2007, 32, 1759–1769. [Google Scholar] [CrossRef]
  43. Desarnaud, J.; Kiriyama, K.; Bicer Simsir, B.; Wilhelm, K.; Viles, H. A Laboratory Study of Equotip Surface Hardness Measurements on a Range of Sandstones: What Influences the Values and What Do They Mean? Earth Surf. Process. Landf. 2019, 44, 1419–1429. [Google Scholar] [CrossRef]
  44. Bhuiyan, M.; Esmaeili, K.; Ordóñez-Calderón, J.C. Evaluation of Rock Characterization Tests as Geometallurgical Predictors of Bond Work Index at the Tasiast Mine, Mauritania. Min. Eng. 2022, 175, 107293. [Google Scholar] [CrossRef]
  45. Ghadernejad, S.; Esmaeili, K. The Application of Small Sample Theory and Confidence Interval Method to Determine the Representative Mean Leeb Rebound Hardness Value. Bull. Eng. Geol. Environ. 2024, 83, 25. [Google Scholar] [CrossRef]
  46. Houshmand, N.; Esmaeili, K.; Goodfellow, S.; Carlos Ordóñez-Calderón, J. Predicting Rock Hardness Using Gaussian Weighted Moving Average Filter on Borehole Data and Machine Learning. Min. Eng. 2023, 204, 108448. [Google Scholar] [CrossRef]
  47. ASTM A956; Standard Test Method for Leeb Hardness Testing of Steel Products. ASTM: West Conshohocken, PA, USA, 2017. [CrossRef]
  48. Proceq, S.A. Equotip 3 Portable Hardness Tester, Operating Instructions; Proceq S.A: Schwerzenbach, Switzerland, 2007. [Google Scholar]
  49. Alber, M. Stress Dependency of the Cerchar Abrasivity Index (CAI) and Its Effects on Wear of Selected Rock Cutting Tools. Tunn. Undergr. Space Technol. 2008, 23, 351–359. [Google Scholar] [CrossRef]
  50. Plinninger, R.; Kasling, H.; Thuro, K. Wear Prediction in Hardrock Excavation Using the CERCHAR Abrasiveness Index (CAI). In Proceedings of the Eurock 2004 and 53rd Geomechanics Colloquium, Salzburg, Austria, 7–9 October 2004; pp. 599–604. [Google Scholar]
  51. AFNOR. Determination du Pouvoir Abrasif d’une Roche-Partie 1: Essai de Rayure Avec une Pointe (NF P 94-430-1); Association française de Normalisation: Paris, France, 2000. [Google Scholar]
  52. ASTM D7625-10; Standard Test Method for Laboratory Determination of Abrasiveness of Rock Using the CERCHAR Method. ASTM: West Conshohocken, PA, USA, 2010.
  53. Ghadernejad, S.; Esmaeili, K. Investigating The Relationship Between Geochemistry, Leeb Rebound Hardness, and Cerchar Abrasivity Index. Int. J. Geomech. 2024, 24, 04024280. [Google Scholar] [CrossRef]
  54. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  55. Brown, G. Ensemble Learning. In Encyclopedia of Machine Learning; Springer: Boston, MA, USA, 2011; pp. 312–320. [Google Scholar]
  56. Berhane, T.M.; Lane, C.R.; Wu, Q.; Autrey, B.C.; Anenkhonov, O.A.; Chepinoga, V.V.; Liu, H. Decision-Tree, Rule-Based, and Random Forest Classification of High-Resolution Multispectral Imagery for Wetland Mapping and Inventory. Remote Sens. 2018, 10, 580. [Google Scholar] [CrossRef]
  57. Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar] [CrossRef]
  58. Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
  59. Betrò, B. Bayesian Methods in Global Optimization. J. Glob. Optim. 1991, 1, 1–14. [Google Scholar] [CrossRef]
  60. Turner, R.; Eriksson, D.; Mccourt, M.; Kiili, J.; Xu, V.Z.; Escalante, H.J.; Hofmann, K. Bayesian Optimization Is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020. Proc. Mach. Learn. Res. 2021, 133, 3–26. [Google Scholar]
  61. Jones, D.R. A Taxonomy of Global Optimization Methods Based on Response Surfaces. J. Glob. Optim. 2001, 21, 345–383. [Google Scholar] [CrossRef]
  62. Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Association Inc.: New York, NY, USA, 2017; pp. 4766–4775. [Google Scholar]
  63. Choros, K.A.; Job, A.T.; Edgar, M.L.; Austin, K.J.; Mcaree, P.R. Can Hyperspectral Imaging and Neural Network Classification Be Used for Ore Grade Discrimination at the Point of Excavation? Sensors 2022, 22, 2684. [Google Scholar] [CrossRef] [PubMed]
  64. Odermatt, D.; Gege, P. Lake Colors: Interpreting Apparent Optical Properties. Encycl. Inland Waters 2022, 1, 474–489. [Google Scholar] [CrossRef]
  65. Manea, D.; Calin, M.A. Hyperspectral Imaging in Different Light Conditions. Imaging Sci. J. 2015, 63, 214–219. [Google Scholar] [CrossRef]
  66. Philpot, W.; Tian, J. The Hyperspectral Soil Line: A Preliminary Description. In Proceedings of the Light, Energy and the Environment; Optica Publishing Group: Leipzig, Germany, 2016. [Google Scholar]
Figure 1. The rock hardness and rock abrasivity footprints on the entire mine-to-mill process.
Figure 1. The rock hardness and rock abrasivity footprints on the entire mine-to-mill process.
Remotesensing 16 03778 g001
Figure 2. Workflow of developing predictive models for rock hardness and abrasivity using hyperspectral data.
Figure 2. Workflow of developing predictive models for rock hardness and abrasivity using hyperspectral data.
Remotesensing 16 03778 g002
Figure 3. Data collection steps: (a) hyperspectral imaging system, (b) LRH test, (c) CAI test.
Figure 3. Data collection steps: (a) hyperspectral imaging system, (b) LRH test, (c) CAI test.
Remotesensing 16 03778 g003
Figure 4. The schematic illustration of (a) the hyperspectral scanning using HySpex VS-620 and (b) the search algorithm for staking VNIR and SWIR images.
Figure 4. The schematic illustration of (a) the hyperspectral scanning using HySpex VS-620 and (b) the search algorithm for staking VNIR and SWIR images.
Remotesensing 16 03778 g004
Figure 5. The results of the preprocessing of hyperspectral data.
Figure 5. The results of the preprocessing of hyperspectral data.
Remotesensing 16 03778 g005
Figure 6. Boxplots of (a) mean HLD value and (b) mean CAI value of different sampling locations.
Figure 6. Boxplots of (a) mean HLD value and (b) mean CAI value of different sampling locations.
Remotesensing 16 03778 g006
Figure 7. Distribution of (a) mean HLD value and (b) mean CAI value with respect to the considered thresholds.
Figure 7. Distribution of (a) mean HLD value and (b) mean CAI value with respect to the considered thresholds.
Remotesensing 16 03778 g007
Figure 8. (a) The ratio of HLD classes within the training dataset, (b) the ratio of HLD classes within the testing dataset, (c) the ratio of CAI classes within the training dataset, and (d) the ratio of CAI classes within the testing dataset.
Figure 8. (a) The ratio of HLD classes within the training dataset, (b) the ratio of HLD classes within the testing dataset, (c) the ratio of CAI classes within the training dataset, and (d) the ratio of CAI classes within the testing dataset.
Remotesensing 16 03778 g008
Figure 9. True color illustration of the mosaic images (training dataset: 127 rock samples).
Figure 9. True color illustration of the mosaic images (training dataset: 127 rock samples).
Remotesensing 16 03778 g009
Figure 10. Visualization of the elbow method used for determining the optimum number of clusters.
Figure 10. Visualization of the elbow method used for determining the optimum number of clusters.
Remotesensing 16 03778 g010
Figure 11. The mean spectral curves of the K-means analysis with 7 clusters (training dataset).
Figure 11. The mean spectral curves of the K-means analysis with 7 clusters (training dataset).
Remotesensing 16 03778 g011
Figure 12. The process of feature extraction on the dominant spectral curves obtained from the K-means clustering analysis.
Figure 12. The process of feature extraction on the dominant spectral curves obtained from the K-means clustering analysis.
Remotesensing 16 03778 g012
Figure 13. Pairwise correlation matrix of different band ratios along with HLD and CAI values on the training dataset (red boxes highlight the potential relationship between band ratios, HLD, and CAI values).
Figure 13. Pairwise correlation matrix of different band ratios along with HLD and CAI values on the training dataset (red boxes highlight the potential relationship between band ratios, HLD, and CAI values).
Remotesensing 16 03778 g013
Figure 14. The distribution of the last three selected band ratios, color-coded based on the HLD and CAI classes.
Figure 14. The distribution of the last three selected band ratios, color-coded based on the HLD and CAI classes.
Remotesensing 16 03778 g014
Figure 15. The relationship between HLD and CAI values [53].
Figure 15. The relationship between HLD and CAI values [53].
Remotesensing 16 03778 g015
Figure 16. The structure of the RFR algorithm.
Figure 16. The structure of the RFR algorithm.
Remotesensing 16 03778 g016
Figure 17. The performance evaluation of the developed models: (a) 1:1 plot for the HLD model on the training dataset, (b) 1:1 plot for the HLD model on the testing dataset, (c) residual plot for the HLD model on the testing dataset, (d) 1:1 plot for the CAI model on the training dataset, (e) 1:1 plot for the CAI model on the testing dataset, (f) residual plot for the CAI model on the testing dataset.
Figure 17. The performance evaluation of the developed models: (a) 1:1 plot for the HLD model on the training dataset, (b) 1:1 plot for the HLD model on the testing dataset, (c) residual plot for the HLD model on the testing dataset, (d) 1:1 plot for the CAI model on the training dataset, (e) 1:1 plot for the CAI model on the testing dataset, (f) residual plot for the CAI model on the testing dataset.
Remotesensing 16 03778 g017
Figure 18. SHAP feature importance analysis in (a) the HLD model and (b) the CAI model.
Figure 18. SHAP feature importance analysis in (a) the HLD model and (b) the CAI model.
Remotesensing 16 03778 g018
Figure 19. Comparison of R2 values for different predictive models (VNIR, SWIR, VNIR-SWIR) for HLD and CAI.
Figure 19. Comparison of R2 values for different predictive models (VNIR, SWIR, VNIR-SWIR) for HLD and CAI.
Remotesensing 16 03778 g019
Figure 20. The results of the resampling approach on sample GS1-17: (a) original image, (b) resampling using a window size of 1, (c) resampling using a window size of 2, (d) resampling using a window size of 3, and (e) resampling using a window size of 4.
Figure 20. The results of the resampling approach on sample GS1-17: (a) original image, (b) resampling using a window size of 1, (c) resampling using a window size of 2, (d) resampling using a window size of 3, and (e) resampling using a window size of 4.
Remotesensing 16 03778 g020
Figure 21. The spectral curve comparison between the original image and resampled ones for the specified spot in Figure 20 for sample GS1-17.
Figure 21. The spectral curve comparison between the original image and resampled ones for the specified spot in Figure 20 for sample GS1-17.
Remotesensing 16 03778 g021
Figure 22. The performance comparison of the developed models tested using the original and resampled data for (a) HLD and (b) CAI.
Figure 22. The performance comparison of the developed models tested using the original and resampled data for (a) HLD and (b) CAI.
Remotesensing 16 03778 g022
Table 1. The results of XRD analysis on the selected 34 rock samples.
Table 1. The results of XRD analysis on the selected 34 rock samples.
SampleQBMChDCaPTMHAPo
Pile1-01131000630004730
Pile1-05311050031000446
Pile1-1019161210110003812
Pile1-32285820510003911
Pile1-48201210740002243
Pile2-012910137000000354
Pile2-02319245000000282
Pile2-042713152010000374
Pile2-08181612001000555
Pile3-01192360420004222
Pile3-0200023670513910
Pile3-0411701010417103
Pile3-0600024530594310
Pile3-09027040301025333
Pile3-1224240000000645
Pile4-01000258604351310
Pile4-0303502050207394
Pile4-04026010340934103
Pile4-0513340030000725
Pile4-08233600100005313
Pile4-1010330031000708
Pile5-01133343311231000
Pile5-023012133000000374
Pile5-03281232000000475
Pile5-040102410001164800
Pile5-09429122511362000
Pile5-102202461016531130
Pile5-22000277013371600
Pile5-3701008000007612
Pile5-42351185031000315
Pile6-0105022840484910
Pile6-0205701904019002
Pile6-08040199504032000
Pile6-1306025050335601
Q:quartz, B: biotite, M: muscovite, Ch: chlorite, D: dolomite, Ca: calcite, P: pyrite, T: talc, M: magnetite, H: hornblende, A: albite, Po: potassium feldspar (all in percent).
Table 2. The obtained band ratios from the feature extraction approach.
Table 2. The obtained band ratios from the feature extraction approach.
FeaturesBand1Band2SourceFeaturesBand1Band2Source
F_1437440VNIRF_1514201804SWIR
F_2481487VNIRF_1618801906SWIR
F_3440487VNIRF_1719832075SWIR
F_4536545VNIRF_1820752116SWIR
F_5606612VNIRF_1921162152SWIR
F_6551603VNIRF_2021522213SWIR
F_7621734VNIRF_2122132223SWIR
F_8734804VNIRF_2222232249SWIR
F_9804900VNIRF_2322492274SWIR
F_10900975VNIRF_2422952310SWIR
F_119751164SWIRF_2523102336SWIR
F_1211641369SWIRF_2623362366SWIR
F_1313691394SWIRF_2723872418SWIR
F_1413941420SWIRF_2824642494SWIR
Table 3. The performance assessment of HLD and CAI predictive models developed based on VNIR, SWIR, and VNIR-SWIR.
Table 3. The performance assessment of HLD and CAI predictive models developed based on VNIR, SWIR, and VNIR-SWIR.
Model Developed Based onR2RMSEVAF (%)
HLDCAIHLDCAIHLDCAI
VNIR0.170.30139.81.211731
SWIR0.660.6888.40.816768
VNIR-SWIR0.740.7978.20.667580
Table 4. The parameters considered in the employed resampling approach.
Table 4. The parameters considered in the employed resampling approach.
Reshaped ImagePixels within the WindowEquivalent H (m)New Pixel Size (mm2)
Window size of 13 × 331.71 × 1.71
Window size of 25 × 552.84 × 2.84
Window size of 37 × 773.98 × 3.98
Window size of 49 × 995.11 × 5.11
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ghadernejad, S.; Esmaeili, K. Predicting Rock Hardness and Abrasivity Using Hyperspectral Imaging Data and Random Forest Regressor Model. Remote Sens. 2024, 16, 3778. https://doi.org/10.3390/rs16203778

AMA Style

Ghadernejad S, Esmaeili K. Predicting Rock Hardness and Abrasivity Using Hyperspectral Imaging Data and Random Forest Regressor Model. Remote Sensing. 2024; 16(20):3778. https://doi.org/10.3390/rs16203778

Chicago/Turabian Style

Ghadernejad, Saleh, and Kamran Esmaeili. 2024. "Predicting Rock Hardness and Abrasivity Using Hyperspectral Imaging Data and Random Forest Regressor Model" Remote Sensing 16, no. 20: 3778. https://doi.org/10.3390/rs16203778

APA Style

Ghadernejad, S., & Esmaeili, K. (2024). Predicting Rock Hardness and Abrasivity Using Hyperspectral Imaging Data and Random Forest Regressor Model. Remote Sensing, 16(20), 3778. https://doi.org/10.3390/rs16203778

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop