Next Article in Journal
Comprehensive Provenance Analysis and Its Applications to Eocene Clastic Rocks in the Huimin Depression, Bohai Bay Basin, China
Previous Article in Journal
Nanoscale Study of Titanomagnetite from the Panzhihua Layered Intrusion, Southwest China: Multistage Exsolutions Record Ore Formation
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

An Enhanced Rock Mineral Recognition Method Integrating a Deep Learning Model and Clustering Algorithm

State Key Laboratory of Hydraulic Engineering Simulation and Safety, Tianjin University, Tianjin 300354, China
Development Research Center of China Geological Survey, Beijing 100037, China
Author to whom correspondence should be addressed.
Minerals 2019, 9(9), 516;
Received: 26 May 2019 / Revised: 5 August 2019 / Accepted: 23 August 2019 / Published: 26 August 2019


Rock mineral recognition is a costly and time-consuming task when using traditional methods, during which physical and chemical properties are tested at micro- and macro-scale in the laboratory. As a solution, a comprehensive recognition model of 12 kinds of rock minerals can be utilized, based upon the deep learning and transfer learning algorithms. In the process, the texture features of images are extracted and a color model for rock mineral identification can also be established by the K-means algorithm. Finally, a comprehensive identification model is made by combining the deep learning model and color model. The test results of the comprehensive model reveal that color and texture are important features in rock mineral identification, and that deep learning methods can effectively improve identification accuracy. To prove that the comprehensive model could extract effective features of mineral images, we also established a support vector machine (SVM) model and a random forest (RF) model based on Histogram of Oriented Gradient (HOG) features. The comparison indicates that the comprehensive model has the best performance of all.

1. Introduction

The classification and identification of rock minerals are indispensable in geological works. The traditional recognition methods are based on the physical and chemical properties of rock minerals, which are used to identify rock minerals at macro- and micro-scales [1,2]. With the development of computer science and artificial intelligence, the recognition model can be established using machine learning methods [3,4,5,6,7]. Traditional recognition methods have definite physical meaning; while machine learning methods are driven by data [8,9]. The two methods have their own strengths and weaknesses.
The significant strength of traditional identification model is that the model can be explained. Vassilev and Vassileva [10,11] analyzed the optical properties of low-temperature ash (LTA) and high-temperature ash (HTA) of the coal samples to get the weight of each mineral in the crystalline matter basis. Then they classified the inorganic matter in coal using the composition from the HTA and got a good result. But the experiment environment and operating procedures of this method are difficult to implement. Zaini [12] used wavelength position, linear spectral un-mixing and a spectral angle mapper to explore carbonate mineral compositions and distribution on the rock mineral surface and got a good result. Moreover, Zaini’s method needs SisuCHEMA SWIR sensor, which is integrated with a computer workstation to collect the data. Adep [13] developed an expert system for hyperspectral data classification with neural network technology. Their system was also used to calculate the depth and distribution of iron ore in a specific engineering project. The scholars chose the key factors to build the model. However, the professional knowledge and equipment are necessary for the process. It is not an efficient and economical way to conduct traditional model sometimes.
In recent years, machine learning methods have been applied in image recognition, including sensing images [14,15] and mineral images [16]. Multi-source and multi-type mineral datasets are suitable for machine learning methods [17,18]. Moreover, the application of machine learning in the identification of rock images has been proved useful and efficient [19,20,21,22,23]. Aligholi et al. [24] classified 45 slices of 15 kinds of rock minerals automatically based on the color of the rock minerals and achieved high accuracy. Li et al. [25] applied feature selection and transfer learning to classify microscopic images of sandstone. The feature engineering is significant in machine learning model establishment. However, it is easy to be influenced by subjective factors. As a result, it is necessary to develop an end-to-end mineral identification model.
At present, deep learning algorithms have been developed and studied widely. Google introduced the design and application of a deep learning framework, which was flexible to build deep learning models [26]. Moreover, deep learning methods have achieved a remarkable prominence in recognition of rock mineral images. Zhang et al. [27] retrained Inception-v3 to identify granite, phyllite and breccia, and thereby proved that the Inception-v3 model was suitable for rock identification. The neuro-adaptive learning algorithm was also used to classify the iron ore real-timely [28]. Lots of researchers studied the application of deep learning in geochemical mapping [29], mineral prospectivity mapping [30] and micro-scale images of quartz and resin [31]. It proves that deep learning models are outstanding in complex image data processing and analysis. While, it is hard to improve deep learning model, because the deep learning architecture is fixed. The model ensemble can be adopted to solve the problem.
In this paper, a comprehensive model for rock mineral identification was developed by combining the Inception-v3 model and a color model. First, as shown in Section 3.1, the texture features of rock minerals in the image were extracted based on the color and brightness of rock minerals. Then, the Inception-v3 model was retrained with these preprocessed images. Moreover, a color model was proposed (as shown in Section 3.2) to recognize the mineral images based on the result of Inception-v3 model. Finally, the comprehensive identification model based on the deep learning model and color model can be established (as shown in Section 4). We also constructed two models using SVM and RF-based on Histogram of Oriented Gradient (HOG) features. The comparison of the three models indicates that the comprehensive model can achieve significant performance. In this research, the elaborate mineral images are adopted, whose features are obvious. We want to study whether the comprehensive model can extract the obvious features, such as cleavage and luster. If effective features can be extracted, the model can be used in a larger domain.

2. Methodology

This research is based on the deep learning method, transfer learning and clustering algorithm. The overall of the method is shown in Figure 1. The process can be divided into two parallel ones. One of them cut the raw images (images without any pre-process) into pieces. Then T1 is calculated, and the color model is established using these image slices. T1 is broadcasted to the other process at the same time. The other process will calculate and outline the binary points in the raw images once received T1. Then the processed images are going to be trained using the transfer learning method based on deep learning method. Finally, a comprehensive model is established, combining the Inception-v3 model and color model. T and T1 are thresholds defined in Section 3.1.

2.1. Deep Learning Algorithm

Deep learning was proposed by Hinton and Salakhutdinov [32] and has been widely applied in semantic recognition [33], image recognition [34], object detection [35], and the medical field [36]. In image recognition, the end-to-end training mode is adopted in the training process. In each iteration of a training process, each layer can adjust its parameters according to the feedback from training data to improve the accuracy of the model. Moreover, the nonlinear function mapping ability is added to the deep learning model. As a consequence, it reduces the computational time for extracting features from images and improves the accuracy of the deep learning model.
In this paper, the Inception-v3 model [37] was set as a pre-trained model. In the Inception-v3 model, there are five convolutional layers and two max-pooling layers in front of the net structure. To reduce the computation cost, the convolution kernel is converted from one 5 × 5 matrix to two 3 × 3 matrixes. Then, 11 mixed layers follow. These mixed layers could be divided into three categories, named block module 1, block module 2 and block module 3. Block module 1 has three repetitive mixed layers, and each layer is connected by a contact layer. The input size of first mixed layer is (35,35,192), and output size is (35,35,256). The rest mixed layers’ input, and output size are (35,35,288) in the block module 1. Batch size is the number of training samples in each step. In the front of block module 2, there is a small mixed layer, whose input size is (35,35,288) and the output size is (17,17,768). The rest of the mixed layers’ input and output size are (17,17,768) in the block module 2. In block module 3, the input size is (17,17,768) and output size is (8,8,2048). After the mixed layer, a convolutional layer, mean pooling layer and dropout layer is followed to extracted image features. Finally, the softmax classifier is trained at the end of the net.
In the process of convolution, a color image is converted into a 3D matrix including all RGB values; then, each pixel is evaluated using the kernel and pooling during each iteration. The average pooling layer and max-pooling layer could get the average value and max value of a matrix, respectively. Finally, the common features can be extracted from a large number of images. In the mixed layer, convolution and pooling are synchronized, which is able to achieve better performance in feature extraction.

2.2. Transfer Leaning Method

According to transfer learning, the parameters in trained models from clustering domains can be utilized to establish a new model efficiently. In a new task of images recognition, the parameters of a pre-trained model are set as the initial parameters. Transfer learning can help scholars avoid building a model from scratch and make full use of the data, which reduces computational cost and dependence on big datasets.
In this research, the Inception-v3 model was selected as the pre-trained model for rock mineral image recognition. The pre-trained deep learning model is able to reduce computational and time cost when a new model is trained. Therefore, transfer learning based on a deep learning model has been widely used [38,39]. The original data set of the Inception-v3 model contains about 1.2 million images from more than 1000 categories. The model involves about 25 million parameters. However, with the transfer learning method, the training process based on this model can be quickly finished even though the computer has no GPU in general. Figure 2 shows the process of transfer learning method.

2.3. Clustering Algorithm

The clustering algorithm has a mass application in pattern recognition (PR) and is based on the Euclidian Distance. Absolute distance is the main point of Euclidian Distance, which measures the clustering of lots of datasets by absolute distance. Reasonably, a rank will be proposed by the distance size of the dataset.
In this research, the color model is established based on the K-means algorithm. In the algorithm, the absolute distances of objects are calculated, and the objects that share the closest absolute distances are in the same category. Finally, a silhouette coefficient is used to evaluate the clustering effect:
S i = b i a i m a x ( a i , b i ) .
In Equation (1), ai is the average distance from the ith point to other points in the same kind, which is called intra-cluster dissimilarity; bi is the average distance from the ith point to the points in any kind, which is called inter-cluster dissimilarity. When the number of kinds is more than 3, the average value is set as a silhouette coefficient. The silhouette coefficient is in [−1, 1]. In the training color model, the model receiving the maximum silhouette coefficient is going to be established.

2.4. Support Vector Machine

Support Vector Machine (SVM) is a supervised learning method. When given a binary classification dataset { x n , y n } n = 1 N with x n R D and y n [ 1 ,   1 ] , for each (xn, yn) belongs to { x n , y n } n = 1 N . An SVM model can be established by the equation:
f ( x ) = ω T x + b .
The optimal weight vector wT and bias b are obtained by the Hinge Loss Function [40]. In this research, the Histogram of Oriented Gradient (HOG) algorithm was used to extract the feature of rock mineral images and then an SVM model was trained with these features.

2.5. Random Forest

Random Forest (RF) is proposed by Leo Breiman. The steps of establishing an RF model are as following:
Step 1: Input T as training dataset size and M as feature numbers.
Step 2: Input the feature number of each node (m, m << M) to get the decision result at a node in the tree.
Step 3: Select K samples with playback from a dataset N randomly. Then, K taxonomic trees are constructed by these selected samples. The rest of the dataset is used to test the model.
Step 4: Randomly select m features at each node and find the best splitting mode by the selected features.
Step 5: Each tree grows completely without pruning and may be adopted after a normal tree classifier is built.

3. Algorithm Implementation

Rock minerals have unique physical properties, such as color, streak, transparency, aggregate shape and luster [41]. By analyzing their properties, rock minerals can be identified more accurately. Table 1 shows the properties of 12 kinds of rock minerals.
In Table 1, color is a key feature of rock minerals in rock mineral image recognition, because different compositions of rock minerals lead to different colors. Also, the composition of different rock minerals usually changes, and the colors vary widely. Some rock minerals with a single composition have a pure color. For example, cinnabar is usually red, malachite is green, and calcite is white. A rock mineral often shows a mixed color on the surface because of mixed composition, which is often indicated by binomial nomenclature, such as lime-green. However, whether a particular type of rock mineral has a pure or mixed composition, those in that classification always share some certain kinds of colors. Therefore, color is an important classification criterion.
The texture is also an important feature for rock mineral image recognition. The main components of rock mineral texture are aggregate shape and cleavage. Different rock minerals have different distinctive textures. For example, cinnabar’s aggregate is granular, with perfect cleavage and uneven to sub-conchoidal fracture; aquamarine’s aggregate is cluster form, with imperfect cleavage and conchoidal to irregular fracture; malachite’s aggregate is multi-form, as shown in Figure 3. The texture is also helpful in recognizing rock minerals. Therefore, the combination of color and texture can improve the accuracy of the rock mineral recognition model.

3.1. Mineral Texture Feature Extraction

The visual features of rock minerals also include the texture of rock mineral aggregates and cleavage. In light conditions, the reflection of different cleavages is different. Meanwhile, the reflection of a given rock mineral follows a pattern because of its own characteristics. Therefore, the texture of rock minerals can be extracted based on the brightness and color variation of images.
In an area where the brightness changes significantly in the image, the rock mineral texture can be outlined. According to the RGB color system, color images can be converted to grayscale to show brightness variations more effectively. The conversion rule [43] is as follows:
g r a y = 0.299 × R + 0.587 × G + 0.114 × B ,
where, gray is the grayscale value of a pixel, and R, G and B are the three primary color values of the pixel. The gray value of the image can be calculated by Equation (3). In the light condition, the brightness in the opaque area of a rock mineral image is distinctly different from that of the surrounding pixels. If the rock mineral’s luster is brilliant in the image, some pixels achieve a higher brightness value than that of their surrounding area, while the rest of the pixels do not. If the luster of a rock mineral is dull in the image, which indicates that it has a low reflection, pixels achieve a lower brightness value than their surrounding pixels, as shown in Figure 4.
In the same light condition, the brightness value fluctuates around a fixed value, and the changed value of the fluctuation is set as | Δ Z i | . The brightness value dramatically changes if it is beside the texture and the value is far more than | Δ Z i | , namely | Δ Z |   >>   | Δ Z i | . The variation for a pixel is calculated:
Δ Z i = g r a y i g r a y ¯ ,
where, Δ Z i is the variation of brightness for each pixel and the grayscale value for pixel points is determined by Equation (3). Moreover, g r a y ¯ is the mean value of the grayscale of the central pixel and 8 pixels around the central pixel. In order to eliminate edge influence, the maximum and minimum are removed, as shown in Equations (5) and (6).
g r a y ¯ = ( i = 1 n = 9 g r a y i g r a y max g r a y min ) / 7 ,
| Δ Z i | = | | Δ Z i | | Δ Z | ¯ | = | | Δ Z i | 1 7 k = 1 7 | Δ Z k | | ,
where, the grayi is the value of the ith pixel’s grayscale value, g r a y m a x and g r a y m i n are the maximum and minimum grayscale values of the pixels being calculated, | Δ Z i | is the variation of the ith grayscale pixel. T is selected as the threshold value. After experimentation, we found that a value of T = 15 was suitable. If the point satisfies | Δ Z i | > T, it will be set as a feature point.
In addition, the texture features could be extracted from color changes at the interface of rock minerals. Therefore, the color change at the interface of a rock mineral could be measured by the variation of RGB proportion. Three linearly independent indexes, namely C1, C2 and C3 were selected, which were calculated in Equation (7). Moreover, n was set as 0.01 to avoid a denominator of zero.
C 1 = R / ( G + n ) ,   C 2 = G / ( B + n ) ,   C 3 = B / ( R + n ) ,
where, C was calculated to evaluate the variation of C1, C2 and C3 comprehensively, as shown in Equation (7).
C = K C 1 2 + C 2 2 + C 3 2 ,
where, K is the expansion coefficient, and in this research, K was set to 1. C will change sharply when any one of C1, C2 and C3 changes. Δ C was set as the maximum difference in coefficients:
Δ C = max { Δ C i } ,
where, Δ C i was the difference between the pixel and the surrounding pixels, and T1 was used as a threshold value to determine whether the point was a texture boundary point; and T’ was a matrix, which was used to store the maximum distances within each class. T1 indicated the max distance between the 12 rock minerals, which was calculated by K-means. First, a matrix which contains the RGB values of every type of rock mineral image was calculated by K-means algorithm. Then, a result matrix that contained the max distance for these 12 kinds of rock mineral was assigned to T’:
T = { D 1 , D 2 , , D i , , D m } ,
where, Di is the max distance in each category from any point’s C value to the mean colors’ RGB values; and m is the number of mineral classes. Finally, the maximum value in T’ was assigned to T1. When T1 is smaller than Δ C , the pixel point will be marked as a texture boundary point; and in the contrary case, it is not going to be marked. The texture and cleavage can be outlined in the image by combining the brightness and color change. The recognition accuracy of rock mineral images can be increased with the extraction of texture and the cleavage of rock minerals.
In a word, the feature of rock mineral images can be extracted by calculating the changes in brightness values and color values. In this research, texture features of rock minerals were extracted to strengthen the texture features of the original rock mineral images by marking and outlining the texture boundary points, as shown in Figure 5. After extraction, Inception-v3 model and the color model will be trained.

3.2. Color Model of Rock Mineral

The colors of the 12 types of rock minerals considered here are shown in Table 1. A computer generated the color through combining red, green, and blue to build RGB color space, in which red, green, and blue have values in [0, 255]. As a result, there are 1,677,216 (226 × 256 × 256 = 1,677,216) colors in RGB color space. Any change of the value in R, G or B will lead to color variation. It is possible to implement recognition for rock mineral images based on the feature of color. The rock mineral label can be determined by the following equation:
S j i = ρ j i 2 = ( R i     R j i ) 2 + ( G i     G j i ) 2 + ( B i     B j i ) 2 ,
where, ρji are the Euclid Distance. Ri, Gi and Bi are the mean values of RGB value in rock mineral images. Rji, Gji and Bji are the mean values of the RGB value of the ith color in the jth rock mineral. According to the color distance Sji, the rock mineral ranked top 6 are recognized, and shown in the result. See Equations (11) and (12).
S c o r e j = 1 m i n { S j i } j = 1 n m i n { S j i } ( i = 0 , 1 , , k ) ,
S c o r e j = S c o r e j j = 1 n S c o r e j .
Here a color model for rock mineral will be established based on the colors of rock mineral. The color model will be used to assist inception-v3 model in identifying rock minerals. During identification of the type of a mineral, the inception-v3 model identifies the mineral first, then a six-size set of identification results is passed to the color model. The final result is given by the color model after matching the color of the mineral with the results from Inception-v3 model.

4. Experiment and Results

4.1. Image Processing

The dataset [42] consists of 4178 images of 12 kinds of elaborate rock minerals. The information about rock mineral images in each category is shown in Table 2. Twenty images were selected randomly in each category as a test dataset, and the remaining images were used to retrain the Inception-v3 model.
In establishing the model, numerous factors including the number and clarity of rock mineral images, background noise and differences between features of minerals all influence its accuracy. Generally, accuracy increases as the number of rock mineral images increases. However, the imbalance among different types in quantity may lead to low accuracy. Therefore, we collected the images and ensured that the total number of each category was at least 150. The proportion of rock minerals in an image was then adjusted to at least 80%. Finally, the noise of the image, such as complex backgrounds and labels, was removed to give the images the same size. In the process of training and the color model, the texture features of all images were extracted and strengthened. All images were also translated into the RGB matrix. Then, the brightness values and a matrix containing the ΔC values of the points of each image were calculated by Equation (3) and Equations (7)–(9), respectively. If they were determined to be boundary points, the points were marked and outlined. Some pre-processed results are shown in Figure 5. Then, the Inception-v3 model was trained.

4.2. Model Establishment

4.2.1. Model Training

The processed images were used as the raw data. The input image size was set to 299 × 299. All the input images were pre-processed for training. There were three input channels for length, width, and color. The training steps were set to 20,000, and the learning rate was set to 0.01. The prediction results were compared with the true label in each step, so that the training and validation accuracy were both calculated to update the weights in the model. The true label is the class name from the dataset. The primary measures used to evaluate training effectiveness are the training accuracy, validation accuracy and train cross-entropy, as shown in Figure 6.
The lighting conditions have an influence on the RGB value of the images. To get a better color model, the self-adaption K-means algorithm, which is optimized by the silhouette coefficient, was used to train the model based on 10 image slices for each category in different lighting condition (with a size of about 300 × 300). There were about 900,000 points in each category. These points were then used to train the color model by K-means. When the model was trained, the silhouette coefficient and the maximum distance, which is the Di in Equation (10), of each category from point to the mean value were calculated. The training process ended until the silhouette coefficient was maximum. The training results of the color model is shown in Table 3. Additionally, Figure 7 shows samples of rock mineral images which were trained in color model.
To eliminate the influence of different lighting condition, we trained numerous images in different lighting conditions. Due to the difference in the reflection of rock minerals, some mineral’s RGB values changed significantly, while some do not, therefore, the number of RGB value kinds of these minerals in the color model is different.
When training the SVM model and RF model, the features of raw images were extracted by the HOG method. Then all features were resized to 256 × 256. The parameters of SVM and RF were determined by the grid search algorithm. Finally, we found that the penalty coefficient and gamma of SVM are 0.9 and 0.09. The number of trees of RF is 180. The parameters of RF are the default.

4.2.2. Model Test and Results

First, two Inception-v3 models were retrained from both raw images and texture feature extraction images. Then, the retrained model from the texture feature extraction images was combined with the color model to create a comprehensive model. The comparison was made among the three models to find one with the highest accuracy. To ensure the accuracy was more convincing, 440 images were used to test these models, and the testing images were randomly selected from the image dataset of the rock mineral. These testing images are going to be used as the input data for a comprehensive model and Inception-v3 model trained with or without feature extraction images. Meanwhile, the images are going to be automatically processed when testing the comprehensive model and Inception-v3 model trained with feature extraction images by the program. The test result is shown in Table 4.
To ensure the models in Table 4 are statistically feasible, the precision rate, recall rate and F1-measure value are used to validate the models. In Table 4, the retrained Inception-v3 model based on raw images was able to reach a validation accuracy of 73.1%, and the top-1 and top-3 accuracies were 64.1% and 96.0%, respectively. The test accuracies (precision rate) of SVM-HOG and RF-HOG are 32.8% and 31.2%, respectively. The accuracy is too low, which indicates that the HOG features are not effective. On the other side, the comprehensive model has a significant performance, which proves it can extract the texture features. The test accuracies of each category (recall rate) of these models are shown in Figure 8, and the F1-measure value of the models is shown in Table 5. As we can see, the accuracy of each category of the comprehensive model is much higher than that of the rest models, which indicates that the comprehensive model is suitable for rock mineral identification. The F1-measure value of the comprehensive model indicates that the model is statistically feasible.
It revealed that the accuracy of the Inception-v3 model, with the texture feature extraction images, is higher than that with the raw images in Table 4. The Inception-v3 model with the texture feature extraction images reached a higher accuracy, which means the texture is a significant feature for rock mineral identification. The texture extraction makes it easier to distinguish different rock minerals, which improves the accuracy of the retrained model.
The comprehensive model achieved the highest accuracy in the three models. Color properties could increase identification accuracy greatly. From the test accuracy of the comprehensive model, we can see that the top-1 accuracy is 74.2%, and the top-3 accuracy is 99.0%. By combining the retrained Inception-v3 model and the color model, a comprehensive identification could be made for different rock minerals. The texture and color features were used the most during identification.
Following are some recognition samples in Figure 9. As you can see, these models sometimes achieved different results. When the texture features are similar, and the colors are different. For example, magnetite’s cleavage is similar to the azurite’s, but their colors are not similar, the color model could identify them successfully. When the minerals’ color is similar, but has different cleavage such as calcite and gym, the Inception-v3 model trained with feature extraction images could identify them successfully. Therefore, the comprehensive model has the advantages of both the color model and the Inception-v3 model trained with feature extraction images. Occasionally, however, the comprehensive model is going to be wrong when the minerals’ texture feature is vague, or the color is abnormal. This condition will confuse the comprehensive model, as shown in Figure 10. It is magnetite in Figure 10, but because of the blue lighting and vague texture feature, these models all received an incorrect result.

5. Conclusions

In this research, three models based on Inception-v3 model were established. The deep learning model, coupled with color model reached top-1 and top-3 accuracies of 74.2% and 99.0%, respectively. The retrained model using raw images reached top-1 and top-3 accuracies of 64.1% and 96.0%, respectively. The retrained model using texture extraction images reached top-1 and top-3 accuracies of 67.5% and 98.3%, respectively. The comparison of the three models indicates that the comprehensive model is the best of all. The SVM-HOG model achieved a validation accuracy of 32.8%. The RF-HOG model achieved a validation accuracy of 31.2%. The results indicate that the deep learning model can extract effective image features. The comparison between the traditional models and the deep learning model shows that the deep learning models are much more effective than the traditional ones.
The deep learning algorithm provides a new idea for rock mineral identification. The clustering algorithm could also improve the performance of the deep learning model. Therefore, the combination of a deep learning algorithm and clustering algorithm is an effective way for rock mineral identification. Meanwhile, the color features and texture features of rock minerals are significant properties for identification. The identification accuracy of Inception-v3 model could be improved greatly by texture and color extraction.
Furthermore, the elaborate mineral images, with clear mineral features, are adopted. If the features, such as cleavage and luster, can be extracted, the comprehensive model can be further tested for mineral specimen identification, even for field survey.

Author Contributions

Modules design and development, C.L.; Overall framework design, M.L.; Data collection and application test, Y.Z. and Y.Z.; Algorithm and model improvement, S.H.


This research was funded by the National Natural Science Foundation for Excellent Young Scientists of China (Grant no. 51622904), the Tianjin Science Foundation for Distinguished Young Scientists of China (Grant no. 17JCJQJC44000) and the National Natural Science Foundation for Innovative Research Groups of China (Grant no. 51621092).

Conflicts of Interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.


  1. Yeshi, K.; Wangdi, T.; Qusar, N.; Nettles, J.; Craig, S.R.; Schrempf, M.; Wangchuk, P. Geopharmaceuticals of Himalayan Sowa Rigpa medicine: Ethnopharmacological uses, mineral diversity, chemical identification and current utilization in Bhutan. J. Ethnopharmacol. 2018, 223, 99–112. [Google Scholar] [CrossRef] [PubMed]
  2. Rustom, L.E.; Poellmann, M.J.; Johnson, A.J.W. Mineralization in micropores of calcium phosphate scaffolds. Acta Biomater. 2019, 83, 435–455. [Google Scholar] [CrossRef] [PubMed]
  3. Laura, B.; Nina, E.; Mona, W.M.; Udo, Z.; Andò Sergio Merete, M.; Reidar, I.K. Quick, easy, and economic mineralogical studies of flooded chalk for eor experiments using raman spectroscopy. Minerals 2018, 8, 221. [Google Scholar]
  4. Ramil, A.; López, A.J.; Pozo-Antonio, J.S.; Rivas, T. A computer vision system for identification of granite-forming minerals based on RGB data and artificial neural networks. Measurement 2017, 117, 90–95. [Google Scholar] [CrossRef]
  5. Sadeghi, B.; Madani, N.; Carranza, E.J.M. Combination of geostatistical simulation and fractal modeling for mineral resource classification. J. Geochem. Explor. 2015, 149, 59–73. [Google Scholar] [CrossRef]
  6. Li, R.; Albert, N.N.; Yun, M.; Meng, Y.S.; Du, H. Geological and Geochemical Characteristics of the Archean Basement-Hosted Gold Deposit in Pinglidian, Jiaodong Peninsula, Eastern China: Constraints on Auriferous Quartz-Vein Exploration. Minerals 2019, 9, 62. [Google Scholar] [CrossRef]
  7. Rajendran, S.; Nasir, S. ASTER capability in mapping of mineral resources of arid region: A review on mapping of mineral resources of the Sultanate of Oman. Ore Geol. Rev. 2019, 108, 33–53. [Google Scholar] [CrossRef]
  8. Shi, B.; Liu, J. Nonlinear metric learning for kNN and SVMs through geometric transformations. Neurocomputing 2018, 318, 18–29. [Google Scholar] [CrossRef][Green Version]
  9. Sachindra, D.; Ahmed, K.; Rashid, M.M.; Shahid, S.; Perera, B. Statistical downscaling of precipitation using machine learning techniques. Atmos. Res. 2018, 212, 240–258. [Google Scholar] [CrossRef]
  10. Vassilev, S.V.; Vassileva, C.G. A new approach for the combined chemical and mineral classification of the inorganic matter in coal. 1. Chemical and mineral classification systems. Fuel 2009, 88, 235–245. [Google Scholar] [CrossRef]
  11. Vassilev, S.V.; Vassileva, C.G. A new approach for the classification of coal fly ashes based on their origin, composition, properties, and behaviour. Fuel 2007, 86, 1490–1512. [Google Scholar] [CrossRef]
  12. Zaini, N.; Van Der Meer, F.; Van Der Werff, H. Determination of Carbonate Rock Chemistry Using Laboratory-Based Hyperspectral Imagery. Remote. Sens. 2014, 6, 4149–4172. [Google Scholar] [CrossRef][Green Version]
  13. Adep, R.N.; Shetty, A.; Ramesh, H. EXhype: A tool for mineral classification using hyperspectral data. ISPRS J. Photogramm. Remote. Sens. 2017, 124, 106–118. [Google Scholar] [CrossRef]
  14. Othman, A.A.; Gloaguen, R. Integration of spectral spatial and morphometric data into lithological. J. Asian Earth Sci. 2017, 146, 90–102. [Google Scholar] [CrossRef]
  15. Li, Y.; Chen, C.; Fang, R.; Yi, L. Accuracy enhancement of high-rate GNSS positions using a complete ensemble empirical mode decomposition-based multiscale multiway PCA. J. Asian Earth Sci. 2018, 169, 67–78. [Google Scholar] [CrossRef]
  16. Wang, W.; Chen, L. Flotation Bubble Delineation Based on Harris Corner Detection and Local Gray Value Minima. Minerals 2015, 5, 142–163. [Google Scholar] [CrossRef][Green Version]
  17. Chen, J.; Chen, Y.; Wang, Q. Synthetic Informational Mineral Resource Prediction: Case Study in Chifeng Region, Inner Mongolia, China. Earth Sci. Front. 2008, 15, 18–26. [Google Scholar] [CrossRef]
  18. Shardt, Y.A.; Brooks, K. Automated System Identification in Mineral Processing Industries: A Case Study using the Zinc Flotation Cell. IFAC-Papers OnLine 2018, 51, 132–137. [Google Scholar] [CrossRef]
  19. Ślipek, B.; Młynarczuk, M. Application of pattern recognition methods to automatic identification of microscopic images of rocks registered under different polarization and lighting conditions. Geol. Geophys. Environ. 2013, 39, 373. [Google Scholar] [CrossRef]
  20. Młynarczuk, M.; Górszczyk, A.; Ślipek, B. The application of pattern recognition in the automatic classification of microscopic rock images. Comput. Geosci. 2013, 60, 126–133. [Google Scholar] [CrossRef]
  21. Shu, L.; McIsaac, K.; Osinski, G.R.; Francis, R. Unsupervised feature learning for autonomous rock image classification. Comput. Geosci. 2017, 106, 10–17. [Google Scholar] [CrossRef]
  22. Coates, A.; Ng, A.Y.; Lee, H. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 215–223. [Google Scholar]
  23. Raina, R.; Battle, A.; Lee, H.; Packer, B.; Ng, A.Y. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th International Conference on Machine Learning, Corvallis, OR, USA, 20–24 June 2007; pp. 759–766. [Google Scholar]
  24. Aligholi, S.; Lashkaripour, G.R.; Khajavi, R.; Razmara, M. Automatic mineral identification using color tracking. Pattern Recognit. 2016, 65, 164–174. [Google Scholar] [CrossRef]
  25. Li, N.; Hao, H.; Gu, Q.; Wang, D.; Hu, X. A transfer learning method for automatic identification of sandstone microscopic images. Comput. Geosci. 2017, 103, 111–121. [Google Scholar] [CrossRef]
  26. Google Deep Learning. Available online: (accessed on 21 March 2019).
  27. Zhang, Y.; Li, M.C.; Han, S. Automatic identification and classification in lithology based on deep learning in rock images. Acta Petrol. Sin. 2018, 34, 333–342. [Google Scholar]
  28. Kitzig, M.C.; Kepic, A.; Grant, A. Near Real-Time Classification of Iron Ore Lithology by Applying Fuzzy Inference Systems to Petrophysical Downhole Data. Minerals 2018, 8, 276. [Google Scholar] [CrossRef]
  29. Xiong, Y.; Zuo, R.; Carranza, E.J.M. Mapping mineral prospectivity through big data analytics and a deep learning algorithm. Ore Geol. Rev. 2018, 102, 811–817. [Google Scholar] [CrossRef]
  30. Zuo, R.; Xiong, Y.; Wang, J.; Carranza, E.J.M. Deep learning and its application in geochemical mapping. Earth-Sci. Rev. 2019, 192, 1–14. [Google Scholar] [CrossRef]
  31. Iglesias, J.C.; Álvarez; Santos, R.B.M.; Paciornik, S. Deep learning discrimination of quartz and resin in optical microscopy images of minerals. Miner. Eng. 2019, 138, 79–85. [Google Scholar] [CrossRef]
  32. Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef][Green Version]
  33. Hinton, G.; Mohamed, A.-R.; Jaitly, N.; Vanhoucke, V.; Kingsbury, B.; Deng, L.; Yu, D.; Dahl, G.; Senior, A.; Nguyen, P.; et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Process. Mag. 2012, 29, 82–97. [Google Scholar] [CrossRef]
  34. Gong, M.; Yang, H.; Zhang, P. Feature learning and change feature classification based on deep learning for ternary change detection in SAR images. ISPRS J. Photogramm. Remote Sens. 2017, 129, 212–225. [Google Scholar] [CrossRef]
  35. Pham, C.C.; Jeon, J.W. Robust object proposals re-ranking for object detection in autonomous driving using convolutional neural networks. Signal Process. Image Commun. 2017, 53, 110–122. [Google Scholar] [CrossRef]
  36. Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef]
  37. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2818–2826. [Google Scholar]
  38. Qureshi, A.S.; Khan, A.; Zameer, A.; Usman, A. Wind power prediction using deep neural network based meta regression and transfer learning. Appl. Soft Comput. 2017, 58, 742–755. [Google Scholar] [CrossRef]
  39. Xu, G.; Zhu, X.; Fu, D.; Dong, J.; Xiao, X. Automatic land cover classification of geo-tagged field photos by deep learning. Environ. Model. Softw. 2017, 91, 127–134. [Google Scholar] [CrossRef][Green Version]
  40. Xu, B.; Shen, S.; Shen, F.; Zhao, J. Locally linear SVMs based on boundary anchor points encoding. Neural Netw. 2019, 117, 274–284. [Google Scholar] [CrossRef]
  41. Pellant, C. Rocks and Minerals—Smithsonian Handbook; DK Publisher: New York, NY, USA, 2002. [Google Scholar]
  42. The Geological Museum of China. Available online: (accessed on 28 March 2019).
  43. International Telecommunication Union. Available online: (accessed on 12 March 2019).
Figure 1. The training process and identification process of a comprehensive model.
Figure 1. The training process and identification process of a comprehensive model.
Minerals 09 00516 g001
Figure 2. Transfer learning with Inception-v3.
Figure 2. Transfer learning with Inception-v3.
Minerals 09 00516 g002
Figure 3. Samples of rock mineral characteristics [42]. (a) Aquamarine; (b) cinnabar; (c) hematite.
Figure 3. Samples of rock mineral characteristics [42]. (a) Aquamarine; (b) cinnabar; (c) hematite.
Minerals 09 00516 g003
Figure 4. The brightness distribution of rock mineral images. (a) Calcite; (b) brightness distribution of calcite image; (c) cinnabar; (d) brightness distribution of cinnabar image.
Figure 4. The brightness distribution of rock mineral images. (a) Calcite; (b) brightness distribution of calcite image; (c) cinnabar; (d) brightness distribution of cinnabar image.
Minerals 09 00516 g004
Figure 5. Samples of texture extraction for rock mineral images. (a) Cinnabar; (b) texture extraction of cinnabar; (c) calcite; (d) texture extraction of calcite; (e) malachite; (f) texture extraction of malachite.
Figure 5. Samples of texture extraction for rock mineral images. (a) Cinnabar; (b) texture extraction of cinnabar; (c) calcite; (d) texture extraction of calcite; (e) malachite; (f) texture extraction of malachite.
Minerals 09 00516 g005aMinerals 09 00516 g005b
Figure 6. Training process with Inception-v3.
Figure 6. Training process with Inception-v3.
Minerals 09 00516 g006
Figure 7. Training samples of color model. (a) Cinnabar; (b) calcite; (c) aquamarine.
Figure 7. Training samples of color model. (a) Cinnabar; (b) calcite; (c) aquamarine.
Minerals 09 00516 g007
Figure 8. Test accuracies of each category.
Figure 8. Test accuracies of each category.
Minerals 09 00516 g008
Figure 9. Samples of recognition results. (ac) are from the comprehensive model, (df) are from the Inception-v3 model trained with feature extraction images, (gj) are from the Inception-v3 model without feature extraction.
Figure 9. Samples of recognition results. (ac) are from the comprehensive model, (df) are from the Inception-v3 model trained with feature extraction images, (gj) are from the Inception-v3 model without feature extraction.
Minerals 09 00516 g009aMinerals 09 00516 g009b
Figure 10. Identification results from these three models. (a) True; (b) true; (c) true.
Figure 10. Identification results from these three models. (a) True; (b) true; (c) true.
Minerals 09 00516 g010
Table 1. Twelve kinds of rock mineral’s physical properties.
Table 1. Twelve kinds of rock mineral’s physical properties.
MineralsColorStreakTransparencyAggregate ShapeLuster
CinnabarRedRedTranslucentGranule, massive formAdamantine luster
HematiteRedCherry redOpaque Multi-formFrom metallic luster to soil state luster
CalciteColorless or whiteWhiteTranslucentGranule, massive form, threadiness, stalactitic form, soil stateGlassy luster
MalachiteGreenPale greenFrom translucent to opaqueEmulsions, massive, incrusting, concretion forms or threadinessWaxy luster, glassy luster, soil state luster
AzuriteNavy bluePale blueOpaqueGranule, stalactitic, incrusting, soil stateGlassy luster
AquamarineDetermine by its componentWhiteTransparentCluster formGlassy luster
AugiteBlackPale green, blackOpaqueGranule, radial pattern, massiveGlassy luster
MagnetiteBlackBlackOpaqueGranule, massive formMetallic luster
MolybdeniteGrayLight grayOpaqueClintheriform, scaly form or lobate formMetallic luster
StibniteGrayDark grayOpaqueMassive, granule or radial patternStrong metallic luster
CassiteriteCrineus, yellow, blackWhite, pale brownFrom opaque to transparentIrregular granuleAdamantine luster, sub adamantine luster
GypWhite, colorlessWhiteTransparent, translucentClintheriform, massive, threadinessGlassy luster, nacreous luster
Table 2. The types and numbers of rock mineral images.
Table 2. The types and numbers of rock mineral images.
Rock MineralsNumberRock MineralsNumber
Table 3. Training results of color model.
Table 3. Training results of color model.
Rock MineralsColors (R, G, B)
Table 4. Results of five models.
Table 4. Results of five models.
ModelsValidation AccuracyTest Accuracy
Model based on raw images73.1%64.1%96.0%
Model based on texture extraction images77.4%67.5%98.3%
Comprehensive model77.4%74.2%99.0%
SVM-HOG 33.6%
Table 5. Validation parameters of models.
Table 5. Validation parameters of models.
ModelsPrecision RateRecall RateF1-measure Value
Comprehensive model74.2%77.5%0.758

Share and Cite

MDPI and ACS Style

Liu, C.; Li, M.; Zhang, Y.; Han, S.; Zhu, Y. An Enhanced Rock Mineral Recognition Method Integrating a Deep Learning Model and Clustering Algorithm. Minerals 2019, 9, 516.

AMA Style

Liu C, Li M, Zhang Y, Han S, Zhu Y. An Enhanced Rock Mineral Recognition Method Integrating a Deep Learning Model and Clustering Algorithm. Minerals. 2019; 9(9):516.

Chicago/Turabian Style

Liu, Chengzhao, Mingchao Li, Ye Zhang, Shuai Han, and Yueqin Zhu. 2019. "An Enhanced Rock Mineral Recognition Method Integrating a Deep Learning Model and Clustering Algorithm" Minerals 9, no. 9: 516.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop