Detection of Magnesite and Associated Gangue Minerals Using Hyperspectral Remote Sensing—A Laboratory Approach

: This study introduced a detection method for magnesite and associated gangue minerals, including dolomite, calcite, and talc, based on mineralogical, chemical, and hyperspectral analyses using hand samples from thirteen di ﬀ erent source locations and Specim hyperspectral short wave infrared (SWIR) hyperspectral images. Band ratio methods and logistic regression models were developed based on the spectral bands selected by the random forest algorithm. The mineralogical analysis revealed the heterogeneity of mineral composition for naturally occurring samples, showing variouscarbonate and silicateminerals asaccessory minerals. TheMg and Cacomposition ofmagnesite and dolomite varied signiﬁcantly, inferring the mixture of minerals. The spectral characteristics of magnesite and associated gangue minerals showed major absorption features of the target minerals mixed with the absorption features of accessory carbonate minerals and talc a ﬀ ected by mineral composition. The spectral characteristics of magnesite and dolomite showed a systematic shift of the Mg-OH absorption features toward a shorter wavelength with an increased Mg content. The spectral bands identiﬁed by the random forest algorithm for detecting magnesite and gangue minerals were mainly associated with spectral features manifested by Mg-OH, CO 3 , and OH. A two-step band ratio classiﬁcation method achieved an overall accuracy of 92% and 55.2%. The classiﬁcation models developed by logistic regression models showed a signiﬁcantly higher accuracy of 98~99.9% for training samples and 82–99.8% for validation samples. Because the samples were collected from heterogeneous sites all over the world, we believe that the results and the approach to band selection and logistic regression developed in this study can be generalized to other case studies of magnesite exploration.

. The source locations and types of sample used in this study.

Mineral Composition Analysis
In the real world, pure carbonate minerals are rarely found, and any natural sample might include a mixture of different minerals. Therefore, the hard classification of mineral types by referencing to a spectral library is almost impossible for making sense. To measure the mineral composition of the samples, we carried out X-ray Diffraction (XRD hereafter) analysis using a D8 Advance diffractometer (Bruker-AXS) with a Cu target and LynxEye position sensitive detector. The parameters for diffraction pattern acquisition were a step size of 0.01 • , 2θ range of 5 • -100 • , 1 sec counting time for each step, and 30 rpm in PE bottles. The fundamental parameters were calibrated with standard materials (LaB6, NIST SRM 660b) with the same conditions. The representative portion of the hand samples were selected based on the visual inspection and cut. The cutting surface of each slab was sanded to remove contamination. The processed samples were air dried for one day to get rid of their moisture and crushed with rock hammers and jaw crushers. The quadrisect samples were powdered with the agate mortar for the XRD analysis and with a tungsten carbide disc mill for X-ray fluorescence (XRF hereafter) spectrometry analysis.

Chemical Analysis
As mentioned, both magnesite and dolomite contain MgO and CaO, whereas magnesite has more Mg content, and thus, the two minerals often show similar patterns in their spectral signatures. Moreover, the MgO content of one specific mineral varies significantly due to the impurities of naturally formed minerals. To analyze the spectral characteristics of magnesite and dolomite associated with MgO and CaO content, we analyzed MgO and CaO content based on Lab XRF analysis. Of each preprocessed sample, 1 g was mixed with 5.5 g of Li-tetraborate (Li 2 B 4 O 7 ) in a platinum crucible. The mixed samples were entirely melted in a gas furnace at 1100 • C for 10 minutes. The glass beads were prepared by quenching the totally molten mixed samples in a polished platinum mold. These glass beads were used for the XRF analyses. The analytical errors for MgO and CaO were within 1%. We analyzed the Mg and Ca content of all 28 magnesite samples and 38 dolomite samples using the powered samples selected in the previous step.

Hyperspectral Image Acquisition and Preprocessing
The hyperspectral images of the samples were acquired by a Specim hyperspectral short wave infrared (SWIR) camera (Spectral Imaging Ltd, Finland) in a laboratory conditions. The SWIR imaging spectrometer has a spectral range of 1000-2500 nm, with a 15 nm bandwidth and 5.6 nm spectral sampling, producing hyperspectral images of 288 bands for 384 spatial pixels. For the Lambertian reflectance data acquisition, the samples were leveled to the camera nadir view, with a halogen lamp as the light source. The white reference panel (Spectralon material with 99% reflectance) was stationed next to the samples in the field of view for radiometric calibration.
The acquired hyperspectral images were preprocessed following the workflow of [33,40]. Moreover, the image pixels corresponding to the sample images were selected for further processing, excluding the background pixels. The radiance recorded by the sensor was calibrated with the empirical line method, using the reflectance panel, and converted to reflectance spectra [41]. In addition, we applied the maximum noise fraction (MNF) transformation to remove random noise in the hyperspectral data [42]. The noise bands were determined by the eigenvalues less or equal to 2. Previous studies (e.g., [43]) suggested cut-off eigenvalues of 2 for maximum noise removal without disturbance of the original data, where approximately 120 dimensions were retained in this study. The noise bands were then replaced with zero values and the data were transformed back to the spectral domain by inverse MNF transformation.
The denoised hyperspectral reflectance of the samples was extracted and transformed with a hull quotient correction. The hull quotient correction techniques enhance absorption features in reflectance spectra and are efficient for the detection of the position and depth of the absorption characteristics [44]. The hull quotient corrected spectra were used to analyze the spectral characteristics associated with mineral composition for all carbonate minerals, as well as the spectral variations associated with Mg and Ca content for dolomite and magnesite.

Band Importance Filtering by Random Forest
This study used band ratio method and logistic regression models to derive simplified open classification models that are applicable in other cases of magnesite exploration based on laboratory hyperspectral approaches. To reduce the number of variables for model construction, a random forest (RF) algorithm was employed to select the best representative bands for magnesite and gangue associated mineral classification. The RF algorithm is a machine learning and ensemble-based model [45][46][47]. Although RF models can do classification and regression as well, one of its useful aspects in addition to the regression function is its ability to rank the variable importance by the Gini index [48]. The Gini index is also called the impurity index. It is used by the decision tree algorithm to select the best variable for splitting the samples. The lowest Gini index represents the most important variable in the classification model. The RF is developed based on a bootstrap sample [45]. The model grows trees from random sampling on the dataset and the variables. The RF model uses 2/3 (known as "in-bag") samples for the training set, and the remaining 1/3 (known as "out-of-bag") for accuracy assessment by cross-validation [45]. The randomly selected subsets of variables are created by user-defined number of features (known as "Mtry"), and the random forest grows to the user-defined number of trees (known as "Ntree"). The final classification is decided based on majority votes from all the trees. Two parameters need to be set in order to produce the forest trees: Ntree and Mtry. In general, the RF classifier can have the maximum number of trees (Ntree) due to its strength of no over-fit. However, we assigned 500 Ntree for this study, as previous studies revealed that the final decision is commonly made before Ntree reaches 500 [49]. The Mtry parameter is set to the square root of the number of input variables [50]. For band selection, we extracted 1000 pixels from each sample image containing information for 269 spectral bands and selected 30 bands derived from the RF model as a by-product for the derivation of band ratios and logistic regression models for the classification of magnesite and associated gangue minerals. We used the SPSS random forest package for band selection.

Band Ratio
The band ratio is a commonly used multiband image processing method to enhance differences in spectral characteristics and remove environmental biases such as illumination variation and shadows [51][52][53], which has often been used for hyperspectral data classification [54,55]. For selecting the best band radio combination from the 30 bands selected by the RF algorithm, we created a simple band assemblage based on the 30 bands: Each band ratio for mineral classes was tested by ANOVA tests, and the tested band combination was further statistically analyzed by the Tamhane T 2 test [56] to compare classification performance among the mineral classes. The band combination with the best classification performance was further analyzed to define index ranges, indicating each class by box plots. The band combination classified carbonate minerals and other types including talc and other types of rock.

Multi-Variate Logistic Regression
While the band ratio method can select two bands at a time, a multi-variate logistic regression model includes all candidate bands in one model to detect mineral existence. The logistic regression method is a statistical method developed for the analysis and classification of categorical variables [57][58][59]. Therefore, the method is an appropriate approach for detection of a specific mineral [60]. The logistic regression assumes that the occurrence of binary response variable (Y) is controlled by variable (X) and, thus, creates two class plots indicating the event (Y = 1) and no event (Y = 0). The logistic function derives a probability model based on input variables and transforms the probability value to 0 or 1 based on the cut-off value of 0.5 [58,61]. We derived a logistic regression model for each mineral class based on the reflectance value of 30 selected bands from the RF model, following Equations (2) and (3) [62].
where Logit(P) mineral indicates the logistic probability of specific mineral occurrence, C is the intercept value, β are the contributions of the covariates to the probability of dependent variable occurrence, and X is the reflectance value of the selected band. Then, the probability of the target event is calculated as The final p value of a mineral occurrence is assigned as either 1 or 0 based on the 0.5 cut-off value [63]. The logistic regression models developed for each mineral were evaluated by -2 log-likelihood(-2LL) tests and Hosmer and Lemeshow tests. The -2LL evaluates the goodness of fit for the model based on the maximum likelihood regarding the observation and prediction dataset, where a lower value indicates a better fit [64]. The Hosmer and Lemeshow tests evaluate a model based on a log-likelihood ratio between the observation and prediction values, where a model with the highest ratio is considered to have the highest statistical significance [65]. In addition, two coefficients of determination, pseudo-R 2 values of "Cox and Snell" and "Nagelkerke", were used to evaluate the logistic regression models [66,67]. The pseudo-R 2 values are calculated as , Cox and Snell (4) where L(M C ) is the log-likelihood for a model without explanatory variables, and L M β is the log-likelihood for a model with the explanatory variables. Both pseudo-R 2 values range from 0 to 1, where values closer to 1 indicate better model effectiveness [68]. Moreover, the Wald statistic ((b/standard error) 2 ) was used to evaluate the statistical significance of each explanatory variable [58].

Mineral Composition of Mineral Samples Associated with Magnesite
The XRD analysis revealed the mineral composition of the magnesite, dolomite, calcite, and talc samples in this study ( Table 2). The mineral composition of the magnesite samples from four different origins showed a various combination of accessory minerals. The results confirmed that the natural occurrence of magnesite ore was not pure and was in the form of mineral mixtures. The accessory minerals include dolomite, calcite, chabazite, clinochlore, quartz, and siderite, where dolomite occurred in all magnesite samples. Differently from magnesite samples, dolomite and calcite samples showed significant variations in mineral composition (Table 2), as they are also considered as major rock forming minerals of carbonate rocks. Talc samples contained magnesite, dolomite, and calcite as accessory minerals (Table 2).
Given the fact that carbonate rocks mainly consist of calcite and dolomite and that magnesite mineralization is mainly associated with carbonate rocks with talc as an accessory mineral, pure minerals with a 100% concentration of a specific mineral are rare in field samples. Depending on the involvement of hydrothermal activity, evaporation, replacement, and recrystallization, the compositional combination of calcite, dolomite, magnesite, and talc varies significantly. The results indicate heterogeneous mineral compositions, even for the same types of sample. Because the mineral composition in one type of mineral showed large variations in the mixture of magnesite, dolomite, and calcite, it is highly possible that the spectral information of the spectral library may not be able to detect natural occurrence. Therefore, hyperspectral approaches for magnesite exploration must consider variations in mineral composition for expanded applicability in real-world cases.    (Table 3). The stoichiometry studies on magnesite revealed that decreases in Ca, Mg, HCO 3 , and the Ca/Mg ratio in carbonate fluid (calcite and aragonite) caused magnesite and dolomite mineralization. Dolomite mineralization takes the Ca from the carbonate fluid, resulting in a combination of Mg and Ca contents. Differently from dolomite mineralization, magnesite mineralization occurs at relatively higher temperatures associated with recrystallization, reducing Ca phase replacement in the mineral structure [69]. This result confirms the mineral identification of magnesite and dolomite samples, and variations in Ca composition between the minerals may cause the mineral spectral signature to change. Given the fact that mineral composition and chemical composition are heterogeneous and vary by origin, spectral variation associated with these components accompanies.

Spectral Characteristics of Magnesite Samples
Spectral analysis on the hull-quoted reflectance spectra of magnesite samples identified strong absorption features at 1850, 1930, 2130, 2300, and 2450 nm, and weak absorption features at 1720 and 2360 nm (Figure 1). Comparing the spectra of magnesite with the JPL (Jet Propulsion Laboratory) reference spectrum, the overlaps in absorption features could only be found at 1389, 1920, 2300, and 2450 nm. The absorption features of the samples at 1720, 1850, and 2130 nm are manifested by dolomite, and that at 2360 nm is affected by calcite and talc.  Table 2). The absorption features were detected at 1389 and 1920 nm of the magnesite signal; at 1720, 2140, and 2320 nm of the dolomite signal; and at 2470 nm of calcite signal. As the results showed, the spectral characteristics of dolomite varied by mineral composition. Comparing the spectral characteristics between magnesite samples and dolomite samples, many absorption features overlap, and the condition is case dependent. It confirms our concern that even when using the hyperspectral images, the magnesite and dolomite samples are hardly separable by simple classification methods and need comprehensive spectral analyses and band selection.

Spectral Characteristics of Talc
Differently from those for other types of mineral, the spectral characteristics of talc samples showed dominant common absorption features at 1276, 1300, 1389, 1910, 2010, 2077, 2133, 2172, 2233, 2290, 2311, and 2383 nm, which were manifested by talc, while minor variation by mineral composition was observed. Group 1 of Myeongjin has talc and dolomite as major minerals, with magnesite as an accessory mineral, showing additional absorption features at 1500, 1800, and 1900 nm. Group 2 of Geumsan shows more distinctive talc spectral features at 1300, 1400, and 1530 nm, where talc is the only major mineral. Group 3 includes samples from Dashiqiao, containing talc and calcite as major minerals and magnesite as an accessory mineral. This group showed additional absorptions of calcite at 1870 nm. These results also confirmed the heterogeneous mineral compositions of naturally formed samples and associated complex variations in spectral curves (Figure 4). It is well known that the absorption feature of Mg-OH around 2300 nm is associated with MgO/CaO content [25,70]. The results showed a systematic shift of the Mg-OH absorption toward a shorter wavelength, with an increase in Mg content regardless of source location. For example, the dolomite spectrum with a MgO content of 13.7% (ID 46) has the maximum absorption at 2322 nm, and that with a MgO content of 21.9 % (ID 55) has the maximum absorption at 2316 nm, showing 6 nm of shift. The maximum absorption feature of Mg-OH for magnesite samples was located at 2294 nm. Compared to the dolomite spectrum of ID 46, the shift of Mg-OH absorption is as much as 28 nm. This study found the same phenomenon from the previous studies for the samples from various source locations [25,70]. It indicates that the shift of Mg-OH absorption is a general phenomenon that may be useful for the detection of Mg content in dolomite and magnesite regardless of source location.

Band Selection Based on the Random Forest Model
We put the samples into the classification model of a random forest with Ntree = 500 and Mtry = 70. The model returned an overall accuracy of 98.2% (Table 4). The variable importance graph produced from the random forest algorithm ( Figure 6) displayed the importance index of all of the input bands. [50] verified that the classification accuracy is superior if a sufficiently higher number of Ntree is used for the small number of variables. The smaller number of variables reduced variable collinearity, and thus could improve a multi-variate regression model. We selected 30 bands for the next step of analysis because, based on Figure 7, the out-of-bag (OOB) error was the lowest near 30 bands. The selected bands were mainly associated with Mg-OH (2289-2384 nm), CO 3 (2467-2500 nm), and OH (1389-1400 nm) (Figures 1 and 6). Among the 30 selected bands, the highest peak in Figure 6 is near the absorption feature of 2289-2384 nm by Mg-OH. This range has the major absorption features of magnesite, dolomite, calcite, and talc, with a minor shift between the minerals [34,[71][72][73]. The second peak in terms of band importance corresponded to carbonate absorptions at 2467-2500nm (magnesite 2450 nm, dolomite/talc 2460 nm, and calcite 2470 nm). Notably, the spectral region showed higher importance than the other absorption features associated with CO 3 2such as 1750 nm, 1870 nm, 1980 nm, and 2160 nm [36,37,74,75].
The minor shifts of absorption features in the bands between the minerals made the spectral region more effective for classification. The third largest peak comprises of the bands in the range 1389-1394 nm, which is the absorption caused by OH. The spectral bands correspond with the absorption features of magnesite and talc. The selected 30 bands gave an OOB error of less than 1.5% from the RF model ( Figure 7). The RF model provides an alternative dimension reduction method for hyperspectral data processing.

Band Ratio
Based on the combination of 30 selected bands, we derived two band ratio equations, following the band ratio driving method [22]. The band ratio equations were tested for all combinations of band math operations, and the final equations were selected based on Tamhane T 2 and ANOVA tests with the best output results. The first band ratio combination was used to classify carbonate minerals and other types including talc, as Equation (6) below, where B2328 is the spectral band of 2328 nm, B2344 is the spectral band of 2344 nm, B2389 is the spectral band of 2389 nm, and B2483 is the spectral band of 2483 nm, where the bands of B2328, B2344, and B2483 are related to the absorptions of carbonate minerals and band B2389 corresponds with the absorption shoulder. Then, the second band combination for the classification of carbonate minerals including magnesite, dolomite, and calcite was developed as in Equation (7) Y = B2294 X B2355 B2333 X B2383 (7) where B2294 is the spectral band of 2294 nm, B2333 is the spectral band of 2333 nm, B2355 is the spectral band of 2355 nm, and B2383 is the spectral band of 2383 nm. B2294 discerns dolomite with higher reflectance than the reflectance of calcite. B2333, B2355, and B2383 show a high reflectance of magnesite and low reflectance of calcite, where absorption locations aligned in order of magnesite, dolomite, and calcite. B2294 is the major absorption band of magnesite, showing a higher reflectance of dolomite. The classification results based on the band ratios derived above are presented in the box plot ( Figure 8) [76]. The first band ratio showed median values for carbonate rock of 1.74, for other types of 1.2, and for talc of 0.65. As shown in Figure 8a, the first band ratio effectively classified carbonate minerals from talc, while the confusion between carbonate minerals and other types was expected. The overall accuracy of the classification was 82% (Table 5), yet the overall accuracy was acceptable, with the producer's accuracy for carbonate minerals being 63.9%. The second band ratio for the discrimination for carbonate minerals showed median values for magnesite of 0.71, calcite of 0.91, and dolomite of 1.02. The range of the index is 0.56-0.88 for magnesite, 0.75-1.08 for calcite, 0.94-1.12 for dolomite (Figure 8b). The overall accuracy of classification for the carbonate minerals was 55.2% (Table 6). However, the classification results for magnesite were acceptable, with a user's accuracy of 96%.

Binary Logistic Regression Models
To overcome the limitation of the band ratio method and for better generalization, a binary logistic regression model was developed employing the 30 bands selected by the random forest method. Concerning the water bands-namely, the strong absorption wavelengths from water vapor at around 1400 and 1900 nm-because of their low signal to noise ratio [54], we excluded them from the regression model, resulting in 27 bands as the input variables for mineral prediction. Each mineral had one equation developed based on Equation (3) and by a step-wise variable selection mechanism to avoid multicollinearity.
The stepwise variable selection for each target mineral is listed in Table 7. The classification equation for magnesite employed 11 variables, among which eight bands are associated with Mg-OH spectral features, two bands are from Ca features, and one band is at 1237 nm (Table 7). For dolomite, the equation was developed based on nine bands from Mg-OH features, four bands from Ca features, and one band at 1248 nm (Table 7). Differently from magnesite, seven bands were significant, and the most important bands were 2361 and 2489 nm of Mg-OH features. The calcite equation was derived from nine bands from Mg-OH features and four bands from Ca features, along with bands at 1237 and 1248 nm. The band at 1248 nm plays an important role in calcite detection. Talc classification was based on seven bands of Mg-OH features, four bands of Ca features, and the band at 1237 nm (Table 7), where Mg-OH bands play important roles. All logistic regression equations employed the band at 2467 nm, and three models employed the bands at 1237, 2294, 2316, 2355, 2361, and 2495 nm. The absorption features associated with Mg-OH and Ca participated in all regression equations, where the absorption depth, peak absorption, and absorption width vary among the target minerals. Exceptionally, the spectral band of 1237 nm participated in three regression models, even though the band has no absorption feature. The spectral signatures (1200 nm range) of the minerals show consistent low standard errors (Table 7). Indeed, [77] also used spectral features other than the 2000-2300 nm region for the detection and classification of carbonatites among sedimentary carbonates. Moreover, the RF algorithm identified 1200 nm as important spectral bands for the classification of target minerals. The overall accuracy of the classification was 99.9% for magnesite, 98% for dolomite, 99.6% for calcite, and 99.8% for talc (Table 8).

Evaluation of Binary Regression Models
The results of Hosmer and Lemeshow test showed that the p values of the X 2 values of the logistic regression models for the target minerals range from 0.472 for dolomite to 0.997 for magnesite and calcite (Table 9). In general, significance higher than 0.05 is acceptable, and the test showed that all models are statistically coherent [64]. In addition, the goodness of fit was tested based on pseudo-R 2 , where pseudo-R 2 ranged from 0.58 to 0.628 for Cox & Snell R 2 , and 0.917 to 0.994 for Nagelkerke R 2 ( Table 9). In general, Cox & Snell pseudo-R 2 values larger than 0.2 are considered to indicate good fit [78]. The psuedo-R 2 values of the models validated that all models have a strong goodness of fit.

Validation of Binary Regression Models
The binary regression models for detection of magnesite, dolomite, calcite, and talc developed from training samples were applied to 46 validation samples ( Figure 9). The accuracy of the magnesite logistic regression model was 97.6%. All magnesite samples were correctly detected, while some pixels of magnesite samples were classified as none (Figure 9a and Table 10). The overall accuracy of the dolomite model was 82%, where the model classified 9 out of 21 dolomite samples ( Figure 9b and Table 10). The accuracy of dolomite was lower than of the other minerals. The erroneous samples include calcite and/or quartz as major minerals ( Table 2). The bias might be caused by the mix of major spectra manifested by both minerals. The calcite model classified 94.6% of calcite pixels correctly (Figure 9c and Table 10). The accuracy of the talc model was very high (99.8%, Figure 9d and Table 10).

Discussions and Limitations of the Present Work
Based on the classification results applied to the validation samples, the logistic regression models showed significantly higher accuracy than the band ratio method. While the band ratio method is easier to apply, its overall accuracy was about 40% lower than that of the logistic regression models for carbonate mineral classification. In addition, we compared the effectiveness of the logistic regression models with the RF method. The classification results of the RF method for the validation samples showed an accuracy ranging from 92.6% to 99.9%, with an overall accuracy of 96.6% (Table 11). The accuracy of the RF method was similar with the logistic regression models except for the dolomite samples. The RF method showed better performance on dolomite classification, with 92.6%. Although the RF method shows a slightly better accuracy, the knowledge learned by the RF algorithm is wrapped in its complex data structure as a black box to the researchers. On the contrary, the logistic models have simple form and can be easily generalized to other case studies for magnesite exploration.
Given the fact that this study developed the models using naturally occurring samples from various locations and that the mineral composition is heterogeneous, the models tested in this study could be applicable for real-world cases as prompt analytical methods for sample discrimination. Analysts can select the best model for their case studies. For hyperspectral band surveys, we recommend the logistic regression model.
Although our study is comprehensive on the band selection, the target minerals only include four major ones. It is not difficult to conclude that if more minerals are considered, the complexity of the prediction model will drastically increase. Furthermore, this study is based on fresh samples with controlled dry conditions. The weathering process and wet surface would complicate the spectral signals associated with hydrolysis and water components [38,79]. Adding controlled moisture and weathered samples would allow us to better understand the uncertainty in mapping magnesite in a natural environment. The band selection and band-ratio equations will be different under different assumptions of surface moisture and weathering. Our research proved the feasibility of the method we developed for fresh dry samples and could serve a role model for these future case studies.

Conclusions
This study introduced a detection method for magnesite and associated gangue minerals including dolomite, calcite, and talc based on mineralogical, chemical, and hyperspectral analyses using SWIR hyperspectral images under laboratory conditions. The samples used for this study originated from thirteen different locations in South Korea, China, and North Korea and were used to develop detection models with wide applicability. The spectral characteristics of sample spectra were analyzed with consideration of minerals and composition. Using the spectral characteristics derived from the hyperspectral images, the random forest algorithm was used for band selection and dimension reduction. Band ratio and logistic regression models were developed to find the most useful detection methods.
The mineralogical analysis revealed the heterogeneity of mineral composition for naturally occurring samples. Magnesite samples contains accessory minerals such as dolomite, calcite, chabazite, clinochlore, quartz, and siderite. Dolomite and calcite samples showed the accessory minerals actinolite, augite, calcite, graphite, magnesite, phlogopite, quartz, and titanite. Talc samples had magnesite, dolomite, and calcite as accessory minerals. The results indicate the heterogeneity of mineral composition, even for the same types of sample. Because the mineral composition in one type of mineral showed large variations-mainly mixed forms of magnesite, dolomite, and calcite-the hyperspectral approaches for magnesite exploration must consider variations in mineral composition in other case studies. The Mg and Ca composition of magnesite and dolomite varied significantly, where magnesite had more Mg content and dolomite had more Ca content. These results confirmed the heterogeneity of minerals in not only the mineral composition but also the chemical composition of major elements.
The spectral characteristics of the magnesite samples were found at the absorption features located at 1850, 1930, 2130, 2300, and 2450 nm, with weak absorptions at 1720 and 2360 nm. The spectral characteristics represent the heterogeneity of mineral composition, where absorption features of dolomite, calcite, and talc were found in the spectra of magnesite samples. The same phenomenon was found for dolomite, calcite, and talc samples, where major absorptions of each mineral were mixed with other minerals' absorptions from the sample spectra, representing heterogeneous mineral composition. The spectral characteristics of magnesite and dolomite showed systematic variations in Mg-OH absorption features toward a shorter wavelength, with an increase in Mg content. This indicates that the shift of Mg-OH absorption may be useful for the detection of Mg content in dolomite and magnesite.
The random forest algorithm reduced the number of bands by selecting 30 of the most sensitive bands for the classification of magnesite and associated gangue minerals. The selected bands were mainly associated with Mg-OH (2289-2384 nm), CO 3 (2467-2500 nm), and OH (1389-1400 nm). Among the selected bands, the bands with the highest importance were found in spectral range of Mg-OH absorptions, followed by the spectral bands around carbonate and hydrolysis absorptions. A two-step band ratio method was derived using the selected bands. The first step classified carbonate minerals from talc and other types of sample with an accuracy of 92%. The second step classified magnesite, dolomite, and calcite with an accuracy of 55.2%, where the classification results were not satisfactory. The logistic regression models based on the 27 selected bands excluding water bands achieved accuracies of 98%~99.9% for the training samples and 82-99.8% for the validation samples.
Given the fact that this study found the naturally formed samples from various locations showing heterogeneous mineral composition, the applicability of the models would expand to general use as a prompt analytical method for sample discrimination. It is necessary to include more samples with more source locations to refine and enhance the model. Furthermore, the method would expand its applicability to carbonate rocks and minerals exploration significantly if the method was tested in the field.