1. Introduction
Hyperspectral (HS) sensors, capable of acquiring data in hundreds of narrow, contiguous spectral bands, are extensively utilized for geological surveys, particularly for mineral identification and mapping [
1]. Ideally, mineral maps generated from such data should remain consistent across different imaging times and sensors, especially in arid or semi-arid regions with well-exposed mineralogy. However, in practice, several factors, including atmospheric correction residuals, sensor calibration drifts, inherent sensor noise, and the prevalence of mixed pixels, often lead to variations and inconsistencies in derived mineral maps [
2].
The Tetracorder expert system, developed by the United States Geological Survey (USGS), is a widely recognized and powerful tool for mineral identification, known for its high accuracy based on detailed spectral feature matching [
3]. While a powerful and widely used tool, Tetracorder’s high sensitivity to spectral features can cause temporal variability in results, as demonstrated by Ong et al. (2003) [
4]. They found outlier results in a multi-year dataset, attributing to potential changes in surface conditions or sensor calibration. This presents an opportunity to develop methods that enhance temporal consistency. Data-driven alternatives, like the Hourglass workflow of the ENVI 6.1 software developed by NV5 Geospatial Software, Inc. (Broomfield, CO, USA), typically involve steps such as the minimum noise fraction (MNF) transformation [
5], pixel purity index (PPI) analysis [
6], and the spectral angle mapper (SAM). While automatable, a key characteristic of such techniques is the sensitivity of their endmember selection and mapping stages to threshold parameters, which requires careful tuning by the analyst to ensure robust outcomes [
7,
8].
To address these challenges and improve the reliability and temporal consistency of mineral maps, this study proposes a hybrid approach. It leverages the high-accuracy mineral identifications from Tetracorder for selected pixels as training data for a random forest classifier, using MNF-transformed HS data as input features. This combination aims to create mineral maps that are both accurate and robust over time.
2. Materials and Methods
2.1. Study Site
This research focused on the Cuprite mining district in southwestern Nevada, USA, the location and alteration maps are shown in
Figure 1. Cuprite is a well-known validation site for geological remote sensing studies due to its exposure of hydrothermally altered volcanic rocks (mainly rhyolitic tuffs) and the presence of distinct mineral assemblages with characteristic absorption features in the short-wave infrared (SWIR) region (2.0–2.5 µm). Geologically, the Cuprite district consists of Cambrian metasedimentary rocks in the western center and Tertiary volcanic rocks in the eastern center, both of which have undergone significant hydrothermal alteration. For a detailed geological map of the area, readers are referred to Figure 3A in Swayze et al. (2014) [
9]. This process has created well-defined zones of mineralization, including a central silicified core, a surrounding advanced argillic zone rich in minerals like alunite and kaolinite, and an outer propylitic zone. The excellent exposure of these distinct mineral assemblages makes the site an ideal natural laboratory for testing and validating hyperspectral mapping techniques [
9].
2.2. Hyperspectral Data Used
We utilized HS images acquired by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) Classic instrument over the Cuprite site on three different dates: 16 June 2011; 7 May 2012; and 6 March 2013. These AVIRIS data were chosen because this study site is a well-established validation site for which a wealth of AVIRIS data and related findings from previous studies are available, facilitating straightforward comparative analysis [
1,
3,
9]. These data are publicly available from the AVIRIS Data Portal site provided by Jet Propulsion Laboratory (JPL) (
https://aviris.jpl.nasa.gov/dataportal/ (accessed on 1 March 2025)). The flight lines of the images selected for 2011, 2012, and 2013 are f110616t01p00r08rdn_c, f120507t01p00r08rdn_a, and f130306t01p00r06rdn_e, respectively, which were subsequently subsetted to an area of approximately 10 km × 20 km covering the main hill area of the Cuprite site. This area was selected because it is well-known for its diverse hydrothermal alteration mineralogy, making it an ideal test site [
9]. These images consist of 224 contiguous spectral bands with a nominal spectral resolution of 10 nm, covering the wavelength range from 0.4 to 2.5 µm. The spatial resolution of AVIRIS data varies with flight altitude; the original resolutions for the data used were 15.5 m (2011), 15.7 m (2012), and 16.4 m (2013). To ensure consistent geolocation and pixel size for accurate temporal comparison, all images were co-registered and resampled to 15.7 m using the nearest neighbor method, aligning them with a 2006 reference dataset. The raw radiance data (referred to as “HS-rad data”) were converted to a surface reflectance image (referred to as “HS-ref data”) through atmospheric correction and used as the primary input data for subsequent analyses.
2.3. Representative Mineral Mapping Tools
2.3.1. USGS Tetracorder
Tetracorder is an expert system that identifies materials by comparing the spectrum of each pixel against a vast library of reference spectra and analyzing the presence, position, and shape of specific absorption features. In this study, Tetracorder was applied to preprocessed HS-ref data to generate initial mineral maps which provide detailed mineralogical information.
2.3.2. ENVI Hourglass Workflow
This is a standard data-driven workflow for endmember extraction and mineral mapping, often referred to as the ENVI Hourglass workflow. It employs a sequential process to identify and map pure material spectra (endmembers) from the HS data. The process typically begins with the MNF transformation to segregate noise from the data, reduce dimensionality, and reorder the components by image quality, thereby optimizing the data for subsequent endmember selection. Following the MNF transformation, the PPI analysis is commonly performed to identify the spectrally extreme or “pure” pixels within the dataset, which are considered potential endmembers representative of distinct materials. Then, these potential endmembers identified through the PPI analysis are often visualized and refined using the N-dimensional scatter plotting, which allows an analyst to interactively examine the distribution of pure pixels in feature space and select the most representative set of endmembers. Finally, the selected endmembers are used to classify the image and map their spatial distribution using a mapping algorithm such as SAM, where SAM is a method that determines spectral similarity based on the angular distance between pixel spectra and endmember spectra. In this way, the Hourglass workflow can produce a comprehensive mineral map by identifying and mapping multiple endmembers. In the proposed method described in the next section, we adopt the MNF transformation [
5] for feature generation and the concept of PPI [
6] for pure pixel selection.
2.4. Proposed Method—Enhanced Mineral Mapping with a Random Forest Classifier
The proposed method integrates elements of Tetracorder’s accuracy with the reproducibility of a machine learning classifier, while avoiding the less robust automated steps of traditional data-driven workflows. The processing flow of the method is shown in
Figure 2. The key steps in this flow are as follows:
First, initial mineral maps are generated by applying Tetracorder to the HS-ref data. Concurrently, the concept of identifying spectrally pure pixels is considered.
Next, the pixels to be used for training are selected. The rationale for this approach is that while Tetracorder provides accurate mineral identification, not all classified pixels are equally reliable for training purposes. Mixed pixels and those affected by noise can introduce uncertainty in the training labels. In this step, information obtained from the Tetracorder-derived mineral maps is incorporated to identify high-purity, reliably classified pixels by applying the PPI analysis to these Tetracorder-derived maps. The output of this selection process is referred to as the “PPI-Tetracorder Map”. The reliability of these labels was confirmed both quantitatively and qualitatively. Compared to a 2006 reference map, the labels showed approximately 80% accuracy (based on pixel-by-pixel comparison of mineral class assignments), and they were also in strong visual agreement with the published maps by Swayze et al. (2014) [
9]. This map provides mineral labels for training a random forest classifier.
- 2.
Feature Preparation
The original HS-ref data were processed using the MNF transformation, applied across all 224 spectral bands. Based on eigenvalue analysis and visual inspection of the component images, the first 20 components were deemed sufficient and retained for subsequent analysis. This specific MNF transformation is optimized for preparing input features for the random forest model, focusing on compressing data and reducing noise, particularly in the SWIR wavelength range (2–2.5 μm), which is critical for mineral identification. The output of this step is referred to as the “MNF-HS data”.
- 3.
Mineral Mapping by a Random Forest Classifier
In this step, a random forest classifier is trained. The MNF-HS data, which contains multiple MNF bands, is used as the input features, and the corresponding mineral labels from the PPI-Tetracorder map are used as the target variables. The classifier was implemented with 100 trees (n_estimators = 100) and a maximum depth of 100 (max_depth = 100). Other key hyperparameters were left at their default values in the scikit-learn library, including criterion = ‘gini’, min_samples_split = 2, and min_samples_leaf = 1. For each year’s dataset, the training samples were randomly divided into 5 folds, ensuring stratified sampling to maintain class balance across folds. To ensure the model’s robustness and prevent overfitting, a 5-fold cross-validation scheme was employed. The results were highly consistent across all folds, and the analysis presented in this paper uses a representative result. Random forest, an ensemble learning method, builds multiple decision trees during training. For prediction, it outputs the class that is the mode of the classes output by individual trees, making it robust to overfitting and effective for high-dimensional data [
10].
The trained random forest model is then applied to the entire MNF-HS data for each year to predict mineral classes for every pixel. This generates the final mineral map, which represents the output of the proposed method. This process is expected to enable more consistent interpolation and classification of minerals across the image by leveraging the robust learning capabilities of random forests trained with high-quality labels and informative features.
3. Results
The proposed method was compared with a method that simply uses Tetracorder applied directly to the HS-ref data without any additional processing or filtering, using the multi-temporal AVIRIS datasets. As shown in
Figure 3, a visual comparison of the mineral maps generated for 2011, 2012, and 2013 indicates that the maps produced by the proposed method exhibit greater spatial consistency in mineral distributions across the different years compared to those generated by the simple Tetracorder-based method, particularly for kaolinite located in the central-right regions of each map.
To quantitatively assess and substantiate these visual observations, the inconsistency rate between two mineral maps was calculated. This rate was quantified by calculating the Root Mean Square Error (RMSE) between the pixel values of the two classification maps, providing a measure of pixel-by-pixel disagreement. This was performed for the simple Tetracorder-based method and the proposed method by comparing the mineral maps between different year-pairs: 2011–2012, 2011–2013, and 2012–2013. In this context, lower inconsistency values indicate higher similarity and consistency between maps from different years, and thus better temporal consistency. The detailed results of these comparisons are summarized in
Table 1, which clearly supports the visual assessments. The proposed method consistently yields lower inconsistency values for all year-pair pairs. To ensure a robust statistical comparison, these inconsistency rates were derived from a 5-fold cross-validation process, yielding fifteen pairs of results for the three-year-pair comparisons. The average inconsistency for the proposed method was 6.64%, a significant reduction from 9.47% for the simple Tetracorder-based method. A paired
t-test (
p < 0.001) and a Wilcoxon signed-rank test (
p < 0.001) both confirmed that this improvement is statistically significant. This difference represents an approximate 30% reduction in error, highlighting the improved temporal consistency and reduced discrepancies between annually produced maps that are achieved by implementing the proposed method.
4. Discussion
Integrating Tetracorder-based labels from high-purity pixels with a random forest classifier operating on MNF-transformed hyperspectral (HS) data significantly improved the consistency of mineral maps over time. This approach achieved an approximate 30% reduction in inconsistency compared to the simple Tetracorder-based method. While simple Tetracorder maps are spectrally accurate at the pixel level, they often show variability across acquisition dates due to atmospheric changes or sensor noise [
2].
Furthermore, to validate the spatial accuracy of the proposed method, the generated mineral map (
Figure 3d) was visually compared with the detailed alteration map by Swayze et al. (2014) [
9] that serves as a reference (
Figure 1). This comparison confirms a high degree of spatial agreement in the distribution patterns of key minerals. For instance, the central zone dominated by kaolinite (blue in
Figure 3d), surrounded by areas rich in alunite (red), and the widespread presence of muscovite show strong correspondence with the reference map. It should be noted that the “white mica 1” shown in
Figure 1 is treated as “muscovite” in this study, which is consistent from a mineralogical standpoint. The mineral classes used for classification were selected based on a previous study (Tsubomatsu and Tonooka, 2023 [
10]), also considering the need to secure a sufficient number of pixels for each class to ensure the reliability of the machine learning model. This suggests that the proposed method not only enhances temporal consistency but also maintains high classification accuracy, producing results that are well-aligned with established geological findings for the site.
Our method utilizes Tetracorder’s spectral identifications for training data, with MNF transformation providing processed features for the random forest classifier. The random forest’s ensemble nature effectively models complex spectral-mineral relationships and generalizes mapping across images, offering more stable results in time-series data than direct spectral matching or sensitive endmember selection. Although relying on Tetracorder-based maps for training could propagate biases, focusing on high-purity pixels and using a machine learning interpolator mitigates these effects, reducing inter-map variability. This method achieves comparable accuracy to simple Tetracorder-based mapping, while demonstrating improved temporal consistency, underscoring its effectiveness for multi-temporal mineral mapping. However, several limitations of the current approach should be acknowledged. First, the method’s performance is inherently dependent on the quality of initial Tetracorder classifications, which may introduce systematic biases. Second, the approach has been validated only on a single site with well-characterized mineralogy, and its transferability to more complex geological settings remains to be tested. Additionally, the computational overhead of the combined approach may be significant for large-scale operational applications.
5. Conclusions
This study demonstrated that combining the USGS Tetracorder expert system with a random forest classifier can enhance the temporal consistency of HS mineral maps. The method, using Tetracorder-identified high-purity pixels for training labels and MNF-transformed data as features, yielded maps of the Cuprite site with smaller inter-annual differences than the simple Tetracorder-based method. The observed inconsistency reduction of approximately 30% signifies improved map robustness and consistency, with accuracy comparable to the simple Tetracorder-based method.
This approach offers a pathway toward more reliable geological mapping with time-series HS imagery. Future work should validate this method in varied geological settings using different sensors to assess its generalizability for applications like mineral exploration and environmental monitoring.
Author Contributions
Conceptualization, H.T. (Hideyuki Tonooka); methodology, H.T. (Hideki Tsubomatsu) and H.T. (Hideyuki Tonooka); software, H.T. (Hideki Tsubomatsu); validation, H.T. (Hideki Tsubomatsu); formal analysis, H.T. (Hideki Tsubomatsu); investigation, H.T. (Hideki Tsubomatsu); resources, H.T. (Hideki Tsubomatsu) and H.T. (Hideyuki Tonooka); data curation, H.T. (Hideki Tsubomatsu); writing—original draft preparation, H.T. (Hideki Tsubomatsu); writing—review and editing, H.T. (Hideyuki Tonooka); visualization, H.T. (Hideki Tsubomatsu); supervision, H.T. (Hideyuki Tonooka); project administration, H.T. (Hideyuki Tonooka); funding acquisition, H.T. (Hideyuki Tonooka). All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by JST SPRING, Grant Number JPMJSP2161.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are contained within the article.
Acknowledgments
The authors would like to express sincere gratitude to Roger Clark for his invaluable assistance in the implementation of Tetracorder. We also deeply appreciate Satoshi Yamamoto of the Geological Survey and Research Center, National Institute of Advanced Industrial Science and Technology (AIST), for his advice and support.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Kruse, F.A.; Perry, S.L.; Caballero, A. District-Level Mineral Survey Using Airborne Hyperspectral Data, Los Menucos, Argentina. Ann. Geophys. 2006, 49, 83–92. [Google Scholar] [CrossRef]
- Gao, B.-C.; Montes, M.J.; Davis, C.O.; Goetz, A.F.H. Atmospheric Correction Algorithms for Hyperspectral Remote Sensing Data of Land and Ocean. Remote Sens. Environ. 2009, 113, S17–S24. [Google Scholar] [CrossRef]
- Clark, R.N.; Swayze, G.A.; Kokaly, R.F.; Higgins, C.T.; Adam, Z.R.; Anderson, F.S.; Brown, A.J.; Pieters, C.M.; Clifford, C.E.; Ehlmann, B.L.; et al. Imaging Spectroscopy: Earth and Planetary Remote Sensing with the PSI Tetracorder and Expert Systems from Rovers to EMIT and Beyond. Planet. Sci. J. 2024, 5, 276. [Google Scholar] [CrossRef]
- Ong, C.; Swayze, G.; Clark, R. An Investigation of the Use of the Tetracorder Expert System for Multitemporal Mapping of Acid Drainage-Related Minerals Using Airborne Hyperspectral Data. In Proceedings of the 3rd EARSel Workshop on Imaging Spectroscopy, Herrsching, Germany, 13–16 May 2003. [Google Scholar]
- Green, A.A.; Berman, M.; Switzer, P.; Craig, M.D. A Transformation for Ordering Multispectral Data in Terms of Image Quality with Implications for Noise Removal. IEEE Trans. Geosci. Remote Sens. 1988, 26, 65–74. [Google Scholar] [CrossRef]
- Boardman, J.W.; Kruse, F.A.; Green, R.O. Mapping target signatures via partial unmixing of AVIRIS data. In Summaries of the Fifth Annual JPL Airborne Earth Science Workshop, Pasadena, CA, USA, 23–26 January 1995; Green, R.O., Ed.; JPL Publication 95-1; Jet Propulsion Laboratory: Pasadena, CA, USA, 1995; Volume 1, pp. 23–26. [Google Scholar]
- Ahmad, F. Pixel Purity Index Algorithm and N-Dimensional Visualization for ETM+ Image Analysis: A Case of District Vehari. Glob. J. Hum. Soc. Sci. Arts Humanit. 2012, 12, 76–82. [Google Scholar]
- Maggiori, E.; Plaza, A.; Tarabalka, Y. Models for Hyperspectral Image Analysis: From Unmixing to Object-Based Classification. In Mathematical Models for Remote Sensing Image Processing: Models and Methods for the Analysis of 2D Satellite and Aerial Images; Springer International Publishing: Cham, Switzerland, 2017; pp. 37–80. [Google Scholar]
- Swayze, G.A.; Clark, R.N.; Goetz, A.F.H.; Livo, K.E.; Breit, G.N.; Kruse, F.A.; Sutley, S.J.; Snee, L.W.; Lowers, H.A.; Post, J.L.; et al. Mapping Advanced Argillic Alteration at Cuprite, Nevada Using Imaging Spectroscopy. Econ. Geol. 2014, 109, 1179–1221. [Google Scholar] [CrossRef]
- Tsubomatsu, H.; Tonooka, H. Region Expansion of a Hyperspectral-Based Mineral Map Using Random Forest Classification with Multispectral Data. Minerals 2023, 13, 754. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).