The use of reflectance spectroscopy to identify minerals through diagnostic absorption features in the visible and infrared regions has been described by several authors in recent decades [1
]. Due to these diagnostic features, the acquired mineral spectra can be employed in several knowledge-based satellite image classification approaches to delineate target areas for mineral occurrences [5
]. Moreover, field spectra allows ground-checking of the remotely sensed data (e.g., [7
]) and validation of the atmospheric corrections made to satellite images.
The recent growing economic importance of lithium (Li), mainly due to its application in batteries for electric cars, has triggered several attempts to use satellite images to target the occurrence of Li minerals and Li pegmatites [10
]. Reference spectra for some of the most important Li minerals, such as spodumene and lepidolite, can be found in the United States Geological Survey (USGS) [15
] and ECOSTRESS [16
] spectral libraries. Nonetheless, there are no reference spectra for petalite in these open domain spectral libraries. The Geological Survey of Brazil (CPRM) has been trying to address this issue by compiling a spectral database with a Li minerals’ dedicated section, but so far the petalite diagnostic features have not been identified [17
To fill in this gap and to complement the development of image classification algorithms for Li exploration, a spectral database was built in this work based on samples collected in the Fregeneda–Almendra aplite–pegmatite field (Figure 1
). In this region, spodumene, petalite, and lepidolite minerals occur in evolved pegmatites [18
] that intruded metasedimentary rocks belonging to the “Complexo Xisto–Grauváquico” (CXG) [20
The spectral library is composed of field and laboratory spectra not only of Li minerals (spodumene, petalite, lepidolite) but also from the main outcropping lithologies of the Fregeneda–Almendra area (granitoid rocks, CXG metasediments, Li pegmatite). These rock spectra were mainly acquired in areas of good exposition so they could be used as training areas for satellite-based lithological mapping. Additionally, to allow further investigation on the ability to discriminate Li minerals and Li pegmatites from other lithologies, the spectra from areas misclassified as Li pegmatites using machine learning algorithms [13
] are also provided. The comparison of the aforementioned data can allow users to evaluate the degree of spectral similarity between the target minerals/rocks and the remaining within-scene elements.
Complementary data such as (i) the sample location and coordinates (when available), (ii) degree of alteration of the sample, (iii) sample color, (iv) type of face measured, and (v) equipment used, are provided for each spectrum. Spectra acquisition and curation are described thoroughly. For the laboratory spectra, also available are (i) sample photographs, (ii) respective continuum removed spectra files, and (iii) details on the main absorption features automatically extracted.
This spectral database was established in the ambit of the “Lightweight Integrated Ground and Airborne Hyperspectral Topological Solution” (LIGHTS) project, whose goal is to develop a tool that combines remote sensing data acquired at different scales with geological and geochemical data to rapidly identify target areas for Li exploration [21
]. However, such a database could be useful for other ongoing research projects, namely: (i) the fiber laser plasma spectroscopy system for real-time element analysis (FLaPsys) project, which aims at developing an advanced spectroscopy system capable of real-time element identification and quantification mainly applied to Li mineralizations [23
], and in which laser-induced breakdown spectroscopy (LIBS) will be correlated with the visible and infrared data acquired; (ii) new exploration tools for a European pegmatite green-tech resources (GREENPEG) project which aims at improving responsible exploration in Europe for pegmatites through the development of integrated, multi-method exploration toolsets that include satellite image processing and airborne and ground-based geophysics and geochemical approaches [24
]. Part of this spectral database was already the basis for some publications [25
], but more works are expected in the future since there is great potential in this dataset whose main advantages include: (i) that the data are provided in a universal text file format that is not dependent on software; (ii) the ability to compare field and laboratory spectra for 52 coincident spots; (iii) finally, the details and statistics of the extracted features provided allow the user to compare the shape, asymmetry, and depth of the absorption features of the distinct Li minerals, including petalite.
2. Data Description
The spectral database is divided into two subsets; the first concerning the spectra collected in the laboratory and the second corresponding to the spectra acquired in the field. The spectra were organized into categories within each subset according to their application purpose (Table 1
). To allow a rapid and easy identification of the spectra, the spectrum naming was made considering logical codes embedded in the spectrum title (Table 1
). Consequently, information such as the location of the sample and analyzed lithology/mineralogy can be readily extracted based just on the spectrum name. Besides the codes of Table 1
, each spectrum has its associated measurement number.
All spectra are provided in a universal UTF-8 text file format that can be read in any proprietary or open-source software, representing measurements made in the visible and near-infrared (VNIR) and shortwave infrared (SWIR) regions. From a total of 340 spectra collected in the laboratory, 84 represent Li minerals, 196 correspond to the outcropping lithologies of the Fregeneda–Almendra area that can be used as training areas for satellite image classification, and 60 spectra were collected from samples in Li pegmatite false-positive areas identified in previous satellite image classification attempts [13
]. Additionally, 75 field spectra are presented in the spectral database (35 measurements of Li minerals and 40 measurements of distinct Li pegmatites). As mentioned in Table 1
, two spectrometers were used for spectral measurements. For the data acquired with the SR-6500 equipment, the UTF-8 text files are composed of a 28 line-header containing information about the equipment and acquisition settings and two columns, the first with the wavelength (in nanometers or nm) and the second with the measured reflectance (in percentage). In the case of the spectra acquired with the ASD FieldSpec 4, there are just two columns, one with the wavelength (in nm) and the other with absolute reflectance values. For each subset (field and laboratory spectra) there is a *.xlsx table containing important ancillary data:
The coordinates are only available for samples collected in situ and, therefore, samples collected in ore stockpiles (for example) do not have this kind of information registered. For the spectra acquired in the laboratory, additional information is also provided, namely: (i) sample photographs with the analyzed spots highlighted, (ii) a UTF-8 text file with the respective continuum removed spectra (with absolute reflectance values); (iii) PNG files showing for each spectrum the main absorption features and a CSV file summarizing the main statistics of each feature (Section 3.2